Age | Commit message (Collapse) | Author | Files | Lines |
|
Fixes: 5611389876a748e19b7593d4eb426ced7a6ed31f
Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
An out of tree platform (p8dtu) uses a different IPMI OEM command
for IPMI_PARTIAL_ADD_ESEL. This exposed some assumptions about the BMC
implementation in our core code.
Now, with platform.bmc, each platform can dictate (or detect) the BMC
that is present. We allow it to be set at runtime rather than purely
statically in struct platform as it's possible to have differing BMC
implementations on the one machine (e.g. AMI BMC or OpenBMC).
Acked-by: Jeremy Kerr <jk@ozlabs.org>
[stewart@linux.vnet.ibm.com: remove enum, update (C) years]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently Hostboot populates /bmc/sensors dt node and corresponding sensors
only for BMC platforms, And for FSP platforms hostboot is not populating any
fsp sensors(Management sensors) and also there is no firmware progress sensor
exist in fsp platforms. Due to which OPAL incorrectly setting firmware status
on a sensor id "00" which is not at all exist.
On a FSP system:
cat /sys/firmware/opal/msglog | grep -i setting
[ 21.189204883,6] IPMI: setting fw progress sensor 00 to 07
[ 21.189559121,6] IPMI: setting fw progress sensor 00 to 13
cat /sys/firmware/opal/msglog | grep -i skiboot
[ 84.127416495,5] SkiBoot skiboot-5.4.0-rc3 starting...
On a BMC system:
cat /sys/firmware/opal/msglog | grep -i setting
[ 3.166286901,6] IPMI: setting fw progress sensor 05 to 14
[ 14.259153338,6] IPMI: setting fw progress sensor 05 to 07
[ 14.469070593,5] IPMI: Resetting boot count on successful boot
[ 15.001210324,6] IPMI: setting fw progress sensor 05 to 13
So this patch fixes this incorrect setting on a fsp system, and also sets the sensor
only if OPAL initialises ipmi sensors and corresponding sensor exists for a given
sensor type in the device tree.
After patch:
On a FSP system:
cat /sys/firmware/opal/msglog | grep -i setting
On a BMC system:
cat /sys/firmware/opal/msglog | grep -i setting
[ 3.164859816,6] IPMI: setting fw progress sensor 05 to 14
[ 14.024941077,6] IPMI: setting fw progress sensor 05 to 07
[ 14.211514767,5] IPMI: Resetting boot count on successful boot
[ 14.252554375,6] IPMI: setting fw progress sensor 05 to 13
Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
[stewart@linux.vnet.ibm.com: return OPAL_UNSUPPORTED on !sensors_present,
make ipmi_sensor_type_present() static in ipmi-sensor.c]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Some modifications related to typo errors, alignment, case letter mismatch to add
more clarity to the code.
Signed-off-by: Mukesh Ojha <mukesh02@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
Commit 127a7dac added eSEL ID to SEL event in reverse order (0700 instead
of 0007). This code fixes this issue by adding ID in proper order.
Sample SEL event output without this patch:
Event Data (RAW) : 050700
Sample SEL event output with this patch:
Event Data (RAW) : 050005
Fixes: 127a7dac (IPMI: Add SEL event with eSEL record ID)
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Namely around PNOR access requests and OCC reset.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
ipmi-opal.c contain OPAL API related functions. Commit a561cf7b added IPMI
support for FSP based system. Now FSP and AMI BMC based system makes use of
these functions. Hence move this file from hw specific dir to core dir.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently these exist for some parts of the source tree, but not all of it. They're nice if you are only modifing code in a one part of the tree as the full test suite can be a little slow.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We pre-allocate IPMI message for PANIC event and use that memory to send
PANIC event to BMC. Presently we return NULL if we have not initiated PANIC
event message. So we won't be able to log early failure events.
This patch tries to initialize ipmi message instead of returning NULL.
Also intialize elog before ipmi_sel_init. Otherwise we will not be able
to create elog message.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Presently we use queue method (ipmi_queue_msg) to send eSEL logs
to BMC.
There are cases like assert() where we want to commit messages
synchronously. This patch checks for log severity and logs PANIC
messages synchronously to BMC (Similar to what we do in FSP based
system).
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Commit a5299ba2 dropped non-severe event from logging to BMC, but I forgot
to releaes the error log structure.
Fixes: a5299ba2 (IPMI: Only log events that needs attention)
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The IPMI Specs document (29.7 Event Data Field Formats) describes the
Event Data 1 field for discrete sensor as follow :
[7:6] - 00b = unspecified byte 2
01b = previous state and/or severity in byte 2
10b = OEM code in byte 2
11b = sensor-specific event extension code in byte 2
[5:4] - 00b = unspecified byte 3
01b = reserved
10b = OEM code in byte 3
11b = sensor-specific event extension code in byte 3
[3:0] - Offset from Event/Reading Code for discrete event state
The "System Firmware Progress" offset in the "System Firmware
Progress" Sensor being 0x02, we should be using 0xc2 in the event data
1 field.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The patch corrects the order of the timestamp and manuf_id
attributes, which currently are reversed from what is stated
in the specs. (32. SEL Record Formats)
We don't use them in skiboot so there should not be any
consequences. This is mostly for the records and for qemu
powernv.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
hw/ipmi/ipmi-rtc.c:41:28: warning: incorrect type in argument 1 (different base types)
hw/ipmi/ipmi-rtc.c:41:28: expected restricted leint32_t [usertype] le_val
hw/ipmi/ipmi-rtc.c:41:28: got unsigned int [unsigned] [addressable] [usertype] result
hw/ipmi/ipmi-rtc.c:66:12: warning: incorrect type in assignment (different base types)
hw/ipmi/ipmi-rtc.c:66:12: expected unsigned int [unsigned] [usertype] tv
hw/ipmi/ipmi-rtc.c:66:12: got restricted leint32_t
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Our normal sequence for a soft power action (IPMI 'power soft' or
'power cycle') involve receiving a SEL from the BMC, sending a message
to Linux's opal platform support which instructs the host OS to shut
down, and finally the host will request OPAL to cut power.
When the host is not yet up we will send the message to /dev/null, and
no action will be taken. This patches changes that behaviour to perform
the action immediately if we know how.
Signed-off-by: Joel Stanley <joel@jms.id.au>
[stewart@linux.vnet.ibm.com: modify checking of OPAL_BOOT_COMPLETE flag, typo]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The caller usually has it and it avoids additional mftb() which
can be expensive.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[stewart@linux.vnet.ibm.com: fix run-timer unit test]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Presently abort() function is not working on BMC based machine. System
hangs after abort/assert call. We have to reboot machine from BMC (IPMI
command or BMC console).
This patch introduces attention functionality for BMC based machine.
It logs eSEL event that contains OPAL version, file info and backtrace.
And calls cec_reboot... which takes care of rebooting host.
Note:
- This patch uses ipmi_queue_msg() instead of ipmi_queue_msg_sync() as
we are having some issues with sync path. This will resolved once we
sort out [1].
- This patch calls cec_reboot to reboot machine after logging eSEL event.
It queues IPMI message and bt_poll() should be working until we pass
reboot IPMI message to BMC. Hence we have while loop with time_wait_ms().
Alternatively we can use xscom_trigger_xstop().. but it will stop
immediately and eSEL logging fails.
[1] https://lists.ozlabs.org/pipermail/skiboot/2015-August/001824.html
Sample eSEL output after assert call:
------------------------------------
[hegdevasant@hegdevasant bin]$ strings fir01bmc.150820.120511.eSel.binary
BB821410
AT8335-GTA000000000000
AT8335-GTA000000000000UD
ATDESC
OPAL version : skiboot-5.1.1-44-geae3999-hegdevasant-dirty-bb31bfd
File info : core/init.c:463:0
CPU 0060 Backtrace:
S: 0000000031d83bc0 R: 000000003006086c .ipmi_terminate+0x110
S: 0000000031d83c60 R: 0000000030017f90 ._abort+0x80
S: 0000000031d83ce0 R: 0000000030017fd8 .assert_fail+0x34
S: 0000000031d83d60 R: 0000000030013dcc .load_and_boot_kernel+0x784
S: 0000000031d83e30 R: 000000003001437c .main_cpu_entry+0x57c
S: 0000000031d83f00 R: 0000000030002544 boot_entry+0x194
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We want to use MAX_PEL_SIZE in other code (like attention) as well.
Hence move this to ipmi.h.
Also rename MAX_PEL_SIZE as IPMI_MAX_PEL_SIZE to reflect its IPMI
specific macro.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently we allocate ipmi_msg for every eSEL event.. But in PANIC
its not advised to allocate memory. Hence pre-allocate ipmi_msg for
PANIC event.
Note that we continue to allocate memory for normal event. Also with
current implementation we can log only one eSEL event in PANIC path.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Presently we allocate ipmi_msg separately for SEL and eSEL event
...which is not required. Instead we can use same memory for sending
SEL event. As these two events are serialized.
This is useful when we pre-allocate memory for PANIC path.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We log SEL event with eSEL record ID for every eSEL event. Presently
SEL event is added to tail of IPMI queue. It works fine during normal
event. But it fails in terminate immediate path...as reboot message
will be called before SEL event.
This patch adds message to head of IPMI queue.. so that we can log
SEL event before sending reboot.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
eSEL logging fails, if eSEL event size is multiples of IPMI
buffer size.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
See https://github.com/lucasdemarchi/codespel
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Our PEL logs doesn't contain timestamp as we don't have timesource.
Hence create SEL event for every eSEL log with eSEL record ID. This
event will be used to get PEL event time.
New SEL event contains eSEL record ID.
Sample output:
-------------
SEL Record ID : 0016
Record Type : 02
Timestamp : 08/09/2015 12:35:16
Generator ID : 0020
EvM Revision : 04
Sensor Type : System Event
Sensor Number : 61
Event Type : Generic Discrete
Event Direction : Assertion Event
Event Data (RAW) : 011400
Description : State Asserted
Sensor ID : System Event (0x61)
Entity ID : 1.0
Sensor Type (Discrete): System Event
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Presently we are passing PEL log without adding Extended SEL record.
Hence logging eSEL event is failing.
This patch sends Extended SEL structure before sending actual PEL log.
So that BMC understands its eSEL log and logs it appropriately.
eSEL format:
<IPMI SEL header> : <eSEL record> : <PEL data>
Note that we use sensor type "System Event (0x12)" for logging OPAL
events.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Cc: Alistair Popple <alistair@popple.id.au>
Cc: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Presently we are logging all the events to service processor (FSP/BMC).
But on BMC machines we should only log events that requires attention.
As per PEL spec, we should log events with severity >= 0x22 and "service
action flag" is "on". But in our case, all logs OPAL originagted logs
are makred as report externally.
So lets log all events that are originated from OAPL (presently all logs
as payload is not logging any PEL event) and severity >= 0x22.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Alistair Popple <alistair@popple.id.au>
Cc: Jeremy Kerr <jk@ozlabs.org>
[stewart@linux.vnet.ibm.com: s/SEL:/IPMI:/ in prlog]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently we are assuming Max PEL size is 64K. But in reality log size
much lesser than this. Hence chose max size to 2K.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Cc: Alistair Popple <alistair@popple.id.au>
Cc: Jeremy Kerr <jk@ozlabs.org>
Acked-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In the error path of the OPAL function to receive the ipmi message,
the function returns the error code without deleting the message
containing response. Though the kernel doesn't claim this message
later and continue with the subsequent ipmi commands. This leads to
a scenario when there is a mismatch between the ipmi command and its
response for all the subsequent ipmi commands.
Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com>
Cc: Alistair Popple <alistair@popple.id.au>
Acked-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This was useful in development when were diagnosing BMC issues. It's
just noisy now, so drop it.
We still print out the SEL received with the command and netfn as this
may be useful in diagnosing failed reboots and power offs in the future.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
The message was sometimes re-queued and always freed. Hilarity ensues.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The current watchdog code calls ipmi_queue_msg_sync from a timer which
leads to calling a opal_poll_events() recursively and consequently an
abort().
This patch ensures we don't send synchronous messages from a timer.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Not all ipmi related functions check for a valid backend before
attempting to use it. Under normal circumstances this should not
happen as the platform should always register an ipmi backend. However
a system should be able to boot without a functional ipmi backend,
which is sometimes the case during system bringup.
This patch adds presence checks for an ipmi backend before attempting
to use it, thus allowing a system with a non-functional backend to
boot without ipmi.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Experiments determined that we need to set the assertion bits as well as
the event data bytes.
To verify the sensor is being set on your BMC, use ipmitool to query the
SEL logs:
$ ipmitool sel list
System Firmware Progress #0x05 | Motherboard initialization | Asserted
System Firmware Progress #0x05 | Memory initialization | Asserted
System Firmware Progress #0x05 | System boot initiated | Asserted
Signed-off-by: Joel Stanley <joel@jms.id.au>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Now that opal.h includes opal-api.h, there are a bunch of files that
include both but don't need to.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This is an attempt to make it clearer how to use the sensor set ipmi
command, and to do the correct thing to our BMC.
The boot count was using the incorrect set mask and type. It now confirms
to what the AMI BMC expects.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Rather than doing the last disable in the interrupt path, do it on final
reset. This is a temporary workaround for incorrect pretimeout
behaviour.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
|
|
We're doing the IPMI sensor type mapping the wrong way around; we want
to map sensor types to sensor IDs.
Also, change the #defines to property reflect that they're types.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
|
|
The boot count sensor is a discrete sensor that is set once the system
is up and running.
On successful boot, the BMC expects the sensor to be set to 2.
Signed-off-by: Joel Stanley <joel@jms.id.au>
|
|
This allows setting a given IPMI sensor to a given value. There are
helpers for setting the firmware boot progress that will be used for
updating the BMC with the host boot progress.
The sensor ids are parsed from the device tree. If the sensor cannot
be found we will silently continue.
Signed-off-by: Joel Stanley <joel@jms.id.au>
|
|
We need to pass the PNOR access status to the OCCs, as they may write to
the PNOR in the event of a checkstop.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
|
|
BMC based systems contain a PNOR to provide flash storage. The host
normally has exclusive access to the PNOR, however the BMC may use IPMI
to request access to perform functions such as update the firmware.
Indicate to users of the flash that the device is busy by taking the
lock, and setting a per-flash busy flag, which causes flash operations
to return OPAL_BUSY.
Minor changes from Jeremy Kerr
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
|
|
This change adds a SEL event handler (triggered through the SMS_ATN
facility), to call prd_occ_reset().
For multi-chip OpenPower machines, we'll need to lookup the proper
sensor IDs, once we have that information available in the device tree.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|