Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit d8e13853e506e00713d15fa5e23457ba21a16829 ]
Fuzzing revealed a crash where pkcs7_get_signed_data was accessing beyond
the bounds of the object, despite valid data being passed in to
mbedtls_pkcs7_parse_der.
Further investigation revealed that pkcs7_get_content_info_type will
reset *p to start if the second call to mbedtls_asn1_get_tag fails,
but not if the first call fails.
mbedtls_asn1_get_tag does indeed advance *p even in some failure
cases, so a reset is required.
Reset *p to start if the first call to mbedtls_asn1_get_tag fails.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit 8dd8b6e4abb4d61cdf98470f3fe5cb750def7a18 ]
We need to actually free the pkcs7 structure, not just pass it to
mbedtls_pkcs7_free().
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit 0c265ace91b9d9ee08e09392a7d4a78a1301a3ab ]
If a declared size is smaller than uuid size, we end up allocating
with an allocation of a 'negative' number, which is a huge 64 bit
number.
This will probably then fail with an OPAL_NO_MEM, but it will be
better to catch it and return OPAL_PARAMETER instead.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit 15da2fd447c04a9f6ea53b8f8bdfaa7cbc6ea520 ]
Catch another OOB read picked up by the fuzzer.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit 56658ad4a0249cdf516e6bc21781cce901965998 ]
Currently, in `verify_signature`, the return code `rc` is initialized
as 0 (our success value). While looping through the ESL's in the given
secvar, the function will break if the remaining data in the secvar is
not enough to contain another ESL. This break from the loop was not
setting a return code, this means that the successful return code
can pass to the end of the function if the first iteration meets
this condition. In other words, if a current secvar has a size that
is less than minimum size for an ESL, than it will approve any update.
In response to this bug, this commit will return an error code if
the described condition is met. Additionally, a test case has been
added to ensure that this unlikely event is handled correctly.
Fixes: 87562bc5c1a6 ("secvar/backend: add edk2 derived key updates processing")
Signed-off-by: Nick Child <nick.child@ibm.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit 355176a9405c83320748f804e8655e6a8ee2324f ]
Currently, in `validate_esl_list`, the return code is initialized to zero
(our success value). While looping though the ESL's in the submitted ESL
chain, the loop will break if there is not enough data to meet minimum ESL
requirements. This condition was not setting a return code, meaning that the
successful return code can pass to the end of the function if there is extra
data at the end of the ESL. As a consequence, any properly signed update can
successfully commit any data (as long as it is less than the min size of an
ESL) to the secvars.
This commit will return an error if the described condition is met. This
means all data in the appended ESL of an auth file must be accounted for. No
extra bytes can be added to the end since, on success, this data will become
the updated secvar.
Additionally, a test case has been added to ensure that this commit
addresses the issue correctly.
Fixes: 87562bc5c1a6 ("secvar/backend: add edk2 derived key updates processing")
Signed-off-by: Nick Child <nick.child@ibm.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit 0917fd18ac30f8935563d26629a02f210a485687 ]
Currently, the loop in validate_esl_list is not iterating through the ESL
entries. As a consequence, all of entries after the first are not being
validated and can contain any data. In order to iterate, the pointer to the
esl buffer must be incremented by the amount of already read bytes.
This commit also adds a new test case and file. The file is
`multipletrimmedKEK.h` the array is very similar to the one in `trimmedKEK.h`
except this one only has an invalid ESL as the second ESL in the chain. This
then tests the condition that this commit tests because only the second ESL
is invalid.
Fixes: 87562bc5c1a6 ("secvar/backend: add edk2 derived key updates processing")
Signed-off-by: Nick Child <nick.child@ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit 8a31163a0271f11b4597bca4e803f559e38e3d24 ]
Currently, `get_esl_cert` receives a data buffer containing an ESL and its
length. It is to return a data buffer of the certificate that is contained
inside the ESL. The ESL has header info that contains the certificates
`size` and the size of the header (`sig_data_offset`). We use this
information to copy `size` bytes starting `sig_data_offset` bytes after the
given ESL buffer. Currently we are checking that the length of the ESL
buffer is at least `sig_data_offset` bytes but we are not checking that it
also has enough bytes to also contain `size` bytes of the certificate. This
becomes problematic if some data at the end of the ESL gets lost. Since the
ESL claims it has more than it actually does, this will lead to a buffer
over-read. What is even worse, is that this buffer over-read can go
unnoticed since the last 256 bytes of the ESL are usually the x509 2048 bit
signature so the extra garbage bytes that are copied will appear to be a
valid rsa signature.
To resolve this, this commit ensures that the ESL buffer length is large
enough to hold the data that it claims it contains.
Lastly, a new test case is added to test the described condition. It
includes a new test file `trimmedKEK.h` which contains a struct a valid KEK
auth file minus 5 bytes, therefore making it invalid.
Fixes: 87562bc5c1a6 ("secvar/backend: add edk2 derived key updates processing")
Signed-off-by: Nick Child <nick.child@ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit 5be38b672c1410e2f10acd3ad2eecfdc81d5daf7 ]
unpack_timestamp() calls le32_to_cpu() for endian conversion of
uint16_t "year" value. This patch fixes the code to use le16_to_cpu().
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Increase timeout introduced in commit 6bf21350da32 ("uart: Drop
console write data if BMC becomes unresponsive") when running under
SIMICS.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Move to qemu version powernv-6.0. Also add required packages.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
This isn't currently used in skiboot but may be used by external
users of skiboot's secvar code.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
We use these types but haven't included the header: every file that
includes edk2.h has already included it.
This might not be true for other users of edk2.h and skiboot's secvar
code, so include it explictly.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
If allocating the secure variable name of a secure variable struct,
`secvar->key`, fails then the secvar struct should be freed before
returning NULL. Previously, if this allocation fails, then only the
`secvar->key` is freed (which is likely a typo) leaving the allocated
`secvar` struct allocated and returning NULL. This memory leak can be
seen with the static analysis tool `cppcheck`. After running valgrind
tests, this commit ensures that memory is properly freed if an error
occurs when allocating the `key` field of the `secvar` struct.
Signed-off-by: Nick Child <nick.child@ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Functions `get_esl_cert`, `validate_esl_list` and
`get_esl_signature_list_size` all contain the same debug print
statement. This statement prints the size of the ESL.
`validate_esl_list` calls `get_esl_cert` so the same debug information
prints twice when validating the newly submitted ESL. Additionally,
the same debug prints twice when validating the current ESL since
`get_esl_cert` and `get_esl_signature_list_size` are both called by
the function `verify_signature`. Since `get_esl_cert` is the common
factor, this commit removes the other two print statements (and adds
some information to an error message to maintain clarity, in case
`validate_esl_list` fails before calling `validate_esl_cert`). After
double checking that these functions are not being called anywhere
else, the only real change is to reduce the number of redundant print
statements for the secvar update process.
Signed-off-by: Nick Child <nick.child@ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
During opal boot, in imc_init(), 24x7/IMC microcode state is checked
and if it is not in running or pause state, currently all the
imc devices are removed from device tree. Instead, remove only
the nest imc devices. Core/Thread/Trace imc devices are not related
to 24x7 microcode. Patch adds a function to remove specific imc
device type and the same is used, when pause_microcode() fails, to
remove nest imc device types from the device tree.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
imc_init() checks for the 24x7 microcode state at boot to
check whether the microcode is in proper state (running or paused).
But in a larger system, loading of 24x7 microcode by OCC gets delayed.
Because of this, imc_init() removes imc devices from the device tree.
Moving imc_init() function towards end of the main_cpu_entry()
works around this.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Hitting below issue on recent distro (fedora-rawhide).
sample failure:
---------------
[ HOSTCC ] hw/ipmi/test/run-fru.c
In file included from hw/ipmi/test/run-fru.c:10:
hw/ipmi/test/../ipmi-fru.c: In function 'fru_fill_product_info':
hw/ipmi/test/../ipmi-fru.c:80:17: error: this 'if' clause does not guard... [-Werror=misleading-indentation]
80 | if (rc < 1) return OPAL_PARAMETER; rc; })
| ^~
hw/ipmi/test/../ipmi-fru.c:102:18: note: in expansion of macro 'FRU_INSERT_STRING'
102 | index += FRU_INSERT_STRING(&buf[index], info->manufacturer);
| ^~~~~~~~~~~~~~~~~
hw/ipmi/test/../ipmi-fru.c:80:52: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'
80 | if (rc < 1) return OPAL_PARAMETER; rc; })
| ^~
hw/ipmi/test/../ipmi-fru.c:102:18: note: in expansion of macro 'FRU_INSERT_STRING'
102 | index += FRU_INSERT_STRING(&buf[index], info->manufacturer);
| ^~~~~~~~~~~~~~~~~
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
|
|
openssl is needed by libstb.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
|
|
Ideally we should move to fedora34. But looks like docker repository
doesn't have fedora34-ppc64le image. Hence moving to fedora33 for now.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
|
|
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
|
|
Convert scripts to Python3 as Python2 has been EOLed in 2020.
Fixes: https://github.com/open-power/skiboot/issues/225
Signed-off-by: Dan Horák <dan@danny.cz>
[Fixed directory walking logic in generate-fwts-olog - Vasant]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
The structes we import from EDK2 are expected to be packed.
The code we imported does this a #pragma pack, but it doesn't
restore the original non-packed state at the end of the header.
Rather than changing that, just explictly pack every structure.
The resulting skiboot.elf has the same disassembly (objdump -dr)
and readelf -a output, but I haven't been able to test this on
a real machine.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
The commit f397cc30bdf8 ("phb4: Only escalate freezes on MMIO load where
necessary") introduced a change to restrict escalation to the chips that
actually need it. However it missed one case which still causes the
escalation on every chip. This affects EEH recovery to cause full
PHB reset on some chips which is not necessary. This patch fixes that.
Also, add a check for p9 chip in phb4_escalation_required() function.
Cc: skiboot-stable@lists.ozlabs.org
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
The commit e73cf72d1f97 ("phb4: make endian-clean") accidently missed
printing correct value for PCI device secondary status register.
[ 1654.399387394,3] PHB#0033[3:3]: devCmdStatus = 00100107
[ 1654.399389575,3] PHB#0033[3:3]: devSecStatus = 00100107
after this patch:
[ 1620.415289504,3] PHB#0033[3:3]: devCmdStatus = 00100107
[ 1620.415291622,3] PHB#0033[3:3]: devSecStatus = 00002000
Fixes: e73cf72d ("phb4: make endian-clean")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
This patch implements a circumvention for HW557787. It disables the
TCE cache line buffer as, under heavy loads, there's a possibility of
an entry being re-allocated incorrectly.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Follow the inclusive terminology from the "Conscious Language in
your Open Source Projects" guidelines [*] and replace the word
"whitelist" appropriately.
[*] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Follow the inclusive terminology from the "Conscious Language in
your Open Source Projects" guidelines [*] and replace the word
"whitelist" appropriately.
[*] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Follow the inclusive terminology from the "Conscious Language in
your Open Source Projects" guidelines [*] and replace the word
"whitelist" appropriately.
[*] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
BT IRQ may preempt BT timer if BMC response host when bt msg timeout.
When BT IRQ preempt BT timer, the infight_bt_msg did not protected by bt.lock very well.
And we will see the following log:
[29006114.163785853,3] BT: seq 0x81 netfn 0x0a cmd 0x23: Timeout sending message
[29006114.288029290,3] BT: seq 0x81 netfn 0x0b cmd 0x23: Timeout sending message
[29006114.288917798,3] IPMI: Incorrect netfn 0x0b in response
It may cause 'CPU Hardlock UP', 'memory refree', 'kernel crash' or something else...
Signed-off-by: lixg <867314078@qq.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
This just keeps the requested delta and uses it to adjust subsequent
rtc_read calls.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Currently we are not accounting cancelled timer request. So in some
corner cases we may schedule new timer request with
new-timer-value > inflight-timer-value.
Lets explicit check new_target value with inflight timer value.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
We schedule timer and wait for `timer expiry` interrupt from SBE.
If we get new timer request which is lesser than inflight timer
expiry value we can update timer (essentially sending new timer chip-op
and SBE takes care of stoping inflight timer and scheduling new one).
SBE runs at much slower speed than host CPU. If we do continuous timer
update like below then SBE will be busy with handling PSU side timer
message and will not get time to handle FIFO side requests.
send timer chip-op -> Got ACK -> send timer chip-op
Hence this patch limits number of continuous timer update and we will
restart sending timer request as soon as we get timer expiry interrupt.
Rate limit value (2) is suggested by SBE team.
With this patch:
If our timer requests are : 2ms, 1500us, 1000us and 800us
(and requests are coming after sending each message)
We will schedule timer for 2ms and then update timer for 1500us and 1000us
(These update happens after getting ACK interrupt from SBE)
We will not send 800us request.
At 1000us we get `timer expiry` and we are good to send next timer requests
(At this stage both 1000us and 800us timeout happens. We will schedule
next timer request with timeout value 500us (1500-1000)).
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Timer flow:
- OPAL sends timer chip-op to SBE and waits for ACK
- Until we get ACK interrupt from SBE we will not schedule any new timer
- Once we get ACK either we wait for timer expiry -OR- schedule
new one if new-timer-request < inflight-timer-timeout value.
- If we get new timer request while processing current one
p9_sbe_update_timer_expiry code sets `has_new_target` and we
schedule it in ACK path (p9_sbe_timer_resp()).
p9_sbe_timer_resp() is callback handler and its called without lock.
It does not check whether timer message is busy or not (timer_ctrl_msg).
So in theory we may hit below scenario and corrupt msg_list.
CPU 1 -> Timer ACK (callback handler) -- its not holding any lock
CPU 2 -> Grabbed sbe_timer_lock -> scheduled timer --> done
CPU 3 -> p9_sbe_update_timer_expiry() -> see timer is busy -> sets has_new_timer -> done
CPU 1 -> gets chance to grab sbe_timer_lock -> saw has_new_timer -> Called p9_sbe_timer_schedule() --> List corrupted !
This patch adds timer message busy check in p9_sbe_timer_resp().
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Use the method provided by Frederic:
Add the "ibm, maximum link speed" attribute to the PHB device tree at index 0.
The phb4.c code will looks for it and set up the link correctly.
Signed-off-by: LuluTHSu <Lulu_Su@wistron.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
This reverts commit 5262cdd1b99f77bca5951fc8132f9795ef0c2b87.
When link reset/retrain, this method cannot maintain the max-link-speed limit, so remove it.
Signed-off-by: LuluTHSu <Lulu_Su@wistron.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Commit 80fd2e963bd4 ("xscom: Don't log xscom errors caused by OPAL
calls") ensured that xscom errors caused due to XSCOM read/write OPAL
calls aren't logged in the error-log since the caller of the OPAL call
is expected to handle it.
However we are continuing to print the prerror() in the OPAL log
regarding the same. This patch reduces the severity of the log from
PR_ERROR to PR_INFO for the xscom read and write made via OPAL calls.
Tested-by: Pavaman Subramaniyam <pavsubra@in.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Print info only for xscom read/writes made via opal calls
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
XIVE VPs are structures describing the vCPUs of guests. When starting
a guest, these are allocated and enabled and some checks are done on
the location of the associated ENDs, which describe the event
queues. If the block of the VP and the block of the ENDs do not match,
the XIVE driver asserts.
Unfortunately, there is no way to check that a VP identifier is part
of a VP block that was previously allocated and it is relatively easy
to crash the host with a bogus VP id. That can be done with a QEMU
hack on a machine using vsmt.
Simply remove the assert, the OS should gracefully handle the error.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Fix log message and convert perror to prlog.
Also reduce message severity as its informational message, not error.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Looks like HBRT sets top bit in pcbaddress before making OCMB SCOM request.
We have to clear that bit so that we can find proper address range
for SCOM operation.
Sample failure:
[ 2578.156011925,3] OCMB: no matching address range!
[ 2578.156044481,3] scom_read: to 80000028 off: 8006430d4008c000 rc = -26
Also move HRMOR_BIT macro to common include file (hdata/spira.h -> skiboot.h).
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
If we have duplicate xscom nodes then it will fail to attach xscom
node to device tree and we will fail eventully. Better to call assert()
and fail here.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Commit '6b08928d - build/lds: place debug sections according to
defaults' introduced a DEBUG_SECTIONS macro that is effectivelly
resetting the location pointer back to zero, making the next section
(builtin_kernel) collide with the earlier sections.
Fix by moving these sections to the very end.
Error message:
$ make KERNEL=zImage.epapr
[CC] asm/asm-offsets.s
[GN] include/asm-offsets.h
<...>
[LD] skiboot.tmp.elf
ld: section .builtin_kernel LMA [0000000000000000,0000000000285d87]
overlaps section .head LMA [0000000000000000,0000000000003897]
ld: section .naca LMA [0000000000004000,000000000000505f] overlaps
section .builtin_kernel LMA [0000000000000000,0000000000285d87]
make: *** [/skiboot/Makefile.main:333: skiboot.tmp.elf] Error 1
Fixes: 6b08928d - build/lds: place debug sections according to defaults
Signed-off-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Sample output from Cédric:
-------------------------
[ 88.294111649,7] cpu_idle_p9 called on cpu 0x063c with pm disabled
[ 88.289365222,7] cpu_idle_p9 called on cpu 0x025f with pm disabled
[ 88.289900684,7] cpu_idle_p9 called on cpu 0x045f with pm disabled
[ 88.302621295,7] CHIPTOD: Base TFMR=0x2512000000000000
[ 88.289899701,7] cpu_idle_p9 called on cpu 0x0456 with pm disabled
LOCK ERROR: Deadlock detected @0x30402740 (state: 0x0000000400000001)
[ 88.332264757,3] ***********************************************
[ 88.332300051,3] < assert failed at core/lock.c:32 >
[ 88.332328282,3] .
[ 88.332347335,3] .
[ 88.332364894,3] .
[ 88.332377963,3] OO__)
[ 88.332395458,3] <"__/
[ 88.332412628,3] ^ ^
[ 88.332450246,3] Fatal TRAP at 00000000300286a0 .lock_error+0x64 MSR 9000000000021002
[ 88.332501812,3] CFAR : 00000000300414f4 MSR : 9000000000021002
[ 88.332536539,3] SRR0 : 00000000300286a0 SRR1 : 9000000000021002
[ 88.332574644,3] HSRR0: 0000000030020024 HSRR1: 9000000000001000
[ 88.332610635,3] DSISR: 00000000 DAR : 0000000000000000
[ 88.332650628,3] LR : 0000000030028690 CTR : 00000000300f9fa0
[ 88.332684451,3] CR : 20002000 XER : 00000000
[ 88.332712767,3] GPR00: 0000000030028690 GPR16: 0000000032c98000
[ 88.332748046,3] GPR01: 0000000032c9b0a0 GPR17: 0000000000000000
[ 88.332784060,3] GPR02: 0000000030169d00 GPR18: 0000000000000000
[ 88.332822091,3] GPR03: 0000000032c9b310 GPR19: 0000000000000000
[ 88.332861357,3] GPR04: 0000000030041480 GPR20: 0000000000000000
[ 88.332897229,3] GPR05: 0000000000000000 GPR21: 0000000000000000
[ 88.332937051,3] GPR06: 0000000000000010 GPR22: 0000000000000000
[ 88.332968463,3] GPR07: 0000000000000000 GPR23: 0000000000000000
[ 88.333007333,3] GPR08: 000000000002cbb5 GPR24: 0000000000000000
[ 88.333041971,3] GPR09: 0000000000000000 GPR25: 0000000000000000
[ 88.333081073,3] GPR10: 0000000000000000 GPR26: 0000000000000003
[ 88.333114301,3] GPR11: 3839616263646566 GPR27: 0000000000000211
[ 88.333156040,3] GPR12: 0000000020002000 GPR28: 000000003042a134
[ 88.333189222,3] GPR13: 0000000000000000 GPR29: 0000000030402740
[ 88.333225638,3] GPR14: 0000000000000000 GPR30: 0000000000000001
[ 88.333259730,3] GPR15: 0000000000000000 GPR31: 0000000000000000
CPU 0211 Backtrace:
S: 0000000032c9b3b0 R: 0000000030028690 .lock_error+0x54
S: 0000000032c9b440 R: 0000000030028828 .add_lock_request+0xd0
S: 0000000032c9b4f0 R: 0000000030028a9c .lock_caller+0x8c
S: 0000000032c9b5a0 R: 0000000030021b30 .__mcount_stack_check+0x70
S: 0000000032c9b650 R: 00000000300fabb0 .list_check_node+0x1c
S: 0000000032c9b6f0 R: 00000000300fac98 .list_check+0x38
S: 0000000032c9b790 R: 00000000300289bc .try_lock_caller+0xac
S: 0000000032c9b830 R: 0000000030028ad8 .lock_caller+0xc8
S: 0000000032c9b8e0 R: 0000000030028d74 .lock_recursive_caller+0x54
S: 0000000032c9b980 R: 0000000030020cb8 .console_write+0x48
S: 0000000032c9ba30 R: 00000000300445a8 .vprlog+0xc8
S: 0000000032c9bc20 R: 0000000030044630 ._prlog+0x50
S: 0000000032c9bcb0 R: 0000000030029204 .cpu_idle_p9+0x74
S: 0000000032c9bd40 R: 0000000030029628 .cpu_idle_pm+0x4c
S: 0000000032c9bde0 R: 0000000030023fe0 .__secondary_cpu_entry+0xa0
S: 0000000032c9be70 R: 0000000030024034 .secondary_cpu_entry+0x40
S: 0000000032c9bf00 R: 0000000030003290 secondary_wait+0x8c
CPU 0x4:
opal_run_pollers ->
check_stacks -> takes stack_check_lock lock
prlog ->
console_write -> waits for con_lock
CPU 0x211
cpu_idle_p9 ->
prlog ->
console_write -> Takes con_lock lock
list_check_node -> tries to take stack_check_lock and hits deadlock.
I think we don't need to hold `stack_check_lock` while printing
backtraces. Instead it makes sense to hold backtrace lock (bt_lock)
and print output.
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
platforms/astbmc/witherspoon.c:557:28: warning: Using plain integer as NULL pointer
Signed-off-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|