diff options
author | Kevin Buettner <kevinb@redhat.com> | 2025-08-23 20:54:30 -0700 |
---|---|---|
committer | Kevin Buettner <kevinb@redhat.com> | 2025-08-25 15:39:18 -0700 |
commit | bf0f85df125412b06b209db41e950715b2af406e (patch) | |
tree | 80328ff7656d20cfbe15709171b4f57722713a4b | |
parent | fa8b445d00c518de27064db3dff22b6c476058ea (diff) | |
download | binutils-bf0f85df125412b06b209db41e950715b2af406e.zip binutils-bf0f85df125412b06b209db41e950715b2af406e.tar.gz binutils-bf0f85df125412b06b209db41e950715b2af406e.tar.bz2 |
Fix tekhex format related gdb.base/dump.exp failures
On s390x, a big-endian machine, I'm seeing these test failures:
FAIL: gdb.base/dump.exp: array as memory, tekhex; file restored ok
FAIL: gdb.base/dump.exp: array as memory, tekhex; value restored ok
FAIL: gdb.base/dump.exp: array as value, tekhex; file restored ok
FAIL: gdb.base/dump.exp: array as value, tekhex; value restored ok
FAIL: gdb.base/dump.exp: array copy, tekhex; file restored ok
FAIL: gdb.base/dump.exp: array copy, tekhex; value restored ok
FAIL: gdb.base/dump.exp: array partial, tekhex; file restored ok
FAIL: gdb.base/dump.exp: array partial, tekhex; value restored ok
FAIL: gdb.base/dump.exp: dump array as memory, tekhex
FAIL: gdb.base/dump.exp: dump array as value, tekhex
FAIL: gdb.base/dump.exp: dump struct as memory, tekhex
FAIL: gdb.base/dump.exp: dump struct as value, tekhex
FAIL: gdb.base/dump.exp: reload array as memory, tekhex; value restored ok
FAIL: gdb.base/dump.exp: reload array as value, tekhex; value restored ok
FAIL: gdb.base/dump.exp: reload struct as memory, tekhex; value restored ok
FAIL: gdb.base/dump.exp: reload struct as value, tekhex; value restored ok
FAIL: gdb.base/dump.exp: struct as memory, tekhex; file restored ok
FAIL: gdb.base/dump.exp: struct as memory, tekhex; value restored ok
FAIL: gdb.base/dump.exp: struct as value, tekhex; file restored ok
FAIL: gdb.base/dump.exp: struct as value, tekhex; value restored ok
FAIL: gdb.base/dump.exp: struct copy, tekhex; file restored ok
FAIL: gdb.base/dump.exp: struct copy, tekhex; value restored ok
It turns out that there's a subtle bug in move_section_contents in
bfd/tekhex.c. The bug is that when attempting to write a buffer that
starts with a zero byte, the function will return false, an error
condition, without writing anything. But it also doesn't set
bfd_error, so GDB ends up displaying whatever the last unrelated error
was, e.g.:
warning: writing dump file '.../intstr1.tekhex' (No such file or directory)
When I investigated this, the bfd error was set during failure to
open a separate debug file for the test case, which is totally
unrelated to this problem.
The reason this fails on big endian machines is that the test case
writes out structs and arrays of int initialized to small values. On
little endian machines, the small integer is the first byte, so the
error doesn't occur. On big endian machines, a zero byte occurs
first, triggering the error.
On the GDB side of things, I've made a one line change to the test
case to cause the error to also happen on little endian machines. I
simply shift value of the first field in the struct left by 16 bits.
That leaves at least one zero byte on both sides of the non-zero part
of the int. I shifted it by 16 because, for a moment, there was a
question in my mind about what would happen with a second zero byte,
but it turns out that it's not a problem.
On the bfd side of things, take a look at move_section_contents() and
find_chunk() in tekhex.c. The scenario is this: we enter
move_section_contents with locationp pointing at a character buffer
whose first byte is zero. The 'get' parameter is false, i.e. we're
writing, not reading. The other critical fact is that the
abfd->tdata.tekhex_data->data is NULL (0).
I'm going to go through the execution path pretty much line by line
with commentary below the line(s) just executed.
char *location = (char *) locationp;
bfd_vma prev_number = 1; /* Nothing can have this as a high bit. */
I can't say that the comment provides the best explanation about
what's happening, but the gist is this: later on, chunk_number will
have it's low bits masked away, therefore no matter what it is, it
can't possibly be equal to prev_number when it's set to 1.
struct data_struct *d = NULL;
BFD_ASSERT (offset == 0);
for (addr = section->vma; count != 0; count--, addr++)
{
Set d to NULL and enter the loop.
/* Get high bits of address. */
bfd_vma chunk_number = addr & ~(bfd_vma) CHUNK_MASK;
bfd_vma low_bits = addr & CHUNK_MASK;
Use CHUNK_MASK, which is 0x1fff, to obtain the chunk number, i.e.
whatever's left after masking off the low 13 bits of addr, and
low_bits, which are the low 13 bits of addr. chunk_number matters for
understanding this bug, low_bits does not. Remember that no matter
what addr is, once you mask off the low 13 bits, it can't be equal to 1.
bool must_write = !get && *location != 0;
!get is true, *location != 0 is false, therefore the conjunction is
false, and furthermore must_write is false. I.e. even though we are
writing, we don't transfer zero bytes to the chunk - this is why
must_write is false. (The reason this works is that a chunk, once
allocated, is zero'd as part of the allocation using bfd_zalloc.
Therefore we can skip transferring zero bytes and, if enough of them
are skipped one after another, chunk allocation simply doesn't happen.
That's a good thing.)
if (chunk_number != prev_number || (!d && must_write))
For the reason provided above, chunk_number != prev_number is true.
The other part of the disjunction doesn't matter since the first part
is true. This means that the if-block is entered.
/* Different chunk, so move pointer. */
d = find_chunk (abfd, chunk_number, must_write);
find_chunk is entered with must_write set to false. Now, remember
where we left off here, because we're going to switch to find_chunk.
static struct data_struct *
find_chunk (bfd *abfd, bfd_vma vma, bool create)
{
(Above 3 lines indented to distinguish code from commentary.)
When we enter find_chunk, create is false because must_write was false.
struct data_struct *d = abfd->tdata.tekhex_data->data;
d is set to NULL since abfd->tdata.texhex_data->data is NULL (one of
the conditions for the scenario).
vma &= ~CHUNK_MASK;
while (d && (d->vma) != vma)
d = d->next;
d is NULL, so the while loop doesn't execute.
if (!d && create)
...
d is NULL so !d is true, but create is false, so the condition
evaluates to false, meaning that the if-block is skipped.
return d;
find_chunk returns NULL, since d is NULL.
Back in move_section_contents:
if (!d)
return false;
d is NULL (because that's what find_chunk returned), so
move_section_contents returns false at this point.
Note that find_section_contents has allocated no memory, nor even
tried to transfer any bytes beyond the first (zero) byte. This
is a bug.
The key to understanding this bug is to observe that find_chunk can
return NULL to indicate that no chunk was found. This is especially
important for the read (get=true) case. But it can also be NULL
to indicate a memory allocation error. I toyed around with the
idea of using a different value to distinguish these cases, i.e.
something like (struct data_struct *) -1, but although bfd contains
plenty of code where -1 is used to indicate various interesting
conditions for scalars, there's no prior art where this is done
for a pointer. Therefore the idea was discarded in favor of
modifying this statement:
if (!d)
return false;
to:
if (!d && must_write)
return false;
This works because, in find_chunk, the only way to return a NULL
memory allocation error is for must_write / create to be true. When
it is true, if bfd_zalloc successfully allocates a chunk, then that
(non-NULL) chunk will be returned at the end of the function. When it
fails, it'll return NULL early. The point is that when bfd_zalloc()
fails and returns NULL, must_write (in move_section_contents) / create
(in find_chunk) HAD to be true. That provides us with an easy test
back in move_section_contents to distinguish a memory-allocation-NULL
from a block-not-found-NULL.
The other NULL return case happens when the end of the function is
reached when either searching for a chunk to read or attempting to
find a chunk to write when abfd->tdata.tekhex_data->data is NULL. But
for the latter case, must_write was false, which does not (now, with
the above fix) trigger the early return of false.
(Alan Modra approved the bfd/tekhex.c change.)
Approved-By: Simon Marchi <simon.marchi@efficios.com> (GDB)
-rw-r--r-- | bfd/tekhex.c | 2 | ||||
-rw-r--r-- | gdb/testsuite/gdb.base/dump.c | 2 |
2 files changed, 2 insertions, 2 deletions
diff --git a/bfd/tekhex.c b/bfd/tekhex.c index 7e4e698..4ac6116 100644 --- a/bfd/tekhex.c +++ b/bfd/tekhex.c @@ -655,7 +655,7 @@ move_section_contents (bfd *abfd, { /* Different chunk, so move pointer. */ d = find_chunk (abfd, chunk_number, must_write); - if (!d) + if (!d && must_write) return false; prev_number = chunk_number; } diff --git a/gdb/testsuite/gdb.base/dump.c b/gdb/testsuite/gdb.base/dump.c index bdcafbf..14b66b1 100644 --- a/gdb/testsuite/gdb.base/dump.c +++ b/gdb/testsuite/gdb.base/dump.c @@ -35,7 +35,7 @@ main() for (i = 0; i < ARRSIZE; i++) intarray[i] = i+1; - intstruct.a = 12 * 1; + intstruct.a = (12 * 1) << 16; intstruct.b = 12 * 2; intstruct.c = 12 * 3; intstruct.d = 12 * 4; |