aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKevin Buettner <kevinb@redhat.com>2025-08-23 20:54:30 -0700
committerKevin Buettner <kevinb@redhat.com>2025-08-25 15:39:18 -0700
commitbf0f85df125412b06b209db41e950715b2af406e (patch)
tree80328ff7656d20cfbe15709171b4f57722713a4b
parentfa8b445d00c518de27064db3dff22b6c476058ea (diff)
downloadbinutils-bf0f85df125412b06b209db41e950715b2af406e.zip
binutils-bf0f85df125412b06b209db41e950715b2af406e.tar.gz
binutils-bf0f85df125412b06b209db41e950715b2af406e.tar.bz2
Fix tekhex format related gdb.base/dump.exp failures
On s390x, a big-endian machine, I'm seeing these test failures: FAIL: gdb.base/dump.exp: array as memory, tekhex; file restored ok FAIL: gdb.base/dump.exp: array as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: array as value, tekhex; file restored ok FAIL: gdb.base/dump.exp: array as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: array copy, tekhex; file restored ok FAIL: gdb.base/dump.exp: array copy, tekhex; value restored ok FAIL: gdb.base/dump.exp: array partial, tekhex; file restored ok FAIL: gdb.base/dump.exp: array partial, tekhex; value restored ok FAIL: gdb.base/dump.exp: dump array as memory, tekhex FAIL: gdb.base/dump.exp: dump array as value, tekhex FAIL: gdb.base/dump.exp: dump struct as memory, tekhex FAIL: gdb.base/dump.exp: dump struct as value, tekhex FAIL: gdb.base/dump.exp: reload array as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: reload array as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: reload struct as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: reload struct as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: struct as memory, tekhex; file restored ok FAIL: gdb.base/dump.exp: struct as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: struct as value, tekhex; file restored ok FAIL: gdb.base/dump.exp: struct as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: struct copy, tekhex; file restored ok FAIL: gdb.base/dump.exp: struct copy, tekhex; value restored ok It turns out that there's a subtle bug in move_section_contents in bfd/tekhex.c. The bug is that when attempting to write a buffer that starts with a zero byte, the function will return false, an error condition, without writing anything. But it also doesn't set bfd_error, so GDB ends up displaying whatever the last unrelated error was, e.g.: warning: writing dump file '.../intstr1.tekhex' (No such file or directory) When I investigated this, the bfd error was set during failure to open a separate debug file for the test case, which is totally unrelated to this problem. The reason this fails on big endian machines is that the test case writes out structs and arrays of int initialized to small values. On little endian machines, the small integer is the first byte, so the error doesn't occur. On big endian machines, a zero byte occurs first, triggering the error. On the GDB side of things, I've made a one line change to the test case to cause the error to also happen on little endian machines. I simply shift value of the first field in the struct left by 16 bits. That leaves at least one zero byte on both sides of the non-zero part of the int. I shifted it by 16 because, for a moment, there was a question in my mind about what would happen with a second zero byte, but it turns out that it's not a problem. On the bfd side of things, take a look at move_section_contents() and find_chunk() in tekhex.c. The scenario is this: we enter move_section_contents with locationp pointing at a character buffer whose first byte is zero. The 'get' parameter is false, i.e. we're writing, not reading. The other critical fact is that the abfd->tdata.tekhex_data->data is NULL (0). I'm going to go through the execution path pretty much line by line with commentary below the line(s) just executed. char *location = (char *) locationp; bfd_vma prev_number = 1; /* Nothing can have this as a high bit. */ I can't say that the comment provides the best explanation about what's happening, but the gist is this: later on, chunk_number will have it's low bits masked away, therefore no matter what it is, it can't possibly be equal to prev_number when it's set to 1. struct data_struct *d = NULL; BFD_ASSERT (offset == 0); for (addr = section->vma; count != 0; count--, addr++) { Set d to NULL and enter the loop. /* Get high bits of address. */ bfd_vma chunk_number = addr & ~(bfd_vma) CHUNK_MASK; bfd_vma low_bits = addr & CHUNK_MASK; Use CHUNK_MASK, which is 0x1fff, to obtain the chunk number, i.e. whatever's left after masking off the low 13 bits of addr, and low_bits, which are the low 13 bits of addr. chunk_number matters for understanding this bug, low_bits does not. Remember that no matter what addr is, once you mask off the low 13 bits, it can't be equal to 1. bool must_write = !get && *location != 0; !get is true, *location != 0 is false, therefore the conjunction is false, and furthermore must_write is false. I.e. even though we are writing, we don't transfer zero bytes to the chunk - this is why must_write is false. (The reason this works is that a chunk, once allocated, is zero'd as part of the allocation using bfd_zalloc. Therefore we can skip transferring zero bytes and, if enough of them are skipped one after another, chunk allocation simply doesn't happen. That's a good thing.) if (chunk_number != prev_number || (!d && must_write)) For the reason provided above, chunk_number != prev_number is true. The other part of the disjunction doesn't matter since the first part is true. This means that the if-block is entered. /* Different chunk, so move pointer. */ d = find_chunk (abfd, chunk_number, must_write); find_chunk is entered with must_write set to false. Now, remember where we left off here, because we're going to switch to find_chunk. static struct data_struct * find_chunk (bfd *abfd, bfd_vma vma, bool create) { (Above 3 lines indented to distinguish code from commentary.) When we enter find_chunk, create is false because must_write was false. struct data_struct *d = abfd->tdata.tekhex_data->data; d is set to NULL since abfd->tdata.texhex_data->data is NULL (one of the conditions for the scenario). vma &= ~CHUNK_MASK; while (d && (d->vma) != vma) d = d->next; d is NULL, so the while loop doesn't execute. if (!d && create) ... d is NULL so !d is true, but create is false, so the condition evaluates to false, meaning that the if-block is skipped. return d; find_chunk returns NULL, since d is NULL. Back in move_section_contents: if (!d) return false; d is NULL (because that's what find_chunk returned), so move_section_contents returns false at this point. Note that find_section_contents has allocated no memory, nor even tried to transfer any bytes beyond the first (zero) byte. This is a bug. The key to understanding this bug is to observe that find_chunk can return NULL to indicate that no chunk was found. This is especially important for the read (get=true) case. But it can also be NULL to indicate a memory allocation error. I toyed around with the idea of using a different value to distinguish these cases, i.e. something like (struct data_struct *) -1, but although bfd contains plenty of code where -1 is used to indicate various interesting conditions for scalars, there's no prior art where this is done for a pointer. Therefore the idea was discarded in favor of modifying this statement: if (!d) return false; to: if (!d && must_write) return false; This works because, in find_chunk, the only way to return a NULL memory allocation error is for must_write / create to be true. When it is true, if bfd_zalloc successfully allocates a chunk, then that (non-NULL) chunk will be returned at the end of the function. When it fails, it'll return NULL early. The point is that when bfd_zalloc() fails and returns NULL, must_write (in move_section_contents) / create (in find_chunk) HAD to be true. That provides us with an easy test back in move_section_contents to distinguish a memory-allocation-NULL from a block-not-found-NULL. The other NULL return case happens when the end of the function is reached when either searching for a chunk to read or attempting to find a chunk to write when abfd->tdata.tekhex_data->data is NULL. But for the latter case, must_write was false, which does not (now, with the above fix) trigger the early return of false. (Alan Modra approved the bfd/tekhex.c change.) Approved-By: Simon Marchi <simon.marchi@efficios.com> (GDB)
-rw-r--r--bfd/tekhex.c2
-rw-r--r--gdb/testsuite/gdb.base/dump.c2
2 files changed, 2 insertions, 2 deletions
diff --git a/bfd/tekhex.c b/bfd/tekhex.c
index 7e4e698..4ac6116 100644
--- a/bfd/tekhex.c
+++ b/bfd/tekhex.c
@@ -655,7 +655,7 @@ move_section_contents (bfd *abfd,
{
/* Different chunk, so move pointer. */
d = find_chunk (abfd, chunk_number, must_write);
- if (!d)
+ if (!d && must_write)
return false;
prev_number = chunk_number;
}
diff --git a/gdb/testsuite/gdb.base/dump.c b/gdb/testsuite/gdb.base/dump.c
index bdcafbf..14b66b1 100644
--- a/gdb/testsuite/gdb.base/dump.c
+++ b/gdb/testsuite/gdb.base/dump.c
@@ -35,7 +35,7 @@ main()
for (i = 0; i < ARRSIZE; i++)
intarray[i] = i+1;
- intstruct.a = 12 * 1;
+ intstruct.a = (12 * 1) << 16;
intstruct.b = 12 * 2;
intstruct.c = 12 * 3;
intstruct.d = 12 * 4;