Age | Commit message (Collapse) | Author | Files | Lines |
|
A recent discussion about what commands are allowed during
gdb.Breakpoint.stop, made me wonder if there would be less restrictions if
we'd do those commands as part of a breakpoint command list instead.
Attribute gdb.Breakpoint.commands is a string with gdb commands, so I
tried implementing a new class PyCommandsBreakpoint, derived from
gdb.Breakpoint, that supports a py_commands method.
My original idea was to forbid setting PyCommandsBreakpoint.commands, and do:
...
def py_commands(self):
print("VAR: %d" % self.var)
self.var += 1
gdb.execute("continue")
...
but as it turns out 'gdb.execute("continue")' does not behave the same way as
continue. I've filed PR python/32454 about this.
So the unsatisfactory solution is to first execute
PyCommandsBreakpoint.py_commands:
...
def py_commands(self):
print("VAR: %d" % self.var)
self.var += 1
...
and then:
...
self.commands = "continue"
...
I was hoping for a better outcome, but having done the work of writing this, I
suppose it has use as a test-case, perhaps also as an example of how to work
around PR python/32454.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32454
|
|
On s390x-linux, with test-case gdb.base/finish-pretty.exp I ran into:
...
(gdb) finish
Run till exit from #0 foo () at finish-pretty.c:28
main () at finish-pretty.c:40
40 return v.a + v.b;
Value returned has type: struct s. Cannot determine contents
(gdb) FAIL: $exp: finish foo prettyprinted function result
...
The function being finished is foo, which returns a value of type struct s.
The ABI [1] specifies:
- that the value is returned in a storage buffer allocated by the caller, and
- that the address of this buffer is passed as a hidden argument in r2.
GDB fails to print the value when finishing foo, because it doesn't know the
address of the buffer.
Implement the gdbarch_get_return_buf_addr hook for s390x to fix this.
This is based on ppc_sysv_get_return_buf_addr, the only other implementation
of gdbarch_get_return_buf_addr. For readability I've factored out
dwarf_reg_on_entry.
There is one difference with ppc_sysv_get_return_buf_addr: only
NO_ENTRY_VALUE_ERROR is caught. If this patch is approved, I intend to submit
a follow-up patch to fix this in ppc_sysv_get_return_buf_addr as well.
The hook is not guaranteed to work, because it attempts to get the value r2
had at function entry.
The hook can be called after function entry, and the ABI doesn't guarantee
that r2 is the same throughout the function.
Using -fvar-tracking adds debug information, which allows the hook to succeed
more often, and indeed after adding this to the test-case, it passes.
Do likewise in one more test-case.
Tested on s390x-linux.
Fixes:
- gdb.ada/finish-large.exp
- gdb.base/finish-pretty.exp
- gdb.base/retval-large-struct.exp
- gdb.cp/non-trivial-retval.exp
- gdb.ada/array_return.exp
AFAICT, I've also enabled the hook for s390 and from the ABI I get the
impression that it should work, but I haven't been able to test it.
[1] https://github.com/IBM/s390x-abi
|
|
The Linaro CI reported a regression on arm-linux in test-case
gdb.base/sigstep.exp following commit 7b46460a619 ("[gdb/symtab] Apply
workaround for PR gas/31115 a bit more") [1]:
...
(gdb) return^M
Make __default_sa_restorer return now? (y or n) n^M
Not confirmed^M
(gdb) FAIL: $exp: return from handleri: \
leave signal trampoline (got interactive prompt)
...
After installing package glibc-debuginfo and adding --with-separate-debug-dir
to the configure flags, I managed to reproduce the FAIL.
The regression seems to be a progression in the sense that the function name
for the signal trampoline is found.
After reading up on the signal trampoline [2] and the return command [3], my
understanding is that forced returning from the signal trampoline is
potentially unsafe, given that for instance the process signal mask won't be
restored.
Fix this by:
- rather than using the name, using "signal trampoline" in the query, and
- adding a warning about returning from a signal trampoline,
giving us:
...
(gdb) return^M
warning: Returning from signal trampoline does not fully restore pre-signal \
state, such as process signal mask.^M
Make signal trampoline return now? (y or n) y^M
87 dummy = 0; dummy = 0; while (!done);^M
(gdb) PASS: $exp: return from handleri: leave signal trampoline (in main)
...
Tested on x86_64-linux.
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
[1] https://linaro.atlassian.net/browse/GNU-1459
[2] https://man7.org/linux/man-pages/man2/sigreturn.2.html
[3] https://sourceware.org/gdb/current/onlinedocs/gdb.html/Returning.html
|
|
I ran make-check-all.sh with gdb.linespec/explicit.exp, and the only problems
were found with target board stabs.
With target board unix the test-case runs in two seconds, but with target
board stabs it takes 12 seconds due to a timeout.
Stabs support in gdb has been unmaintained for a while, and there's an ongoing
discussion to deprecate and remove it (PR symtab/31210).
It seems unnecessary to excercise this unmaintained feature in
make-check-all.sh, so drop it.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31210
|
|
This changes gdbpy_lookup_static_symbols to pass the 'flags' parameter
to expand_symtabs_matching. This should refine the search somewhat.
Note this is "just" a performance improvement, as the loop over
symtabs already checks 'flags'.
v2 also removes 'SEARCH_GLOBAL_BLOCK' and updates py-symbol.exp to
verify that this works properly. Thanks to Tom for this insight.
Co-Authored-By: Tom de Vries <tdevries@suse.de>
|
|
Now that the GDB 16 branch has been created,
this commit bumps the version number in gdb/version.in to
17.0.50.DATE-git
For the record, the GDB 16 branch was created
from commit ee29a3c4ac7adc928ae6ed1fed3b59c940a519a4.
Also, as a result of the version bump, the following changes
have been made in gdb/testsuite:
* gdb.base/default.exp: Change $_gdb_major to 17.
|
|
There are two tests that fail in gdb.base/startup-with-shell.exp when
using the native-extended-remote board. I plan to fix these issues,
and I've posted a series that does just that:
https://inbox.sourceware.org/gdb-patches/cover.1730731085.git.aburgess@redhat.com
But until that series is reviewed, I thought I'd merge this commit,
which marks the FAIL as XFAIL and links them to the relevant bug
number.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28392
Tested-By: Guinevere Larsen <guinevere@redhat.com>
|
|
This commit implements the gdbarch_core_parse_exec_context method for
FreeBSD.
This is much simpler than for Linux. On FreeBSD, at least the
version (13.x) that I have installer, there are additional entries in
the auxv vector that point directly to the argument and environment
vectors, this makes it trivial to find this information.
If these extra auxv entries are not available on earlier FreeBSD, then
that's fine. The fallback behaviour will be for GDB to act as it
always has up to this point, you'll just not get the extra
functionality.
Other differences compared to Linux are that FreeBSD has
AT_FREEBSD_EXECPATH instead of AT_EXECFN, the AT_FREEBSD_EXECPATH is
the full path to the executable. On Linux AT_EXECFN is the command
the user typed, so this can be a relative path.
This difference is handy as on FreeBSD we don't parse the mapped files
from the core file (are they even available?). So having the EXECPATH
means we can use that as the absolute path to the executable.
However, if the user ran a symlink then AT_FREEBSD_EXECPATH will be
the absolute path to the symlink, not to the underlying file. This is
probably a good thing, but it does mean there is one case we test on
Linux that fails on FreeBSD.
On Linux if we create a symlink to an executable, then run the symlink
and generate a corefile. Now delete the symlink and load the core
file. On Linux GDB will still find (and open) the original
executable. This is because we use the mapped file information to
find the absolute path to the executable, and the mapped file
information only stores the real file names, not symlink names.
This is a total edge case, I only added the deleted symlink test
originally because I could see that this would work on Linux. Though
it is neat that Linux finds this, I don't feel too bad that this fails
on FreeBSD.
Other than this, everything seems to work on x86-64 FreeBSD (13.4)
which is all I have setup right now. I don't see why other
architectures wouldn't work too, but I haven't tested them.
|
|
GDB already has a limited mechanism for auto-loading the executable
corresponding to a core file, this can be found in the function
locate_exec_from_corefile_build_id in corelow.c.
However, this approach uses the build-id of the core file to look in
either the debug directory (for a symlink back to the executable) or
by asking debuginfod. This is great, and works fine if the core file
is a "system" binary, but often, when I'm debugging a core file, it's
part of my development cycle, so there's no build-id symlink in the
debug directory, and debuginfod doesn't know about the binary either,
so GDB can't auto load the executable....
... but the executable is right there!
This commit builds on the earlier commits in this series to make GDB
smarter.
On GNU/Linux, when we parse the execution context from the core
file (see linux-tdep.c), we already grab the command pointed to by
AT_EXECFN. If this is an absolute path then GDB can use this to
locate the executable, a build-id check ensures we've found the
correct file. With this small change GDB suddenly becomes a lot
better at auto-loading the executable for a core file.
But we can do better! Often the AT_EXECFN is not an absolute path.
If it is a relative path then we check for this path relative to the
core file. This helps if a user does something like:
$ ./build/bin/some_prog
Aborted (core dumped)
$ gdb -c corefile
In this case the core file in the current directory will have an
AT_EXECFN value of './build/bin/some_prog', so if we look for that
path relative to the location of the core file this might result in a
hit, again, a build-id check ensures we found the right file.
But we can do better still! What if the user moves the core file? Or
the user is using some tool to manage core files (e.g. the systemd
core file management tool), and the user downloads the core file to a
location from which the relative path no longer works?
Well in this case we can make use of the core file's mapped file
information (the NT_FILE note). The executable will be included in
the mapped file list, and the path within the mapped file list will be
an absolute path. We can search for mapped file information based on
an address within the mapped file, and the auxv vector happens to
include an AT_ENTRY value, which is the entry address in the main
executable. If we look up the mapped file containing this address
we'll have the absolute path to the main executable, a build-id check
ensures this really is the file we're looking for.
It might be tempting to jump straight to the third approach, however,
there is one small downside to the third approach: if the executable
is a symlink then the AT_EXECFN string will be the name of the
symlink, that is, the thing the user asked to run. The mapped file
entry will be the name of the actual file, i.e. the symlink target.
When we auto-load the executable based on the third approach, the file
loaded might have a different name to that which the user expects,
though the build-id check (almost) guarantees that we've loaded the
correct binary.
But there's one more thing we can check for!
If the user has placed the core file and the executable into a
directory together, for example, as might happen with a bug report,
then neither the absolute path check, nor the relative patch check
will find the executable. So GDB will also look for a file with the
right name in the same directory as the core file. Again, a build-id
check is performed to ensure we find the correct file.
Of course, it's still possible that GDB is unable to find the
executable using any of these approaches. In this case, nothing
changes, GDB will check in the debug info directory for a build-id
based link back to the executable, and if that fails, GDB will ask
debuginfod for the executable. If this all fails, then, as usual, the
user is able to load the correct executable with the 'file' command,
but hopefully, this should be needed far less from now on.
|
|
We have a few tests that load core files, which depend on GDB not
auto-loading the executable that matches the core file. One of these
tests (corefile-buildid.exp) exercises GDB's ability to load the
executable via the build-id links in the debug directory, while the
other two tests are just written assuming that GDB hasn't auto-loaded
the executable.
In the next commit, GDB is going to get better at finding the
executable for a core file, and as a consequence these tests could
start to fail if the testsuite is being run using a compiler that adds
build-ids by default, and is on a target (currently only Linux) with
the improved executable auto-loading.
To avoid these test failures, this commit updates some of the tests.
coredump-filter.exp and corefile.exp are updated to unload the
executable should it be auto-loaded. This means that the following
output from GDB will match the expected patterns. If the executable
wasn't auto-loaded then the new step to unload is harmless.
The corefile-buildid.exp test needed some more significant changes.
For this test it is important that the executable be moved aside so
that GDB can't locate it, but we do still need the executable around
somewhere, so that the debug directory can link to it. The point of
the test is that the executable _should_ be auto-loaded, but using the
debug directory, not using GDB's context parsing logic.
While looking at this test I noticed two additional problems, first we
were creating the core file more times than we needed. We only need
to create one core file for each test binary (total two), while we
previously created one core file for each style of debug info
directory (total four). The extra core files should be identical, and
were just overwriting each other, harmless, but still pointless work.
The other problem is that after running an earlier test we modified
the test binary in order to run a later test. This means it's not
possible to manually re-run the first test as the binary for that test
is destroyed.
As part of the rewrite in this commit I've addressed these issues.
This test does change many of the test names, but there should be no
real changes in what is being tested after this commit. However, when
the next commit is added, and GDB gets better at auto-loading the
executable for a core file, these tests should still be testing what
is expected.
|
|
Extend the core file context parsing mechanism added in the previous
commit to also store the environment parsed from the core file.
This environment can then be injected into the inferior object.
The benefit of this is that when examining a core file in GDB, the
'show environment' command will now show the environment extracted
from a core file.
Consider this example:
$ env -i GDB_TEST_VAR=FOO ./gen-core
Segmentation fault (core dumped)
$ gdb -c ./core.1669829
...
[New LWP 1669829]
Core was generated by `./gen-core'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000401111 in ?? ()
(gdb) show environment
GDB_TEST_VAR=foo
(gdb)
There's a new test for this functionality.
|
|
Add a new gdbarch method which can read the execution context from a
core file. An execution context, for this commit, means the filename
of the executable used to generate the core file and the arguments
passed to the executable.
In later commits this will be extended further to include the
environment in which the executable was run, but this commit is
already pretty big, so I've split that part out into a later commit.
Initially this new gdbarch method is only implemented for Linux
targets, but a later commit will add FreeBSD support too.
Currently when GDB opens a core file, GDB reports the command and
arguments used to generate the core file. For example:
(gdb) core-file ./core.521524
[New LWP 521524]
Core was generated by `./gen-core abc def'.
However, this information comes from the psinfo structure in the core
file, and this struct only allows 80 characters for the command and
arguments combined. If the command and arguments exceed this then
they are truncated.
Additionally, neither the executable nor the arguments are quoted in
the psinfo structure, so if, for example, the executable was named
'aaa bbb' (i.e. contains white space) and was run with the arguments
'ccc' and 'ddd', then when this core file was opened by GDB we'd see:
(gdb) core-file ./core.521524
[New LWP 521524]
Core was generated by `./aaa bbb ccc ddd'.
It is impossible to know if 'bbb' is part of the executable filename,
or another argument.
However, the kernel places the executable command onto the user stack,
this is pointed to by the AT_EXECFN entry in the auxv vector.
Additionally, the inferior arguments are all available on the user
stack. The new gdbarch method added in this commit extracts this
information from the user stack and allows GDB to access it.
The information on the stack is writable by the user, so a user
application can start up, edit the arguments, override the AT_EXECFN
string, and then dump core. In this case GDB will report incorrect
information, however, it is worth noting that the psinfo structure is
also filled (by the kernel) by just copying information from the user
stack, so, if the user edits the on stack arguments, the values
reported in psinfo will change, so the new approach is no worse than
what we currently have.
The benefit of this approach is that GDB gets to report the full
executable name and all the arguments without the 80 character limit,
and GDB is aware which parts are the executable name, and which parts
are arguments, so we can, for example, style the executable name.
Another benefit is that, now we know all the arguments, we can poke
these into the inferior object. This means that after loading a core
file a user can 'show args' to see the arguments used. A user could
even transition from core file debugging to live inferior debugging
using, e.g. 'run', and GDB would restart the inferior with the correct
arguments.
Now the downside: finding the AT_EXECFN string is easy, the auxv entry
points directly too it. However, finding the arguments is a little
trickier. There's currently no easy way to get a direct pointer to
the arguments. Instead, I've got a heuristic which I believe should
find the arguments in most cases. The algorithm is laid out in
linux-tdep.c, I'll not repeat it here, but it's basically a search of
the user stack, starting from AT_EXECFN.
If the new heuristic fails then GDB just falls back to the old
approach, asking bfd to read the psinfo structure for us, which gives
the old 80 character limited answer.
For testing, I've run this series on (all GNU/Linux) x86-64. s390,
ppc64le, and the new test passes in each case. I've done some very
basic testing on ARM which does things a little different than the
other architectures mentioned, see ARM specific notes in
linux_corefile_parse_exec_context_1 for details.
|
|
This commit adds support for a `gstack' command which Fedora has
been carrying for many years. gstack is a natural counterpart to
the gcore command. Whereas gcore dumps a core file, gstack prints
stack traces of a running process.
There are many improvements over Fedora's version of this script.
The dependency on procfs is gone; gstack will run anywhere gdb
runs. The only runtime dependencies are bash and awk.
The script includes suggestions from gdb/32325 to include
versioning and help. [If this approach to gdb/32325 is acceptable,
I could propagate the solution to gcore/gdb-add-index.]
I've rewritten the documentation, integrating it into the User Manual.
The manpage is now output using this one source.
Example run (on x86_64 Fedora 40)
$ gstack --help
Usage: gstack [-h|--help] [-v|--version] PID
Print a stack trace of a running program
-h, --help Print this message then exit.
-v, --version Print version information then exit.
$ gstack -v
GNU gstack (GDB) 16.0.50.20241119-git
$ gstack 12345678
Process 12345678 not found.
$ gstack $(pidof emacs)
Thread 6 (Thread 0x7fd5ec1c06c0 (LWP 2491423) "pool-spawner"):
#0 0x00007fd6015ca3dd in syscall () at /lib64/libc.so.6
#1 0x00007fd60b31eccd in g_cond_wait () at /lib64/libglib-2.0.so.0
#2 0x00007fd60b28a61b in g_async_queue_pop_intern_unlocked () at /lib64/libglib-2.0.so.0
#3 0x00007fd60b2f1a03 in g_thread_pool_spawn_thread () at /lib64/libglib-2.0.so.0
#4 0x00007fd60b2f0813 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#5 0x00007fd6015486d7 in start_thread () at /lib64/libc.so.6
#6 0x00007fd6015cc60c in clone3 () at /lib64/libc.so.6
#7 0x0000000000000000 in ??? ()
Thread 5 (Thread 0x7fd5eb9bf6c0 (LWP 2491424) "gmain"):
#0 0x00007fd6015be87d in poll () at /lib64/libc.so.6
#1 0x0000000000000001 in ??? ()
#2 0xffffffff00000001 in ??? ()
#3 0x0000000000000001 in ??? ()
#4 0x000000002104cfd0 in ??? ()
#5 0x00007fd5eb9be320 in ??? ()
#6 0x00007fd60b321c34 in g_main_context_iterate_unlocked.isra () at /lib64/libglib-2.0.so.0
Thread 4 (Thread 0x7fd5eb1be6c0 (LWP 2491425) "gdbus"):
#0 0x00007fd6015be87d in poll () at /lib64/libc.so.6
#1 0x0000000020f9b558 in ??? ()
#2 0xffffffff00000003 in ??? ()
#3 0x0000000000000003 in ??? ()
#4 0x00007fd5d8000b90 in ??? ()
#5 0x00007fd5eb1bd320 in ??? ()
#6 0x00007fd60b321c34 in g_main_context_iterate_unlocked.isra () at /lib64/libglib-2.0.so.0
Thread 3 (Thread 0x7fd5ea9bd6c0 (LWP 2491426) "emacs"):
#0 0x00007fd6015ca3dd in syscall () at /lib64/libc.so.6
#1 0x00007fd60b31eccd in g_cond_wait () at /lib64/libglib-2.0.so.0
#2 0x00007fd60b28a61b in g_async_queue_pop_intern_unlocked () at /lib64/libglib-2.0.so.0
#3 0x00007fd60b28a67c in g_async_queue_pop () at /lib64/libglib-2.0.so.0
#4 0x00007fd603f4d0d9 in fc_thread_func () at /lib64/libpangoft2-1.0.so.0
#5 0x00007fd60b2f0813 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#6 0x00007fd6015486d7 in start_thread () at /lib64/libc.so.6
#7 0x00007fd6015cc60c in clone3 () at /lib64/libc.so.6
#8 0x0000000000000000 in ??? ()
Thread 2 (Thread 0x7fd5e9e6d6c0 (LWP 2491427) "dconf worker"):
#0 0x00007fd6015be87d in poll () at /lib64/libc.so.6
#1 0x0000000000000001 in ??? ()
#2 0xffffffff00000001 in ??? ()
#3 0x0000000000000001 in ??? ()
#4 0x00007fd5cc000b90 in ??? ()
#5 0x00007fd5e9e6c320 in ??? ()
#6 0x00007fd60b321c34 in g_main_context_iterate_unlocked.isra () at /lib64/libglib-2.0.so.0
Thread 1 (Thread 0x7fd5fcc45280 (LWP 2491417) "emacs"):
#0 0x00007fd6015c9197 in pselect () at /lib64/libc.so.6
#1 0x0000000000000000 in ??? ()
Since this is essentially a complete rewrite of the original
script and documentation, I've chosen to only keep a 2024 copyright date.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Consider operate-and-get-next [1] in bash:
...
$ <echo 1>echo 1<enter>
1
$ <echo 2>echo 2<enter>
2
$ <Ctrl-r>(reverse-i-search)`': <echo 1>echo 1<Ctrl-o>
1
$ echo 2<Ctrl-o>
2
$ echo 1
...
So, typing Ctrl-o:
- executes the recalled command, and
- prefills the next one (which then can be executed again with Ctrl-o).
We have the same functionality in gdb, but when recalling the last command
from history with bash we have no prefill:
...
$ <echo 1>echo 1<enter>
1
$ <Ctrl-r>(reverse-i-search)`': <echo 1>echo 1<Ctrl-o>
1
$
...
but with gdb do we have a prefill:
...
(gdb) echo 1\n
1
(gdb) <Ctrl-r>(reverse-i-search)`': <echo 1>echo 1\n<Ctrl-o>
1
(gdb) echo 1\n
...
Following the principle of least surprise [2], I think gdb should do what bash
does.
Fix this by:
- signalling this case in gdb_rl_operate_and_get_next using
"operate_saved_history = -1", and
- handling operate_saved_history == -1 in
gdb_rl_operate_and_get_next_completion.
Tested on aarch64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
PR cli/32485
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32485
[1] https://www.man7.org/linux/man-pages/man3/readline.3.html
[2] https://en.wikipedia.org/wiki/Principle_of_least_astonishment
|
|
On openSUSE Leap 15.6 ppc64le-linux, with gdb.linespec/explicit.exp I run
into:
...
(gdb) b -source thread_pointer.h FAIL: $exp: complete after -source: tab complete "b -source thr"
Quit^M
...
The test-case already contains a related workaround:
...
# Get rid of symbols from shared libraries, otherwise
# "b -source thr<tab>" could find some system library's
# source.
gdb_test_no_output "nosharedlibrary"
...
but that doesn't work in this case because the debug info is in the executable
itself:
...
The File Name Table (offset 0xb5):
Entry Dir Time Size Name
1 0 0 0 abi-note.c
2 1 0 0 types.h
3 2 0 0 stdint-intn.h
4 2 0 0 stdint-uintn.h
5 3 0 0 elf.h
6 4 0 0 thread_pointer.h
...
due to debug info in some glibc object file.
Fix this by:
- using -nostdlib, ensuring only debug info from the three test-case sources
is present in the executable, and
- adding a _start wrapping main.
Tested on x86_64-linux and ppc64le-linux.
Reviewed-By: Keith Seitz <keiths@redhat.com>
PR testsuite/31229
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31229
|
|
'rbreak' searches symbols and then sets a number of breakpoints. If
setting one of the breakpoints fails, then 'rbreak' will terminate
before examining the remaining symbols.
However, it seems to me that it is better for 'rbreak' to keep going
in this situation. That is what this patch implements.
This problem can be seen by writing an Ada program that uses "pragma
import" to reference a symbol that does not have debug info. In this
case, the program will link but setting a breakpoint on the imported
name will not work.
I don't think it's possible to write a reliable test for this, as it
depends on the order in which symtabs are examined.
New in v2: rbreak now shows how many breakpoints it made and also how
many errors it encountered.
Regression tested on x86-64 Fedora 40.
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
After posting this series:
https://inbox.sourceware.org/gdb-patches/cover.1733742925.git.aburgess@redhat.com
I got a failure report from the Linaro CI system. I eventually
tracked the issue down to a filename clash with glibc. I was able to
reproduce the issue when I installed the glibc debug information on to
my local machine, and ran the gdb.base/dlmopen.exp test as updated in
the above series.
Here's what's happening:
There is a file called dlmopen.c within glibc, within the glibc source
tree the file can be found at ./dlfcn/dlmopen.c. When this file is
compiled it appears that the glibc build system first enters the dlfcn
directory, and then compiles the file using the relative path
./dlmopen.c, here's a snippet of the DWARF:
<0><d5d27>: Abbrev Number: 12 (DW_TAG_compile_unit)
<d5d28> DW_AT_producer : (alt indirect string, offset: 0x16433) t
<d5d2c> DW_AT_language : 29 (C11)
<d5d2d> DW_AT_name : (indirect line string, offset: 0x5c8f): dlmopen.c
<d5d31> DW_AT_comp_dir : (indirect line string, offset: 0xb478): /usr/src/debug/glibc-2.38-19.fc39.x86_64/dlfcn
<d5d35> DW_AT_low_pc : 0x8a4c0
<d5d3d> DW_AT_high_pc : 408
<d5d3f> DW_AT_stmt_list : 0x68ec1
The important thing here is the DW_AT_name, which is just "dlmopen.c".
The gdb.base/dlmopen.exp test also has a source file called
"dlmopen.c".
The dlmopen.exp test makes use of the clean_restart TCL proc, which
calls gdb_reinitialize_dir, which resets the source directories search
path to '$cdir:$cwd', and then prepends the test source directory to
the front of the list, so the source directory search path will look
something like:
/tmp/src/gdb/testsuite/gdb.base/gdb.base:$cdir:$cwd
In the existing test we try to place a breakpoint on 'dlmopen.c:64'.
This is the line tagged 'bp.main' in the source file. This currently
works fine. GDB searches through the symtabs and finds two matches,
the test dlmopen.c, and the glibc dlmopen.c. For each GDB tries to
convert line 64 into an address.
For the testsuite source file this is fine, we get the address of the
line tagged 'bp.main' from the source, and the breakpoint is created.
For the glibc source file though, at least, for the version available
to me, line 64 happens to be the closing '}' of a function, and there
isn't a line table entry for this exact line. So GDB searches forward
looking for the next line in order to place a breakpoint there. The
next line GDB finds is the start of the next function, and so GDB
rejects this location due to commit:
commit dcaa85e58c4ef50a92908e071ded631ce48c971c
Date: Wed May 1 10:47:47 2024 +0100
gdb: reject inserting breakpoints between functions
So we managed to avoid creating two breakpoint locations in this case,
but only by pure good luck.
In my updates to the test though I try to create a breakpoint at line
61 in addition to the breakpoint at line 64. So now the breakpoint
spec is 'dlmopen.c:61'.
Just as before, GDB identifies the 'dlmopen.c' could mean two files,
and searches for line 61 in both. The test source works as expected
and the breakpoint is created in the desired location.
But this time, line 61 in the glibc source file is an actual line,
with actual code, and so GDB places a breakpoint at this location.
This second breakpoint, in glibc is entirely unexpected (by the
dlmopen.exp test script). Unfortunately, the inferior hits this
second glibc breakpoint before it hits the actual breakpoint within
the main test executable, this throws the test off and causes some
failures.
In trying to fix this, I did wonder if I could just specify the full
path to the source file, instead of using just 'dlmopen.c:61'.
However, this doesn't work.
Remember that the glibc source file is recorded as just 'dlmopen.c'.
So, when GDB tries to figure out the absolute path to this source
file, the source directory search path is used. In this case, the
first entry in the source directory search path is the gdb.base/
directory in the GDB source tree. GDB looks in this directory and
finds a dlmopen.c, and so GDB assumes that this is the file in
question.
Thus, GDB actually thinks that both files _are_ the same source file.
Indeed, when GDB stops at the incorrect (glibc) breakpoint, and lists
the source code, it actually lists the source code from the correct
file. This confused me to begin with, GDB reported the wrong
function (the glibc function), but listed code from the correct file
and line.
Now on my machine I have installed the package that provides the glibc
source code. If I change the source directory search path so that
$cdir is first instead of the gdb.base/ from the GDB source tree, this
fixes the listing the wrong file problem. GDB does not realise that
the files are different, and if I create the breakpoint using the
absolute path then only a single breakpoint location is created.
However, this relies on the developer having both the glibc debug
information, and the glibc source package installed, this doesn't seem
like a great requirement to have in place.
So instead, I propose that we just take the easy way out, rename the
test source file. By doing this all the issues are avoided. The test
now creates a breakpoint at 'dlmopen-main.c:61', and there is only one
file with this name found, so we only get a single breakpoint location
created.
I renamed the source file, but not the dlmopen.exp file because the
test already makes use of multiple source files, so having a range of
different names didn't feel that bad, but if this bothers people, I
could rename both the .exp and main .c file, just let me know.
If you want to explore this issue for yourself then try with
installing the glibc debug information for your system, and ensure
that your GDBs under test are able to find the glibc debug
information. You can then either apply the series I linked above, or,
you can modify the existing test source so that the line tagged as
'bp.main' becomes line 61, I just deleted 3 lines from the big comment
at the head of the file.
Of course, reproducing this does depend on how glibc is compiled,
which could change from system to system, or overtime. I reproduced
this issue on Fedora 39 with glibc-2.38-19.
With this patch applied I no longer see any regressions when I apply
the above linked series.
While making these changes I took the opportunity to update the test
script to make better use of standard_testfile and build_executable.
Reviewed-By: Keith Seitz <keiths@redhat.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This patch reuses the "title" style for titles -- in particular the
header line of a list display.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Keith Seitz <keiths@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
Currently the "title" style is only used when printing command names.
The "title" name itself is probably a misnomer, but meanwhile this
patch changes the existing uses to instead use the new "command" style
for consistency.
The "title" style is not removed; see the next patch.
Reviewed-By: Keith Seitz <keiths@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
This fixes a formatting issue and corrects a comment in the new
gdb.ada/lazy-string.exp. I meant to do this in an earlier patch but
forgot to save.
|
|
Currently, if you create a lazy string while in Ada language mode, the
string will be rendered strangely, like:
"["d0"]["9f"]["d1"]["80"]["d0"]["b8"]...
This happens because ada_printstr does not really handle UTF-8
decoding.
This patch changes ada_language::printstr to use generic_printstr when
UTF-8 is used.
Note that this code could probably be improved some more -- the
current patch only addresses the narrow case of the Python API. I've
filed a follow-up bug (PR ada/32413) for the remaining changes.
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
Commit 1411185a ("Introduce and use gnat_version_compare") changed the
Ada tests to use a new proc for version checking. Unfortunately this
patch inadvertently reversed the sense of the test in
packed_array_assign.exp.
After fixing this, I went through that patch again and looked for
other problems. I found one spot where the wrong syntax was used, and
some others where I believe the sense of the test was inverted.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32444
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
The test-case gdb.dap/scopes.exp contains the following outdated comment:
...
# setVariable isn't implemented yet, so use the register name.
...
Now that setVariable is implemented, use it to set variable scalar, and remove
the bit that sets the first register. That part is known to fail on s390x,
because the first register isn't writeable [1].
Tested on x86_64-linux.
Suggested-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
[1] https://sourceware.org/pipermail/gdb-patches/2024-December/213823.html
|
|
With test-case gdb.dap/step-out.exp on s390x-linux, I get:
...
>>> {"seq": 7, "type": "request", "command": "scopes", "arguments": {"frameId": 0}}
Content-Length: 569^M
^M
{"request_seq": 7, "type": "response", "command": "scopes", "body": {"scopes": [{"variablesReference": 1, "name": "Locals", "presentationHint": "locals", "expensive": false, "namedVariables": 1, "line": 35, "source": {"name": "step-out.c", "path": "/home/vries/gdb/src/gdb/testsuite/gdb.dap/step-out.c"}}, {"variablesReference": 2, "name": "Registers", "presentationHint": "registers", "expensive": false, "namedVariables": 114, "line": 35, "source": {"name": "step-out.c", "path": "/home/vries/gdb/src/gdb/testsuite/gdb.dap/step-out.c"}}]}, "success": true, "seq": 21}PASS: gdb.dap/step-out.exp: get scopes success
FAIL: gdb.dap/step-out.exp: three scopes
...
The problem is that the test-case expects three scopes:
...
lassign $scopes scope reg_scope return_scope
...
but the return_scope is missing because this doesn't work:
...
$ gdb -q -batch outputs/gdb.dap/step-out/step-out \
-ex "b function_breakpoint_here" \
-ex run \
-ex finish
...
Value returned has type: struct result. Cannot determine contents
...
This is likely caused by a problem in gdb, but there's nothing wrong the DAP
support.
Fix this by:
- allowing two scopes, and
- declaring the tests of return_scope unsupported.
Tested on s390x-linux.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Since commit e69d35f45e0 ("Use ui-out table in "maint print reggroups""),
test-case gdb.python/py-arch-reg-groups.exp fails with check-read1:
...
FAIL: $exp: Same number of registers groups found
FAIL: $exp: all register groups match
...
Fix this by adding a gdb_test_multiple clause that matches the command.
Tested on x86_64-linux.
|
|
We discovered that attempting to print a very large string-like array
would succeed on the CLI, but in DAP would cause the "variables"
request to fail with:
value requires 67038491 bytes, which is more than max-value-size
This turns out to be a limitation in Value.format_string, which
de-lazy-ifies the value.
This patch fixes this problem by introducing a new NoOpStringPrinter
class, and then using it for string-like values. This printer returns
a lazy string, which solves the problem.
Note there are some special cases where we do not want to return a
lazy string. I've documented these in the code. I considered making
gdb.Value.lazy_string handle these cases -- for example it could
return 'self' rather than a lazy string in some situations -- but this
approach was simpler.
|
|
gdbpy_create_lazy_string_object will throw an exception if you pass it
a NULL pointer without also setting length=0 -- the default,
length==-1, will fail. This seems bizarre. Furthermore, it doesn't
make sense to do this check for array types, as an array can have a
zero length. This patch cleans up the check and makes it specific to
TYPE_CODE_PTR.
|
|
Currently, gdb.Value.lazy_string will allow the conversion of any
object to a "lazy string". However, this was never the intent and is
weird besides. This patch changes this code to correctly throw an
exception in the non-matching cases.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=20769
|
|
I added a new test using gdb_py_test_silent_cmd, and then was
surprised to find out that the new test passed -- it caused a Python
exception and I had expected it to fail. This patch fixes this proc
to detect this situation and fail.
|
|
While testing DAP, we found a situation where a compiler-generated
variable caused the "variables" request to fail -- the variable in
question being an apparent 67-megabyte string.
It seems to me that artificial variables like this aren't interesting
to DAP users, and the gdb CLI omits these as well.
This patch changes DAP to omit these variables, adding a new
gdb.Symbol.is_artificial attribute to make this possible.
|
|
PR dap/32090 points out that gdb's DAP "launch" sequencing is
incorrect. The current approach (which is itself a 2nd
implementation...) was based on a misreading of the spec. The spec
has since been clarified here:
https://github.com/microsoft/debug-adapter-protocol/issues/497
The clarification here is that a client is free to send the "launch"
(or "attach") request at any point after the "initialized" event has
been sent by gdb. However, the "launch" does not cause any action to
be taken -- and does not send a response -- until after
"configurationDone" has been seen.
This patch implements this by arranging for the launch and attach
commands to return a DeferredRequest object.
All the tests needed updates. I've also added a new test that checks
that the deferred "launch" request can be cancelled. (Note that the
cancellation is lazy -- it also waits until configurationDone is seen.
This could be fixed, but I was not sure whether it is important to do
so.)
Finally, the "launch" command has a somewhat funny sequencing now.
Simply sending the command and waiting for a response yielded strange
results if the inferior did not stop -- in this case, the repsonse was
never sent. So now, the command is split into two parts, with some
setup being done synchronously (for better error propagation) and the
actual "run" being done async.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32090
Reviewed-by: Kévin Le Gouguec <legouguec@adacore.com>
|
|
After the commit:
commit 03ad29c86c232484f9090582bbe6f221bc87c323
Date: Wed Jun 19 11:14:08 2024 +0100
gdb: 'target ...' commands now expect quoted/escaped filenames
it was no longer possible to pass GDB the name of a core file
containing any special characters (white space or quote characters) on
the command line. For example:
$ gdb -c /tmp/core\ file.core
Junk after filename "/tmp/core": file.core
(gdb)
The problem is that the above commit changed the 'target core' command
to expect quoted filenames, so before the above commit a user could
write:
(gdb) target core /tmp/core file.core
[New LWP 2345783]
Core was generated by `./mkcore'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000401111 in ?? ()
(gdb)
But after the above commit the user must write:
(gdb) target core /tmp/core\ file.core
or
(gdb) target core "/tmp/core file.core"
This is part of a move to make GDB's filename argument handling
consistent.
Anyway, the problem with the '-c' command line flag is that it
forwards the filename unmodified through to the 'core-file' command,
which in turn forwards to the 'target core' command.
So when the user, at a shell writes:
$ gdb -c "core file.core"
this arrives in GDB as the unquoted string 'core file.core' (without
the single quotes). GDB then forwards this to the 'core-file'
command as if the user had written this at a GDB prompt:
(gdb) core-file core file.core
Which then fails to parse due to the unquoted white space between
'core' and 'file.core'.
The solution I propose is to escape any special characters in the core
file name passed from the command line before calling 'core-file'
command from main.c.
I've updated the corefile.exp test to include a test for passing a
core file containing a white space character. While I was at it I've
modernised the part of corefile.exp that I was touching.
|
|
The recent commit <HASH> moved an initialization of an objfile_holder in
syms_from_objfile_1 much earlier in the function, to better deal with
when GDB is unable to read the objfile format.
However, there is an early exit from syms_from_objfile_1 when the
objfile can be understood, but has no symbols. That was not releasing
the objfile_holder, so the objfile was being unlinked from the program
space, but the process of reading the objfile was being continued,
leading to use-after-frees flagged by the Address Sanitizer.
This commit fixes that UAF by making the objfile_holder release the
objfile right before the early exit.
This commit also changes the test gdb.base/dump.exp since that was the
original test that flagged the UAF, but at the end of the test the
generated files were being deleted, meaning we couldn't redo the test
manually after the fact. That final deletion was removed
Reported-by: Simon Marchi <simark@simark.ca>
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
After the commit:
commit b9de07a5ff74663ff39bf03632d1b2ea417bf8d5
Date: Thu Oct 10 11:37:34 2024 +0100
gdb: fix handling of DW_AT_entry_pc of inlined subroutines
GDB's buildbot CI testing highlighted this assertion failure:
(gdb) c
Continuing.
../../binutils-gdb/gdb/block.h:203: internal-error: set_entry_pc: Assertion `start >= this->start () && start < this->end ()' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
FAIL: gdb.base/break-probes.exp: run til our library loads (GDB internal error)
This assertion was in the new function set_entry_pc and is asserting
that the default_entry_pc() value is within the blocks start/end
range.
The default_entry_pc() is the value GDB will use as the entry-pc if
the DWARF doesn't specifically override the entry-pc. This value is
calculated as:
1. The start address of the first sub-range within the block, if the
block has more than 1 range, or
2. The low address (from DW_AT_low_pc) for the block.
If the block only has a single range then this means the block was
defined with low/high pc attributes (case #2 above). These low/high
pc values are what block::start() and block::end() return. This means
that by definition, if the block is continuous, the above assert
cannot trigger as 'start', the default_entry_pc() would be equivalent
to block::start().
This means that, for the assert to trigger, the block must have
multiple ranges, and the first address of the first range is not
within the blocks low/high address range. This seems wrong.
I inspected the state at the time the assert triggered and discovered
the block's start() address. Then I removed the assert and restarted
GDB. I was now able to inspect the blocks at the offending address:
(gdb) maintenance info blocks 0x7ffff7dddaa4
Blocks at 0x7ffff7dddaa4:
from objfile: [(objfile *) 0x44a37f0] /lib64/ld-linux-x86-64.so.2
[(block *) 0x46b30c0] 0x7ffff7ddd5a0..0x7ffff7dde8a6
entry pc: 0x7ffff7ddd5a0
is global block
symbol count: 4
is contiguous
[(block *) 0x46b3020] 0x7ffff7ddd5a0..0x7ffff7dde8a6
entry pc: 0x7ffff7ddd5a0
is static block
symbol count: 9
is contiguous
[(block *) 0x46b2f70] 0x7ffff7ddda00..0x7ffff7dddac3
entry pc: 0x7ffff7ddda00
function: __GI__dl_find_dso_for_object
symbol count: 4
is contiguous
[(block *) 0x46b2e10] 0x7ffff7dddaa4..0x7ffff7dddac3
entry pc: 0x7ffff7dddaa4
inline function: __GI__dl_find_dso_for_object
symbol count: 5
is contiguous
[(block *) 0x46b2a40] 0x7ffff7dddaa4..0x7ffff7dddac3
entry pc: 0x7ffff7dddaa4
symbol count: 1
is contiguous
[(block *) 0x46b2970] 0x7ffff7dddaa4..0x7ffff7dddac3
entry pc: 0x7ffff7dddaa4
symbol count: 2
address ranges:
0x7ffff7ddda0e..0x7ffff7ddda77
0x7ffff7ddda90..0x7ffff7ddda96
I've left everything in for context, but the only really interesting
bit is the very last block, it's low/high range is:
0x7ffff7dddaa4..0x7ffff7dddac3
but it has separate ranges:
0x7ffff7ddda0e..0x7ffff7ddda77
0x7ffff7ddda90..0x7ffff7ddda96
which are all outside the low/high range. This is what triggers the
assert. But why does that block exist at all?
What I believe is happening is that we're running into a bug in older
versions of GCC. The buildbot failure was with an 8.5 gcc, and Tom de
Vries also reported seeing failures when using version 7 and 8 gcc,
but not with gcc 9 and onward.
Looking at the DWARF I can see that the problematic block is created
from this DIE:
<4><15efb>: Abbrev Number: 83 (DW_TAG_lexical_block)
<15efc> DW_AT_abstract_origin: <0x15e9f>
<15efe> DW_AT_low_pc : 0x7ffff7dddaa4
<15f06> DW_AT_high_pc : 31
which links via DW_AT_abstract_origin to:
<2><15e9f>: Abbrev Number: 80 (DW_TAG_lexical_block)
<15ea0> DW_AT_ranges : 0x38e0
<15ea4> DW_AT_sibling : <0x15eca>
And so we can see that <15efb> has got both low/high pc attributes and
a ranges attribute.
If I widen my checking to parents of DIE <15efb> then I see that they
also have DW_AT_abstract_origin, however, there is something
interesting going on, the parent DIEs are linking to a different DIE
tree than <15efb>.
What I believe is happening is this, we have an abstract instance
tree, this is rooted at a DW_AT_subprogram, and contains all the
blocks, variables, parameters, etc, that you would expect. As this is
an abstract instance, then there are no low/high pc attributes, and no
ranges attributes in this tree. This makes sense.
Now elsewhere we have a DW_TAG_subprogram (not
DW_TAG_inlined_subroutine) which links via
DW_AT_abstract_origin to the abstract DW_AT_subprogram. This case is
documented in the DWARF 5 spec in section 3.3.8.3, and describes an
Out-of-Line Instance of an Inlined Subroutine. Within this out of
line instance many of the DIE correctly link back, using
DW_AT_abstract_origin to the abstract instance tree. This tree also
includes the DIE <15e9f>, which is where our problem DIE references.
Now, to really confuse things, within this out-of-line instance we
have a DW_TAG_inlined_subroutine, which is another instance of the
same abstract instance tree! This would seem to indicate a recursive
call to the inline function, and the compiler, for some reason, needed
to instantiate an out of line instance of this function.
And it is within this nested, inlined subroutine, that the problem DIE
exists. The problem DIE is referencing the corresponding DIE within
the out of line instance tree, but I am convinced this must be a (long
fixed) GCC bug, and that the problem DIE should be referencing the DIE
within the abstract instance tree.
I'm aware that the above is pretty confusing. The actual DWARF would
be a around 200 lines long, so I'd like to avoid dumping it in here.
But here's my attempt at representing what's going on in a minimal
example. The numbers down the side represent the section offset, not
the nesting level, and I've removed any attributes that are not
relevant:
<1> DW_TAG_subprogram
<2> DW_TAG_lexical_block
<3> DW_TAG_subprogram
DW_AT_abstract_origin <1>
<4> DW_TAG_lexical_block
DW_AT_ranges ...
<5> DW_TAG_inlined_subroutine
DW_AT_abstract_origin <1>
<6> DW_TAG_lexical_block
DW_AT_abstract_origin <4>
DW_AT_low_pc ...
DW_AT_high_pc ...
The lexical block at <6> is linking to <4> when it should be linking
to <2>.
There is one additional thing that we might wonder about, which is,
when calculating the low/high pc range for a block, why does GDB not
make use of the range information and expand the range beyond the
defined low/high values?
The answer to this is in dwarf_get_pc_bounds_ranges_or_highlow_pc in
dwarf/read.c. This is where the low/high bounds are calculated. What
we see is that GDB first checks for a low/high attribute pair, and if
that is present, this defines the address range for the block. Only
if there is no DW_AT_low_pc do we check for the DW_AT_ranges, and use
that to define the extent of the block. And this makes sense, section
3.5 of the DWARF-5 spec says:
The lexical block entry may have either a DW_AT_low_pc and DW_AT_high_pc
pair of attributes or a DW_AT_ranges attribute whose values encode the
contiguous or non-contiguous address ranges, respectively, of the machine
instructions generated for the lexical block...
Section 3.5 is specifically about lexical blocks, but the same
wording, about it being either low/high OR ranges is repeated for
other DW_TAG_ types.
So this explains why GDB doesn't use the ranges to expand the problem
blocks ranges; as the first DIE has low/high addresses, these are
used, and the ranges is not consulted.
It is only later in dwarf2_record_block_ranges that we create a range
based off the low/high pc, and then also process the ranges data, this
allows the problem block to exist with ranges that are outside the
low/high range.
To solve this I considered a number of options:
1. Prevent loading certain attributes from an abstract instance.
Section 3.3.8.1 of the DWARF-5 spec talks about which attributes are
appropriate to place in an abstract instance. Any attribute that
might vary between instances should not appear in an abstract
instance. DW_AT_ranges is included as an example in the
non-exhaustive list of attributes that should not appear in an
abstract instance.
Currently in dwarf2_attr (dwarf2/read.c), when we see a
DW_AT_abstract_origin attribute, we always follow this to try and find
the attribute we are looking for. But we could change this function
so that we prevent this following for attributes that we know should
not be looked up in an abstract instance. This would solve the
problem in this case by preventing us finding the DW_AT_ranges in the
incorrect abstract instance.
2. Filter the ranges.
Having established a blocks low/high address range in
dwarf_get_pc_bounds_ranges_or_highlow_pc, we could allow
dwarf2_record_block_ranges to parse the ranges, but we could reject
any range that extends outside the blocks defined start and end
addresses.
For well behaved DWARF where we have either low/high or ranges, then
the blocks start/end are defined from the range data, and so, by
definition, every range would be acceptable.
But in our problem case we would reject all of the invalid ranges.
This is my least favourite solution as it feels like rejecting the
ranges is tackling the problem too late on.
3. Don't try to parse ranges when we have low/high attributes.
This option involves updating dwarf2_record_block_ranges to match the
behaviour of dwarf_get_pc_bounds_ranges_or_highlow_pc, and, I believe,
to match the DWARF spec: don't try to read range data from
DW_AT_ranges if we have low/high pc attributes.
In our case this solves the issue because the problematic DIE has the
low/high attributes, and it then links to the wrong DIE which happens
to have DW_AT_ranges. With this change in place we don't even look
for the DW_AT_ranges.
If the problem were reversed, and the initial DIE had DW_AT_ranges,
but the incorrectly referenced DIE had the low/high pc attributes,
we would pick up the wrong addresses, but this wouldn't trigger any
asserts. The reason is that dwarf_get_pc_bounds_ranges_or_highlow_pc
would also find the low/high addresses from the incorrectly referenced
DIE, and so we would just end up with a block which had the wrong
address ranges, but the block would be self consistent, which is
different to the problem we hit here.
In the end, in this commit I went with solution #3, having
dwarf_get_pc_bounds_ranges_or_highlow_pc and
dwarf2_record_block_ranges be consistent seems sensible. However, I
do wonder if in the future we might want to explore solution #1 as an
additional safety feature.
With this patch in place I'm able to run the gdb.base/break-probes.exp
without seeing the assert that CI testing highlighted. I see no
regressions when testing on x86-64 GNU/Linux with gcc 9.3.1.
Note: the diff in this commit looks big, but it's really just me
indenting the code.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
A failure of 'runto_main' in 'start_structs_test' results in a TCL
error. The return value of 'start_structs_test' function is evaluated
inside an if conditional clause, which expects a boolean value. Return
'-1' on failure to avoid the error.
Reviewed-By: Keith Seitz <keiths@redhat.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
In commit 922ab963e1c ("[gdb/python] Handle empty PYTHONDONTWRITEBYTECODE") I
added a test in gdb.python/py-startup-opt.exp that checks the
"show python dont-write-bytecode" output.
Then in commit 348290c7ef4 ("[gdb/python] Warn and ignore ineffective python
settings") I changed the output of "show python dont-write-bytecode" after
python initialization.
I tested these changes individually, and found no problems but after
committing both the test started failing, which the Linaro CI reported.
Fix this by updating the expected output.
While we're at it, make the test a bit more generic by testing
"show python $setting" in all cases.
Tested on x86_64-linux, using:
- PYTHONDONTWRITEBYTECODE=
- PYTHONDONTWRITEBYTECODE=1
- unset PYTHONDONTWRITEBYTECODE
|
|
This changes the "maint print reggroups" command to use a ui-out table
rather than printf.
It also fixes a typo I noticed in a related test case name; and lets
us finally remove the leading \s from the regexp in completion.exp.
Reviewed-by: Christina Schimpe <christina.schimpe@intel.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
This changes various "maint print" register commands to use ui-out
tables rather than the current printf approach.
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
With test-case gdb.arch/pr25124.exp, I run into:
...
PASS: gdb.arch/pr25124.exp: disassemble thumb instruction (1st try)
PASS: gdb.arch/pr25124.exp: disassemble thumb instruction (2nd try)
DUPLICATE: gdb.arch/pr25124.exp: disassemble thumb instruction (2nd try)
...
Fix this by using a comma instead of parentheses.
Tested on arm-linux.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
When using PYTHONDONTWRITEBYTECODE with an empty string we get:
...
$ PYTHONDONTWRITEBYTECODE= gdb -q -batch -ex "show python dont-write-bytecode"
Python's dont-write-bytecode setting is auto (currently on).
...
This is incorrect, it should be off.
The actual setting is correct, that was already fixed in commit 24d2cbc42cc
("set/show python dont-write-bytecode fixes"), in function
python_write_bytecode.
Fix this by:
- factoring out new function env_python_dont_write_bytecode out of
python_write_bytecode, and
- using it in show_python_dont_write_bytecode.
Tested on x86_64-linux, using test-case gdb.python/py-startup-opt.exp and:
- PYTHONDONTWRITEBYTECODE=
- PYTHONDONTWRITEBYTECODE=1
- unset PYTHONDONTWRITEBYTECODE
Approved-By: Tom Tromey <tom@tromey.com>
PR python/32389
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32389
|
|
PYTHONDONTWRITEBYTECODE
When running test-case gdb.python/py-startup-opt.exp with empty
PYTHONDONTWRITEBYTECODE:
...
$ cd build/gdb/testsuite
$ PYTHONDONTWRITEBYTECODE= make check \
RUNTESTFLAGS=gdb.python/py-startup-opt.exp
...
I get:
...
end^M
dont_write_bytecode is off^M
(gdb) FAIL: $exp: attr=dont_write_bytecode: testname: input 6: end
...
The problem is that the test-case expects dont_write_bytecode to be
on, which is incorrect because PYTHONDONTWRITEBYTECODE only has effect if set
to a non-empty string [1].
Fix this by correctly setting expectations in the test-case.
Tested on x86_64-linux, with:
- PYTHONDONTWRITEBYTECODE=
- PYTHONDONTWRITEBYTECODE=1
- unset PYTHONDONTWRITEBYTECODE
Approved-By: Tom Tromey <tom@tromey.com>
[1] https://docs.python.org/3/using/cmdline.html#envvar-PYTHONDONTWRITEBYTECODE
|
|
The test gdb.reverse/i386-avx-reverse.exp was assuming that if the CPU
was like x86, it would have AVX instructions because I didn't know how
to check for AVX instruction support explicitly. This commit updates
that to use the pre-existing TCL proc have_avx.
Also update the comment at the top of the test, since it was a copy of a
different test.
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
When building gdb with --with-expat=no and running test-case
gdb.base/reset-catchpoint-cond.exp we get:
...
(gdb) catch syscall write^M
warning: Can not parse XML syscalls information; \
XML support was disabled at compile time.^M
Unknown syscall name 'write'.^M
(gdb) FAIL: $exp: mode=syscall: catch syscall write
...
Fix this by skipping the test for --with-expat=no.
Tested on x86_64-linux.
|
|
When building gdb with --disable-tui, we run into:
...
(gdb) python print(type(gdb.TuiWindow))^M
Python Exception <class 'AttributeError'>: \
module 'gdb' has no attribute 'TuiWindow'^M
Error occurred in Python: module 'gdb' has no attribute 'TuiWindow'^M
(gdb) FAIL: gdb.python/python.exp: gdb.TuiWindow is registered
...
Fix this by skipping the test for --disable-tui.
Tested on x86_64-linux.
|
|
The test gdb.cp/step-and-next-inline.exp creates a test binary called
step-and-next-inline-no-header. This test includes a function
`tree_check` which is inlined 3 times.
When testing with some older versions of gcc (I've tried 8.4.0, 9.3.1)
we see the following DWARF representing one of the inline instances of
tree_check:
<2><8d9>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
<8da> DW_AT_abstract_origin: <0x9ee>
<8de> DW_AT_entry_pc : 0x401165
<8e6> DW_AT_GNU_entry_view: 0
<8e7> DW_AT_ranges : 0x30
<8eb> DW_AT_call_file : 1
<8ec> DW_AT_call_line : 52
<8ed> DW_AT_call_column : 10
<8ee> DW_AT_sibling : <0x92d>
...
<1><9ee>: Abbrev Number: 46 (DW_TAG_subprogram)
<9ef> DW_AT_external : 1
<9ef> DW_AT_name : (indirect string, offset: 0xe8): tree_check
<9f3> DW_AT_decl_file : 1
<9f4> DW_AT_decl_line : 38
<9f5> DW_AT_decl_column : 1
<9f6> DW_AT_linkage_name: (indirect string, offset: 0x2f2): _Z10tree_checkP4treei
<9fa> DW_AT_type : <0x9e8>
<9fe> DW_AT_inline : 3 (declared as inline and inlined)
<9ff> DW_AT_sibling : <0xa22>
...
Contents of the .debug_ranges section:
Offset Begin End
...
00000030 0000000000401165 0000000000401165 (start == end)
00000030 0000000000401169 0000000000401173
00000030 0000000000401040 0000000000401045
00000030 <End of list>
...
Notice that one of the sub-ranges of tree-check is empty, this is the
line marked 'start == end'. As the end address is the first address
after the range, this range cover absolutely no code.
But notice too that the DW_AT_entry_pc for the inline instance points
at this empty range.
Further, notice that despite the ordering of the sub-ranges, the empty
range is actually in the middle of the region defined by the lowest
address to the highest address. The ordering is not a problem, the
DWARF spec doesn't require that ranges be in any particular order.
However, this empty range is causing issues with GDB newly acquire
DW_AT_entry_pc support.
GDB already rejects, and has done for a long time, empty sub-ranges,
after all, the DWARF spec is clear that such a range covers no code.
The recent DW_AT_entry_pc patch also had GDB reject an entry-pc which
was outside of the low/high bounds of a block.
But in this case, the entry-pc value is within the bounds of a block,
it's just not within any useful sub-range. As a consequence, GDB is
storing the entry-pc value, and making use of it, but when GDB stops,
and tries to work out which block the inferior is in, it fails to spot
that the inferior is within tree_check, and instead reports the
function into which tree_check was inlined.
I've tested with newer versions of gcc (12.2.0 and 14.2.0) and with
these versions gcc is still generating the empty sub-range, but now
this empty sub-range is no longer the entry point. Here's the
corresponding ranges table from gcc 14.2.0:
Contents of the .debug_rnglists section:
Table at Offset: 0:
Length: 0x56
DWARF version: 5
Address size: 8
Segment size: 0
Offset entries: 0
Offset Begin End
...
00000021 0000000000401165 000000000040116f
0000002b 0000000000401040 (base address)
00000034 0000000000401040 0000000000401040 (start == end)
00000037 0000000000401041 0000000000401046
0000003a <End of list>
...
The DW_AT_entry_pc is 0x401165, but this is not the empty sub-range,
as a result, when GDB stops at the entry-pc, GDB will correctly spot
that the inferior is in the tree_check function.
The fix I propose here is, instead of rejecting entry-pc values that
are outside the block's low/high range, instead reject entry-pc values
that are not inside any of the block's sub-ranges.
Now, GDB will ignore the prescribed entry-pc, and will instead select
a suitable default entry-pc based on either the block's low-pc value,
or the first address of the first range.
I have extended the gdb.cp/step-and-next-inline.exp test to check this
case, but this does depend on the compiler version being used (newer
compilers will always pass, even without the fix).
So I have also added a DWARF assembler test to cover this case.
Reviewed-By: Kevin Buettner <kevinb@redhat.com>
|
|
Add missing return statements in
* gdb.threads/process-exit-status-is-leader-exit-status.c
* gdb.threads/next-fork-exec-other-thread.c
to fix 'no return statement' compiler warnings, e.g.:
process-exit-status-is-leader-exit-status.c: In function ‘start’:
process-exit-status-is-leader-exit-status.c:46:1: warning: no return
statement in function returning non-void [-Wreturn-type]
46 | }
| ^
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Since 2020 it has been reported to clang[1] that the debug information
around OpenMP is insufficient. The OpenMP section is not declared
within the correct scope, and instead clang marks as if the section was
a function in the global scope. This causes several failures in the
test gdb.threads/omp-par-scope.exp when using clang to test GDB.
Since this isn't a true failure of GDB, and there is little expectation
that clang will be able to fix this soon, this commit disables the
aforementioned test when clang is being used.
[1] https://github.com/llvm/llvm-project/issues/44236
Approved-by: Kevin Buettner <kevinb@redhat.com>
|
|
Add a regression test for PR symtab/32225.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32225
|
|
Intel has EOL'ed the Nios II architecture, and it's time to remove support
from all toolchain components before it gets any more bit-rotten from
lack of maintenance or regular testing.
|
|
This converts gdb_bfd.c to use the new hash table for all_bfds.
This patch slightly changes the htab_t pretty-printer test, which was
relying on all_bfds. Note that with the new hash table, gdb-specific
printers aren't needed; the libstdc++ printers suffice -- in fact,
they are better, because the true types of the contents are available.
Change-Id: I48b7bd142085287b34bdef8b6db5587581f94280
Co-Authored-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
|