Age | Commit message (Collapse) | Author | Files | Lines |
|
On aarch64-linux with a gdb build without libexpat, I run into:
...
(gdb) PASS: gdb.base/catch-syscall.exp: determine pipe syscall: \
catch syscall 59
continue
Continuing.
Catchpoint 5 (call to syscall 59), 0x0000fffff7e04578 in pipe () from \
/lib64/libc.so.6
(gdb) FAIL: gdb.base/catch-syscall.exp: determine pipe syscall: continue
...
In the test-case, this pattern handles either the syscall name or number for
the pipe syscall:
...
-re -wrap "Catchpoint $decimal \\(call to syscall (pipe|$SYS_pipe)\\).*" {
...
but the pattern for the pipe2 syscall mistakenly uses SYS_pipe instead of
SYS_pipe2:
...
-re -wrap "Catchpoint $decimal \\(call to syscall (pipe2|$SYS_pipe)\\).*" {
...
and consequently doesn't handle the pipe2 syscall number.
Fix the typo by using SYS_pipe2 instead.
Tested on aarch64-linux.
|
|
When running test-case gdb.base/gdb-index-err.exp in a container as root user,
I run into:
...
FAIL: gdb.base/gdb-index-err.exp: flag=: \
try to write index to a non-writable directory
FAIL: gdb.base/gdb-index-err.exp: flag=-dwarf-5: \
try to write index to a non-writable directory
...
The test-case creates a directory without write permissions:
...
$ ls -ald private
dr-xr-xr-x 2 root root 4096 Dec 29 06:26 private/
...
but apparently the root user is still able to write in it.
Fix this by making the test unsupported for the root user.
Tested on x86_64-linux.
Reviewed-By: Lancelot SIX <lancelot.six@amd.com>
PR testsuite/31197
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31197
|
|
While making recent changes to 'save gdb-index' command I triggered
some errors -- of the kind a user might be expected to trigger if they
do something wrong -- and I didn't find GDB's output as helpful as it
might be.
For example:
$ gdb -q /tmp/hello.x
...
(gdb) save gdb-index /non_existing_dir
Error while writing index for `/tmp/hello': mkstemp: No such file or directory.
That the error message mentions '/tmp/hello', which does exist, but
doesn't mention '/non_existing_dir', which doesn't is, I think,
confusing.
Also, I find the 'mkstemp' in the error message confusing for a user
facing error. A user might not know what mkstemp means, and even if
they do, that it appears in the error message is an internal GDB
detail. The user doesn't care what function failed, but wants to know
what was wrong with their input, and what they should do to fix
things.
Similarly, for a directory that does exist, but can't be written to:
(gdb) save gdb-index /no_access_dir
Error while writing index for `/tmp/hello': mkstemp: Permission denied.
In this case, the 'Permission denied' might make the user thing there
is a permissions issue with '/tmp/hello', which is not the case.
After this patch, the new errors are:
(gdb) save gdb-index /non_existing_dir
Error while writing index for `/tmp/hello': `/non_existing_dir': No such file or directory.
and:
(gdb) save gdb-index /no_access_dir
Error while writing index for `/tmp/hello': `/no_access_dir': Permission denied.
we also have:
(gdb) save gdb-index /tmp/not_a_directory
Error while writing index for `/tmp/hello': `/tmp/not_a_directory': Is not a directory.
I think these do a better job of guiding the user towards fixing the
problem.
I've added a new test that exercises all of these cases, and also
checks the case where a user tries to use an executable that already
contains an index in order to generate an index. As part of the new
test I've factored out some code from ensure_gdb_index (lib/gdb.exp)
into a new proc (get_index_type), which I've then used in the new
test. I've confirmed that all the tests that use ensure_gdb_index
still pass.
During review it was pointed out that the testsuite proc
have_index (lib/gdb.exp) is similar to the new get_index_type proc, so
I've rewritten have_index to also use get_index_type, I've confirmed
that all the tests that use have_index still pass.
Nothing that worked correctly before this patch should give an error
after this patch; I've only changed the output when the user was going
to get an error anyway.
Reviewed-By: Tom de Vries <tdevries@suse.de>
Reviewed-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
When using $_thread in info threads to showonly the current thread,
you get this error:
```
(gdb) info thread $_thread
Convenience variable must have integer value.
Args must be numbers or '$' variables.
```
It's because $_thread is a dynamically computed convenience
variable, which isn't supported yet by get_internalvar_integer.
Now the output looks like this:
```
(gdb) info threads $_thread
Id Target Id Frame
* 1 Thread 10640.0x2680 main () at C:/src/repos/binutils-gdb.git/gdb/testsuite/gdb.base/gdbvars.c:21
```
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17600
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Following on from the previous commit, I searched the testsuite for
places where we did:
set eol "<some pattern>"
in most cases the <some pattern> could be replaced with "\r\n" though
in the stabs test I've switched to using the multi_line proc as that
seemed like a better choice.
In gdb.ada/info_types.exp I did need to add an extra use of $eol as
the previous pattern would match multiple newlines, and in this one
place we were actually expecting to match multiple newlines. The
tighter pattern only matches a single newline, so we now need to be
explicit when multiple newlines are expected -- I think this is a good
thing.
All the tests are still passing for me after these changes.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The test gdb.base/watchpoint.exp has a proc named 'test_stepping'
which claims to "Test stepping and other mundane operations with
watchpoints enabled". It sets a watchpoint on ival2, performs an
inferior function call (which is not at all mundane), and uses 'next',
'until', and, finally, does a 'step'.
However, that final 'step' command steps to (but not over/through) the
line at which the assignment to ival2 takes place. At no time while
performing these operations is a watchpoint hit.
This commit adds a test to see what happens when stepping over/through
the assignment to ival2. The watchpoint on ival2 should be triggered
during this step. I've added another 'step' to make sure that the
correct statement is reached after performing the watchpoint-hitting
step.
After running the 'test_stepping' proc, gdb.base/watchpoint.exp does
a clean_restart before doing further tests, so nothing depends upon
'test_stepping' to stop at the particular statement at which it had
been stopping.
I've examined all tests which set watchpoints and step. I haven't
been able to identify a(nother) test case which tests what happens
when stepping over/through a statement which triggers a watchpoint.
Therefore, adding these new 'step' tests is testing something which
hasn't being tested elsewhere.
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
|
|
This commit enables the early initialization commands (92e4e97a9f5) to
modify the number of threads used by gdb's thread pool.
The motivation here is to prevent gdb from spawning a detrimental number
of threads on many-core systems under environments with restrictive
ulimits.
With gdb before this commit, the thread pool takes the following sizes:
1. Thread pool size is initialized to 0.
2. After the maintenance commands are defined, the thread pool size is
set to the number of system cores (if it has not already been set).
3. Using early initialization commands, the thread pool size can be
changed using "maint set worker-threads".
4. After the first prompt, the thread pool size can be changed as in the
previous step.
Therefore after step 2. gdb has potentially launched hundreds of threads
on a many-core system.
After this change, step 2 and 3 are reversed so there is an opportunity
to set the required number of threads without needing to default to the
number of system cores first.
There does exist a configure option (added in 261b07488b9) to disable
multithreading, but this does not allow for an already deployed gdb to
be configured.
Additionally, the default number of worker threads is clamped at eight
to control the number of worker threads spawned on many-core systems.
This value was chosen as testing recorded on bugzilla issue 29959
indicates that parallel efficiency drops past this point.
GDB built with GCC 13.
No test suite regressions detected. Compilers: GCC, ACfL, Intel, Intel
LLVM, NVHPC; Platforms: x86_64, aarch64.
The scenario that interests me the most involves preventing GDB from
spawning any worker threads at all. This was tested by counting the
number of clones observed by strace:
strace -e clone,clone3 gdb/gdb -q \
--early-init-eval-command="maint set worker-threads 0" \
-ex q ./gdb/gdb |& grep --count clone
The new test relies on "gdb: install CLI uiout while processing early
init files" developed by Andrew Burgess. This patch will need pushing
prior to this change.
The clamping was tested on machines with both 16 cores and a single
core. "maint show worker-threads" correctly reported eight and one
respectively.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
When using GDB on native linux, it can happen that, while attempting
to detach an inferior, the inferior may have been exited or have been
killed, yet still be in the list of lwps. Should that happen, the
assert in x86_linux_update_debug_registers in
gdb/nat/x86-linux-dregs.c will trigger. The line in question looks
like this:
gdb_assert (lwp_is_stopped (lwp));
For this case, the lwp isn't stopped - it's dead.
The bug which brought this problem to my attention is one in which the
pwntools library uses GDB to to debug a process; as the script is
shutting things down, it kills the process that GDB is debugging and
also sends GDB a SIGTERM signal, which causes GDB to detach all
inferiors prior to exiting. Here's a link to the bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2192169
The following shell command mimics part of what the pwntools
reproducer script does (with regard to shutting things down), but
reproduces the bug much less reliably. I have found it necessary to
run the command a bunch of times before seeing the bug. (I usually
see it within 5-10 repetitions.) If you choose to try this command,
make sure that you have no running "cat" or "gdb" processes first!
cat </dev/zero >/dev/null & \
(sleep 5; (kill -KILL `pgrep cat` & kill -TERM `pgrep gdb`)) & \
sleep 1 ; \
gdb -q -iex 'set debuginfod enabled off' -ex 'set height 0' \
-ex c /usr/bin/cat `pgrep cat`
So, basically, the idea here is to kill both gdb and cat at roughly
the same time. If we happen to attempt the detach before the process
lwp has been deleted from GDB's (linux native) LWP data structures,
then the assert will trigger. The relevant part of the backtrace
looks like this:
#8 0x00000000008a83ae in x86_linux_update_debug_registers (lwp=0x1873280)
at gdb/nat/x86-linux-dregs.c:146
#9 0x00000000008a862f in x86_linux_prepare_to_resume (lwp=0x1873280)
at gdb/nat/x86-linux.c:81
#10 0x000000000048ea42 in x86_linux_nat_target::low_prepare_to_resume (
this=0x121eee0 <the_amd64_linux_nat_target>, lwp=0x1873280)
at gdb/x86-linux-nat.h:70
#11 0x000000000081a452 in detach_one_lwp (lp=0x1873280, signo_p=0x7fff8ca3441c)
at gdb/linux-nat.c:1374
#12 0x000000000081a85f in linux_nat_target::detach (
this=0x121eee0 <the_amd64_linux_nat_target>, inf=0x16e8f70, from_tty=0)
at gdb/linux-nat.c:1450
#13 0x000000000083a23b in thread_db_target::detach (
this=0x1206ae0 <the_thread_db_target>, inf=0x16e8f70, from_tty=0)
at gdb/linux-thread-db.c:1385
#14 0x0000000000a66722 in target_detach (inf=0x16e8f70, from_tty=0)
at gdb/target.c:2526
#15 0x0000000000a8f0ad in kill_or_detach (inf=0x16e8f70, from_tty=0)
at gdb/top.c:1659
#16 0x0000000000a8f4fa in quit_force (exit_arg=0x0, from_tty=0)
at gdb/top.c:1762
#17 0x000000000070829c in async_sigterm_handler (arg=0x0)
at gdb/event-top.c:1141
My colleague, Andrew Burgess, has done some recent work on other
problems with detach. Upon hearing of this problem, he came up a test
case which reliably reproduces the problem and tests for a few other
problems as well. In addition to testing detach when the inferior has
terminated due to a signal, it also tests detach when the inferior has
exited normally. Andrew observed that the linux-native-only
"checkpoint" command would be affected too, so the test also tests
those cases when there's an active checkpoint.
For the LWP exit / termination case with no checkpoint, that's handled
via newly added checks of the waitstatus in detach_one_lwp in
linux-nat.c.
For the checkpoint detach problem, I chose to pass the lwp_info
to linux_fork_detach in linux-fork.c. With that in place, suitable
tests were added before attempting a PTRACE_DETACH operation.
I added a few asserts at the beginning of linux_fork_detach and
modified the caller code so that the newly added asserts shouldn't
trigger. (That's what the 'pid == inferior_ptid.pid' check is about
in gdb/linux-nat.c.)
Lastly, I'll note that the checkpoint code needs some work with regard
to background execution. This patch doesn't attempt to fix that
problem, but it doesn't make it any worse. It does slightly improve
the situation with detach because, due to the check noted above,
linux_fork_detach() won't be called for the wrong inferior when there
are multiple inferiors. (There are at least two other problems with
the checkpoint code when there are multiple inferiors. See:
https://sourceware.org/bugzilla/show_bug.cgi?id=31065)
This commit also adds a new test,
gdb.base/process-dies-while-detaching.exp. Andrew Burgess is the
primary author of this test case. Its design is similar to that of
gdb.threads/main-thread-exit-during-detach.exp, which was also written
by Andrew.
This test checks that GDB correctly handles several cases that can
occur when GDB attempts to detach an inferior process. The process
can exit or be terminated (e.g. via SIGKILL) prior to GDB's event
loop getting a chance to remove it from GDB's internal data
structures. To complicate things even more, detach works differently
when a checkpoint (created via GDB's "checkpoint" command) exists for
the inferior. This test checks all four possibilities: process exit
with no checkpoint, process termination with no checkpoint, process
exit with a checkpoint, and process termination with a checkpoint.
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
On Linux, threads are treated much like separate processes by the
kernel. In particular, it's possible to ptrace just a single thread.
If gdb tries to attach to a multi-threaded inferior, where a non-main
thread is already being traced (e.g., by strace), then gdb will get
into an infinite loop attempting to attach.
This patch fixes this problem by having the attach fail if ptrace
fails to attach to any thread of the inferior.
|
|
On pinebook I ran into:
...
Running gdb.tui/tui-layout-asm-short-prog.exp ...
gdb compile failed, gdb.tui/tui-layout-asm-short-prog.S: Assembler messages:
gdb.tui/tui-layout-asm-short-prog.S:23: Error: \
junk at end of line, first unrecognized character is `,'
...
Fix this by using %progbits instead of @progbits for arm.
Approved-by: Luis Machado <luis.machado@arm.com>
Tested on x86_64-linux and pinebook.
|
|
The C++ type-printing code had its own variant of the accessibility
enum. This patch removes this and changes the code to use the new one
from gdbtypes.h.
This patch also changes the C++ code to recognize the default
accessibility of a class. This makes ptype a bit more C++-like, and
lets us remove a chunk of questionable code.
Acked-By: Simon Marchi <simon.marchi@efficios.com>
Reviewed-by: Keith Seitz <keiths@redhat.com>
|
|
Commit 59a561480d5 ("Fix spurious FAILs with examine-backward.exp") describes
the problem that:
...
The test case examine-backward.exp issues the command "x/-s" after the end
of the first string in TestStrings, but without making sure that this
string is preceded by a string terminator. Thus GDB may spuriously print
some random characters from before that string, and then the test fails.
...
The commit fixes the problem by adding a Barrier variable before the TestStrings
variable:
...
+const char Barrier[] = { 0x0 };
const char TestStrings[] = {
...
There is however no guarantee that Barrier is placed immediately before
TestStrings.
Before recent commit 169fe7ab54b ("Change gdb.base/examine-backwards.exp for
AIX.") on x86_64-linux, I see:
...
0000000000400660 R Barrier
0000000000400680 R TestStrings
...
So while the Barrier variable is the first before the TestStrings variable,
it's not immediately preceding TestStrings.
After commit 169fe7ab54b:
...
0000000000402259 B Barrier
0000000000402020 D TestStrings
...
they're not even in the same section anymore.
Fix this reliably by adding the zero in the array itself:
...
char TestStringsBase[] = {
0x0,
...
};
char *TestStrings = &TestStringsBase[1];
...
and do likewise for TestStringsH and TestStringsW.
Tested on x86_64-linux.
PR testsuite/31064
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31064
|
|
Now that gdb/19675 is fixed for both native and gdbserver GNU/Linux,
remove the gdb/19675 kfails.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I95c1c38ca370100675d303cd3c8995860bef465d
|
|
Add a new command completer function for the disassemble command.
There are two things that this completion function changes. First,
after the previous commit, the new function calls skip_over_slash_fmt,
which means that hitting tab after entering a /OPT flag now inserts a
space ready to start typing the address to disassemble at:
(gdb) disassemble /r<TAB>
(gdb) disassemble /r <CURSOR>
But also, we now get symbol completion after a /OPT option set,
previously this would do nothing:
(gdb) disassemble /r mai<TAB>
But now:
(gdb) disassemble /r mai<TAB>
(gdb) disassemble /r main <CURSOR>
Which was my main motivation for working on this commit.
However, I have made a second change in the completion function.
Currently, the disassemble command calls the generic
location_completer function, however, the disassemble docs say:
Note that the 'disassemble' command's address arguments are specified
using expressions in your programming language (*note Expressions:
Expressions.), not location specs (*note Location Specifications::).
So, for example, if you want to disassemble function 'bar' in file
'foo.c', you must type 'disassemble 'foo.c'::bar' and not 'disassemble
foo.c:bar'.
And indeed, if I try:
(gdb) disassemble hello.c:main
No symbol "hello" in current context.
(gdb) disassemble hello.c::main
No symbol "hello" in current context.
(gdb) disassemble 'hello.c'::main
Dump of assembler code for function main:
... snip ...
But, if I do this:
(gdb) disassemble hell<TAB>
(gdb) disassemble hello.c:<CURSOR>
which is a consequence of using the location_completer function. So
in this commit, after calling skip_over_slash_fmt, I forward the bulk
of the disassemble command completion to expression_completer. Now
when I try this:
(gdb) disassemble hell<TAB>
gives nothing, which I think is an improvement. There is one slight
disappointment, if I do:
(gdb) disassemble 'hell<TAB>
I still get nothing. I had hoped that this would expand to:
'hello.c':: but I guess this is a limitation of the current
expression_completer implementation, however, I don't think this is a
regression, the previous expansion was just wrong. Fixing
expression_completer is out of scope for this commit.
I've added some disassembler command completion tests, and also a test
that disassembling using 'FILE'::FUNC syntax works, as I don't think
that is tested anywhere.
|
|
In AIX unused or constant variables are collected as garbage by the linker and in the dwarf dump
an address with all f's in hexadecimal are assigned. Hence the testcase fails with many failures stating
it cannot access memory.
This patch is a small change to get it working in AIX as well.
|
|
When printing a value, I think the history reference -- the "$1" in
the output -- should be styled using the "variable" style. This patch
implements this.
|
|
When running test-case gdb.base/jit-bfd-name.exp, I run into:
...
ERROR: tcl error sourcing gdb/testsuite/gdb.base/jit-bfd-name.exp.
ERROR: can't read "start": no such variable
...
The problem is that commit c96ceed9dce ("gdb: include the end address in
in-memory bfd filenames") introduced a use of variable start, but not a
definition.
Fix this by adding the missing definition.
Tested on x86_64-linux.
|
|
Commit
66984afd29e gdb: include the base address in in-memory bfd filenames
added the base address to in-memory bfd filenames. Also add the end
address to allow dumping the in-memory bfd using the 'dump memory'
command.
|
|
Since commit 1e5ccb9c5ff4fd8ade4a8694676f99f4abf2d679, we have an assertion in
displaced_step_buffers::copy_insn_closure_by_addr that makes sure a closure
is available whenever we have a match between the provided address argument and
the buffer address.
That is fine, but the report in PR30872 shows this assertion triggering when
it really shouldn't. After some investigation, here's what I found out.
The 32-bit Arm architecture is the only one that calls
gdbarch_displaced_step_copy_insn_closure_by_addr directly, and that's because
32-bit Arm needs to figure out the thumb state of the original instruction
that we displaced-stepped through the displaced-step buffer.
Before the assertion was put in place by commit
1e5ccb9c5ff4fd8ade4a8694676f99f4abf2d679, there was the possibility of
getting nullptr back, which meant we were not doing a displaced-stepping
operation.
Now, with the assertion in place, this is running into issues.
It looks like displaced_step_buffers::copy_insn_closure_by_addr is
being used to return a couple different answers depending on the
state we're in:
1 - If we are actively displaced-stepping, then copy_insn_closure_by_addr
is supposed to return a valid closure for us, so we can determine the
thumb mode.
2 - If we are not actively displaced-stepping, then copy_insn_closure_by_addr
should return nullptr to signal that there isn't any displaced-step buffers
in use, because we don't have a valid closure (but we should always have
this).
Since the displaced-step buffers are always allocated, but not always used,
that means the buffers will always contain data. In particular, the buffer
addr field cannot be used to determine if the buffer is active or not.
For instance, we cannot set the buffer addr field to 0x0, as that can be a
valid PC in some cases.
My understanding is that the current_thread field should be a good candidate
to signal that a particular displaced-step buffer is active or not. If it is
nullptr, we have no threads using that buffer to displaced-step. Otherwise,
it is an active buffer in use by a particular thread.
The following fix modifies the displaced_step_buffers::copy_insn_closure_by_addr
function so we only attempt to return a closure if the buffer has an assigned
current_thread and if the buffer address matches the address argument.
Alternatively, I think we could use a function to answer the question of
whether we're actively displaced-stepping (so we have an active buffer) or
not.
I've also added a testcase that exercises the problem. It should reproduce
reliably on Arm, as that is the only architecture that faces this problem
at the moment.
Regression-tested on Ubuntu 20.04. OK?
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30872
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This thread:
https://inbox.sourceware.org/gdb-patches/20231003195338.334948-1-thiago.bauermann@linaro.org/
pointed out that within gdb.base/maint.exp, some regexps within a
gdb_test_multiple were failing to match a complete line, while later
regexps within the gdb_test_multiple made use of the '^' anchor, and
so assumed that earlier lines had been completely matched and removed
from expect's buffer.
When testing with READ1 set this assumption was failing.
Fix this by extending the offending patterns with a trailing '\r\n'.
Tested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
|
|
Now that the GDB 14 branch has been created,
this commit bumps the version number in gdb/version.in to
15.0.50.DATE-git
For the record, the GDB 14 branch was created
from commit 8f12a1a841cd0c447de7a5a0f134a0efece73588.
Also, as a result of the version bump, the following changes
have been made in gdb/testsuite:
* gdb.base/default.exp: Change $_gdb_major to 15.
|
|
The last few commits resolved the KFAILs in gdb.base/args.exp. With
those out of the way we can clean up this test script a little.
In this commit I have:
- Stopped passing 'nowarnings' flag when building the source file.
I see no reason why this source should issue a warning,
- Moved setup of GDBFLAGS into args_test proc, callers that passed a
newline needed a small tweak, and also the matching code needs
updating for newline handling, but I think this is nicer, the
argument lists are now given just once,
- Updated comment on args_test,
- Updated other comments.
There should be no change in what is tested after this commit.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Similarly to how single quotes were mishandled, which was fixed two
commits ago, this commit fixes handling of newlines in arguments
passed to gdbserver.
We already had a test that covered this, gdb.base/args.exp, which,
when run with the native-extended-gdbserver board contained several
KFAIL covering this situation.
In this commit I remove the unnecessary, attempt to quote incoming
newlines within arguments, and do some minimal cleanup of the related
code. There is additional cleanup that can be done, but I'm leaving
that for the next commit.
Then I've removed the KFAIL from the test case, and performed some
minimal cleanup there too.
After this commit the gdb.base/args.exp is 100% passing with the
native-extended-gdbserver board file.
During review I was pointed to this older series:
https://inbox.sourceware.org/gdb-patches/20211022071933.3478427-1-m.weghorn@posteo.de/
which also includes this fix as part of a larger set of changes. I'm
giving a Co-Authored-By credit to the author of that original series.
I believe this smaller fix brings some benefits on its own, though the
original series does offer additional improvements. Once this is
merged I'll take a look at rebasing and resubmitting the original series.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27989
Co-Authored-By: Michael Weghorn <m.weghorn@posteo.de>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
When I posted the previous patch for review Andreas Schwab pointed out
that passing a trailing empty argument also doesn't work.
The fix for this is in the same area of code as the previous patch,
but is sufficiently different that I felt it deserved a patch of its
own.
I noticed that passing arguments containing single quotes to gdbserver
didn't work correctly:
gdb -ex 'set sysroot' --args /tmp/show-args
Reading symbols from /tmp/show-args...
(gdb) target extended-remote | gdbserver --once --multi - /tmp/show-args
Remote debugging using | gdbserver --once --multi - /tmp/show-args
stdin/stdout redirected
Process /tmp/show-args created; pid = 176054
Remote debugging using stdio
Reading symbols from /lib64/ld-linux-x86-64.so.2...
(No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
0x00007ffff7fd3110 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) set args abc ""
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /tmp/show-args \'
stdin/stdout redirected
Process /tmp/show-args created; pid = 176088
2 args are:
/tmp/show-args
abc
Done.
[Inferior 1 (process 176088) exited normally]
(gdb) target native
Done. Use the "run" command to start a process.
(gdb) run
Starting program: /tmp/show-args \'
2 args are:
/tmp/show-args
abc
Done.
[Inferior 1 (process 176095) exited normally]
(gdb) q
The 'shows-args' program used here just prints the arguments passed to
the inferior.
Notice that when starting the inferior using the extended-remote
target there is only a single argument 'abc', while when using the
native target there is a second argument, the blank line, representing
the empty argument.
The problem here is that the vRun packet coming from GDB looks like
this (I've removing the trailing checksum):
$vRun;PROGRAM_NAME;616263;
If we compare this to a packet with only a single argument and no
trailing empty argument:
$vRun;PROGRAM_NAME;616263
Notice the lack of the trailing ';' character here.
The problem is that gdbserver processes this string in a loop. At
each point we maintain a pointer to the character just after a ';',
and then we process everything up to either the next ';' character, or
to the end of the string.
We break out of this loop when the character we start with (in that
loop iteration) is the null-character. This means in the trailing
empty argument case, we abort the loop before doing anything with the
empty argument.
In this commit I've updated the loop, we now break out using a 'break'
statement at the end of the loop if the (sub-)string we just processed
was empty, with this change we now notice the trailing empty
argument.
I've updated the test case to cover this issue.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
I noticed that passing arguments containing single quotes to gdbserver
didn't work correctly:
gdb -ex 'set sysroot' --args /tmp/show-args
Reading symbols from /tmp/show-args...
(gdb) target extended-remote | gdbserver --once --multi - /tmp/show-args
Remote debugging using | gdbserver --once --multi - /tmp/show-args
stdin/stdout redirected
Process /tmp/show-args created; pid = 176054
Remote debugging using stdio
Reading symbols from /lib64/ld-linux-x86-64.so.2...
(No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
0x00007ffff7fd3110 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) set args \'
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /tmp/show-args \'
stdin/stdout redirected
Process /tmp/show-args created; pid = 176088
2 args are:
/tmp/show-args
\'
Done.
[Inferior 1 (process 176088) exited normally]
(gdb) target native
Done. Use the "run" command to start a process.
(gdb) run
Starting program: /tmp/show-args \'
2 args are:
/tmp/show-args
'
Done.
[Inferior 1 (process 176095) exited normally]
(gdb) q
The 'shows-args' program used here just prints the arguments passed to
the inferior.
Notice that when starting the inferior using the extended-remote
target the second argument is "\'", while when running using native
target the argument is "'". The second of these is correct, the \'
used with the "set args" command is just to show GDB that the single
quote is not opening an argument string.
It turns out that the extra backslash is injected on the gdbserver
side when gdbserver processes the arguments that GDB passes it, the
code that does this was added as part of this much larger commit:
commit 2090129c36c7e582943b7d300968d19b46160d84
Date: Thu Dec 22 21:11:11 2016 -0500
Share fork_inferior et al with gdbserver
In this commit I propose removing the specific code that adds what I
believe is a stray backslash. I've extended an existing test to cover
this case, and I now see identical behaviour when using an
extended-remote target as with the native target.
This partially fixes PR gdb/27989, though there are still some issues
with newline handling which I'll address in a later commit.
During review I was pointed to this older series:
https://inbox.sourceware.org/gdb-patches/20211022071933.3478427-1-m.weghorn@posteo.de/
which also includes this fix as part of a larger set of changes. I'm
giving a Co-Authored-By credit to the author of that original series.
I believe this smaller fix brings some benefits on its own, though the
original series does offer additional improvements. Once this is
merged I'll take a look at rebasing and resubmitting the original series.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27989
Co-Authored-By: Michael Weghorn <m.weghorn@posteo.de>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This started with me running into this comment in symfile.c:
/* FIXME, should use print_sys_errmsg but it's not filtered. */
gdb_printf (_("`%ps' has disappeared; keeping its symbols.\n"),
styled_string (file_name_style.style (), filename));
In this particular case I think I disagree with the comment; I think
the output should be a warning rather than just a message printed to
gdb_stdout, I think when the executable, or some other objfile that is
currently being debugged, disappears from disk, this is likely an
unexpected situation, and worth warning the user about.
So, in theory, I could just call print_sys_errmsg and remove the
comment, but that would mean loosing the filename styling in the
output... so in the end I remove the comment and updated the code to
call warning.
But that got me looking at print_sys_errmsg and how it's used.
Currently the function takes a string and an errno, and prints, to
stderr, the string followed by the result of calling strerror on the
errno.
In some places the string passed to print_sys_errmsg is just a
filename, and this is used when something goes wrong. In these cases,
I think calling warning rather than gdb_printf to gdb_stderr, would be
better, and in fact, in a couple of places we manually print a
"warning" prefix, and then call print_sys_errmsg. And so, for these
users I have added a new function warning_filename_and_errno, which
takes a filename, which is printed with styling, and an errno, which
is passed through strerror and the resulting string printed. This new
function calls warning to print its output. I then updated some of
the print_sys_errmsg users to use this new function.
Some other users of print_sys_errmsg are also emitting what is clearly
a warning, however, the string being passed in is more than just a
filename, so the new warning_filename_and_errno function can't be
used, it would style the whole string. For these users I have
switched to calling warning directly, this allows me to style the
warning message correctly.
Finally, in inflow.c there is one last call to print_sys_errmsg, in
this case I just inlined the definition of print_sys_errmsg. This is
a really weird case, as after printing this message GDB just does a
hard exit. This is pretty old code, dating back to the initial GDB
import, I guess it should be updated to call error() maybe, but I'm
reluctant to make this change as part of this commit, just in case
there's some reason why we can't throw an error at this point.
With that done there are now no users of print_sys_errmsg, and so the
old function can be removed.
While I was doing all of the above I added some additional filename
styling in soure.c, this is in an else block where the if contained
the print_sys_errmsg call, so these felt related.
And finally, while I was updating the uses of print_sys_errmsg in
procfs.c, I noticed that we used a static errmsg buffer to format some
error strings. As the above changes got rid of one of the users of
errmsg I also removed the other two users, and the static buffer.
There were a couple of tests that depended on the existing output
message format that needed updating. In one case we gained an extra
'warning: ' prefix, and in the other 'Warning: ' becomes 'warning: ',
I think in both cases the new output is an improvement.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Some gdb.base/fileio.exp tests expect the inferior to not have write
access to some files. If the test is being run as root, this is never
possible. This commit adds a way to identify if the user is root and
xfails the tests that expect no write access.
Approved-By: Tom de Vries <tdevries@suse.de>
|
|
gdb.base/jit-reader-simple.exp regex
I see this failure:
FAIL: gdb.base/jit-reader-simple.exp: standalone: change addr: initial run: maint info breakpoints shows jit breakpoint
The jit breakpoint expected by the test is there, it's just that the
number of spaces doesn't match what the test expects, after "jit
events":
-2 jit events keep y 0x0000555555555119 <__jit_debug_register_code> inf 1
Fix that by relaxing the regex a bit.
Change-Id: Ia3b04e6d5978399d940fd1a590f95f15275ca7ac
|
|
I ran across this site:
https://no-color.org/
... which lobbies for tools to recognize the NO_COLOR environment
variable and disable any terminal styling when it is seen.
This patch implements this for gdb.
Regression tested on x86-64 Fedora 38.
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Reviewed-by: Kevin Buettner <kevinb@redhat.com>
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
When running test-case gdb.base/unwind-on-each-insn-amd64-2.exp with target
board unix/-fPIE/-pie, I run into:
...
gdb compile failed, ld: unwind-on-each-insn-amd64-21.o: relocation \
R_X86_64_32S against `.text' can not be used when making a PIE object; \
recompile with -fPIE
ld: failed to set dynamic section sizes: bad value
...
Fix this by hardcoding nopie in the test-case, and for good measure in the
other test-cases that source unwind-on-each-insn.exp.tcl and use a .s file.
Tested on x86_64-linux.
Approved-by: Kevin Buettner <kevinb@redhat.com>
|
|
Make sure to unlink the related breakpoint when the watchpoint instance
is deleted. This prevents having a wp-related breakpoint that is
linked to a NULL watchpoint (e.g. the watchpoint instance is being
deleted when the 'watch' command fails). With the below scenario,
having such a left out breakpoint will lead to a GDB hang, and this
is due to an infinite loop when deleting all inferior breakpoints.
Scenario:
(gdb) set can-use-hw-watchpoints 0
(gdb) awatch <SCOPE VAR>
Can't set read/access watchpoint when hardware watchpoints are disabled.
(gdb) rwatch <SCOPE VAR>
Can't set read/access watchpoint when hardware watchpoints are disabled.
(gdb) <continue the program until the end>
>> HANG <<
Signed-off-by: Mohamed Bouhaouel <mohamed.bouhaouel@intel.com>
Reviewed-by: Bruno Larsen <blarsen@redhat.com>
|
|
After the series that added this command was pushed, Pedro mentioned
that the news description could easily be misinterpreted, as well as
some code and test improvements that should be made.
While fixing the test, I realized that code repetition wasn't
happening as it should, so I took care of that too.
Approved-By: Andrew Burgess <aburgess@redhat.com>
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
|
|
I've been running the test-suite on an i686-linux laptop with 1GB of memory,
and 1 GB of swap, and noticed problems after running gdb.base/huge.exp: gdb
not being able to spawn for a large number of test-cases afterwards.
So I investigated the memory usage, on my usual x86_64-linux development
platform.
The test-case is compiled with -DCRASH_GDB=2097152, so this:
...
static int a[CRASH_GDB], b[CRASH_GDB];
...
with sizeof (int) == 4 represents two arrays of 8MB each.
Say we add a loop around the "print a" command and print space usage
statistics:
...
gdb_test "maint set per-command space on"
for {set i 0} {$i < 100} {incr i} {
gdb_test "print a"
}
...
This gets us:
...
(gdb) print a^M
$1 = {0 <repeats 2097152 times>}^M
Space used: 478248960 (+469356544 for this command)^M
(gdb) print a^M
$2 = {0 <repeats 2097152 times>}^M
Space used: 486629376 (+8380416 for this command)^M
(gdb) print a^M
$3 = {0 <repeats 2097152 times>}^M
Space used: 495009792 (+8380416 for this command)^M
...
(gdb) print a^M
$100 = {0 <repeats 2097152 times>}^M
Space used: 1308721152 (+8380416 for this command)^M
...
In other words, we start out at 8MB, and the first print costs us about 469MB,
and subsequent prints 8MB, which accumulates to 1.3 GB usage. [ On the
i686-linux laptop, the first print costs us 335MB. ]
The subsequent 8MBs are consistent with the values being saved into the value
history, but the usage for the initial print seems somewhat excessive.
There is a PR open about needing sparse representation of large arrays
(PR8819), but this memory usage points to an independent problem.
The function value_print_array_elements contains a scoped_value_mark to free
allocated values in the outer loop, but it doesn't prevent the inner loop from
allocating a lot of values.
Fix this by adding a scoped_value_mark in the inner loop, after which we have:
...
(gdb) print a^M
$1 = {0 <repeats 2097152 times>}^M
Space used: 8892416 (+0 for this command)^M
(gdb) print a^M
$2 = {0 <repeats 2097152 times>}^M
Space used: 8892416 (+0 for this command)^M
(gdb) print a^M
$3 = {0 <repeats 2097152 times>}^M
Space used: 8892416 (+0 for this command)^M
...
(gdb) print a^M
$100 = {0 <repeats 2097152 times>}^M
Space used: 8892416 (+0 for this command)^M
...
Note that the +0 here just means that the mallocs did not trigger an sbrk.
This is dependent on malloc (which can use either mmap or sbrk or some
pre-allocated memory) and will likely vary between different tunings, versions
and implementations, so this does not give us a reliable way detect the
problem in a minimal way.
A more reliable way of detecting the problem is:
...
void
value_free_to_mark (const struct value *mark)
{
+ size_t before = all_values.size ();
auto iter = std::find (all_values.begin (), all_values.end (), mark);
if (iter == all_values.end ())
all_values.clear ();
else
all_values.erase (iter + 1, all_values.end ());
+ size_t after = all_values.size ();
+ if (before - after >= 1024)
+ fprintf (stderr, "value_free_to_mark freed %zu items\n", before - after);
...
which without the fix tells us:
...
+print a
value_free_to_mark freed 2097152 items
$1 = {0 <repeats 2097152 times>}
...
Fix a similar problem for Fortran:
...
+print array1
value_free_to_mark freed 4194303 items
$1 = (0, <repeats 2097152 times>)
...
in fortran_array_printer_impl::process_element.
The problem also exists for Ada:
...
+print Arr
value_free_to_mark freed 2097152 items
$1 = (0 <repeats 2097152 times>)
...
but is fixed by the fix for C.
Add Fortran and Ada variants of the test-case. The *.exp files are similar
enough to the original to keep the copyright years range.
While writing the Fortran test-case, I ran into needing an additional print
setting to print the entire array in repeat form, filed as PR exp/30817.
I managed to apply the compilation loop for the Ada variant as well, but with
a cumbersome repetition style. I noticed no other test-case uses gnateD, so
perhaps there's a better way of implementing this.
The regression test included in the patch is formulated in its weakest
form, to avoid false positive FAILs, which also means that smaller regressions
may not get detected.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Rewrite test-case gdb.base/huge.exp:
- use build_executable rather than gdb_compile,
- use save_vars,
- factor out hardcoded loop limits min and max,
- handle compilation failure using require, and
- avoid using . in regexp to match $, {} and <>.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
When a GDB stub is run via "target remote |", it sometimes produces
extra output that ends up mixed with GDB's own output. For example,
QEMU's built-in GDB stub responds to the vKill packet by printing
nios2-elf-qemu-system: QEMU: Terminated via GDBstub
before exiting.
This patch fixes the regexp in gdb.base/hook-stop.exp to allow such
messages between GDB's "continuing" and "Inferior killed" messages.
Reviewed-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
These testcases assume host==build or that the remote host has a Posix
shell to run commands in. Don't try to run them if that's not the case.
Reviewed-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This patch fixes some testcases that formerly had patterns with
hardwired "/" pathname separators in them, which broke when testing on
(remote) Windows host.
Reviewed-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Some embedded targets don't have full support for argc/argv. argv
may print as "0x0" or as an address with a symbol name following.
This causes problems for the regexps in the style.exp line-wrapping
tests that assume it always prints as an ordinary address in backtrace
output.
This patch generalizes the regexps to handle these additional forms
and reworks some of the line-wrapping tests to account for the argv
address string being shorter or longer than a regular address.
Reviewed-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
When running test-case gdb.base/add-symbol-file-attach.exp with target board
unix/-m32, we run into:
...
(gdb) attach 3955^M
Attaching to process 3955^M
Load new symbol table from "add-symbol-file-attach"? (y or n) y^M
Reading symbols from add-symbol-file-attach/add-symbol-file-attach...^M
Reading symbols from /lib/libm.so.6...^M
Reading symbols from /usr/lib/debug/lib/libm-2.31.so-i386.debug...^M
Reading symbols from /lib/libc.so.6...^M
Reading symbols from /usr/lib/debug/lib/libc-2.31.so-i386.debug...^M
Reading symbols from /lib/ld-linux.so.2...^M
Reading symbols from /usr/lib/debug/lib/ld-2.31.so-i386.debug...^M
0xf7f53549 in __kernel_vsyscall ()^M
(gdb) FAIL: gdb.base/add-symbol-file-attach.exp: attach
...
The test fails because this regexp is used:
...
-re ".*in \[_A-Za-z0-9\]*pause.*$gdb_prompt $" {
...
The regexp attempts to detect that the exec is somewhere in pause ():
...
int
main (int argc, char **argv)
{
pause ();
return 0;
}
...
but when the exec is blocked in pause, the backtrace is:
...
(gdb) bt
#0 0xf7fd2549 in __kernel_vsyscall ()
#1 0xf7d84966 in __libc_pause () at ../sysdeps/unix/sysv/linux/pause.c:29
#2 0x0804844c in main (argc=1, argv=0xffffce84)
at /data/vries/gdb/src/gdb/testsuite/gdb.base/add-symbol-file-attach.c:26
...
We could simply extend the regexp to also match __kernel_vsyscall, but the
more fundamental problem is that the test is racy.
The attach can happen before the exec is blocked in pause (), somewhere in the
dynamic linker resolving the call to pause, in main or even earlier.
Note that for the test-case to be effective, the exec is not required to be in
pause (). I added a "while (1);" loop at the start of main, reverted the
patch fixing the corresponding PR and reproduced the problem it's supposed to
detect.
Fix this by simply matching the "Reading symbols from" line, similar to what
an earlier test is doing.
While we're at it, rewrite the earlier test to also use the -wrap idiom.
Tested on x86_64-linux.
|
|
This commit adds a new test case for bug 27831. See the contents
of the .exp file for a description of what it's about.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27831
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The author of 'mold' pointed out that with a certain shared library,
gdb would fail to find the shared library's name in 'bt'.
The function in question appeared at the end of the .so's .text
segment and ended with a call to 'abort'.
This turned out to be a classic case of calling get_frame_pc when
get_frame_address_in_block is needed -- the former will be off-by-one
for purposes of finding the enclosing function or shared library.
The included test fails without the patch on my system. However, I
imagine it can't be assumed to reliably fail. Nevertheless it seemed
worth doing.
Regression tested on x86-64 Fedora 38.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29074
Reviewed-by: Kevin Buettner <kevinb@redhat.com>
|
|
This recent commit:
commit b1c0ab20809a502b2d2224fecb0dca3ada2e9b22
Date: Wed Jul 12 21:56:50 2023 +0100
gdb: avoid double stop after failed breakpoint condition check
Addressed a problem where two MI stopped events would be reported if a
breakpoint condition failed due to a signal, this commit was a
replacement for this commit:
commit 2e411b8c68eb2b035b31d5b00d940d4be1a0928b
Date: Fri Oct 14 14:53:15 2022 +0100
gdb: don't always print breakpoint location after failed condition check
which solved the two stop problem, but only for the CLI. Before both
of these commits, if a b/p condition failed due to a signal then the
user would see two stops reported, the first at the location where the
signal occurred, and the second at the location of the breakpoint.
By default GDB remains at the location where the signal occurred, so
the second reported stop can be confusing, this is the problem that
commit 2e411b8c68eb tried to solve (for the CLI) and b1c0ab20809a
extended also address the issue for MI too.
However, while working on another patch I realised that there was a
problem with GDB after the above commits. Neither of the above
commits considered 'set unwindonsignal on'. With this setting on,
when an inferior function call fails with a signal GDB will unwind the
stack back to the location where the inferior function call started.
In the b/p case we're looking at, the stop should be reported at the
location of the breakpoint, not at the location where the signal
occurred, and this isn't what happens.
This commit fixes this by ensuring that when unwindonsignal is 'on',
GDB reports a single stop event at the location of the breakpoint,
this fixes things for both CLI and MI.
The function call_thread_fsm::should_notify_stop is called when the
inferior function call completes and GDB is figuring out if the user
should be notified about this stop event by calling normal_stop from
fetch_inferior_event in infrun.c. If normal_stop is called, then this
notification will be for the location where the inferior call stopped,
which will be the location at which the signal occurred.
Prior to this commit, the only time that normal_stop was not called,
was if the inferior function call completed successfully, this was
controlled by ::should_notify_stop, which only turns false when the
inferior function call has completed successfully.
In this commit I have extended the logic in ::should_notify_stop. Now
there are three cases in which ::should_notify_stop will return false,
and we will not announce the first stop (by calling normal_stop).
These three reasons are:
1. If the inferior function call completes successfully, this is
unchanged behaviour,
2. If the inferior function call stopped due to a signal and 'set
unwindonsignal on' is in effect, and
3. If the inferior function call stopped due to an uncaught C++
exception, and 'set unwind-on-terminating-exception on' is in
effect.
However, if we don't call normal_stop then we need to call
async_enable_stdin in call_thread_fsm::should_stop. Prior to this
commit this was only done for the case where the inferior function
call completed successfully.
In this commit I now call ::should_notify_stop and use this to
determine if we need to call async_enable_stdin. With this done we
now call async_enable_stdin for each of the three cases listed above,
which means that GDB will exit wait_sync_command_done correctly (see
run_inferior_call in infcall.c).
With these two changes the problem is mostly resolved. However, the
solution isn't ideal, we've still lost some information.
Here is how GDB 13.1 behaves, this is before commits b1c0ab20809a and
2e411b8c68eb:
$ gdb -q /tmp/mi-condbreak-fail \
-ex 'set unwindonsignal on' \
-ex 'break 30 if (cond_fail())' \
-ex 'run'
Reading symbols from /tmp/mi-condbreak-fail...
Breakpoint 1 at 0x40111e: file /tmp/mi-condbreak-fail.c, line 30.
Starting program: /tmp/mi-condbreak-fail
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401116 in cond_fail () at /tmp/mi-condbreak-fail.c:24
24 return *p; /* Crash here. */
Error in testing breakpoint condition:
The program being debugged was signaled while in a function called from GDB.
GDB has restored the context to what it was before the call.
To change this behavior use "set unwindonsignal off".
Evaluation of the expression containing the function
(cond_fail) will be abandoned.
Breakpoint 1, foo () at /tmp/mi-condbreak-fail.c:30
30 global_counter += 1; /* Set breakpoint here. */
(gdb)
In this state we see two stop notifications, the first is where the
signal occurred, while the second is where the breakpoint is located.
As GDB has unwound the stack (thanks to unwindonsignal) the second
stop notification reflects where the inferior is actually located.
Then after commits b1c0ab20809a and 2e411b8c68eb the behaviour changed
to this:
$ gdb -q /tmp/mi-condbreak-fail \
-ex 'set unwindonsignal on' \
-ex 'break 30 if (cond_fail())' \
-ex 'run'
Reading symbols from /tmp/mi-condbreak-fail...
Breakpoint 1 at 0x40111e: file /tmp/mi-condbreak-fail.c, line 30.
Starting program: /tmp/mi-condbreak-fail
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401116 in cond_fail () at /tmp/mi-condbreak-fail.c:24
24 return *p; /* Crash here. */
Error in testing condition for breakpoint 1:
The program being debugged was signaled while in a function called from GDB.
GDB has restored the context to what it was before the call.
To change this behavior use "set unwindonsignal off".
Evaluation of the expression containing the function
(cond_fail) will be abandoned.
(gdb) bt 1
#0 foo () at /tmp/mi-condbreak-fail.c:30
(More stack frames follow...)
(gdb)
This is the broken state. GDB is reports the SIGSEGV location, but
not the unwound breakpoint location. The final 'bt 1' shows that the
inferior is not located in cond_fail, which is the only location GDB
reported, so this is clearly wrong.
After implementing the fixes described above we now get this
behaviour:
$ gdb -q /tmp/mi-condbreak-fail \
-ex 'set unwindonsignal on' \
-ex 'break 30 if (cond_fail())' \
-ex 'run'
Reading symbols from /tmp/mi-condbreak-fail...
Breakpoint 1 at 0x40111e: file /tmp/mi-condbreak-fail.c, line 30.
Starting program: /tmp/mi-condbreak-fail
Error in testing breakpoint condition for breakpoint 1:
The program being debugged was signaled while in a function called from GDB.
GDB has restored the context to what it was before the call.
To change this behavior use "set unwindonsignal off".
Evaluation of the expression containing the function
(cond_fail) will be abandoned.
Breakpoint 1, foo () at /tmp/mi-condbreak-fail.c:30
30 global_counter += 1; /* Set breakpoint here. */
(gdb)
This is better. GDB now reports a single stop at the location of the
breakpoint, which is where the inferior is actually located. However,
by removing the first stop notification we have lost some potentially
useful information about which signal caused the inferior to stop.
To address this I've reworked the message that is printed to include
the signal information. GDB now reports this:
$ gdb -q /tmp/mi-condbreak-fail \
-ex 'set unwindonsignal on' \
-ex 'break 30 if (cond_fail())' \
-ex 'run'
Reading symbols from /tmp/mi-condbreak-fail...
Breakpoint 1 at 0x40111e: file /tmp/mi-condbreak-fail.c, line 30.
Starting program: /tmp/mi-condbreak-fail
Error in testing condition for breakpoint 1:
The program being debugged received signal SIGSEGV, Segmentation fault
while in a function called from GDB. GDB has restored the context
to what it was before the call. To change this behavior use
"set unwindonsignal off". Evaluation of the expression containing
the function (cond_fail) will be abandoned.
Breakpoint 1, foo () at /tmp/mi-condbreak-fail.c:30
30 global_counter += 1; /* Set breakpoint here. */
(gdb)
This is better, the user now sees a single stop notification at the
correct location, and the error message describes which signal caused
the inferior function call to stop.
However, we have lost the information about where the signal
occurred. I did consider trying to include this information in the
error message, but, in the end, I opted not too. I wasn't sure it was
worth the effort. If the user has selected to unwind on signal, then
surely this implies they maybe aren't interested in debugging failed
inferior calls, so, hopefully, just knowing the signal name will be
enough. I figure we can always add this information in later if
there's a demand for it.
|
|
When running test-case gdb.base/vfork-follow-parent.exp, I run into:
...
ERROR: tcl error sourcing gdb/testsuite/gdb.base/vfork-follow-parent.exp.
ERROR: error copying "vforked-prog": no such file or directory
while executing
"file copy -force $fromfile $tofile"
(procedure "gdb_remote_download" line 29)
invoked from within
"gdb_remote_download target $binfile3"
...
Fix this by:
- making the copy-to-remote conditional on is_remote target, and
- allowing gdb_remote_download to find $binfile3 by using
standard_output_file.
Also remove unused variable remote_exec_prog.
Tested on x86_64-linux.
|
|
It was pointed out on the mailing list[1] that after this commit:
commit b1e0126ec56e099d753c20e91a9f8623aabd6b46
Date: Wed Jun 21 14:18:54 2023 +0100
gdb: don't resume vfork parent while child is still running
the test gdb.base/vfork-follow-parent.exp now has some failures when
run with the native-gdbserver or native-extended-gdbserver boards:
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to end of inferior 2 (timeout)
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: inferior 1 (timeout)
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1 (timeout)
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent (timeout)
The reason that these failures don't show up when run on the standard
unix board is that the test is only run in the default operating mode,
so for Linux this will be all-stop on top of non-stop.
If we adjust the test script so that it runs in the default mode and
with target-non-stop turned off, then we see the same failures on the
unix board. This commit includes this change.
The way that the test is written means that it is not (currently)
possible to turn on non-stop mode and have the test still work, so
this commit does not do that.
I have also updated the test script so that the vfork child performs
an exec as well as the current exit. Exec and exit are the two ways
in which a vfork child can release the vfork parent, so testing both
of these cases is useful I think.
In this test the inferior performs a vfork and the vfork-child
immediately exits. The vfork-parent will wait for the vfork-child and
then blocks waiting for gdb. Once gdb has released the vfork-parent,
the vfork-parent also exits.
In the test that fails, GDB sets 'detach-on-fork off' and then runs to
the vfork. At this point the test tries to just "continue", but this
fails as the vfork-parent is still selected, and the parent can't
continue until the vfork-child completes. As the vfork-child is
stopped by GDB the parent will never stop once resumed, so GDB refuses
to resume it.
The test script then sets 'schedule-multiple on' and once again
continues. This time GDB, in theory, resumes both the parent and the
child, the parent will be held blocked by the kernel, but the child
will run until it exits, and which point GDB stops again, this time
with inferior 2, the newly exited vfork-child, selected.
What happens after this in the test script is irrelevant as far as
this failure is concerned.
To understand why the test started failing we should consider the
behaviour of four different cases:
1. All-stop-on-non-stop before commit b1e0126ec56e,
2. All-stop-on-non-stop after commit b1e0126ec56e,
3. All-stop-on-all-stop before commit b1e0126ec56e, and
4. All-stop-on-all-stop after commit b1e0126ec56e.
Only case #4 is failing after commit b1e0126ec56e, but I think the
other cases are interesting because, (a) they inform how we might fix
the regression, and (b) it turns out the behaviour of #2 changed too
with the commit, but the change was harmless.
For #1 All-stop-on-non-stop before commit b1e0126ec56e, what happens
is:
1. GDB calls proceed with the vfork-parent selected, as schedule
multiple is on user_visible_resume_ptid returns -1 (everything)
as the resume_ptid (see proceed function),
2. As this is all-stop-on-non-stop, every thread is resumed
individually, so GDB tries to resume both the vfork-parent and the
vfork-child, both of which succeed,
3. The vfork-parent is held stopped by the kernel,
4. The vfork-child completes (exits) at which point the GDB sees the
EXITED event for the vfork-child and the VFORK_DONE event for the
vfork-parent,
5. At this point we might take two paths depending on which event
GDB handles first, if GDB handles the VFORK_DONE first then:
(a) As GDB is controlling both parent and child the VFORK_DONE is
ignored (see handle_vfork_done), the vfork-parent will be
resumed,
(b) GDB processes the EXITED event, selects the (now defunct)
vfork-child, and stops, returning control to the user.
Alternatively, if GDB selects the EXITED event first then:
(c) GDB processes the EXITED event, selects the (now defunct)
vfork-child, and stops, returning control to the user.
(d) At some future time the user resumes the vfork-parent, at
which point the VFORK_DONE is reported to GDB, however, GDB
is ignoring the VFORK_DONE (see handle_vfork_done), so the
parent is resumed.
For case #2, all-stop-on-non-stop after commit b1e0126ec56e, the
important difference is in step (2) above, now, instead of resuming
both the vfork-parent and the vfork-child, only the vfork-child is
resumed. As such, when we get to step (5), only a single event, the
EXITED event is reported.
GDB handles the EXITED just as in (5)(c), then, later, when the user
resumes the vfork-parent, the VFORKED_DONE is immediately delivered
from the kernel, but this is ignored just as in (5)(d), and so,
though the pattern of when the vfork-parent is resumed changes, the
overall pattern of which events are reported and when, doesn't
actually change. In fact, by not resuming the vfork-parent, the order
of events (in this test) is now deterministic, which (maybe?) is a
good thing.
If we now consider case #3, all-stop-on-all-stop before commit
b1e0126ec56e, then what happens is:
1. GDB calls proceed with the vfork-parent selected, as schedule
multiple is on user_visible_resume_ptid returns -1 (everything)
as the resume_ptid (see proceed function),
2. As this is all-stop-on-all-stop, the resume is passed down to the
linux-nat target, the vfork-parent is the event thread, while the
vfork-child is a sibling of the event thread,
3. In linux_nat_target::resume, GDB calls linux_nat_resume_callback
for all threads, this causes the vfork-child to be resumed. Then
in linux_nat_target::resume, the event thread, the vfork-parent,
is also resumed.
4. The vfork-parent is held stopped by the kernel,
5. The vfork-child completes (exits) at which point the GDB sees the
EXITED event for the vfork-child and the VFORK_DONE event for the
vfork-parent,
6. We are now in a situation identical to step (5) as for
all-stop-on-non-stop above, GDB selects one of the events to
handle, and whichever we select the user sees the correct
behaviour.
And so, finally, we can consider #4, all-stop-on-all-stop after commit
b1e0126ec56e, this is the case that started failing.
We start out just like above, in proceed, the resume_ptid is
-1 (resume everything), due to schedule multiple being on. And just
like above, due to the target being all-stop, we call
proceed_resume_thread_checked just once, for the current thread,
which, remember, is the vfork-parent thread.
The change in commit b1e0126ec56e was to avoid resuming a vfork-parent
thread, read the commit message for the justification for this change.
However, this means that GDB now rejects resuming the vfork-parent in
this case, which means that nothing gets resumed! Obviously, if
nothing resumes, then nothing will ever stop, and so GDB appears to
hang.
I considered a couple of solutions which, in the end, I didn't go
with, these were:
1. Move the vfork-parent check out of proceed_resume_thread_checked,
and place it in proceed, but only on the all-stop-on-non-stop
path, this should still address the issue seen in b1e0126ec56e,
but would avoid the issue seen here. I rejected this just
because it didn't feel great to split the checks that exist in
proceed_resume_thread_checked like this,
2. Extend the condition in proceed_resume_thread_checked by adding a
target_is_non_stop_p check. This would have the same effect as
idea 1, but leaves all the checks in the same place, which I
think would be better, but this still just didn't feel right to
me, and so,
What I noticed was that for the all-stop-on-non-stop, after commit
b1e0126ec56e, we only resumed the vfork-child, and this seems fine.
The vfork-parent isn't going to run anyway (the kernel will hold it
back), so if feels like we there's no harm in just waiting for the
child to complete, and then resuming the parent.
So then I started looking at follow_fork, which is called from the top
of proceed. This function already has the task of switching between
the parent and child based on which the user wishes to follow. So, I
wondered, could we use this to switch to the vfork-child in the case
that we are attached to both?
Turns out this is pretty simple to do.
Having done that, now the process is for all-stop-on-all-stop after
commit b1e0126ec56e, and with this new fix is:
1. GDB calls proceed with the vfork-parent selected, but,
2. In follow_fork, and follow_fork_inferior, GDB switches the
selected thread to be that of the vfork-child,
3. Back in proceed user_visible_resume_ptid returns -1 (everything)
as the resume_ptid still, but now,
4. When GDB calls proceed_resume_thread_checked, the vfork-child is
the current selected thread, this is not a vfork-parent, and so
GDB allows the proceed to continue to the linux-nat target,
5. In linux_nat_target::resume, GDB calls linux_nat_resume_callback
for all threads, this does not resume the vfork-parent (because
it is a vfork-parent), and then the vfork-child is resumed as
this is the event thread,
At this point we are back in the same situation as for
all-stop-on-non-stop after commit b1e0126ec56e, that is, the
vfork-child is resumed, while the vfork-parent is held stopped by
GDB.
Eventually the vfork-child will exit or exec, at which point the
vfork-parent will be resumed.
[1] https://inbox.sourceware.org/gdb-patches/3e1e1db0-13d9-dd32-b4bb-051149ae6e76@simark.ca/
|
|
Fixes the issue where jump failed if multiple inferiors run the same
source.
See the below example
$ gdb -q ./simple
Reading symbols from ./simple...
(gdb) break 2
Breakpoint 1 at 0x114e: file simple.c, line 2.
(gdb) run
Starting program: /temp/simple
Breakpoint 1, main () at simple.c:2
2 int a = 42;
(gdb) add-inferior
[New inferior 2]
Added inferior 2 on connection 1 (native)
(gdb) inferior 2
[Switching to inferior 2 [<null>] (<noexec>)]
(gdb) info inferiors
Num Description Connection Executable
1 process 6250 1 (native) /temp/simple
* 2 <null> 1 (native)
(gdb) file ./simple
Reading symbols from ./simple...
(gdb) run
Starting program: /temp/simple
Thread 2.1 "simple" hit Breakpoint 1, main () at simple.c:2
2 int a = 42;
(gdb) info inferiors
Num Description Connection Executable
1 process 6250 1 (native) /temp/simple
* 2 process 6705 1 (native) /temp/simple
(gdb) jump 3
Unreasonable jump request
(gdb)
In this example, jump fails because the debugger finds two different
locations, one for each inferior.
Solution is to limit the search to the current program space.
This is done by having the jump_command function use
decode_line_with_current_source rather than
decode_line_with_last_displayed, which makes sense,
the *_current_source function always looks up a location based on the
current thread's location -- if a user is asking the current thread to
jump, then surely their destination should be relative to where the
current thread is located.
Then, inside decode_line_with_current_source, the call to
decode_line_1 is updated to pass through the current program_space,
which will limit the returned locations to those in the current
program space.
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
When loading an executable using "file a.out", the language is set according
to a.out, which can involve looking up the language of symbol "main", which
will cause the symtab expansion for the containing CU.
Expansion of lto debug info can be slow, so in commit d3214198119 ("[gdb] Use
partial symbol table to find language for main") a feature was added to avoid
the symtab expansion.
This feature stopped working after commit 7f4307436fd ("Fix "start" for D,
Rust, etc").
[ The commit addresses problems related to command start, which requires finding
the main function:
- for language D, "main" was found instead of "D main", and
- for Rust, the correct function was found, but attributed the wrong name
(not fully qualified). ]
Reimplement the feature by adding
cooked_index_functions::lookup_global_symbol_language.
Tested on x86_64-linux.
PR symtab/30661
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30661
|
|
Add lookup of a non-existing symbol to test-case gdb.base/index-cache.exp.
This serves as regression test for PR symtab/30718.
PR symtab/30718
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30718
|
|
gdb.base/index-cache.exp
In test-case gdb.base/index-cache.exp proc run_test_with_flags contains:
...
clean_restart ${testfile}
# The tests generally want to check the cache, so make sure it
# has completed its work.
gdb_test_no_output "maintenance wait-for-index-cache"
...
This however hides data races between:
- index-cache writing (due to file $exec), and
- symbol lookups (due to subsequent ptype commands).
Fix this by:
- moving the "maintenance wait-for-index-cache" to proc check_cache_stats, and
- moving all calls to proc check_cache_stats ALAP.
Tested on x86_64-linux.
|
|
The test-case gdb.base/index-cache.exp uses only one source file, which
contains main.
While doing "file $exec", in set_initial_language a symbol lookup of "main" is
done, causing the symtab containing main to be expanded.
Handling of main is special, and a future optimization may skip the lookup and
expansion.
Reliably exercise:
- the lookup of main, expanding the symtab containing main, by doing
"ptype main", and
- the lookup of another symbol, expanding a symtab not containing main, by:
- adding another source file containing function foo, and
- doing "ptype foo".
This triggered a segfault with target board native-extended-gdbserver, filed
as PR symtab/30712, but that seems to be fixed by a previous commit in this
series.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30712
|
|
When running test-case gdb.base/step-over-syscall.exp without glibc debuginfo
installed, I get:
...
(gdb) continue^M
Continuing.^M
^M
Breakpoint 2, 0x00007ffff7d4405e in vfork () from /lib64/libc.so.6^M
(gdb) PASS: gdb.base/step-over-syscall.exp: vfork: displaced=off: \
continue to vfork (1st time)
...
but with glibc debuginfo installed I get instead:
...
(gdb) continue^M
Continuing.^M
^M
Breakpoint 2, 0x00007ffff7d4405e in __libc_vfork () at \
../sysdeps/unix/sysv/linux/x86_64/vfork.S:44^M
44 ENTRY (__vfork)^M
(gdb) FAIL: gdb.base/step-over-syscall.exp: vfork: displaced=off: \
continue to vfork (1st time)
...
The FAIL is due to a mismatch with regexp:
...
"Breakpoint \[0-9\]+, (.* in |__libc_|)$syscall \\(\\).*"
...
because it cannot match both ".* in " and the __libc_ prefix.
Fix this by using instead the regexp:
...
"Breakpoint \[0-9\]+, (.* in )?(__libc_)?$syscall \\(\\).*"
...
Tested on x86_64-linux.
|