Age | Commit message (Collapse) | Author | Files | Lines |
|
In bug PR gdb/29036, another failure was reported for the test
gdb.mi/mi-multi-commands.exp. This test sends two commands to GDB as
a single write, and then checks that both commands are executed.
The problem that was encountered here is that the output of the first
command, which looks like this:
^done,value="\"FIRST COMMAND\""
Is actually produced in parts, first the '^done' is printed, then the
',value="\"FIRST COMMAND\"" is printed.
What was happening is that some characters from the second command
were being echoed after the '^done' had been printed, but before the
value part had been printed. To avoid this issue I've relaxed the
pattern that checks for the first command a little. With this fix in
place the occasional failure in this test is no longer showing up.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29036
|
|
- Add a script syscalls/gen-header.py, based on syscalls/arm-linux.py.
- Add a script syscalls/update-linux.sh (alongside update-freebsd.sh and
update-netbsd.sh).
- Use syscalls/update-linux.sh to update syscalls/{amd64,i386}-linux.xml.in.
- Regenerate syscalls/{amd64,i386}-linux.xml using syscalls/Makefile.
In gdb/syscalls/i386-linux.xml.in, updating has the following notable effect:
...
- <syscall name="madvise1" number="220"/>
- <syscall name="getdents64" number="221"/>
- <syscall name="fcntl64" number="222"/>
+ <syscall name="getdents64" number="220"/>
+ <syscall name="fcntl64" number="221"/>
...
I've verified in ./arch/x86/entry/syscalls/syscall_32.tbl that the numbers are
correct.
Tested on x86_64-linux.
|
|
Add a Makefile in gdb/syscalls that can be used to translate
gdb/syscalls/*.xml.in into gdb/syscalls/*.xml.
Calling make reveals that bfin-linux.xml is missing, so add it.
Tested on x86_64-linux.
|
|
When execute the following command on LoongArch:
make check-gdb TESTS="gdb.base/async.exp"
there exist the following failed testcases:
FAIL: gdb.base/async.exp: finish& (timeout)
FAIL: gdb.base/async.exp: jump& (timeout)
FAIL: gdb.base/async.exp: until& (timeout)
FAIL: gdb.base/async.exp: set exec-done-display off (GDB internal error)
we can see the following messages in gdb/testsuite/gdb.log:
finish&
Run till exit from #0 foo () at /home/loongson/gdb.git/gdb/testsuite/gdb.base/async.c:9
(gdb) /home/loongson/gdb.git/gdb/gdbarch.c:2646: internal-error: gdbarch_return_value: Assertion `gdbarch->return_value != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
In order to fix the above failed testcases, implement the return_value
gdbarch method on LoongArch.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
|
|
This fix relates to PR gdb/29032, this makes the test more stable by
ensuring that the Ctrl-D is only sent once the prompt has been
displayed. This issue was also discussed on the mailing list here:
https://sourceware.org/pipermail/gdb-patches/2022-April/187670.html
The problem identified in the bug report is that sometimes the Ctrl-D
(that the test sends to GDB) arrives while GDB is processing a
command. When this happens the Ctrl-D is handled differently than if
the Ctrl-D is sent while GDB is waiting for input at a prompt.
The original intent of the test was that the Ctrl-D be sent while GDB
was waiting at a prompt, and that is the case the occurs most often,
but, when the Ctrl-D arrives during command processing, then GDB will
ignore the Ctrl-D, and the test will fail.
This commit ensures the Ctrl-D is always sent while GDB is waiting at
a prompt, which makes this test stable.
But, that still leaves an open question, what should happen when the
Ctrl-D arrives while GDB is processing a command? This commit doesn't
attempt to answer that question, which is while bug PR gdb/29032 will
not be closed once this commit is merged.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29032
|
|
Do:
...
-record btrace <name> <email>
+record
+ btrace <name> <email>
...
to clarify that the listed maintainer is only maintainer of the btrace part of
record.
|
|
With test-case gdb.base/catch-syscall.exp and target board unix/-m32, we run
into:
...
(gdb) catch syscall pipe2^M
Unknown syscall name 'pipe2'.^M
(gdb) FAIL: gdb.base/catch-syscall.exp: determine pipe syscall: catch syscall pipe2
...
Fix this by:
- adding a pipe2 entry in gdb/syscalls/i386-linux.xml.in, and
- regenerating gdb/syscalls/i386-linux.xml using
"xsltproc --output i386-linux.xml apply-defaults.xsl i386-linux.xml.in".
Tested on x86_64-linux with native and unix/-m32.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29056
|
|
When running test-case gdb.reverse/pipe-reverse.exp on openSUSE Tumbleweed,
I run into:
...
(gdb) continue^M
Continuing.^M
^M
Catchpoint 2 (returned from syscall pipe2), in pipe () from /lib64/libc.so.6^M
(gdb) FAIL: gdb.base/catch-syscall.exp: without arguments: \
syscall pipe has returned
...
The current glibc on Tumbleweed is 2.35, which contains commit
"linux: Implement pipe in terms of __NR_pipe2", and consequently syscall pipe2
is used instead of syscall pipe.
Fix this by detecting whether syscall pipe or pipe2 is used before running the
tests.
Tested on x86_64-linux, specifically on:
- openSUSE Tumbleweed (with glibc 2.35), and
- openSUSE Leap 15.3 (with glibc 2.31).
On openSUSE Tumbleweed + target board unix/-m32, this exposes:
...
(gdb) catch syscall pipe2^M
Unknown syscall name 'pipe2'.^M
...
which will be fixed in a folllow-up patch.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29056
|
|
When running test-case gdb.reverse/pipe-reverse.exp on openSUSE Tumbleweed,
I run into:
...
(gdb) continue^M
Continuing.^M
Process record and replay target doesn't support syscall number 293^M
Process record: failed to record execution log.^M
^M
Program stopped.^M
0x00007ffff7daabdb in pipe () from /lib64/libc.so.6^M
(gdb) FAIL: gdb.reverse/pipe-reverse.exp: continue to breakpoint: marker2
...
The current glibc on Tumbleweed is 2.35, which contains commit
"linux: Implement pipe in terms of __NR_pipe2", and consequently syscall pipe2
is used in stead of syscall pipe.
There is already support added for syscall pipe2 for aarch64 (which only has
syscall pipe2, not syscall pipe), so enable the same for amd64, by:
- adding amd64_sys_pipe2 in enum amd64_syscall
- translating amd64_sys_pipe2 to gdb_sys_pipe2 in amd64_canonicalize_syscall
Tested on x86_64-linux, specifically on:
- openSUSE Tumbleweed (with glibc 2.35), and
- openSUSE Leap 15.3 (with glibc 2.31).
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29056
|
|
When running test-case gdb.tui/scroll.exp, I get:
...
Box Dump (80 x 8) @ (0, 0):
0 $17 = 16
1 (gdb) p 17
2 $18 = 17
3 (gdb) p 18
4 $19 = 18
5 (gdb) p 19
6 $20 = 19
7 (gdb)
PASS: gdb.tui/scroll.exp: check cmd window in flip layout
...
but with check-read1 I get instead:
...
Box Dump (80 x 8) @ (0, 0):
0 (gdb) 15
1 (gdb) p 16
2 $17 = 16
3 (gdb) p 17
4 $18 = 17
5 (gdb) p 18
6 $19 = 18
7 (gdb) p 19
FAIL: gdb.tui/scroll.exp: check cmd window in flip layout
...
The "p 19" command is handled by Term::command, which sends the command and then
does Term::wait_for "^$gdb_prompt [string_to_regexp $cmd]", which:
- matches the line with "(gdb) p 19", and
- tries to match the following prompt "(gdb) "
The problem is that scrolling results in reissuing output before the "(gdb) p
19", and the second matching triggers on that. Consequently, wait_for no
longer translates gdb output into screen actions, and the screen does not
reflect the result of "p 19".
Fix this by using a new proc wait_for_region_contents, which in contrast to
wait_for can handle a multi-line regexp.
Tested on x86_64-linux with make targets check and check-read1.
|
|
When running test-case gdb.cp/casts.exp with target board unix/-m32, I run
into:
...
(gdb) print (unsigned long long) &gd == gd_value^M
$31 = false^M
(gdb) FAIL: gdb.cp/casts.exp: print (unsigned long long) &gd == gd_value
...
With some additional printing, we can see in more detail why the comparison
fails:
...
(gdb) print /x &gd^M
$31 = 0xffffc5c8^M
(gdb) PASS: gdb.cp/casts.exp: print /x &gd
print /x (unsigned long long)&gd^M
$32 = 0xffffc5c8^M
(gdb) PASS: gdb.cp/casts.exp: print /x (unsigned long long)&gd
print /x gd_value^M
$33 = 0xffffffffffffc5c8^M
(gdb) PASS: gdb.cp/casts.exp: print /x gd_value
print (unsigned long long) &gd == gd_value^M
$34 = false^M
(gdb) FAIL: gdb.cp/casts.exp: print (unsigned long long) &gd == gd_value
...
The gd_value is set by this assignment:
...
unsigned long long gd_value = (unsigned long long) &gd;
...
The problem here is directly casting from a pointer to a non-pointer-sized
integer.
Fix this by adding an intermediate cast to std::uintptr_t.
Tested on x86_64-linux with native and target board unix/-m32.
|
|
In OBS, on aarch64-linux, with a gdb 11.1 based package, I run into:
...
(gdb) builtin_spawn -pty^M
new-ui mi /dev/pts/5^M
New UI allocated^M
(gdb) =thread-group-added,id="i1"^M
(gdb) ERROR: MI channel failed
warning: Error detected on fd 11^M
thread 1.1^M
Unknown thread 1.1.^M
(gdb) UNRESOLVED: gdb.mi/user-selected-context-sync.exp: mode=non-stop: \
test_cli_inferior: reset selection to thread 1.1
...
with many more UNRESOLVED following.
The ERROR is a common problem, filed as
https://sourceware.org/bugzilla/show_bug.cgi?id=28561 .
But the many UNRESOLVEDs are due to not checking whether the setup as done in
the test_setup function succeeds or not.
Fix this by:
- making test_setup return an error upon failure
- handling test_setup error at the call site
- adding a "setup done" pass/fail to be turned into an unresolved
in case of error during setup.
Tested on x86_64-linux, by manually triggering the error in
mi_gdb_start_separate_mi_tty.
|
|
When running test-case gdb.ada/catch_ex_std.exp on target board
remote-gdbserver-on-localhost, I run into:
...
(gdb) continue^M
Continuing.^M
[Inferior 1 (process 15656) exited with code 0177]^M
(gdb) FAIL: gdb.ada/catch_ex_std.exp: runto: run to main
Remote debugging from host ::1, port 49780^M
/home/vries/foo: error while loading shared libraries: libsome_package.so: \
cannot open shared object file: No such file or directory^M
...
Fix this by adding the usual shared-library + remote-target helper
"gdb_load_shlib $sofile".
Tested on x86_64-linux with native and target board
remote-gdbserver-on-localhost.
|
|
When running test-case gdb.threads/fork-plus-threads.exp with check-readmore,
I run into:
...
[Inferior 11 (process 7029) exited normally]^M
[Inferior 1 (process 6956) exited normally]^M
FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
inferior 1 exited (timeout)
...
The problem is that the regexp consuming the "Inferior exited normally"
messages:
- consumes more than one of those messages at a time, but
- counts only one of those messages.
Fix this by adopting a line-by-line approach, which deals with those messages
one at a time.
Tested on x86_64-linux with native, check-read1 and check-readmore.
|
|
Simon pointed out that some recent patches of mine broke "catch
syscall". Apparently I forgot to finish the conversion of this code
when removing init_catchpoint. This patch completes the conversion
and fixes the bug.
|
|
After these two commits:
commit 4fb7bc4b147fd30b781ea2dad533956d0362295a
Date: Mon Mar 7 13:49:21 2022 +0000
readline: back-port changes needed to properly detect EOF
commit 91395d97d905c31ac38513e4aaedecb3b25e818f
Date: Tue Feb 15 17:28:03 2022 +0000
gdb: handle bracketed-paste-mode and EOF correctly
It was observed that, if a previous command is selected at the
readline prompt using the up arrow key, then when the command is
accepted (by pressing return) an unexpected 'quit' message will be
printed by GDB. Here's an example session:
(gdb) p 123
$1 = 123
(gdb) p 123
quit
$2 = 123
(gdb)
In this session the second 'p 123' was entered not by typing 'p 123',
but by pressing the up arrow key to select the previous command. It
is important that the up arrow key is used, typing Ctrl-p will not
trigger the bug.
The problem here appears to be readline's EOF detection when handling
multi-character input sequences. I have raised this issue on the
readline mailing list here:
https://lists.gnu.org/archive/html/bug-readline/2022-04/msg00012.html
a solution has been proposed here:
https://lists.gnu.org/archive/html/bug-readline/2022-04/msg00016.html
This patch includes a test for this issue as well as a back-port of
(the important bits of) readline commit:
commit 2ef9cec8c48ab1ae3a16b1874a49bd1f58eaaca1
Date: Wed May 4 11:18:04 2022 -0400
fix for setting RL_STATE_EOF in callback mode
That commit also includes some updates to the readline documentation
and tests that I have not included in this commit.
With this commit in place the unexpected 'quit' messages are resolved.
|
|
On PowerPC, the stop in the printf function is of the form:
Breakpoint 2, 0x00007ffff7c6ab08 in printf@@GLIBC_2.17 () from /lib64/libc.so.6
On other architectures the output looks like:
Breakpoint 2, 0x0000007fb7ea29ac in printf () from /lib/aarch64-linux-gnu/libc.so.6
The following patch modifies the printf test by matchine any character
starting immediately after the printf. The test now works for PowerPC
output as well as the output from other architectures.
The test has been run on a Power 10 system and and Intel x86_64 system.
|
|
This introduces a catchpoint class that is used as the base class for
all catchpoints. init_catchpoint is rewritten to be a constructor
instead.
This changes the hierarchy a little -- some catchpoints now inherit
from base_breakpoint whereas previously they did not. This isn't a
problem, as long as re_set is redefined in catchpoint.
|
|
This adds some initializers to tracepoint. I think right now these
may not be needed, due to obscure rules about zero initialization.
However, this will change in the next patch, and anyway it is clearer
to be explicit.
|
|
This removes init_raw_breakpoint_without_location, replacing it with a
constructor on 'breakpoint' itself. The subclasses and callers are
all updated.
|
|
It seems to me that breakpoint should use DISABLE_COPY_AND_ASSIGN.
This patch does this.
|
|
This adds a constructor to exception_catchpoint and simplifies the
caller.
|
|
This adds a constructor to syscall_catchpoint and simplifies the
caller.
|
|
This adds a constructor to signal_catchpoint and simplifies the
caller.
|
|
This adds a constructor to solib_catchpoint and simplifies the caller.
|
|
This adds a constructor to fork_catchpoint and simplifies the caller.
|
|
catch_exec_command_1 clears the new catchpoint's "exec_pathname"
field, but this is already done by virtue of calling "new".
|
|
This constifies breakpoint::print_recreate.
|
|
This constifies breakpoint::print_mention.
|
|
This constifies breakpoint::print_one.
|
|
This constifies breakpoint::print_it. Doing this pointed out some
code in ada-lang.c that can be simplified a little as well.
|
|
works_in_software_mode is only useful for watchpoints. This patch
moves it from breakpoint to watchpoint, and changes it to return bool.
|
|
This changes breakpoint::explains_signal to return bool.
|
|
The breakpoint::ops field is set but never used. This removes it.
|
|
This changes print_recreate_thread to be a method on breakpoint. This
function is only used as a helper by print_recreate methods, so I
thought this transformation made sense.
|
|
The break point after the stepi on Intel is the entry point of the user
signal handler function test_signal_handler. The code at the break point
looks like:
0x<hex address> <test_signal_handler>: endbr64
On PowerPC with a Linux 5.9 kernel or latter, the address where gdb stops
after the stepi is in the vdso code inserted by the kernel. The code at the
breakpoint looks like:
0x<hex address> <__kernel_start_sigtramp_rt64>: bctrl
This is different from other architectures. As discussed below, recent
kernel changes involving the vdso for PowerPC have been made changes to the
signal handler code flow. PowerPC is now stopping in function
__kernel_start_sigtramp_rt64. PowerPC now requires an additional stepi to
reach the user signal handler unlike other architectures.
The bp-permanent.exp and kill-after-signal tests run fine on PowerPC with an
kernel that is older than Linux 5.9.
The PowerPC 64 signal handler was updated by the Linux kernel 5.9-rc1:
commit id: 0138ba5783ae0dcc799ad401a1e8ac8333790df9
powerpc/64/signal: Balance return predictor stack in signal trampoline
An additional change to the PowerPC 64 signal handler was made in Linux
kernel version 5.11-rc7 :
commit id: 24321ac668e452a4942598533d267805f291fdc9
powerpc/64/signal: Fix regression in __kernel_sigtramp_rt64() semantics
The first kernel change, puts code into the user space signal handler (in
the vdso) as a performance optimization to prevent the call/return stack
from getting out of balance. The patch ensures that the entire
user/kernel/vdso cycle is balanced with the addition of the "brctl"
instruction.
The second patch, fixes the semantics of __kernel_sigtramp_rt64(). A new
symbol is introduced to serve as the jump target from the kernel to the
trampoline which now consists of two parts.
The above changes for PowerPC signal handler, causes gdb to stop in the
kernel code not the user signal handler as expected. The kernel dispatches
to the vdso code which in turn calls into the signal handler. PowerPC is
special in that the kernel is using a vdso instruction (bctrl) to enter the
signal handler.
I do not have access to a system with the first patch but not the second. I did
test on Power 9 with the Linux 5.15.0-27-generic kernel. Both tests fail on
this Power 9 system. The two tests also fail on Power 10 with the Linux
5.14.0-70.9.1.el9_0.ppc64le kernel.
The following patch fixes the issue by checking if gdb stopped at "signal
handler called". If gdb stopped there, the tests verifies gdb is in the kernel
function __kernel_start_sigtramp_rt64 then does an additional stepi to reach the
user signal handler. With the patch below, the tests run without errors on both
the Power 9 and Power 10 systems with out any failures.
|
|
When running test-case gdb.dwarf2/locexpr-data-member-location.exp with
target board unix/-fno-PIE/-no-pie/-m32 I run into:
...
(gdb) step^M
26 return 0;^M
(gdb) FAIL: gdb.dwarf2/locexpr-data-member-location.exp: step into foo
...
The problem is that the test-case tries to mimic some gdb_compile_shlib
behaviour using:
...
set flags {additional_flags=-fpic debug}
get_func_info foo $flags
...
but this doesn't work with the target board setting, because we end up doing:
...
gcc locexpr-data-member-location-lib.c -fpic -g -lm -fno-PIE -no-pie -m32 \
-o func_addr23029.x
...
while gdb_compile_shlib properly filters out the -fno-PIE -no-pie.
Consequently, the address for foo determined by get_func_info doesn't match
the actual address of foo.
Fix this by printing the address of foo using the result of gdb_compile_shlib.
|
|
gdbarch_iterate_over_objfiles_in_search_order callback
A rather straightforward patch to change an instance of callback +
void pointer to gdb::function_view, allowing pasing lambdas that
capture, and eliminating the need for the untyped pointer.
Change-Id: I73ed644e7849945265a2c763f79f5456695b0037
|
|
native-extended-gdbserver board
Running
$ make check TESTS="gdb.gdb/unittest.exp" RUNTESTFLAGS="--target_board=native-extended-gdbserver"
I get some failures:
Running selftest regcache::cooked_write_test::i386.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:intel.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:x64-32.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:x64-32:intel.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:x86-64.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:x86-64:intel.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i8086.^M
Self test failed: target already pushed^M
This is because the native-extended-gdbserver automatically connects GDB
to a GDBserver on startup, and therefore pushes a remote target on the
initial inferior. cooked_write_test is currently written in a way that
errors out if the current inferior has a process_stratum_target pushed.
Rewrite it to use scoped_mock_context, so it doesn't depend on the
current inferior (the current one upon entering the function).
Change-Id: I0357f989eacbdecc4bf88b043754451b476052ad
|
|
My patches yesterday to unify the DWARF index base classes had a bug
-- namely, I did the wholesale dynamic_cast-to-static_cast too hastily
and introduced a crash. This can be seen by trying to add an index to
a file that has an index, or by running a test like gdb-index-cxx.exp
using the cc-with-debug-names.exp target board.
This patch fixes the crash by introducing a new virtual method and
removing some of the static casts.
|
|
For some reason g++ 11.2.1 on s390x produces a spurious warning for
stringop-overread in debuginfod_is_enabled for url_view. Add a new
DIAGNOSTIC_IGNORE_STRINGOP_OVERREAD macro to suppress this warning.
include/ChangeLog:
* diagnostics.h (DIAGNOSTIC_IGNORE_STRINGOP_OVERREAD): New
macro.
gdb/ChangeLog:
* debuginfod-support.c (debuginfod_is_enabled): Use
DIAGNOSTIC_IGNORE_STRINGOP_OVERREAD on s390x.
|
|
start_remote_1 calls remote_check_symbols after things are set up to
give the remote side a chance to look up symbols. One call to
remote_check_symbols sets the "general thread", if needed, and sends one
qSymbol packet. However, a remote target could have more than one
process on initial connection, and this would send a qSymbol for only
one of these processes.
Change it to iterate on all the target's inferiors and send a qSymbol
packet for each one.
I tested this by changing gdbserver to spawn two processes on startup:
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index 33c42714e72..9b682e9f85f 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -3939,6 +3939,7 @@ captured_main (int argc, char *argv[])
/* Wait till we are at first instruction in program. */
target_create_inferior (program_path.get (), program_args);
+ target_create_inferior (program_path.get (), program_args);
/* We are now (hopefully) stopped at the first instruction of
the target process. This assumes that the target process was
Instead of hacking GDBserver, it should also be possible to test this by
starting manually two inferiors on an "extended-remote" connection,
disconnecting from GDBserver (with the disconnect command), and
re-connecting.
I was able to see qSymbol being sent for each inferior:
[remote] Sending packet: $Hgp828dc.828dc#1f
[remote] Packet received: OK
[remote] Sending packet: $qSymbol::#5b
[remote] Packet received: qSymbol:6764625f6167656e745f6764625f74705f686561705f627566666572
[remote] Sending packet: $qSymbol::6764625f6167656e745f6764625f74705f686561705f627566666572#1e
[remote] Packet received: qSymbol:6e70746c5f76657273696f6e
[remote] Sending packet: $qSymbol::6e70746c5f76657273696f6e#4d
[remote] Packet received: OK
[remote] Sending packet: $Hgp828dd.828dd#21
[remote] Packet received: OK
[remote] Sending packet: $qSymbol::#5b
[remote] Packet received: qSymbol:6764625f6167656e745f6764625f74705f686561705f627566666572
[remote] Sending packet: $qSymbol::6764625f6167656e745f6764625f74705f686561705f627566666572#1e
[remote] Packet received: qSymbol:6e70746c5f76657273696f6e
[remote] Sending packet: $qSymbol::6e70746c5f76657273696f6e#4d
[remote] Packet received: OK
Note that there would probably be more work to be done to fully support
this scenario, more things that need to be done for each discovered
inferior instead of just for one.
Change-Id: I21c4ecf6367391e2e389b560f0b4bd906cf6472f
|
|
Commit 152a17495663 ("gdb: prune inferiors at end of
fetch_inferior_event, fix intermittent failure of
gdb.threads/fork-plus-threads.exp") broke some tests with the
native-gdbserver board, such as:
(gdb) PASS: gdb.base/step-over-syscall.exp: detach-on-fork=off: follow-fork=child: break cond on target : vfork: break marker
continue^M
Continuing.^M
terminate called after throwing an instance of 'gdb_exception_error'^M
I can manually reproduce the issue by running (just the commands that
the test does as a one liner):
$ ./gdb -q --data-directory=data-directory \
testsuite/outputs/gdb.base/step-over-syscall/step-over-vfork \
-ex "tar rem | ../gdbserver/gdbserver - testsuite/outputs/gdb.base/step-over-syscall/step-over-vfork" \
-ex "b main" \
-ex c \
-ex "d 1" \
-ex "set displaced-stepping off" \
-ex "b *0x7ffff7d7ac5a if main == 0" \
-ex "set detach-on-fork off" \
-ex "set follow-fork-mode child" \
-ex c \
-ex "inferior 1" \
-ex "b marker" \
-ex c
... where 0x7ffff7d7ac5a is the exact address of the vfork syscall
(which can be found by looking at gdb.log).
The important part of the above is that a vfork syscall creates inferior
2, then inferior 2 executes until exit, then we switch back to inferior
1 and try to resume it.
The uncaught exception happens here:
#4 0x00005596969d81a9 in error (fmt=0x559692da9e40 "Cannot execute this command while the target is running.\nUse the \"interrupt\" command to stop the target\nand then try again.")
at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:43
#5 0x0000559695af6f66 in remote_target::putpkt_binary (this=0x617000038080, buf=0x559692da4380 "qSymbol::", cnt=9) at /home/simark/src/binutils-gdb/gdb/remote.c:9560
#6 0x0000559695af6aaf in remote_target::putpkt (this=0x617000038080, buf=0x559692da4380 "qSymbol::") at /home/simark/src/binutils-gdb/gdb/remote.c:9518
#7 0x0000559695ab50dc in remote_target::remote_check_symbols (this=0x617000038080) at /home/simark/src/binutils-gdb/gdb/remote.c:5141
#8 0x0000559695b3cccf in remote_new_objfile (objfile=0x0) at /home/simark/src/binutils-gdb/gdb/remote.c:14600
#9 0x0000559693bc52a9 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x61b0000167f8: 0x559695b3cb1d <remote_new_objfile(objfile*)>) at /usr/include/c++/11.2.0/bits/invoke.h:61
#10 0x0000559693bb2848 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x61b0000167f8: 0x559695b3cb1d <remote_new_objfile(objfile*)>) at /usr/include/c++/11.2.0/bits/invoke.h:111
#11 0x0000559693b8dddf in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7ffe0bae0590: 0x0) at /usr/include/c++/11.2.0/bits/std_function.h:291
#12 0x00005596956374b2 in std::function<void (objfile*)>::operator()(objfile*) const (this=0x61b0000167f8, __args#0=0x0) at /usr/include/c++/11.2.0/bits/std_function.h:560
#13 0x0000559695633c64 in gdb::observers::observable<objfile*>::notify (this=0x55969ef5c480 <gdb::observers::new_objfile>, args#0=0x0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/observable.h:150
#14 0x0000559695df6cc2 in clear_symtab_users (add_flags=...) at /home/simark/src/binutils-gdb/gdb/symfile.c:2873
#15 0x000055969574c263 in program_space::~program_space (this=0x6120000c8a40, __in_chrg=<optimized out>) at /home/simark/src/binutils-gdb/gdb/progspace.c:154
#16 0x0000559694fc086b in delete_inferior (inf=0x61700003bf80) at /home/simark/src/binutils-gdb/gdb/inferior.c:205
#17 0x0000559694fc341f in prune_inferiors () at /home/simark/src/binutils-gdb/gdb/inferior.c:390
#18 0x0000559695017ada in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4293
#19 0x0000559694f629e6 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:41
#20 0x0000559695b3b0e3 in remote_async_serial_handler (scb=0x6250001ef100, context=0x6170000380a8) at /home/simark/src/binutils-gdb/gdb/remote.c:14466
#21 0x0000559695c59eb7 in run_async_handler_and_reschedule (scb=0x6250001ef100) at /home/simark/src/binutils-gdb/gdb/ser-base.c:138
#22 0x0000559695c5a42a in fd_event (error=0, context=0x6250001ef100) at /home/simark/src/binutils-gdb/gdb/ser-base.c:189
#23 0x00005596969d9ebf in handle_file_event (file_ptr=0x60700005af40, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:574
#24 0x00005596969da7fa in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:700
#25 0x00005596969d8539 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
If I enable "set debug infrun" just before the last continue, we see:
(gdb) continue
Continuing.
[infrun] clear_proceed_status_thread: 965604.965604.0
[infrun] proceed: enter
[infrun] proceed: addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT
[infrun] scoped_disable_commit_resumed: reason=proceeding
[infrun] start_step_over: enter
[infrun] start_step_over: stealing global queue of threads to step, length = 0
[infrun] operator(): step-over queue now empty
[infrun] start_step_over: exit
[infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [965604.965604.0] at 0x7ffff7d7ac5c
[infrun] do_target_resume: resume_ptid=965604.0.0, step=0, sig=GDB_SIGNAL_0
[infrun] prepare_to_wait: prepare_to_wait
[infrun] reset: reason=proceeding
[infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target remote
[infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target remote
[infrun] proceed: exit
[infrun] fetch_inferior_event: enter
[infrun] scoped_disable_commit_resumed: reason=handling event
[infrun] do_target_wait: Found 2 inferiors, starting at #1
[infrun] random_pending_event_thread: None found.
[infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
[infrun] print_target_wait_results: 965604.965604.0 [Thread 965604.965604],
[infrun] print_target_wait_results: status->kind = VFORK_DONE
[infrun] handle_inferior_event: status->kind = VFORK_DONE
[infrun] context_switch: Switching context from 0.0.0 to 965604.965604.0
[infrun] handle_vfork_done: not waiting for a vfork-done event
[infrun] start_step_over: enter
[infrun] start_step_over: stealing global queue of threads to step, length = 0
[infrun] operator(): step-over queue now empty
[infrun] start_step_over: exit
[infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [965604.965604.0] at 0x7ffff7d7ac5c
[infrun] do_target_resume: resume_ptid=965604.0.0, step=0, sig=GDB_SIGNAL_0
[infrun] prepare_to_wait: prepare_to_wait
[infrun] reset: reason=handling event
[infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target remote
[infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target remote
terminate called after throwing an instance of 'gdb_exception_error'
What happens is:
- After doing the "continue" on inferior 1, the remote target gives us
a VFORK_DONE event. The core ignores it and resumes inferior 1.
- Since prune_inferiors is now called after each handled event, in
fetch_inferior_event, it is called after we handled that VFORK_DONE
event and resumed inferior 1.
- Inferior 2 is pruned, which (see backtrace above) causes its program
space to be deleted, which clears the symtabs for that program space,
which calls the new_objfile observable and remote_new_objfile
observer (with a nullptr objfile, to indicate that the previously
loaded symbols have been discarded), which calls
remote_check_symbols.
remote_check_symbols is the function that sends the qSymbol packet, to
let the remote side ask for symbol addresses. The problem is that the
remote target is working in all-stop / sync mode and is currently
resumed. It has sent a vCont packet to resume the target and is waiting
for a stop reply. It can't send any packets in the mean time. That
causes the exception to be thrown.
This wasn't a problem before, when prune_inferiors was called in
normal_stop, because it was always called at a time the target was not
resumed.
An important observation here is that the new_objfile observable is
invoked for a change in inferior 2's program space (inferior 2's program
space is the current program space). Inferior 2 isn't bound to any
process on the remote side (it has exited, that's why it's being
pruned). It doesn't make sense to try to send a qSymbol packet for a
process that doesn't exist on the remote side. remote_check_symbols
actually attempts to avoid that:
/* The remote side has no concept of inferiors that aren't running
yet, it only knows about running processes. If we're connected
but our current inferior is not running, we should not invite the
remote target to request symbol lookups related to its
(unrelated) current process. */
if (!target_has_execution ())
return;
The problem here is that while inferior 2's program space is the current
program space, inferior 1 is the current inferior. So the check above
passes, since inferior has execution. We therefore try to send a
qSymbol packet for inferior 1 in reaction to a change in inferior 2's
program space, that's wrong.
This exposes a conceptual flaw in remote_new_objfile. The "new_objfile"
event concerns a specific program space, which can concern multiple
inferiors, as inferiors can share a program space. We shouldn't
consider the current inferior at all, but instead all inferiors bound to
the affected program space. Especially since the current inferior can
be unrelated to the current program space at that point.
To be clear, we are in this state because ~program_space sets itself as
the current program space, but there is no more inferior having that
program space to switch to, inferior 2 has already been unlinked.
To fix this, make remote_new_objfile iterate on all inferiors bound to
the affected program space. Remove the target_has_execution check from
remote_check_symbols, replace it with an assert. All callers must
ensure that the current inferior has execution before calling it.
Change-Id: Ica643145bcc03115248290fd310cadab8ec8371c
|
|
|
|
|
|
|
|
This permits resolving TLS variables.
|
|
Derive the pointer to the DTV array from the tpidr register.
|
|
|
|
|