Age | Commit message (Collapse) | Author | Files | Lines |
|
this patch fixes a crash in gdbserver whenever a process is detached.
the crash is caused by `detach` calling `remove_process` before `win32_clear_inferiors`
error message:
Detaching from process 184
../../gdbserver/inferiors.cc:160: A problem internal to GDBserver has been detec
ted.
remove_process: Assertion `find_thread_process (process) == NULL' failed.
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Building on the last commit, which added a general --debug=COMPONENT
option to the gdbserver command line, this commit updates the monitor
command to allow for general:
(gdb) monitor set debug COMPONENT off|on
style commands. Just like with the previous commit, the COMPONENT can
be any one of all, threads, remote, event-loop, and correspond to the
same set of global debug flags.
While on the command line it is possible to do:
--debug=remote,event-loop,threads
the components have to be entered one at a time with the monitor
command. I guess there's no reason why we couldn't allow component
grouping within the monitor command, but (to me) what I have here
seemed more in the spirit of GDB's existing 'set debug ...' commands.
If people want it then we can always add component grouping later.
Notice in the above that I use 'off' and 'on' instead of '0' and '1',
which is what the 'monitor set debug' command used to use. The 0/1
can still be used, but I now advertise off/on in all the docs and help
text, again, this feels more inline with GDB's existing boolean
settings.
I have removed the two existing monitor commands:
monitor set remote-debug 0|1
monitor set event-loop-debug 0|1
These are replaced by:
monitor set debug remote off|on
monitor set debug event-loop off|on
respectively.
Approved-By: Tom Tromey <tom@tromey.com>
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
|
|
Currently, gdbserver has the following command line options related to
debugging output:
--debug
--remote-debug
--event-loop-debug
This doesn't scale well. If I want an extra debug component I need to
add another command line flag.
This commit changes --debug to take a list of components.
The currently supported components are: all, threads, remote, and
event-loop. The 'threads' component represents the debug we currently
get from the --debug option. And if --debug is used without a
component list then the threads component is assumed as the default.
Currently the threads component actually includes a lot of output that
is not really threads related. In the future I'd like to split this
up into some new, separate components. But that is not part of this
commit, or even this series.
The special component 'all' does what you'd expect: enables debug
output from all supported components.
The component list is parsed left to write, and you can prefix a
component with '-' to disable that component, so I can write:
target> gdbserver --debug=all,-event-loop
to get debug for all components except the event-loop component.
I've removed the existing --remote-debug and --event-loop-debug
command line options, these are equivalent to --debug=remote and
--debug=event-loop respectively, or --debug=remote,event-loop to
enable both components.
In this commit I've only update the command line options, in the next
commit I'll update the monitor commands to support a similar
interface.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
I noticed that gdbserver/win32-low.cc has a fall-through comment that
should have been converted to use the annotation instead.
This patch makes the change.
|
|
On Linux, threads are treated much like separate processes by the
kernel. In particular, it's possible to ptrace just a single thread.
If gdb tries to attach to a multi-threaded inferior, where a non-main
thread is already being traced (e.g., by strace), then gdb will get
into an infinite loop attempting to attach.
This patch fixes this problem by having the attach fail if ptrace
fails to attach to any thread of the inferior.
|
|
C++17 makes the second parameter to static_assert optional, so we can
remove gdb_static_assert now.
|
|
This changes the various gdb-related directories to use
-Wimplicit-fallthrough=5, meaning that only the fallthrough attribute
can be used in switches -- special 'fallthrough' comments will no
longer be usable.
Approved-By: Pedro Alves <pedro@palves.net>
|
|
This changes gdb to use the C++17 [[fallthrough]] attribute rather
than special comments.
This was mostly done by script, but I neglected a few spellings and so
also fixed it up by hand.
I suspect this fixes the bug mentioned below, by switching to a
standard approach that, presumably, clang supports.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=23159
Approved-By: John Baldwin <jhb@FreeBSD.org>
Approved-By: Luis Machado <luis.machado@arm.com>
Approved-By: Pedro Alves <pedro@palves.net>
|
|
This introduces throw_winerror_with_name, a Windows analog of
perror_with_name, and changes various places in gdb to call it.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30770
|
|
After this commit:
commit 0b04e52316079981b2b77124198a405d826a05cd
Date: Sun Jan 19 14:33:37 2014 -0700
link gdbserver against libiberty
we can cleanup how the help text is generated in monitor_show_help.
This doesn't change the output that the user will see -- it just folds
multiple monitor_output calls into one.
There should be no user visible change after this commit.
|
|
Since GDB now requires C++17, we don't need the internally maintained
gdb::optional implementation. This patch does the following replacing:
- gdb::optional -> std::optional
- gdb::in_place -> std::in_place
- #include "gdbsupport/gdb_optional.h" -> #include <optional>
This change has mostly been done automatically. One exception is
gdbsupport/thread-pool.* which did not use the gdb:: prefix as it
already lives in the gdb namespace.
Change-Id: I19a92fa03e89637bab136c72e34fd351524f65e9
Approved-By: Tom Tromey <tom@tromey.com>
Approved-By: Pedro Alves <pedro@palves.net>
|
|
Doing this on behalf of Arsen as obvious.
* gdb: Regenerate.
* gdbserver: Regenerate.
* gprofng: Regenerate.
|
|
On Linux, a thread can only be 16 bytes (including the trailing \0).
A user sent in a test case where this causes a truncated UTF-8
sequence, causing gdbserver to create invalid XML.
I went back and forth about different ways to solve this, and in the
end decided to fix it in gdbserver, with the reason being that it
seems important to generate correct XML for the <thread> response.
I am not totally sure whether the call to setlocale could have
unplanned consequences. This is needed, though, for nl_langinfo to
return the correct result.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30618
|
|
If GDB sets the GDB_THREAD_OPTION_EXIT option on a thread, then if the
thread disappears from the thread list, GDB expects to shortly see a
thread exit event for it. See e.g., here, in
remote_target::update_thread_list():
/* Do not remove the thread if we've requested to be
notified of its exit. For example, the thread may be
displaced stepping, infrun will need to handle the
exit event, and displaced stepping info is recorded
in the thread object. If we deleted the thread now,
we'd lose that info. */
if ((tp->thread_options () & GDB_THREAD_OPTION_EXIT) != 0)
continue;
There's one scenario that is deleting a thread from the
remote/gdbserver thread list without ever reporting a corresponding
thread exit event though -- check_zombie_leaders. This can lead to
GDB getting confused. For example, with a following patch that
enables GDB_THREAD_OPTION_EXIT whenever schedlock is enabled, we'd see
this regression:
$ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp"
...
Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ...
FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
... some more cascading FAILs ...
gdb.log shows:
(gdb) continue
Continuing.
FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
A passing run would have resulted in:
(gdb) continue
Continuing.
No unwaited-for children left.
(gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits
note how the leader thread is not listed in the remote-reported XML
thread list below:
(gdb) set debug remote 1
(gdb) set debug infrun 1
(gdb) info threads
Id Target Id Frame
* 1 Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65
3 Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39
(gdb) c
Continuing.
[infrun] clear_proceed_status_thread: 1163850.1163850.0
...
[infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f
[remote] Sending packet: $QPassSignals:#f3
[remote] Packet received: OK
[remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3
[remote] Packet received: OK
...
[infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3]
...
[infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0
[remote] Sending packet: $vCont;c:p11c24a.11c24a#98
[infrun] prepare_to_wait: prepare_to_wait
[infrun] reset: reason=handling event
[infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
[infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote
[infrun] fetch_inferior_event: exit
[infrun] fetch_inferior_event: enter
[infrun] scoped_disable_commit_resumed: reason=handling event
[infrun] random_pending_event_thread: None found.
[remote] wait: enter
[remote] Packet received: N
[remote] wait: exit
[infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
[infrun] print_target_wait_results: -1.0.0 [process -1],
[infrun] print_target_wait_results: status->kind = NO_RESUMED
[infrun] handle_inferior_event: status->kind = NO_RESUMED
[remote] Sending packet: $Hgp0.0#ad
[remote] Packet received: OK
[remote] Sending packet: $qXfer:threads:read::0,1000#92
[remote] Packet received: l<threads>\n<thread id="p11c24a.11c362" core="0" name="no-unwaited-for" handle="0097d8f7ff7f0000"/>\n</threads>\n
[infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed)
...
... however, infrun decided there was a resumed thread still, so
ignored the TARGET_WAITKIND_NO_RESUMED event. Debugging GDB, we see
that the "found resumed" thread that GDB finds, is the leader thread.
Even though that thread is not on the remote-reported thread list, it
is still on the GDB thread list, due to the special case in remote.c
mentioned above.
This commit addresses the issue by fixing GDBserver to report a thread
exit event for the zombie leader too, i.e., making GDBserver respect
the "if thread has GDB_THREAD_OPTION_EXIT set, report a thread exit"
invariant. To do that, it takes a bit more code than one would
imagine off hand, due to the fact that we currently always report LWP
exit pending events as TARGET_WAITKIND_EXITED, and then decide whether
to convert it to TARGET_WAITKIND_THREAD_EXITED just before reporting
the event to GDBserver core. For the zombie leader scenario
described, we need to record early on that we want to report a
THREAD_EXITED event, and then make sure that decision isn't lost along
the way to reporting the event to GDBserver core.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I1e68fccdbc9534434dee07163d3fd19744c8403b
|
|
Normally, if the last resumed thread on the target exits, the server
sends a no-resumed event to GDB. If however, GDB enables the
GDB_THREAD_OPTION_EXIT option on a thread, and, that thread exits, the
server sends a thread exit event for that thread instead.
In all-stop RSP mode, since events can only be forwarded to GDB one at
a time, and the whole target stops whenever an event is reported, GDB
resumes the target again after getting a THREAD_EXITED event, and then
the server finally reports back a no-resumed event if/when
appropriate.
For non-stop RSP though, events are asynchronous, and if the server
sends a thread-exit event for the last resumed thread, the no-resumed
event is never sent. This patch makes sure that in non-stop mode, the
server queues a no-resumed event after the thread-exit event if it was
the last resumed thread that exited.
Without this, we'd see failures in step-over-thread-exit testcases
added later in the series, like so:
continue
Continuing.
- No unwaited-for children left.
- (gdb) PASS: gdb.threads/step-over-thread-exit.exp: displaced-stepping=off: non-stop=on: target-non-stop=on: schedlock=off: ns_stop_all=1: continue stops when thread exits
+ FAIL: gdb.threads/step-over-thread-exit.exp: displaced-stepping=off: non-stop=on: target-non-stop=on: schedlock=off: ns_stop_all=1: continue stops when thread exits (timeout)
(and other similar ones)
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I927d78b30f88236dbd5634b051a716f72420e7c7
|
|
This implements support for the new GDB_THREAD_OPTION_EXIT thread
option for Linux GDBserver.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I96b719fdf7fee94709e98bb3a90751d8134f3a38
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
|
|
gdbserver's linux_process_target::wait loops if:
- called in sync mode, and,
- wait_1 returns TARGET_WAITKIND_IGNORE, _and_,
- wait_1 also returns null_ptid.
The null_ptid check fails however when this path is taken:
ptid_t
linux_process_target::filter_exit_event (lwp_info *event_child,
target_waitstatus *ourstatus)
{
...
if (!is_leader (thread))
{
if (report_exit_events_for (thread))
ourstatus->set_thread_exited (0);
else
ourstatus->set_ignore (); <<<<<<<
delete_lwp (event_child);
}
return ptid;
}
This makes linux_process_target::wait return TARGET_WAITKIND_IGNORE in
sync mode, which is unexpected by the core and fails an assertion.
This commit fixes it by just making linux_process_target::wait loop if
it got a TARGET_WAITKIND_IGNORE, irrespective of event_ptid.
Change-Id: I39776908a6c75cbd68aa04139ffcf7be334868cf
|
|
Currently, GDB does not understand the THREAD_EXITED stop reply in
remote all-stop mode. There's no good reason for this, it just
happened that THREAD_EXITED was only ever reported in non-stop mode so
far. This patch teaches GDB to parse that event in all-stop RSP too.
There is no need to add a qSupported feature for this, because the
server won't send a THREAD_EXITED event unless GDB explicitly asks for
it, with QThreadEvents, or with the GDB_THREAD_OPTION_EXIT
QThreadOptions option added in the next patch.
Change-Id: Ide5d12391adf432779fe4c79526801c4a5630966
|
|
This commit extends the logic added by these two commits from a while
ago:
#1 7b961964f866 (gdbserver: hide fork child threads from GDB),
#2 df5ad102009c (gdb, gdbserver: detach fork child when detaching from fork parent)
... to handle thread clone events, which are very similar to (v)fork
events.
For #1, we want to hide clone children as well, so just update the
comments.
For #2, unlike (v)fork children, pending clone children aren't full
processes, they're just threads, so don't detach them in
handle_detach. linux-low.cc will take care of detaching them along
with all other threads of the process, there's nothing special that
needs to be done.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I7f5901d07efda576a2522d03e183994e071b8ffc
|
|
This patch teaches the Linux GDBserver backend to report clone events
to GDB, when GDB has requested them with the GDB_THREAD_OPTION_CLONE
thread option, via the new QThreadOptions packet.
This shuffles code in linux_process_target::handle_extended_wait
around to a more logical order when we now have to handle and
potentially report all of fork/vfork/clone.
Raname lwp_info::fork_relative -> lwp_info::relative as the field is
no longer only about (v)fork.
With this, gdb.threads/stepi-over-clone.exp now cleanly passes against
GDBserver, so remove the native-target-only requirement from that
testcase.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I3a19bc98801ec31e5c6fdbe1ebe17df855142bb2
|
|
A previous patch taught GDB about a new TARGET_WAITKIND_THREAD_CLONED
event kind, and made the Linux target report clone events.
A following patch will teach Linux GDBserver to do the same thing.
However, for remote debugging, it wouldn't be ideal for GDBserver to
report every clone event to GDB, when GDB only cares about such events
in some specific situations. Reporting clone events all the time
would be potentially chatty. We don't enable thread create/exit
events all the time for the same reason. Instead we have the
QThreadEvents packet. QThreadEvents is target-wide, though.
This patch makes GDB instead explicitly request that the target
reports clone events or not, on a per-thread basis.
In order to be able to do that with GDBserver, we need a new remote
protocol feature. Since a following patch will want to enable thread
exit events on per-thread basis too, the packet introduced here is
more generic than just for clone events. It lets you enable/disable a
set of options at once, modelled on Linux ptrace's PTRACE_SETOPTIONS.
IOW, this commit introduces a new QThreadOptions packet, that lets you
specify a set of per-thread event options you want to enable. The
packet accepts a list of options/thread-id pairs, similarly to vCont,
processed left to right, with the options field being a number
interpreted as a bit mask of options. The only option defined in this
commit is GDB_THREAD_OPTION_CLONE (0x1), which ask the remote target
to report clone events. Another patch later in the series will
introduce another option.
For example, this packet sets option "1" (clone events) on thread
p1000.2345:
QThreadOptions;1:p1000.2345
and this clears options for all threads of process 1000, and then sets
option "1" (clone events) on thread p1000.2345:
QThreadOptions;0:p1000.-1;1:p1000.2345
This clears options of all threads of all processes:
QThreadOptions;0
The target reports the set of supported options by including
"QThreadOptions=<supported options>" in its qSupported response.
infrun is then tweaked to enable GDB_THREAD_OPTION_CLONE when stepping
over a breakpoint.
Unlike PTRACE_SETOPTIONS, fork/vfork/clone children do NOT inherit
their parent's thread options. This is so that GDB can send e.g.,
"QThreadOptions;0;1:TID" without worrying about threads it doesn't
know about yet.
Documentation for this new remote protocol feature is included in a
documentation patch later in the series.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ie41e5093b2573f14cf6ac41b0b5804eba75be37e
|
|
The previous patch taught GDB about a new
TARGET_WAITKIND_THREAD_CLONED event kind, and made the Linux target
report clone events.
A following patch will teach Linux GDBserver to do the same thing.
But before we get there, we need to teach the remote protocol about
TARGET_WAITKIND_THREAD_CLONED. That's what this patch does. Clone is
very similar to vfork and fork, and the new stop reply is likewise
handled similarly. The stub reports "T05clone:...".
GDBserver core is taught to handle TARGET_WAITKIND_THREAD_CLONED and
forward it to GDB in this patch, but no backend actually emits it yet.
That will be done in a following patch.
Documentation for this new remote protocol feature is included in a
documentation patch later in the series.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: If271f20320d864f074d8ac0d531cc1a323da847f
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
|
|
common-defs.h has a few defines that I suspect were used during the
transition to C++. These aren't needed any more, so remove them.
Tested by rebuilding.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
This patch proposes to require a C++17 compiler to build gdb /
gdbsupport / gdbserver. Before this patch, GDB required a C++11
compiler.
The general policy regarding bumping C++ language requirement in GDB (as
stated in [1]) is:
Our general policy is to wait until the oldest compiler that
supports C++NN is at least 3 years old.
Rationale: We want to ensure reasonably widespread compiler
availability, to lower barrier of entry to GDB contributions, and to
make it easy for users to easily build new GDB on currently
supported stable distributions themselves. 3 years should be
sufficient for latest stable releases of distributions to include a
compiler for the standard, and/or for new compilers to appear as
easily installable optional packages. Requiring everyone to build a
compiler first before building GDB, which would happen if we
required a too-new compiler, would cause too much inconvenience.
See the policy proposal and discussion
[here](https://sourceware.org/ml/gdb-patches/2016-10/msg00616.html).
The first GCC release which with full C++17 support is GCC-9[2],
released in 2019[3], which is over 4 years ago. Clang has had C++17
support since Clang-5[4] released in 2018[5].
A discussions with many distros showed that a C++17-able compiler is
always available, meaning that this no hard requirement preventing us to
require it going forward.
[1] https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards#When_is_GDB_going_to_start_requiring_C.2B-.2B-NN_.3F
[2] https://gcc.gnu.org/projects/cxx-status.html#cxx17
[3] https://gcc.gnu.org/gcc-9/
[4] https://clang.llvm.org/cxx_status.html
[5] https://releases.llvm.org/
Change-Id: Id596f5db17ea346e8a978668825787b3a9a443fd
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
Approved-By: Pedro Alves <pedro@palves.net>
|
|
This patch upgrades gdb/ax_cxx_compile_stdcxx.m4 to follow changes
available in [1] and regenerates the configure script.
[1] https://www.gnu.org/software/autoconf-archive/ax_cxx_compile_stdcxx.html
Change-Id: I5b16adc65c9e48a13ad65202d58ab7a9d487214e
Approved-By: Tom Tromey <tom@tromey.com>
Approved-By: Pedro Alves <pedro@palves.net>
|
|
Overview
========
Consider the following situation, GDB is in non-stop mode, the main
thread is running while a second thread is stopped. The user has the
second thread selected as the current thread and asks GDB to detach.
At the exact moment of detach the main thread exits.
This situation currently causes crashes, assertion failures, and
unexpected errors to be reported from GDB for both native and remote
targets.
This commit addresses this situation for native and remote targets.
There are a number of different fixes, but all are required in order
to get this functionality working correct for native and remote
targets.
Native Linux Target
===================
For the native Linux target, detaching is handled in the function
linux_nat_target::detach. In here we call stop_wait_callback for each
thread, and it is this callback that will spot that the main thread
has exited.
GDB then detaches from everything except the main thread by calling
detach_callback.
After this the first problem is this assert:
/* Only the initial process should be left right now. */
gdb_assert (num_lwps (pid) == 1);
The num_lwps call will return 0 as the main thread has exited and all
of the other threads have now been detached. I fix this by changing
the assert to allow for 0 or 1 lwps at this point. As the 0 case can
only happen in non-stop mode, the assert becomes:
gdb_assert (num_lwps (pid) == 1
|| (target_is_non_stop_p () && num_lwps (pid) == 0));
The next problem is that we do:
main_lwp = find_lwp_pid (ptid_t (pid));
and then proceed assuming that main_lwp is not nullptr. In the case
that the main thread has exited though, main_lwp will be nullptr.
However, we only need main_lwp so that GDB can detach from the
thread. If the main thread has exited, and GDB has already detached
from every other thread, then GDB has finished detaching, GDB can skip
the calls that try to detach from the main thread, and then tell the
user that the detach was a success.
For Remote Targets
==================
On remote targets there are two problems.
First is that when the exit occurs during the early phase of the
detach, we see the stop notification arrive while GDB is removing the
breakpoints ahead of the detach. The 'set debug remote on' trace
looks like this:
[remote] Sending packet: $z0,7f1648fe0241,1#35
[remote] Notification received: Stop:W0;process:2a0ac8
# At this point an unpatched gdbserver segfaults, and the connection
# is broken. A patched gdbserver continues as below...
[remote] Packet received: E01
[remote] Sending packet: $z0,7f1648ff00a8,1#68
[remote] Packet received: E01
[remote] Sending packet: $z0,7f1648ff132f,1#6b
[remote] Packet received: E01
[remote] Sending packet: $D;2a0ac8#3e
[remote] Packet received: E01
I was originally running into Segmentation Faults, from within
gdbserver/mem-break.cc, in the function find_gdb_breakpoint. This
function calls current_process() and then dereferences the result to
find the breakpoint list.
However, in our case, the current process has already exited, and so
the current_process() call returns nullptr. At the point of failure,
the gdbserver backtrace looks like this:
#0 0x00000000004190e4 in find_gdb_breakpoint (z_type=48 '0', addr=4198762, kind=1) at ../../src/gdbserver/mem-break.cc:982
#1 0x000000000041930d in delete_gdb_breakpoint (z_type=48 '0', addr=4198762, kind=1) at ../../src/gdbserver/mem-break.cc:1093
#2 0x000000000042d8db in process_serial_event () at ../../src/gdbserver/server.cc:4372
#3 0x000000000042dcab in handle_serial_event (err=0, client_data=0x0) at ../../src/gdbserver/server.cc:4498
...
The problem is that, as a result non-stop being on, the process
exiting is only reported back to GDB after the request to remove a
breakpoint has been sent. Clearly gdbserver can't actually remove
this breakpoint -- the process has already exited -- so I think the
best solution is for gdbserver just to report an error, which is what
I've done.
The second problem I ran into was on the gdb side, as the process has
already exited, but GDB has not yet acknowledged the exit event, the
detach -- the 'D' packet in the above trace -- fails. This was being
reported to the user with a 'Can't detach process' error. As the test
actually calls detach from Python code, this error was then becoming a
Python exception.
Though clearly the detach has returned an error, and so, maybe, having
GDB throw an error would be fine, I think in this case, there's a good
argument that the remote error can be ignored -- if GDB tries to
detach and gets back an error, and if there's a pending exit event for
the pid we tried to detach, then just ignore the error and pretend the
detach worked fine.
We could possibly check for a pending exit event before sending the
detach packet, however, I believe that it might be possible (in
non-stop mode) for the stop notification to arrive after the detach is
sent, but before gdbserver has started processing the detach. In this
case we would still need to check for pending stop events after seeing
the detach fail, so I figure there's no point having two checks -- we
just send the detach request, and if it fails, check to see if the
process has already exited.
Testing
=======
In order to test this issue I needed to ensure that the exit event
arrives at the same time as the detach call. The window of
opportunity for getting the exit to arrive is so small I've never
managed to trigger this in real use -- I originally spotted this issue
while working on another patch, which did manage to trigger this
issue.
However, if we trigger both the exit and the detach from a single
Python function then we never return to GDB's event loop, as such GDB
never processes the exit event, and so the first time GDB gets a
chance to see the exit is during the detach call. And so that is the
approach I've taken for testing this patch.
Tested-By: Kevin Buettner <kevinb@redhat.com>
Approved-By: Kevin Buettner <kevinb@redhat.com>
|
|
I noticed that in handle_v_run (gdbserver/server.cc) we leak
new_program_name (a string) each time GDB starts an inferior, in the
case where GDB passes a program name to gdbserver.
This bug was introduced with this commit:
commit 7ab2607f97e5deaeae91018edf3ef5b4255a842c
Date: Wed Apr 13 17:31:02 2022 -0400
gdbsupport: make gdb_abspath return an std::string
When gdbserver receives a program name from GDB, this is first placed
into a malloc'd buffer within handle_v_run, and this buffer is then
used in this call:
program_path.set (new_program_name);
Prior to the above commit this call took ownership of the buffer
passed to it, but now this call uses the buffer to initialise a
std::string, which copies the buffer contents, leaving ownership with
the caller. So now, after this call (in handle_v_run)
new_program_name still owns a buffer.
At no point in handle_v_run do we free new_program_name, as a result
we are leaking the program name each time GDB starts a remote
inferior.
I could solve this by adding a 'free' call into handle_v_run, but I'd
rather automate the memory management.
So, to this end, I have added a new function in gdbserver/server.cc,
decode_v_run_arg. This function takes care of allocating the memory
buffer and decoding the vRun packet into the buffer, but returns a
gdb::unique_xmalloc_ptr<char> (or nullptr on error).
Back in handle_v_run I have converted new_program_name to also be a
gdb::unique_xmalloc_ptr<char>.
Now, after we call program_path.set(), the allocated buffer will be
automatically released when it is no longer needed.
It is worth highlighting that within the new decode_v_run_arg
function, I have wrapped the call to hex2bin in a try/catch block.
The hex2bin function can throw an exception if it encounters an
invalid (non-hex) character. Back in handle_v_run, we have a local
argument new_argv, which is of type std::vector<char *>. Each
'char *' in this vector is a malloc'd buffer. If we allow
hex2bin to throw an exception and don't catch it in either
decode_v_run_arg or handle_v_run then we are going to leak memory from
new_argv.
I chose to catch the exception in decode_v_run_arg, this seemed
cleanest, but I'm not sure it really matters, so long as the exception
is caught before we leave handle_v_run. I am working on a patch that
changes new_argv to automatically manage its memory, but that isn't
ready for posting yet. I think what I have here would be fine if my
follow on patch never arrives.
Additionally, within the handle_v_run loop I have changed an
assignment of nullptr to new_program_name into an assert. Previously,
the assignment could only trigger on the first iteration of the loop,
if we had no new program name to assign. However, new_program_name
always starts as nullptr, so, on the first loop iteration, if we have
nothing to assign to new_program_name, its value must already be
nullptr.
There should be no user visible changes after this commit.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
A user pointed out that the -lsocket check in gdb should also apply to
gdbserver -- otherwise it can't find the Solaris socketpair. This
patch makes the change. It also removes a couple of redundant
function checks from gdb's configure.ac.
This was tested by the person who reported the bug.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30927
Approved-By: Pedro Alves <pedro@palves.net>
|
|
After this commit:
commit 6a65998a8a94abaaae7ca4ff0ab9c3f25dc2e766
Date: Mon Sep 11 12:42:00 2023 +0100
Convert tdesc's expedite_regs to a string vector
The risc-v, loongarch, and csky gdbserver builds were broken. A use
of target_desc::expedite_regs (for each architecture) was not updated
to take account of the type change.
I've tested that this fixes the risc-v build. I haven't tested the
other architectures, but they should be fine.
|
|
After the previous commit there is now a redundant string copy in
handle_v_run, this commit cleans that up.
There should be no functional change after this commit.
During review I was pointed to this older series:
https://inbox.sourceware.org/gdb-patches/20211022071933.3478427-1-m.weghorn@posteo.de/
which also includes this fix as part of a larger set of changes. I'm
giving a Co-Authored-By credit to the author of that original series.
I believe this smaller fix brings some benefits on its own, though the
original series does offer additional improvements. Once this is
merged I'll take a look at rebasing and resubmitting the original series.
Co-Authored-By: Michael Weghorn <m.weghorn@posteo.de>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Similarly to how single quotes were mishandled, which was fixed two
commits ago, this commit fixes handling of newlines in arguments
passed to gdbserver.
We already had a test that covered this, gdb.base/args.exp, which,
when run with the native-extended-gdbserver board contained several
KFAIL covering this situation.
In this commit I remove the unnecessary, attempt to quote incoming
newlines within arguments, and do some minimal cleanup of the related
code. There is additional cleanup that can be done, but I'm leaving
that for the next commit.
Then I've removed the KFAIL from the test case, and performed some
minimal cleanup there too.
After this commit the gdb.base/args.exp is 100% passing with the
native-extended-gdbserver board file.
During review I was pointed to this older series:
https://inbox.sourceware.org/gdb-patches/20211022071933.3478427-1-m.weghorn@posteo.de/
which also includes this fix as part of a larger set of changes. I'm
giving a Co-Authored-By credit to the author of that original series.
I believe this smaller fix brings some benefits on its own, though the
original series does offer additional improvements. Once this is
merged I'll take a look at rebasing and resubmitting the original series.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27989
Co-Authored-By: Michael Weghorn <m.weghorn@posteo.de>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
When I posted the previous patch for review Andreas Schwab pointed out
that passing a trailing empty argument also doesn't work.
The fix for this is in the same area of code as the previous patch,
but is sufficiently different that I felt it deserved a patch of its
own.
I noticed that passing arguments containing single quotes to gdbserver
didn't work correctly:
gdb -ex 'set sysroot' --args /tmp/show-args
Reading symbols from /tmp/show-args...
(gdb) target extended-remote | gdbserver --once --multi - /tmp/show-args
Remote debugging using | gdbserver --once --multi - /tmp/show-args
stdin/stdout redirected
Process /tmp/show-args created; pid = 176054
Remote debugging using stdio
Reading symbols from /lib64/ld-linux-x86-64.so.2...
(No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
0x00007ffff7fd3110 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) set args abc ""
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /tmp/show-args \'
stdin/stdout redirected
Process /tmp/show-args created; pid = 176088
2 args are:
/tmp/show-args
abc
Done.
[Inferior 1 (process 176088) exited normally]
(gdb) target native
Done. Use the "run" command to start a process.
(gdb) run
Starting program: /tmp/show-args \'
2 args are:
/tmp/show-args
abc
Done.
[Inferior 1 (process 176095) exited normally]
(gdb) q
The 'shows-args' program used here just prints the arguments passed to
the inferior.
Notice that when starting the inferior using the extended-remote
target there is only a single argument 'abc', while when using the
native target there is a second argument, the blank line, representing
the empty argument.
The problem here is that the vRun packet coming from GDB looks like
this (I've removing the trailing checksum):
$vRun;PROGRAM_NAME;616263;
If we compare this to a packet with only a single argument and no
trailing empty argument:
$vRun;PROGRAM_NAME;616263
Notice the lack of the trailing ';' character here.
The problem is that gdbserver processes this string in a loop. At
each point we maintain a pointer to the character just after a ';',
and then we process everything up to either the next ';' character, or
to the end of the string.
We break out of this loop when the character we start with (in that
loop iteration) is the null-character. This means in the trailing
empty argument case, we abort the loop before doing anything with the
empty argument.
In this commit I've updated the loop, we now break out using a 'break'
statement at the end of the loop if the (sub-)string we just processed
was empty, with this change we now notice the trailing empty
argument.
I've updated the test case to cover this issue.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
I noticed that passing arguments containing single quotes to gdbserver
didn't work correctly:
gdb -ex 'set sysroot' --args /tmp/show-args
Reading symbols from /tmp/show-args...
(gdb) target extended-remote | gdbserver --once --multi - /tmp/show-args
Remote debugging using | gdbserver --once --multi - /tmp/show-args
stdin/stdout redirected
Process /tmp/show-args created; pid = 176054
Remote debugging using stdio
Reading symbols from /lib64/ld-linux-x86-64.so.2...
(No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
0x00007ffff7fd3110 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) set args \'
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /tmp/show-args \'
stdin/stdout redirected
Process /tmp/show-args created; pid = 176088
2 args are:
/tmp/show-args
\'
Done.
[Inferior 1 (process 176088) exited normally]
(gdb) target native
Done. Use the "run" command to start a process.
(gdb) run
Starting program: /tmp/show-args \'
2 args are:
/tmp/show-args
'
Done.
[Inferior 1 (process 176095) exited normally]
(gdb) q
The 'shows-args' program used here just prints the arguments passed to
the inferior.
Notice that when starting the inferior using the extended-remote
target the second argument is "\'", while when running using native
target the argument is "'". The second of these is correct, the \'
used with the "set args" command is just to show GDB that the single
quote is not opening an argument string.
It turns out that the extra backslash is injected on the gdbserver
side when gdbserver processes the arguments that GDB passes it, the
code that does this was added as part of this much larger commit:
commit 2090129c36c7e582943b7d300968d19b46160d84
Date: Thu Dec 22 21:11:11 2016 -0500
Share fork_inferior et al with gdbserver
In this commit I propose removing the specific code that adds what I
believe is a stray backslash. I've extended an existing test to cover
this case, and I now see identical behaviour when using an
extended-remote target as with the native target.
This partially fixes PR gdb/27989, though there are still some issues
with newline handling which I'll address in a later commit.
During review I was pointed to this older series:
https://inbox.sourceware.org/gdb-patches/20211022071933.3478427-1-m.weghorn@posteo.de/
which also includes this fix as part of a larger set of changes. I'm
giving a Co-Authored-By credit to the author of that original series.
I believe this smaller fix brings some benefits on its own, though the
original series does offer additional improvements. Once this is
merged I'll take a look at rebasing and resubmitting the original series.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27989
Co-Authored-By: Michael Weghorn <m.weghorn@posteo.de>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This patch teaches gdbserver about the SME2 and the ZT0 register.
Since most of the code used by gdbserver for SME2 is shared with gdb, this
is a rather small patch that reuses most of the code put in place for native
AArch64 Linux.
Validated under Fast Models.
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
|
|
Enable SME support in gdbserver by adjusting the usual fields. There is
not much to this patch because the code is either in gdb or it is shared
between gdbserver and gdb. One exception is the bump to gdbserver's
PBUFSIZ from 18432 to 131104.
Since the ZA register can be quite big (256 * 256 bytes), the g/G remote
packet will also become quite big
From gdbserver/tdesc.cc:init_target_desc, I estimated the new size should
be at least (2 * 256 * 256 + 32), which yields 131104.
It is also unlikely we will find a process starting up with SVL set to 256.
Ideally we'd adjust the packet size dynamically based on what we need, but
for now this should do.
Please note we have the same limitation for SME that we have for SVE, and
that is the fact gdbserver cannot communicate vector length changes to gdb
via the remote protocol.
Thiago is working on this improvement, which hopefully will be able to be
adapted to SME in an easy way.
Co-Authored-By: Ezra Sitorus <ezra.sitorus@arm.com>
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
|
|
Instead of using static arrays, build the list of expedited registers
dynamically using a std::vector.
This refactor shouldn't cause any user-visible changes.
Regression-tested for aarch64-linux Ubuntu 22.04/20.04.
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
|
|
Right now the list of expedited registers is stored as an array of char *,
with a nullptr element at the end to signal its last element.
Convert expedite_regs to a std::vector of std::string so it is easier to
manage the elements and the storage is handled automatically.
Eventually we might want to convert all the target functions so they pass a
std::vector of std::string as well. Or maybe expose an interface that target can
use to add expedited registers on-by-one depending on the target description
discovery needs, as opposed to just a static list of char *.
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The SME (Scalable Matrix Extension) [1] exposes a new matrix register ZA with
variable sizes. It also exposes a new mode called streaming mode.
Similarly to SVE, the ZA register size is dictated by a vector length, but the
SME vector length is called streaming vetor length. The total size for
ZA in a given moment is svl x svl.
In streaming mode, the SVE registers have their sizes based on svl rather than
the regular vector length (vl).
The feature detection is controlled by the HWCAP2_SME bit, but actual support
should be validated by attempting a ptrace call for one of the new register
sets: NT_ARM_ZA and NT_ARM_SSVE.
Due to its large size, the ZA register is exposed as a vector of bytes, but we
introduce a number of pseudo-registers that gives various different views
into the ZA contents. These can be arranged in a couple categories: tiles and
tile slices.
Tiles are matrices the same size or smaller than ZA. Tile slices are vectors
which map to ZA's rows/columns in different ways.
A new dynamic target description is provided containing the ZA register, the SVG
register and the SVCR register. The size of ZA, like the SVE vector registers,
is based on the vector length register SVG (VG for SVE).
This patch enables SME register support for gdb.
[1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/scalable-matrix-extension-armv9-a-architecture
Co-Authored-By: Ezra Sitorus <ezra.sitorus@arm.com>
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
|
|
This is a patch in preparation to upcoming patches enabling SME support. It
attempts to simplify the gdb/gdbserver shared interface used to read/write
SVE registers.
Where the current code makes use of unique_ptr, allocating a new buffer by
hand and passing a buffer around, this patch makes that code use
gdb::byte_vector and passes a reference to this byte vector to the functions,
allowing the functions to have ready access to the size of the buffer.
It also shares a bit more code between gdb and gdbserver, in particular around
handling of ptrace get/set requests for SVE.
I think gdbserver could be refactored to handle register reads/writes more
like gdb's native layer as opposed to letting the generic linux-low layer do
the ptrace calls. This is not very flexible and assumes one size for the
responses. If you have something like NT_ARM_SVE, where you can have either
FPSIMD or SVE contents, it doesn't work that well.
I didn't want to change that interface right now as it is a bit too much work
and touches all the targets, some of which I can't easily test.
Hence the reason why the buffer the generic linux-now passes down to
linux-aarch64-low is unused or ignored.
No user-visible changes should happen as part of this refactor other than a
slightly reworded warning message.
While doing the refactor, I also noticed what seems to be a mistake in checking
if the register cache contains active (non-zero) SVE data.
For instance, the original code did something like this in
aarch64_sve_regs_copy_from_reg_buf:
has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_Z0_REGNUM + i
reg, sizeof (__int128_t));
"reg" is a zeroed-out buffer that we compare the Z register contents
past the first 128 bits. The problem here is that raw_compare returns
1 if the contents compare the same, which means has_sve_state will be
true. But if we compared the Z register contents to 0, it means we
*do not* have SVE state, and therefore has_sve_state should be false.
The consequence of this mistake is that we convert the initial
FPSIMD-formatted data we get from ptrace for the NT_ARM_SVE register
set to a SVE-formatted one.
In the end, this doesn't cause user-visible differences because the
values of both the Z and V registers will still be the same. But the
logic is not correct.
I used the opportunity to fix this, and it gets tested later on by
the additional SME tests.
I do plan on submitting some SVE-specific tests to make sure we have
a bit more coverage in GDB's testsuite.
Regression-tested on aarch64-linux Ubuntu 22.04/20.04.
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
|
|
In preparation to the SME support patches, rename the SVE-specific files to
something a bit more meaningful that can be shared with the SME code.
In this case, I've renamed the "sve" in the names to "scalable".
No functional changes.
Regression-tested on aarch64-linux Ubuntu 22.04/20.04.
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
|
|
I noticed a comment by an include and remembered that I think these
don't really provide much value -- sometimes they are just editorial,
and sometimes they are obsolete. I think it's better to just remove
them. Tested by rebuilding.
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
On a machine with AVX512 support (AMD EPYC 9634), I see these failures:
$ make check TESTS="gdb.arch/i386-avx512.exp" RUNTESTFLAGS="--target_board=native-gdbserver"
...
FAIL: gdb.arch/i386-avx512.exp: check contents of zmm_data[16] after writing ZMM regs
FAIL: gdb.arch/i386-avx512.exp: check contents of zmm_data[17] after writing ZMM regs
FAIL: gdb.arch/i386-avx512.exp: check contents of zmm_data[18] after writing ZMM regs
...
The problem can be reduced to:
(gdb) print $zmm16.v8_int64
$1 = {0, 0, 0, 0, 0, 0, 0, 0}
(gdb) print $zmm16.v8_int64 = {11,22,33,44,55,66,77,88}
$2 = {11, 22, 33, 44, 55, 66, 77, 88}
(gdb) print $zmm16.v8_int64
$3 = {11, 22, 33, 44, 55, 66, 77, 88}
(gdb) step
5 ++x;
(gdb) print $zmm16.v8_int64
$4 = {11, 22, 77, 88, 0, 0, 0, 0}
Writing to the local regcache in GDB works fine, but the writeback to
gdbserver (which happens when resuming / stepping) doesn't work (the
code being stepped doesn't touch AVX registers, so we don't expect the
value of zmm16 to change when stepping).
The problem is on the gdbserver side, the zmmh and ymmh portions of the
zmm register are not memcpied at the right place in the xsave buffer. Fix
that. Note now how the two modified memcpy calls match the memcmp calls
just above them.
With this patch, gdb.arch/i386-avx512.exp passes completely for me.
Change-Id: I22c417e0f5e88d4bc635a0f08f8817a031c76433
Reviewed-by: John Baldwin <jhb@FreeBSD.org>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30818
|
|
This safeguards a couple of places that may theoretically return NULL but
must not in this specific context. These were found by a static analysis tool.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
|
|
- Reuse num_xmm_registers directly for the count of ZMM0-15 registers
as is already done for the YMM registers for AVX rather than using
a new variable that is always the same.
- Replace 3 identical variables for the count of upper ZMM16-31
registers with a single variable. Make use of this to merge
various loops working on the ZMM XSAVE region so that all of the
handling for the various sub-registers in this region are always
handled in a single loop.
- While here, fix some bugs in i387_cache_to_xsave where if
X86_XSTATE_ZMM was set on i386 (e.g. a 32-bit process on a 64-bit
kernel), the -1 register nums would wrap around and store the value
of GPRs in the XSAVE area. This should be harmless, but is
definitely odd. Instead, check num_zmm_high_registers directly when
checking X86_XSTATE_ZMM and skip the ZMM region handling entirely if
the register count is 0.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Replace the extended state area fields of i387_xsave with methods which
return an offset into the XSAVE buffer.
The two changed functions are called within all tests which runs
gdbserver.
Signed-off-by: Aleksandar Paunovic <aleksandar.paunovic@intel.com>
Co-authored-by: John Baldwin <jhb@FreeBSD.org>
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Legacy fields of the XSAVE area are already defined within fx_save
struct. Use class inheritance to remove code duplication.
The two changed functions are called within all tests which run
gdbserver.
Signed-off-by: Aleksandar Paunovic <aleksandar.paunovic@intel.com>
Co-authored-by: John Baldwin <jhb@FreeBSD.org>
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Make x86_xcr0 private to i387-fp.cc and use i387_set_xsave_mask to set
the value instead. Add a static global instance of x86_xsave_layout
and initialize it in the new function as well to be used in a future
commit to parse XSAVE extended state regions.
Update the Linux port to use this function rather than setting
x86_xcr0 directly. In the case that XML is not supported, don't
bother setting x86_xcr0 to the default value but just omit the call to
i387_set_xsave_mask as i387-fp.cc defaults to the SSE case used for
non-XML.
In addition, use x86_xsave_length to determine the size of the XSAVE
register set via CPUID.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
resume_stopped_resumed_lwps
At the moment, while performing a software single-step, gdbserver fails
to reinsert software single-step breakpoints for a LWP when
interrupted by a signal in another thread. This commit fixes this
problem by reinstalling software single-step breakpoints in
linux_process_target::resume_stopped_resumed_lwps in
gdbserver/linux-low.cc.
This bug was discovered due to a failing assert in maybe_hw_step()
in gdbserver/linux-low.cc. Looking at the backtrace revealed
that the caller was linux_process_target::resume_stopped_resumed_lwps.
I was uncertain whether the assert should still be valid when called
from that method, so I tried hoisting the assert from maybe_hw_step
to all callers except resume_stopped_resumed_lwps. But running the
new test case, described below, showed that merely eliminating the
assert for this case was NOT a good fix - a study of the log file for
the test showed that the single-step operation failed to occur.
Instead GDB (via gdbserver) stopped at the next breakpoint that was
hit.
Zhiyong Yan had proposed a fix which resinserted software single-step
breakpoints, albeit at a different location in linux-low.cc. Testing
revealed that, while running gdb.threads/pending-fork-event-detach,
the executable associated with that test would die due to a SIGTRAP
after the test program was detached. Examination of the core file(s)
showed that a breakpoint instruction had been left in program memory.
Test results were otherwise very good, so Zhiyong was definitely on
the right track!
This commit causes software single-step breakpoint(s) to be inserted
before the call to maybe_hw_step in resume_stopped_resumed_lwps. This
will cause 'has_single_step_breakpoints (thread)' to be true, so that
the assert in maybe_hw_step...
/* GDBserver must insert single-step breakpoint for software
single step. */
gdb_assert (has_single_step_breakpoints (thread));
...will no longer fail. And better still, the single-step breakpoints
are reinstalled, so that stepping will actually work, even when
interrupted.
The C code for the test case was loosely adapted from the reproducer
provided in Zhiyong's bug report for this problem. The .exp file was
copied from next-fork-other-thread.exp and then tweaked slightly. As
noted in a comment in next-fork-exec-other-thread.exp, I had to remove
"on" from the loop for non-stop as it was failing on all architectures
(including x86-64) that I tested. I have a feeling that it ought to
work, but this can be investigated separately and (re)enabled once it
works. I also increased the number of iterations for the loop running
the "next" commands. I've had some test runs which don't show the bug
until the loop counter exceeded 100 iterations. The C file for the
new test uses shorter delays than next-fork-other-thread.c though, so
it doesn't take overly long (IMO) to run this new test.
Running the new test on a Raspberry Pi w/ a 32-bit (Arm) kernel and
userland using a gdbserver build without the fix in this commit shows
the following results:
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=auto: i=12: next to other line
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=on: i=9: next to other line
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=off: i=18: next to other line
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=auto: i=3: next to other line
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=on: i=11: next to other line
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=off: i=1: next to other line
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=auto: non-stop=off: displaced-stepping=auto: i=1: next to break here
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=auto: non-stop=off: displaced-stepping=on: i=3: next to break here
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=auto: non-stop=off: displaced-stepping=off: i=1: next to break here
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=auto: i=47: next to other line
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=on: i=57: next to other line
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=off: non-stop=off: displaced-stepping=auto: i=1: next to break here
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=off: non-stop=off: displaced-stepping=on: i=10: next to break here
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=off: non-stop=off: displaced-stepping=off: i=1: next to break here
=== gdb Summary ===
# of unexpected core files 12
# of expected passes 3011
# of unexpected failures 14
Each of the 12 core files were caused by the failed assertion in
maybe_hw_step in linux-low.c. These correspond to 12 of the
unexpected failures.
When the tests are run using a gdbserver build which includes the fix
in this commit, the results are significantly better, but not perfect:
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=auto: i=143: next to other line
FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=on: i=25: next to other line
=== gdb Summary ===
# of expected passes 10178
# of unexpected failures 2
I think that the two remaining failures are due to some different
problem. They are also racy - I've seen runs with no failures or only
one failure, but never more than two. Also, those runs were conducted
with the loop count in next-fork-exec-other-thread.exp set to 200.
During his testing of this fix and the new test case, Luis Machado
found that this test was taking a long time and asked about ways to
speed it up. I then conducted additional tests in which I gradually
reduced the loop count, timing each one, also noting the number of
failures. With the loop count set to 30, I found that I could still
reliably reproduce the failures that Zhiyong reported (in which, with
the proper settings, core files are created). But, with the loop
count set to 30, the other failures noted above were much less likely
to show up. Anyone wishing to investigate those other failures should
set the loop count back up to 200.
Running the new test on x86-64 and aarch64, both native and
native-gdbserver shows no failures.
Also, I see no regressions when running the entire test suite for
armv7l-unknown-linux-gnueabihf (i.e. the Raspberry Pi w/ 32-bit
kernel+userland) with --target_board=native-gdbserver. Additionally,
using --target_board=native-gdbserver, I also see no regressions for
the entire test suite for x86-64 and aarch64 running Fedora 38.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30387
Co-Authored-By: Zhiyong Yan <zhiyong.yan@windriver.com>
Tested-By: Zhiyong Yan <zhiyong.yan@windriver.com>
Tested-By: Luis Machado <luis.machado@arm.com>
|
|
It was pointed out[1] that after this commit:
commit 3812b38d8de5804ad3eadd6c7a5d532402ddabab
Date: Thu Oct 20 11:14:33 2022 +0100
gdbserver: allow agent expressions to fail with invalid memory access
Now that agent expressions might fail with the error
expr_eval_invalid_memory_access, we might overflow the
eval_result_names array in tracepoint.cc. This is because the
eval_result_names array does not include a string for either
expr_eval_invalid_goto or expr_eval_invalid_memory_access.
I don't know if having expr_eval_invalid_goto missing is also a
problem, but it feels like eval_result_names should just include a
string for every possible error.
I could just add two more strings into the array, but I figure that a
more robust solution will be to move all of the error types, and their
associated strings, into a new ax-result-types.def file, and to then
include this file in both ax.h and tracepoint.cc in order to build
the enum eval_result_type and the eval_result_names string array.
Doing this means it is impossible to have a missing error string in
the future.
[1] https://inbox.sourceware.org/gdb-patches/01059f8a-0e59-55b5-f530-190c26df5ba3@palves.net/
Approved-By: Pedro Alves <pedro@palves.net>
|