Age | Commit message (Collapse) | Author | Files | Lines |
|
Currently, every internal_error call must be passed __FILE__/__LINE__
explicitly, like:
internal_error (__FILE__, __LINE__, "foo %d", var);
The need to pass in explicit __FILE__/__LINE__ is there probably
because the function predates widespread and portable variadic macros
availability. We can use variadic macros nowadays, and in fact, we
already use them in several places, including the related
gdb_assert_not_reached.
So this patch renames the internal_error function to something else,
and then reimplements internal_error as a variadic macro that expands
__FILE__/__LINE__ itself.
The result is that we now should call internal_error like so:
internal_error ("foo %d", var);
Likewise for internal_warning.
The patch adjusts all calls sites. 99% of the adjustments were done
with a perl/sed script.
The non-mechanical changes are in gdbsupport/errors.h,
gdbsupport/gdb_assert.h, and gdb/gdbarch.py.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Change-Id: Ia6f372c11550ca876829e8fd85048f4502bdcf06
|
|
I looked through all the uses of static_cast<... *> in gdb and
converted many of them to checked_static_cast.
I couldn't test a few of these changes.
|
|
gdb_bfd.c and remote.c contain identical implementations of a
fileio_error -> errno function. Factor that out to
gdbsupport/fileio.{h,cc}.
Rename it fileio_error_to_host, for symmetry with host_to_fileio_error.
Change-Id: Ib9b8807683de2f809c94a5303e708acc2251a0df
|
|
Converting from free-form macros to an enum gives a bit of type-safety.
This caught places where we would assign host error numbers to what
should contain a target fileio error number, for instance in
target_fileio_pread.
I added the FILEIO_SUCCESS enumerator, because
remote.c:remote_hostio_parse_result initializes the remote_errno output
variable to 0. It seems better to have an explicit enumerator than to
assign a value for which there is no enumerator. I considered
initializing this variable to FILEIO_EUNKNOWN instead, such that if the
remote side replies with an error and omits the errno value, we'll get
an errno that represents an error instead of 0 (which reprensents no
error). But it's not clear what the consequences of that change would
be, so I prefer to err on the side of caution and just keep the existing
behavior (there is no intended change in behavior with this patch).
Note that remote_hostio_parse_resul still reads blindly what the remote
side sends as a target errno into this variable, so we can still end up
with a nonsensical value here. It's not good, but out of the scope of
this patch.
Convert host_to_fileio_error and fileio_errno_to_host to return / accept
a fileio_error instead of an int, and cascade the change in the whole
chain that uses that.
Change-Id: I454b0e3fcf0732447bc872252fa8e57d138b0e03
|
|
I don't see why include/gdb/fileio.h is placed there. It's not
installed by "make install", and it's not included by anything outside
of gdb/gdbserver/gdbsupport.
Move its content back to gdbsupport/fileio.h. I have omitted the bits
inside an `#if 0`, since it's obviously not used, as well as the
"limits" constants, which are also unused.
Change-Id: I6fbc2ea10fbe4cfcf15f9f76006b31b99c20e5a9
|
|
gdbarch implements its own registry-like approach. This patch changes
it to instead use registry.h. It's a rather large patch but largely
uninteresting -- it's mostly a straightforward conversion from the old
approach to the new one.
The main benefit of this change is that it introduces type safety to
the gdbarch registry. It also removes a bunch of code.
One possible drawback is that, previously, the gdbarch registry
differentiated between pre- and post-initialization setup. This
doesn't seem very important to me, though.
|
|
This changes struct objfile to use a gdb_bfd_ref_ptr. In addition to
removing some manual memory management, this fixes a use-after-free
that was introduced by the registry rewrite series. The issue there
was that, in some cases, registry shutdown could refer to memory that
had already been freed. This help fix the bug by delaying the
destruction of the BFD reference (and thus the per-bfd object) until
after the registry has been shut down.
|
|
These two macros print either a 16 digit hex number or an 8 digit
hex number. Unfortunately they depend on both target and host, which
means that the output for 32-bit targets may be either 8 or 16 hex
digits.
Replace them in most cases with code that prints a bfd_vma using
PRIx64. In some cases, deliberately lose the leading zeros.
This change some output, notably in base/offset fields of m68k
disassembly which I think looks better that way, and in error
messages. I've kept leading zeros in symbol dumps (objdump -t)
and in PE header dumps.
bfd/
* bfd-in.h (fprintf_vma, sprintf_vma, printf_vma): Delete.
* bfd-in2.h: Regenerate.
* bfd.c (bfd_sprintf_vma): Don't use sprintf_vma.
(bfd_fprintf_vma): Don't use fprintf_vma.
* coff-rs6000.c (xcoff_reloc_type_tls): Don't use sprintf_vma.
Instead use PRIx64 to print bfd_vma values.
(xcoff_ppc_relocate_section): Likewise.
* cofflink.c (_bfd_coff_write_global_sym): Likewise.
* mmo.c (mmo_write_symbols_and_terminator): Likewise.
* srec.c (srec_write_symbols): Likewise.
* elf32-xtensa.c (print_r_reloc): Similarly for fprintf_vma.
* pei-x86_64.c (pex64_dump_xdata): Likewise.
(pex64_bfd_print_pdata_section): Likewise.
* som.c (som_print_symbol): Likewise.
* ecoff.c (_bfd_ecoff_print_symbol): Use bfd_fprintf_vma.
opcodes/
* dis-buf.c (perror_memory, generic_print_address): Don't use
sprintf_vma. Instead use PRIx64 to print bfd_vma values.
* i386-dis.c (print_operand_value, print_displacement): Likewise.
* m68k-dis.c (print_base, print_indexed): Likewise.
* ns32k-dis.c (print_insn_arg): Likewise.
* ia64-gen.c (_opcode_int64_low, _opcode_int64_high): Delete.
(opcode_fprintf_vma): Delete.
(print_main_table): Use PRIx64 to print opcode.
binutils/
* od-macho.c: Replace all uses of printf_vma with bfd_printf_vma.
* objcopy.c (copy_object): Don't use sprintf_vma. Instead use
PRIx64 to print bfd_vma values.
(copy_main): Likewise.
* readelf.c (CHECK_ENTSIZE_VALUES): Likewise.
(dynamic_section_mips_val): Likewise.
(print_vma): Don't use printf_vma. Instead use PRIx64 to print
bfd_vma values.
(dump_ia64_vms_dynamic_fixups): Likewise.
(process_version_sections): Likewise.
* rddbg.c (stab_context): Likewise.
gas/
* config/tc-i386.c (offset_in_range): Don't use sprintf_vma.
Instead use PRIx64 to print bfd_vma values.
(md_assemble): Likewise.
* config/tc-mips.c (load_register, macro): Likewise.
* messages.c (as_internal_value_out_of_range): Likewise.
* read.c (emit_expr_with_reloc): Likewise.
* config/tc-ia64.c (note_register_values): Don't use fprintf_vma.
Instead use PRIx64 to print bfd_vma values.
(print_dependency): Likewise.
* listing.c (list_symbol_table): Use bfd_sprintf_vma.
* symbols.c (print_symbol_value_1): Use %p to print pointers.
(print_binary): Likewise.
(print_expr_1): Use PRIx64 to print bfd_vma values.
* write.c (print_fixup): Use %p to print pointers. Don't use
fprintf_vma.
* testsuite/gas/all/overflow.l: Update expected output.
* testsuite/gas/m68k/mcf-mov3q.d: Likewise.
* testsuite/gas/m68k/operands.d: Likewise.
* testsuite/gas/s12z/truncated.d: Likewise.
ld/
* deffilep.y (def_file_print): Don't use fprintf_vma. Instead
use PRIx64 to print bfd_vma values.
* emultempl/armelf.em (gld${EMULATION_NAME}_finish): Don't use
sprintf_vma. Instead use PRIx64 to print bfd_vma values.
* emultempl/pe.em (gld${EMULATION_NAME}_finish): Likewise.
* ldlang.c (lang_map): Use %V to print region origin.
(lang_one_common): Don't use sprintf_vma.
* ldmisc.c (vfinfo): Don't use fprintf_vma or sprintf_vma.
* pe-dll.c (pe_dll_generate_def_file): Likewise.
gdb/
* remote.c (remote_target::trace_set_readonly_regions): Replace
uses of sprintf_vma with bfd_sprintf_vma.
|
|
This rewrites registry.h, removing all the macros and replacing it
with relatively ordinary template classes. The result is less code
than the previous setup. It replaces large macros with a relatively
straightforward C++ class, and now manages its own cleanup.
The existing type-safe "key" class is replaced with the equivalent
template class. This approach ended up requiring relatively few
changes to the users of the registry code in gdb -- code using the key
system just required a small change to the key's declaration.
All existing users of the old C-like API are now converted to use the
type-safe API. This mostly involved changing explicit deletion
functions to be an operator() in a deleter class.
The old "save/free" two-phase process is removed, and replaced with a
single "free" phase. No existing code used both phases.
The old "free" callbacks took a parameter for the enclosing container
object. However, this wasn't truly needed and is removed here as
well.
|
|
This changes the parameter of target_ops::async from int to bool.
Regression tested on x86-64 Fedora 34.
|
|
This converts location_spec_to_string to a method of location_spec,
simplifying the code using it, as it no longer has to use
std::unique_ptr::get().
Change-Id: I621bdad8ea084470a2724163f614578caf8f2dd5
|
|
Currently, GDB internally uses the term "location" for both the
location specification the user input (linespec, explicit location, or
an address location), and for actual resolved locations, like the
breakpoint locations, or the result of decoding a location spec to
SaLs. This is expecially confusing in the breakpoints module, as
struct breakpoint has these two fields:
breakpoint::location;
breakpoint::loc;
"location" is the location spec, and "loc" is the resolved locations.
And then, we have a method called "locations()", which returns the
resolved locations as range...
The location spec type is presently called event_location:
/* Location we used to set the breakpoint. */
event_location_up location;
and it is described like this:
/* The base class for all an event locations used to set a stop event
in the inferior. */
struct event_location
{
and even that is incorrect... Location specs are used for finding
actual locations in the program in scenarios that have nothing to do
with stop events. E.g., "list" works with location specs.
To clean all this confusion up, this patch renames "event_location" to
"location_spec" throughout, and then all the variables that hold a
location spec, they are renamed to include "spec" in their name, like
e.g., "location" -> "locspec". Similarly, functions that work with
location specs, and currently have just "location" in their name are
renamed to include "spec" in their name too.
Change-Id: I5814124798aa2b2003e79496e78f95c74e5eddca
|
|
This changes target_pid_to_exec_file and target_ops::pid_to_exec_file
to return a "const char *". I couldn't build many of these targets,
but did examine the code by hand -- also, as this only affects the
return type, it's normally pretty safe. This brings gdb and gdbserver
a bit closer, and allows for the removal of a const_cast as well.
|
|
start_remote_1 calls remote_check_symbols after things are set up to
give the remote side a chance to look up symbols. One call to
remote_check_symbols sets the "general thread", if needed, and sends one
qSymbol packet. However, a remote target could have more than one
process on initial connection, and this would send a qSymbol for only
one of these processes.
Change it to iterate on all the target's inferiors and send a qSymbol
packet for each one.
I tested this by changing gdbserver to spawn two processes on startup:
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index 33c42714e72..9b682e9f85f 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -3939,6 +3939,7 @@ captured_main (int argc, char *argv[])
/* Wait till we are at first instruction in program. */
target_create_inferior (program_path.get (), program_args);
+ target_create_inferior (program_path.get (), program_args);
/* We are now (hopefully) stopped at the first instruction of
the target process. This assumes that the target process was
Instead of hacking GDBserver, it should also be possible to test this by
starting manually two inferiors on an "extended-remote" connection,
disconnecting from GDBserver (with the disconnect command), and
re-connecting.
I was able to see qSymbol being sent for each inferior:
[remote] Sending packet: $Hgp828dc.828dc#1f
[remote] Packet received: OK
[remote] Sending packet: $qSymbol::#5b
[remote] Packet received: qSymbol:6764625f6167656e745f6764625f74705f686561705f627566666572
[remote] Sending packet: $qSymbol::6764625f6167656e745f6764625f74705f686561705f627566666572#1e
[remote] Packet received: qSymbol:6e70746c5f76657273696f6e
[remote] Sending packet: $qSymbol::6e70746c5f76657273696f6e#4d
[remote] Packet received: OK
[remote] Sending packet: $Hgp828dd.828dd#21
[remote] Packet received: OK
[remote] Sending packet: $qSymbol::#5b
[remote] Packet received: qSymbol:6764625f6167656e745f6764625f74705f686561705f627566666572
[remote] Sending packet: $qSymbol::6764625f6167656e745f6764625f74705f686561705f627566666572#1e
[remote] Packet received: qSymbol:6e70746c5f76657273696f6e
[remote] Sending packet: $qSymbol::6e70746c5f76657273696f6e#4d
[remote] Packet received: OK
Note that there would probably be more work to be done to fully support
this scenario, more things that need to be done for each discovered
inferior instead of just for one.
Change-Id: I21c4ecf6367391e2e389b560f0b4bd906cf6472f
|
|
Commit 152a17495663 ("gdb: prune inferiors at end of
fetch_inferior_event, fix intermittent failure of
gdb.threads/fork-plus-threads.exp") broke some tests with the
native-gdbserver board, such as:
(gdb) PASS: gdb.base/step-over-syscall.exp: detach-on-fork=off: follow-fork=child: break cond on target : vfork: break marker
continue^M
Continuing.^M
terminate called after throwing an instance of 'gdb_exception_error'^M
I can manually reproduce the issue by running (just the commands that
the test does as a one liner):
$ ./gdb -q --data-directory=data-directory \
testsuite/outputs/gdb.base/step-over-syscall/step-over-vfork \
-ex "tar rem | ../gdbserver/gdbserver - testsuite/outputs/gdb.base/step-over-syscall/step-over-vfork" \
-ex "b main" \
-ex c \
-ex "d 1" \
-ex "set displaced-stepping off" \
-ex "b *0x7ffff7d7ac5a if main == 0" \
-ex "set detach-on-fork off" \
-ex "set follow-fork-mode child" \
-ex c \
-ex "inferior 1" \
-ex "b marker" \
-ex c
... where 0x7ffff7d7ac5a is the exact address of the vfork syscall
(which can be found by looking at gdb.log).
The important part of the above is that a vfork syscall creates inferior
2, then inferior 2 executes until exit, then we switch back to inferior
1 and try to resume it.
The uncaught exception happens here:
#4 0x00005596969d81a9 in error (fmt=0x559692da9e40 "Cannot execute this command while the target is running.\nUse the \"interrupt\" command to stop the target\nand then try again.")
at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:43
#5 0x0000559695af6f66 in remote_target::putpkt_binary (this=0x617000038080, buf=0x559692da4380 "qSymbol::", cnt=9) at /home/simark/src/binutils-gdb/gdb/remote.c:9560
#6 0x0000559695af6aaf in remote_target::putpkt (this=0x617000038080, buf=0x559692da4380 "qSymbol::") at /home/simark/src/binutils-gdb/gdb/remote.c:9518
#7 0x0000559695ab50dc in remote_target::remote_check_symbols (this=0x617000038080) at /home/simark/src/binutils-gdb/gdb/remote.c:5141
#8 0x0000559695b3cccf in remote_new_objfile (objfile=0x0) at /home/simark/src/binutils-gdb/gdb/remote.c:14600
#9 0x0000559693bc52a9 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x61b0000167f8: 0x559695b3cb1d <remote_new_objfile(objfile*)>) at /usr/include/c++/11.2.0/bits/invoke.h:61
#10 0x0000559693bb2848 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x61b0000167f8: 0x559695b3cb1d <remote_new_objfile(objfile*)>) at /usr/include/c++/11.2.0/bits/invoke.h:111
#11 0x0000559693b8dddf in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7ffe0bae0590: 0x0) at /usr/include/c++/11.2.0/bits/std_function.h:291
#12 0x00005596956374b2 in std::function<void (objfile*)>::operator()(objfile*) const (this=0x61b0000167f8, __args#0=0x0) at /usr/include/c++/11.2.0/bits/std_function.h:560
#13 0x0000559695633c64 in gdb::observers::observable<objfile*>::notify (this=0x55969ef5c480 <gdb::observers::new_objfile>, args#0=0x0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/observable.h:150
#14 0x0000559695df6cc2 in clear_symtab_users (add_flags=...) at /home/simark/src/binutils-gdb/gdb/symfile.c:2873
#15 0x000055969574c263 in program_space::~program_space (this=0x6120000c8a40, __in_chrg=<optimized out>) at /home/simark/src/binutils-gdb/gdb/progspace.c:154
#16 0x0000559694fc086b in delete_inferior (inf=0x61700003bf80) at /home/simark/src/binutils-gdb/gdb/inferior.c:205
#17 0x0000559694fc341f in prune_inferiors () at /home/simark/src/binutils-gdb/gdb/inferior.c:390
#18 0x0000559695017ada in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4293
#19 0x0000559694f629e6 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:41
#20 0x0000559695b3b0e3 in remote_async_serial_handler (scb=0x6250001ef100, context=0x6170000380a8) at /home/simark/src/binutils-gdb/gdb/remote.c:14466
#21 0x0000559695c59eb7 in run_async_handler_and_reschedule (scb=0x6250001ef100) at /home/simark/src/binutils-gdb/gdb/ser-base.c:138
#22 0x0000559695c5a42a in fd_event (error=0, context=0x6250001ef100) at /home/simark/src/binutils-gdb/gdb/ser-base.c:189
#23 0x00005596969d9ebf in handle_file_event (file_ptr=0x60700005af40, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:574
#24 0x00005596969da7fa in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:700
#25 0x00005596969d8539 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
If I enable "set debug infrun" just before the last continue, we see:
(gdb) continue
Continuing.
[infrun] clear_proceed_status_thread: 965604.965604.0
[infrun] proceed: enter
[infrun] proceed: addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT
[infrun] scoped_disable_commit_resumed: reason=proceeding
[infrun] start_step_over: enter
[infrun] start_step_over: stealing global queue of threads to step, length = 0
[infrun] operator(): step-over queue now empty
[infrun] start_step_over: exit
[infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [965604.965604.0] at 0x7ffff7d7ac5c
[infrun] do_target_resume: resume_ptid=965604.0.0, step=0, sig=GDB_SIGNAL_0
[infrun] prepare_to_wait: prepare_to_wait
[infrun] reset: reason=proceeding
[infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target remote
[infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target remote
[infrun] proceed: exit
[infrun] fetch_inferior_event: enter
[infrun] scoped_disable_commit_resumed: reason=handling event
[infrun] do_target_wait: Found 2 inferiors, starting at #1
[infrun] random_pending_event_thread: None found.
[infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
[infrun] print_target_wait_results: 965604.965604.0 [Thread 965604.965604],
[infrun] print_target_wait_results: status->kind = VFORK_DONE
[infrun] handle_inferior_event: status->kind = VFORK_DONE
[infrun] context_switch: Switching context from 0.0.0 to 965604.965604.0
[infrun] handle_vfork_done: not waiting for a vfork-done event
[infrun] start_step_over: enter
[infrun] start_step_over: stealing global queue of threads to step, length = 0
[infrun] operator(): step-over queue now empty
[infrun] start_step_over: exit
[infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [965604.965604.0] at 0x7ffff7d7ac5c
[infrun] do_target_resume: resume_ptid=965604.0.0, step=0, sig=GDB_SIGNAL_0
[infrun] prepare_to_wait: prepare_to_wait
[infrun] reset: reason=handling event
[infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target remote
[infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target remote
terminate called after throwing an instance of 'gdb_exception_error'
What happens is:
- After doing the "continue" on inferior 1, the remote target gives us
a VFORK_DONE event. The core ignores it and resumes inferior 1.
- Since prune_inferiors is now called after each handled event, in
fetch_inferior_event, it is called after we handled that VFORK_DONE
event and resumed inferior 1.
- Inferior 2 is pruned, which (see backtrace above) causes its program
space to be deleted, which clears the symtabs for that program space,
which calls the new_objfile observable and remote_new_objfile
observer (with a nullptr objfile, to indicate that the previously
loaded symbols have been discarded), which calls
remote_check_symbols.
remote_check_symbols is the function that sends the qSymbol packet, to
let the remote side ask for symbol addresses. The problem is that the
remote target is working in all-stop / sync mode and is currently
resumed. It has sent a vCont packet to resume the target and is waiting
for a stop reply. It can't send any packets in the mean time. That
causes the exception to be thrown.
This wasn't a problem before, when prune_inferiors was called in
normal_stop, because it was always called at a time the target was not
resumed.
An important observation here is that the new_objfile observable is
invoked for a change in inferior 2's program space (inferior 2's program
space is the current program space). Inferior 2 isn't bound to any
process on the remote side (it has exited, that's why it's being
pruned). It doesn't make sense to try to send a qSymbol packet for a
process that doesn't exist on the remote side. remote_check_symbols
actually attempts to avoid that:
/* The remote side has no concept of inferiors that aren't running
yet, it only knows about running processes. If we're connected
but our current inferior is not running, we should not invite the
remote target to request symbol lookups related to its
(unrelated) current process. */
if (!target_has_execution ())
return;
The problem here is that while inferior 2's program space is the current
program space, inferior 1 is the current inferior. So the check above
passes, since inferior has execution. We therefore try to send a
qSymbol packet for inferior 1 in reaction to a change in inferior 2's
program space, that's wrong.
This exposes a conceptual flaw in remote_new_objfile. The "new_objfile"
event concerns a specific program space, which can concern multiple
inferiors, as inferiors can share a program space. We shouldn't
consider the current inferior at all, but instead all inferiors bound to
the affected program space. Especially since the current inferior can
be unrelated to the current program space at that point.
To be clear, we are in this state because ~program_space sets itself as
the current program space, but there is no more inferior having that
program space to switch to, inferior 2 has already been unlinked.
To fix this, make remote_new_objfile iterate on all inferiors bound to
the affected program space. Remove the target_has_execution check from
remote_check_symbols, replace it with an assert. All callers must
ensure that the current inferior has execution before calling it.
Change-Id: Ica643145bcc03115248290fd310cadab8ec8371c
|
|
Because the actual construction of a breakpoint is buried deep in
create_breakpoint, at present it's necessary to have a new bp_
enumerator constant any time a new subclass is needed. Static marker
tracepoints are one such case, so this patch introduces
bp_static_marker_tracepoint and updates various spots to recognize it.
|
|
The current target_resume interface is a bit odd & non-intuitive.
I've found myself explaining it a couple times the recent past, while
reviewing patches that assumed STEP/SIGNAL always applied to the
passed in PTID. It goes like this today:
- if the passed in PTID is a thread, then the step/signal request is
for that thread.
- otherwise, if PTID is a wildcard (all threads or all threads of
process), the step/signal request is for inferior_ptid, and PTID
indicates which set of threads run free.
Because GDB always switches the current thread to "leader" thread
being resumed/stepped/signalled, we can simplify this a bit to:
- step/signal are always for inferior_ptid.
- PTID indicates the set of threads that run free.
Still not ideal, but it's a minimal change and at least there are no
special cases this way.
That's what this patch does. It renames the PTID parameter to
SCOPE_PTID, adds some assertions to target_resume, and tweaks
target_resume's description. In addition, it also renames PTID to
SCOPE_PTID in the remote and linux-nat targets, and simplifies their
implementation a little bit. Other targets could do the same, but
they don't have to.
Change-Id: I02a2ec2ab3a3e9b191de1e9a84f55c17cab7daaf
|
|
The following behaviour was observed in GDB:
(gdb) show remote X-packet
Support for the `p' packet is auto-detected, currently unknown.
Note the message mentions the 'p' packet. This is a regression since
this commit:
commit 8579fd136a614985bd27f20539c7bb7c5a51287d
Date: Mon Nov 8 14:58:46 2021 +0000
gdb/gdbsupport: make xstrprintf and xstrvprintf return a unique_ptr
Before this commit the behaviour was:
(gdb) show remote X-packet
Support for the `X' packet is auto-detected, currently unknown.
The problem was caused by a failed attempt to ensure that some
allocated strings were deleted when GDB exits. The code in the above
commit attempted to make use of 'static' to solve this problem,
however, the solution was just wrong.
In this new commit I instead allocate a static vector into which all
the allocated strings are stored, this ensures the strings are
released when GDB exits (which makes output from tools like valgrind
cleaner), but each string within the vector can be unique, which fixes
the regression.
|
|
Remove all macros related to getting and setting some symbol value:
#define SYMBOL_VALUE(symbol) (symbol)->value.ivalue
#define SYMBOL_VALUE_ADDRESS(symbol) \
#define SET_SYMBOL_VALUE_ADDRESS(symbol, new_value) \
#define SYMBOL_VALUE_BYTES(symbol) (symbol)->value.bytes
#define SYMBOL_VALUE_COMMON_BLOCK(symbol) (symbol)->value.common_block
#define SYMBOL_BLOCK_VALUE(symbol) (symbol)->value.block
#define SYMBOL_VALUE_CHAIN(symbol) (symbol)->value.chain
#define MSYMBOL_VALUE(symbol) (symbol)->value.ivalue
#define MSYMBOL_VALUE_RAW_ADDRESS(symbol) ((symbol)->value.address + 0)
#define MSYMBOL_VALUE_ADDRESS(objfile, symbol) \
#define BMSYMBOL_VALUE_ADDRESS(symbol) \
#define SET_MSYMBOL_VALUE_ADDRESS(symbol, new_value) \
#define MSYMBOL_VALUE_BYTES(symbol) (symbol)->value.bytes
#define MSYMBOL_BLOCK_VALUE(symbol) (symbol)->value.block
Replace them with equivalent methods on the appropriate objects.
Change-Id: Iafdab3b8eefc6dc2fd895aa955bf64fafc59ed50
|
|
Someone at IRC spotted a bug in qRcmd handling. This looks like an oversight
or it is that way for historical reasons.
The code in gdb/remote.c:remote_target::rcmd uses isdigit instead of
isxdigit. One could argue that we are expecting decimal numbers, but further
below we use fromhex ().
Update the function to use isxdigit instead and also update the documentation.
I see there are lots of other cases of undocumented number format for error
messages, mostly described as NN instead of nn. For now I'll just update
this particular function.
|
|
target_waitstatus::child_ptid if kind == TARGET_WAITKIND_THREAD_EXITED
Following the previous patch, running
gdb.threads/forking-threads-plus-breakpoints.exp continuously eventually
gives me an internal error.
gdb/target/waitstatus.h:372: internal-error: child_ptid: Assertion `m_kind == TARGET_WAITKIND_FORKED || m_kind == TARGET_WAITKIND_VFORKED' failed.^M
FAIL: gdb.threads/forking-threads-plus-breakpoint.exp: cond_bp_target=0: detach_on_fork=on: displaced=off: inferior 1 exited (GDB internal error)
The backtrace is:
0x55925b679c85 internal_error(char const*, int, char const*, ...)
/home/simark/src/binutils-gdb/gdbsupport/errors.cc:55
0x559258deadd2 target_waitstatus::child_ptid() const
/home/simark/src/binutils-gdb/gdb/target/waitstatus.h:372
0x55925a7cbac9 remote_target::remove_new_fork_children(threads_listing_context*)
/home/simark/src/binutils-gdb/gdb/remote.c:7311
0x55925a79dfdb remote_target::update_thread_list()
/home/simark/src/binutils-gdb/gdb/remote.c:3981
0x55925ad79b83 target_update_thread_list()
/home/simark/src/binutils-gdb/gdb/target.c:3793
0x55925addbb15 update_thread_list()
/home/simark/src/binutils-gdb/gdb/thread.c:2031
0x559259d64838 stop_all_threads(char const*, inferior*)
/home/simark/src/binutils-gdb/gdb/infrun.c:5104
0x559259d88b45 keep_going_pass_signal
/home/simark/src/binutils-gdb/gdb/infrun.c:8215
0x559259d8951b keep_going
/home/simark/src/binutils-gdb/gdb/infrun.c:8251
0x559259d78835 process_event_stop_test
/home/simark/src/binutils-gdb/gdb/infrun.c:6858
0x559259d750e9 handle_signal_stop
/home/simark/src/binutils-gdb/gdb/infrun.c:6580
0x559259d6c07b handle_inferior_event
/home/simark/src/binutils-gdb/gdb/infrun.c:5832
0x559259d57db8 fetch_inferior_event()
/home/simark/src/binutils-gdb/gdb/infrun.c:4222
Indeed, the code accesses target_waitstatus::child_ptid when the kind
is TARGET_WAITKIND_THREAD_EXITED, which is not right. A
TARGET_WAITKIND_THREAD_EXITED event does not have a child_ptid value
associated, it has an exit status, which we are not interested in. The
intent is to remove from the thread list the thread that has exited.
Its ptid is found in the stop reply event, get it from there.
Change-Id: Icb298cbb80b8779fdf0c660dde9a5314d5591535
|
|
Add a "reason" parameter, only used to show in debug messages what is
the reason for stopping all threads. This helped me understand the
debug logs while adding some new uses of stop_all_threads, so I am
proposing to merge it.
Change-Id: I66c8c335ebf41836a7bc3d5fe1db92c195f65e55
|
|
Now that filtered and unfiltered output can be treated identically, we
can unify the printf family of functions. This is done under the name
"gdb_printf". Most of this patch was written by script.
|
|
Now that filtered and unfiltered output can be treated identically, we
can unify the putc family of functions. This is done under the name
"gdb_putc". Most of this patch was written by script.
|
|
Now that filtered and unfiltered output can be treated identically, we
can unify the puts family of functions. This is done under the name
"gdb_puts". Most of this patch was written by script.
|
|
A number of spots call printf_unfiltered only because they are in code
that should not be interrupted by the pager. However, I believe these
cases are all handled by infrun's blanket ban on paging, and so can be
converted to the default (_filtered) API.
After this patch, I think all the remaining _unfiltered calls are ones
that really ought to be. A few -- namely in complete_command -- could
be replaced by a scoped assignment to pagination_enabled, but for the
remainder, the code seems simple enough like this.
|
|
When parsing the ptid out of a reply package, if the multi-process
extensions are not supported, use current_inferior's pid as the pid of
the reported thread, instead of inferior_ptid. This is needed because
the inferior_ptid may be null_ptid although a legit context exists,
due to a prior context switch via switch_to_inferior_no_thread.
Below is a scenario that illustrates what could go wrong. First,
setup a multi-target scenario. This is needed, because in a
multi-target setting, the inferior_ptid is cleared out before waiting
on targets. The second inferior below sits on top of a remote target.
Multi-process packets are disabled.
$ # First, spawn a process with PID 26253 to attach to later.
$ gdb-up a.out
Reading symbols from a.out...
(gdb) maint set target-non-stop on
(gdb) set remote multiprocess-feature-packet off
(gdb) start
...
(gdb) add-inferior -no-connection
[New inferior 2]
Added inferior 2
(gdb) inferior 2
[Switching to inferior 2 [<null>] (<noexec>)]
(gdb) target extended-remote | gdbserver --multi -
Remote debugging using | gdbserver --multi -
Remote debugging using stdio
(gdb) attach 26253
Attaching to Remote target
Attached; pid = 26253
[New Thread 26253]
[New inferior 3]
Reading /tmp/a.out from remote target...
...
[New Thread 26253]
...
Reading /usr/local/lib/debug/....debug from remote target...
>>> GDB seems to hang here.
After attaching to a process and reading some library files, GDB
seems to hang. One interesting thing to note is that
[New Thread 26253]
appears twice. We also see
[New inferior 3]
Running the same scenario with "debug infrun on" reveals more details.
...
(gdb) attach 26253
[infrun] scoped_disable_commit_resumed: reason=attaching
Attaching to Remote target
Attached; pid = 26253
[New Thread 26253]
[infrun] infrun_async: enable=1
[infrun] attach_command: immediately after attach:
[infrun] attach_command: thread 26253.26253.0, executing = 1, resumed = 0, state = RUNNING
[infrun] clear_proceed_status_thread: 26253.26253.0
[infrun] reset: reason=attaching
[infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target native, no resumed threads
[infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
[infrun] fetch_inferior_event: enter
[infrun] scoped_disable_commit_resumed: reason=handling event
[infrun] do_target_wait: Found 2 inferiors, starting at #1
[infrun] random_pending_event_thread: None found.
[infrun] print_target_wait_results: target_wait (-1.0.0 [Thread 0], status) =
[infrun] print_target_wait_results: 26253.26253.0 [Thread 26253],
[infrun] print_target_wait_results: status->kind = STOPPED, sig = GDB_SIGNAL_0
[infrun] handle_inferior_event: status->kind = STOPPED, sig = GDB_SIGNAL_0
[infrun] start_step_over: enter
[infrun] start_step_over: stealing global queue of threads to step, length = 0
[infrun] operator(): step-over queue now empty
[infrun] start_step_over: exit
[infrun] context_switch: Switching context from 0.0.0 to 26253.26253.0
[infrun] handle_signal_stop: stop_pc=0x7f849d8cf151
[infrun] stop_waiting: stop_waiting
[infrun] stop_all_threads: starting
[infrun] stop_all_threads: pass=0, iterations=0
[New inferior 3]
Reading /tmp/a.out from remote target...
warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
Reading /tmp/a.out from remote target...
Reading symbols from target:/tmp/a.out...
[New Thread 26253]
[infrun] stop_all_threads: 4723.4723.0 not executing
[infrun] stop_all_threads: 26253.26253.0 not executing
[infrun] stop_all_threads: 42000.26253.0 executing, need stop
[infrun] print_target_wait_results: target_wait (-1.0.0 [Thread 0], status) =
[infrun] print_target_wait_results: -1.0.0 [Thread 0],
[infrun] print_target_wait_results: status->kind = IGNORE
[infrun] print_target_wait_results: target_wait (-1.0.0 [Thread 0], status) =
[infrun] print_target_wait_results: -1.0.0 [Thread 0],
[infrun] print_target_wait_results: status->kind = IGNORE
GDB tried to stop Thread 42000.26253.0, which does not exist, and we
are waiting for a stop event that will never happen. The PID in
'42000.26253.0', namely 42000, is the PID of magic_null_ptid.
It comes from gdb/remote.c:read_ptid:
/* Since the stub is not sending a process id, then default to
what's in inferior_ptid, unless it's null at this point. If so,
then since there's no way to know the pid of the reported
threads, use the magic number. */
if (inferior_ptid == null_ptid)
pid = magic_null_ptid.pid ();
else
pid = inferior_ptid.pid ();
if (obuf)
*obuf = pp;
return ptid_t (pid, tid);
Because multi-process was turned off, GDB did not parse an explicitly
specified PID. Furthermore, inferior_ptid == null_ptid, and
eventually GDB picked the PID from magic_null_ptid.
If target-non-stop is not turned on at the beginning, the same bug
reveals itself as a duplicated thread as shown below.
# Same setup as above, without 'maint set target-non-stop on'.
...
(gdb) attach 26253
Attaching to Remote target
Attached; pid = 26253
[New inferior 3]
...
[New Thread 26253]
...
(gdb) info threads
Id Target Id Frame
1.1 process 13517 "a.out" main () at test.c:3
* 2.1 Thread 26253 "a.out" 0x00007f12750c5151 in read () from target:/lib/x86_64-linux-gnu/libc.so.6
3.1 Thread 26253 "a.out" Remote 'g' packet reply is too long (expected 560 bytes, got 2496 bytes): 00feffffffffffff000a3a75127f000051510c75127f0000000400000000000060d24ef6af5500000000000000000000680d000000000000b85b31e3fc7f0000c0283a75127f000000e55b75127f000010d04ef6af550000460200000000000060c73975127f0000a0d23975127f0000000a3a75127f0000000000000000000051510c75127f000046020000330000002b0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f03000000000000ffff0000000000000000000000000000000000000000000080143a75127f000080143a75127f000040143a75127f000040143a75127f00007d0000007e0000007f00000080000000300c3a75127f0000300c3a75127f00000e000000000000000e0000000000000000000000000000000000000000000000ffffffffffffffffffffffffffffffff0400000004000000040000000400000020143a75127f000020143a75127f000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000000000000000000000e55b75127f0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
(gdb)
Fix the problem by preferring current_inferior()'s pid instead of
magic_null_ptid.
Regression-tested on X86-64 Linux.
Co-authored-by: Aleksandar Paunovic <aleksandar.paunovic@intel.com>
|
|
If the attach target supports async mode, enable it after the
attach target's ::attach method returns.
|
|
Now that target_resume always enables async mode after target::resume
returns, these calls are redundant.
The other place that target resume methods are invoked outside of
target_resume are as the beneath target in record_full_wait_1. In
this case, async mode should already be enabled when supported by the
target before the resume method is invoked due to the following:
In general, targets which support async mode run as async until
::wait returns TARGET_WAITKIND_NO_RESUMED to indicate that there are
no unwaited for children (either they have exited or are stopped).
When that occurs, the loop in wait_one disables async mode. Later
if a stopped child is resumed, async mode is re-enabled in
do_target_resume before waiting for the next event.
In the case of record_full_wait_1, this function is invoked from the
::wait target method when fetching an event. If the underlying
target supports async mode, then an earlier call to do_target_resume
to resume the child reporting an event in the loop in
record_full_wait_1 would have already enabled async mode before
::wait was invoked. In addition, nothing in the code executed in
the loop in record_full_wait_1 disables async mode. Async mode is
only disabled higher in the call stack in wait_one after ::wait
returns.
It is also true that async mode can be disabled by an
INF_EXEC_COMPLETE event passed to inferior_event_handle, but all of
the places that invoke that are in the gdb core which is "above" a
target ::wait method.
Note that there is an earlier call to enable async mode in
linux_nat_target::resume. That call also marks the async event pipe
to report an existing event after enabling async mode, so it needs to
stay.
|
|
In remote_target::remote_btrace_maybe_reopen, we switch to the currently
iterated thread in order to set inferior_ptid for a subsequent xfer.
Move the switch_to_thread call directly before the target_read_stralloc
call to clarify why we need to switch threads.
|
|
The enable_btrace target method takes a ptid_t to identify the thread on
which tracing shall be enabled.
Change this to thread_info * to avoid translating back and forth between
the two. This will be used in a subsequent patch.
|
|
In remote_btrace_maybe_reopen() we iterate over threads and use
set_general_thread() to set the thread from which to transfer the btrace
configuration.
This sets the remote general thread but does not affect inferior_ptid. On
the xfer request later on, remote_target::xfer_partial() again sets the
remote general thread to inferior_ptid, overwriting what
remote_btrace_maybe_reopen() had done.
In one case, this led to inferior_ptid being null_ptid when we tried to
enable tracing on a newly created thread inside a newly created process
during attach.
This, in turn, led to find_inferior_pid() asserting when we iterated over
threads in record_btrace_is_replaying(), which was called from
record_btrace_target::xfer_partial() when reading the btrace configuration
of the new thread to check whether it was already being recorded.
The bug was exposed by
0618ae41497 gdb: optimize all_matching_threads_iterator
and found by
FAIL: gdb.btrace/enable-new-thread.exp: ... (GDB internal error)
Use switch_to_thread() in remote_btrace_maybe_reopen().
|
|
A common pattern for string_file is to want to move out the internal
string buffer, because it is the result of the computation that we want
to return. It is the reason why string_file::string returns a non-const
reference, as explained in the comment. I think it would make sense to
have a dedicated method for that instead and make string_file::string
return a const reference.
This allows removing the explicit std::move in the typical case. Note
that compile_program::compute was missing a move, meaning that the
resulting string was copied. With the new version, it's not possible to
forget to move.
Change-Id: Ieaefa35b73daa7930b2f3a26988b6e3b4121bb79
|
|
Same idea as 0fab79556484 ("gdb: use ptid_t::to_string in infrun debug
messages"), but throughout GDB.
Change-Id: I62ba36eaef29935316d7187b9b13d7b88491acc1
|
|
PR remote/9177 points out that "info files" mentions "serial" a couple
of times:
Remote serial target in gdb-specific protocol:
Debugging a target over a serial line.
However, often the remote target isn't really a serial connection.
It seems to me that this text could be a bit clearer; and furthermore
since "info files" prints the target's long description,
remote_target::files_info doesn't really add much and can simply be
removed.
Regression tested on x86-64 Fedora 34.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=9177
|
|
This introduces target_announce_attach, by analog with
target_announce_detach. Then it converts existing targets to use
this, rather than emitting their own output by hand.
|
|
This commit brings all the changes made by running gdb/copyright.py
as per GDB's Start of New Year Procedure.
For the avoidance of doubt, all changes in this commits were
performed by the script.
|
|
I happened to notice that one "show" callback was printing to
gdb_stdout rather than to the passed-in ui_file parameter. I went
through all such callbacks and fixed them to consistently use the
ui_file.
Regression tested on x86-64 Fedora 34.
|
|
Bug PR gdb/28405 reports a regression when using attach with an
extended-remote target. In this case the target is not including a
thread-id in the stop packet it sends back after the attach.
The regression was introduced with this commit:
commit 8f66807b98f7634c43149ea62e454ea8f877691d
Date: Wed Jan 13 20:26:58 2021 -0500
gdb: better handling of 'S' packets
The problem is that when GDB processes the stop packet, it sees that
there is no thread-id and so has to "guess" which thread the stop
should apply to.
In this case the target only has one thread, so really, there's no
guessing needed, but GDB still runs through the same process, this
shouldn't cause us any problems.
However, after the above commit, GDB now expects itself to be more
internally consistent, specifically, only a thread that GDB thinks is
resumed, can be a candidate for having stopped.
It turns out that, when GDB attaches to a process through an
extended-remote target, the threads of the process being attached too,
are not, initially, marked as resumed.
And so, when GDB tries to figure out which thread the stop might apply
too, it finds no threads in the processes marked resumed, and so an
assert triggers.
In extended_remote_target::attach we create a new thread with a call
to add_thread_silent, rather than remote_target::remote_add_thread,
the reason is that calling the latter will result in a call to
'add_thread' rather than 'add_thread_silent'. However,
remote_target::remote_add_thread includes additional
actions (i.e. calling remote_thread_info::set_resumed and set_running)
which are missing from extended_remote_target::attach. These missing
calls are what would serve to mark the new thread as resumed.
In this commit I propose that we add an extra parameter to
remote_target::remote_add_thread. This new parameter will force the
new thread to be added with a call to add_thread_silent. We can now
call remote_add_thread from the ::attach method, the extra
actions (listed above) will now be performed, and the thread will be
left in the correct state.
Additionally, in PR gdb/28405, a segfault is reported. This segfault
triggers when 'set debug remote 1' is used before trying to reproduce
the original assertion failure. The cause of this is in
remote_target::select_thread_for_ambiguous_stop_reply, where we do
this:
remote_debug_printf ("first resumed thread is %s",
pid_to_str (first_resumed_thread->ptid).c_str ());
remote_debug_printf ("is this guess ambiguous? = %d", ambiguous);
gdb_assert (first_resumed_thread != nullptr);
Notice that when debug printing is on we dereference
first_resumed_thread before we assert that the pointer is not
nullptr. This is the cause of the segfault, and is resolved by moving
the assert before the debug printing code.
I've extended an existing test, ext-attach.exp, so that the original
test is run multiple times; we run in the original mode, as normal,
but also, we now run with different packets disabled in gdbserver. In
particular, disabling Tthread would trigger the assertion as it was
reported in the original bug. I also run the test in all-stop and
non-stop modes now for extra coverage, we also run the tests with
target-async enabled, and disabled.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28405
|
|
While working on another patch I ended up in a situation where I had
async mode disabled (with 'maint set target-async off'), but the async
event token got marked anyway.
In this situation GDB was continually calling into
remote_target::wait, however, the async token would never become
unmarked as the unmarking is guarded by target_is_async_p.
We could just unconditionally unmark the token, but that would feel
like just ignoring a bug, so, instead, lets assert that if
!target_is_async_p, then the async token should not be marked.
This assertion would have caught my earlier mistake.
There should be no user visible changes with this commit.
|
|
While working on another patch relating to remote targets, I wanted to
test with 'maint set target-async off' in place. Unfortunately I ran
into some problems. This commit is an attempt to fix one of the
issues I hit.
In my particular case I was actually running with:
maint set target-async off
maint set target-non-stop off
that is, we're telling GDB to force the targets to operate in
non-async mode, and in all-stop mode. Here's my GDB session showing
the problem:
(gdb) maintenance set target-async off
(gdb) maintenance set target-non-stop off
(gdb) target extended-remote :54321
Remote debugging using :54321
(gdb) attach 2365960
Attaching to process 2365960
No unwaited-for children left.
(gdb)
Notice the 'No unwaited-for children left.' error, this is the
problem. There's no reason why GDB should not be able to attach to
the process.
The problem is this:
1. The user runs 'attach PID' and this sends GDB into attach_command
in infcmd.c. From here we call the ::attach method on the attach
target, which will be the extended_remote_target.
2. In extended_remote_target::attach, we attach to the remote target
and get the first reply (which is a stop packet). We put off
processing the stop packet until the end of ::attach. We setup the
inferior and thread to represent the process we attached to, and
download the target description. Finally, we process the initial
stop packet.
If '!target_is_non_stop_p ()' and '!target_can_async_p ()', which is
the case for us given the maintenance commands we used, we cache the
stop packet within the remote_state::buf for later processing.
3. Back in attach_command, if 'target_is_non_stop_p ()' then we
request that the target stops. This will either process any cached
stop replies, or request that the target stops, and process the stop
replies. However, this code is not what we use due to non-stop mode
being disabled. So, we skip to the next step which is to call
validate_exec_file.
4. Calling validate_exec_file can cause packets to be sent to the
remote target, and replies received, the first path I hit is the
call to target_pid_to_exec_file, which calls
remote_target::pid_to_exec_file, which can then try to read the
executable from the remote. Sending an receiving packets will make
use of the remote_state::buf object.
5. The attempt to attach continues, but the damage is already done...
So, the problem is that, in step #2 we cache a stop reply in the
remote_state::buf, and then in step #4 we reuse the remote_state::buf
object, discarding any cached stop reply. As a result, the initial
stop, which is sent when GDB first attaches to the target, is lost.
This problem can clearly be seen, I feel, by looking at the
remote_state::cached_wait_status flag. This flag tells GDB if there
is a wait status cached in remote_state::buf. However, in
remote_target::putpkt_binary and remote_target::getpkt_or_notif_sane_1
this flag is just set back to 0, doing this immediately discards any
cached data.
I don't know if this scheme ever made sense, looking at commit
2d717e4f8a54, where the cached_wait_status flag was added, it appears
that there was nothing between where the stop was cached, and where
the stop was consumed, so, I suspect, there never was a situation
where we ended up in putpkt_binary or getpkt_or_notif_sane_1 and
needed to clear to the flag, maybe the clearing was added "just in
case". Whatever the history, I claim that this clearing this flag is
no longer a good idea.
So, my first step toward fixing this issue was to replace the two
instances of 'rs->cached_wait_status = 0;' in ::putpkt_binary and
::getpkt_or_notif_sane_1 with 'gdb_assert (rs->cached_wait_status ==
0);', this, at least would show me when GDB was doing something
dangerous, and indeed, this assert is now hit in my test case above.
I did play with using some kind of scoped restore to backup, and
restore the remote_state::buf object in all the places within remote.c
that I was hitting where the ::buf was being corrupted. The first
problem with this is that, where the ::cached_wait_status flag is
reset is _not_ where ::buf is corrupted. For the ::putpkt_binary
case, by the time we get to the method the buffer has already been
corrupted in many cases, so we end up needing to add the scoped
save/restore within the callers, which means we need the save/restore
in _lots_ of places.
Plus, using this save/restore model feels like the wrong solution. I
don't think that it's obvious that the buffer might be holding cached
data, and I think it would be too easy for new corruptions of the
buffer to be introduced, which could easily go unnoticed for a long
time.
So, I really wanted a solution that didn't require us to cache data in
the ::buf object.
Luckily, I think we already have such a solution in place, the
remote_state::stop_reply_queue, it seems like this does exactly the
same task, just in a slightly different way. With the
::stop_reply_queue, the stop packets are processed upon receipt and
the stop_reply object is added to the queue. With the ::buf cache
solution, the unprocessed stop reply is cached in the ::buf, and
processed later.
So, finally, in this commit, I propose to remove the
remote_state::cached_wait_status flag and to stop using the ::buf to
cache stop replies. Instead, stop replies will now always be stored
in the ::stop_reply_queue.
There are two places where we use the ::buf to hold a cached stop
reply, the first is in the ::attach method, and the second is in
remote_target::start_remote, however, the second of these cases is far
less problematic, as after caching the stop reply in ::buf we call the
global start_remote function, which does very little work before
calling normal_stop, which processes the cached stop reply. However,
my plan is to switch both users over to using ::stop_reply_queue so
that the old (unsafe) ::cached_wait_status mechanism can be completely
removed.
The next problem is that the ::stop_reply_queue is currently only used
for async-mode, and so, in remote_target::push_stop_reply, where we
push stop_reply objects into the ::stop_reply_queue, we currently also
mark the async event token. I've modified this so we only mark the
async event token if 'target_is_async_p ()' - note, _is_, not _can_
here. The ::push_stop_reply method is called in places where async
mode has been temporarily disabled, but, when async mode is switched
back on (see remote_target::async) we will mark the event token if
there are events in the queue.
Another change of interest is in remote_target::remote_interrupt_as.
Previously this code checked ::cached_wait_status, but didn't check
for events in the ::stop_reply_queue. Now that ::cached_wait_status
has been removed we now check the queue length instead, which should
have the same result.
Finally, in remote_target::wait_as, I've tried to merge the processing
of the ::stop_reply_queue with how we used to handle the
::cached_wait_status flag.
Currently, when processing the ::stop_reply_queue we call
process_stop_reply and immediately return. However, when handling
::cached_wait_status we run through the whole of ::wait_as, and return
at the end of the function.
If we consider a standard stop packet, the two differences I see are:
1. Resetting of the remote_state::waiting_for_stop_reply, flag; this
is not currently done when processing a stop from the
::stop_reply_queue.
2. The final return value has the possibility of being adjusted at
the end of ::wait_as, as well as there being calls to
record_currthread, non of which are done if we process a stop from
the ::stop_reply_queue.
After discussion on the mailing list:
https://sourceware.org/pipermail/gdb-patches/2021-December/184535.html
it was suggested that, when an event is pushed into the
::stop_reply_queue, the ::waiting_for_stop_reply flag is never going
to be set. As a result, we don't need to worry about the first
difference. I have however, added a gdb_assert to validate the
assumption that the flag is never going to be set. If in future the
situation ever changes, then we should find out pretty quickly.
As for the second difference, I have resolved this by having all stop
packets taken from the ::stop_reply_queue, pass through the return
value adjustment code at the end of ::wait_as.
An example of a test that reveals the benefits of this commit is:
make check-gdb \
RUNTESTFLAGS="--target_board=native-extended-gdbserver \
GDBFLAGS='-ex maint\ set\ target-async\ off \
-ex maint\ set\ target-non-stop\ off' \
gdb.base/attach.exp"
For testing I've been running test on x86-64/GNU Linux, and run with
target boards unix, native-gdbserver, and native-extended-gdbserver.
For each board I've run with the default GDBFLAGS, as well as with:
GDBFLAGS='-ex maint\ set\ target-async\ off \
-ex maint\ set\ target-non-stop\ off' \
Though running with the above GDBFLAGS is clearly a lot more unstable
both before and after my patch, I'm not seeing any consistent new
failures with my patch, except, with the native-extended-gdbserver
board, where I am seeing new failures, but only because more tests are
now running. For that configuration alone I see the number of
unresolved go down by 49, the number of passes goes up by 446, and the
number of failures also increases by 144. All of the failures are new
tests as far as I can tell.
|
|
While working with pending fork events, I wondered what would happen if
the user detached an inferior while a thread of that inferior had a
pending fork event. What happens with the fork child, which is
ptrace-attached by the GDB process (or by GDBserver), but not known to
the core? Sure enough, neither the core of GDB or the target detach the
child process, so GDB (or GDBserver) just stays ptrace-attached to the
process. The result is that the fork child process is stuck, while you
would expect it to be detached and run.
Make GDBserver detach of fork children it knows about. That is done in
the generic handle_detach function. Since a process_info already exists
for the child, we can simply call detach_inferior on it.
GDB-side, make the linux-nat and remote targets detach of fork children
known because of pending fork events. These pending fork events can be
stored in:
- thread_info::pending_waitstatus, if the core has consumed the event
but then saved it for later (for example, because it got the event
while stopping all threads, to present an all-stop stop on top of a
non-stop target)
- thread_info::pending_follow: if we ran to a "catch fork" and we
detach at that moment
Additionally, pending fork events can be in target-specific fields:
- For linux-nat, they can be in lwp_info::status and
lwp_info::waitstatus.
- For the remote target, they could be stored as pending stop replies,
saved in `remote_state::notif_state::pending_event`, if not
acknowledged yet, or in `remote_state::stop_reply_queue`, if
acknowledged. I followed the model of remove_new_fork_children for
this: call remote_notif_get_pending_events to process /
acknowledge any unacknowledged notification, then look through
stop_reply_queue.
Update the gdb.threads/pending-fork-event.exp test (and rename it to
gdb.threads/pending-fork-event-detach.exp) to try to detach the process
while it is stopped with a pending fork event. In order to verify that
the fork child process is correctly detached and resumes execution
outside of GDB's control, make that process create a file in the test
output directory, and make the test wait $timeout seconds for that file
to appear (it happens instantly if everything goes well).
This test catches a bug in linux-nat.c, also reported as PR 28512
("waitstatus.h:300: internal-error: gdb_signal target_waitstatus::sig()
const: Assertion `m_kind == TARGET_WAITKIND_STOPPED || m_kind ==
TARGET_WAITKIND_SIGNALLED' failed.). When detaching a thread with a
pending event, get_detach_signal unconditionally fetches the signal
stored in the waitstatus (`tp->pending_waitstatus ().sig ()`). However,
that is only valid if the pending event is of type
TARGET_WAITKIND_STOPPED, and this is now enforced using assertions (iit
would also be valid for TARGET_WAITKIND_SIGNALLED, but that would mean
the thread does not exist anymore, so we wouldn't be detaching it). Add
a condition in get_detach_signal to access the signal number only if the
wait status is of kind TARGET_WAITKIND_STOPPED, and use GDB_SIGNAL_0
instead (since the thread was not stopped with a signal to begin with).
Add another test, gdb.threads/pending-fork-event-ns.exp, specifically to
verify that we consider events in pending stop replies in the remote
target. This test has many threads constantly forking, and we detach
from the program while the program is executing. That gives us some
chance that we detach while a fork stop reply is stored in the remote
target. To verify that we correctly detach all fork children, we ask
the parent to exit by sending it a SIGUSR1 signal and have it write a
file to the filesystem before exiting. Because the parent's main thread
joins the forking threads, and the forking threads wait for their fork
children to exit, if some fork child is not detach by GDB, the parent
will not write the file, and the test will time out. If I remove the
new remote_detach_pid calls in remote.c, the test fails eventually if I
run it in a loop.
There is a known limitation: we don't remove breakpoints from the
children before detaching it. So the children, could hit a trap
instruction after being detached and crash. I know this is wrong, and
it should be fixed, but I would like to handle that later. The current
patch doesn't fix everything, but it's a step in the right direction.
Change-Id: I6d811a56f520e3cb92d5ea563ad38976f92e93dd
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28512
|
|
In preparation for a following patch, refactor a few things that I did
find a bit awkward, and to make them a bit more reusable.
- Pass an inferior to kill_new_fork_children instead of a pid. That
allows iterating on only this inferior's threads and avoid further
filtering on the thread's pid.
- Change thread_pending_fork_status to return a non-nullptr value only
if the thread does have a pending fork status.
- Remove is_pending_fork_parent_thread, as one can just use
thread_pending_fork_status and check for nullptr.
- Replace is_pending_fork_parent with is_fork_status, which just
returns if the given target_waitkind if a fork or a vfork. Push
filtering on the pid to the callers, when it is necessary.
Change-Id: I0764ccc684d40f054e39df6fa5458cc4c5d1cd7b
|
|
Move the stop_reply and a few functions up. Some code above them in the
file will need to use them in a following patch. No behavior changes
expected here.
Change-Id: I3ca57d0e3ec253f56e1ba401289d9d167de14ad2
|
|
When running test-case gdb.base/cached-source-file.exp with target board
readnow, we run into:
...
FAIL: gdb.base/cached-source-file.exp: rerun program (the program exited)
...
The problem is that when rereading, the readnow is ignored.
Fix this by copying the readnow handling code from symbol_file_add_with_addrs
to reread_symbols.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26800
|
|
This commit:
commit 288712bbaca36bff6578bc839ebcdc3707662f81
Date: Mon Nov 22 15:16:27 2021 +0000
gdb/remote: use scoped_restore to control starting_up flag
introduced a use after free bug. The scoped restore added in the
above commit resets a flag within a remote_target's remote_state
object.
However, in some situations, the remote_target can be unpushed before
the error is thrown. If the only reference to the target is the one
in the target stack, then unpushing the target will cause the
remote_target to be deleted, which, in turn, will delete the
remote_state object. The scoped restore will then try to reset the
flag within a deleted object.
This problem was caught in the gdb.server/server-connect.exp test,
which, when run with the address sanitizer enabled, highlights the
write after free bug described above.
This commit resolves this issue by adding a new class specifically for
the purpose of managing the starting_up flag. As well as setting, and
then clearing the starting_up flag, this new class increments, and
then decrements the reference count on the remote_target object. This
prevents the remote_target from being deleted until after the flag has
been reset.
The gdb.server/server-connect.exp now runs cleanly with the address
sanitizer enabled.
|
|
This commit makes use of a scoped_restore object to control the
remote_state::starting_up flag within the remote_target::start_remote
method.
Ideally I would have liked to create the scoped_restore inside
start_remote and just leave the restore in place until the end of the
scope, however, I'm worried that doing this would change the behaviour
of GDB. Specifically, in start_remote, the following code is executed
once the starting_up flag has been restored to its previous value:
if (breakpoints_should_be_inserted_now ())
insert_breakpoints ();
I think (but am not 100% sure) that calling install_breakpoints could
end up back inside remote_target::can_download_tracepoint, which does
check the value of remote_state::starting_up. And so, I'm concerned
that leaving the scoped_restore in place until the end of start_remote
will cause a possible change in behaviour.
To avoid this, and to leave things as close to the current behaviour
as possible, I've split remote_target::start_remote into two, there's
the main function body which moves into remote_target::start_remote_1,
this function uses the scoped_restore to change the ::starting_up
flag, then there's the old remote_target::start_remote, which now just
calls ::start_remote_1, and then does the insert_breakpoints call.
There should be no user visible changes after this commit, unless
there's a situation where the ::starting_up flag could previously have
been left set, if this was the case, then this situation should no
longer be possible.
|
|
This commits adds a new sub-class of gdb.TargetConnection,
gdb.RemoteTargetConnection. This sub-class is created for all
'remote' and 'extended-remote' targets.
This new sub-class has one additional method over its base class,
'send_packet'. This new method is equivalent to the 'maint
packet' CLI command, it allows a custom packet to be sent to a remote
target.
The outgoing packet can either be a bytes object, or a Unicode string,
so long as the Unicode string contains only ASCII characters.
The result of calling RemoteTargetConnection.send_packet is a bytes
object containing the reply that came from the remote.
|
|
In a later commit I will add a Python API to access the 'maint packet'
functionality, that is, sending a user specified packet to the target.
To make implementing this easier, this commit refactors how this
command is currently implemented so that the packet_command function
is now global.
The new global send_remote_packet function takes an object that is an
implementation of an abstract interface. Two functions within this
interface are then called, one just before a packet is sent to the
remote target, and one when the reply has been received from the
remote target. Using an interface object in this way allows (1) for
the error checking to be done before the first callback is made, this
means we only print out what packet it being sent once we know we are
going to actually send it, and (2) we don't need to make a copy of the
reply if all we want to do is print it.
One user visible changes after this commit are the error
messages, which I've changed to be less 'maint packet' command
focused, this will make them (I hope) better for when
send_remote_packet can be called from Python code.
So: "command can only be used with remote target"
Becomes: "packets can only be sent to a remote target"
And: "remote-packet command requires packet text as argument"
Becomes: "a remote packet must not be empty"
Additionally, in this commit, I've added support for packet replies
that contain binary data. Before this commit, the code that printed
the reply treated the reply as a C string, it assumed that the string
only contained printable characters, and had a null character only at
the end.
One way to show the problem with this is if we try to read the auxv
data from a remote target, the auxv data is binary, so, before this
commit:
(gdb) target remote :54321
...
(gdb) maint packet qXfer:auxv:read::0,1000
sending: "qXfer:auxv:read::0,1000"
received: "l!"
(gdb)
And after this commit:
(gdb) target remote :54321
...
(gdb) maint packet qXfer:auxv:read::0,1000
sending: "qXfer:auxv:read::0,1000"
received: "l!\x00\x00\x00\x00\x00\x00\x00\x00\xf0\xfc\xf7\xff\x7f\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\xff\xf>
(gdb)
The binary contents of the reply are now printed as escaped hex.
|
|
This commit introduced a test failure in gdb.server/attach-flag.exp.
I didn't spot this failure originally as the problem is fixed by this,
as yet unpushed patch:
https://sourceware.org/pipermail/gdb-patches/2021-November/183768.html
I unfortunately didn't test each patch in the original series
independently. I'll repost this patch after the above patch has been
merged.
This reverts commit 32b1f5e8d6b8ddd3be6e471c26dd85a1dac31dda.
|