Age | Commit message (Collapse) | Author | Files | Lines |
|
SUMMARY
The '--simple-values' argument to '-stack-list-arguments' and similar
GDB/MI commands does not take reference types into account, so that
references to arbitrarily large structures are considered "simple" and
printed. This means that the '--simple-values' argument cannot be used
by IDEs when tracing the stack due to the time taken to print large
structures passed by reference.
DETAILS
Various GDB/MI commands ('-stack-list-arguments', '-stack-list-locals',
'-stack-list-variables' and so on) take a PRINT-VALUES argument which
may be '--no-values' (0), '--all-values' (1) or '--simple-values' (2).
In the '--simple-values' case, the command is supposed to print the
name, type, and value of variables with simple types, and print only the
name and type of variables with compound types.
The '--simple-values' argument ought to be suitable for IDEs that need
to update their user interface with the program's call stack every time
the program stops. However, it does not take C++ reference types into
account, and this makes the argument unsuitable for this purpose.
For example, consider the following C++ program:
struct s {
int v[10];
};
int
sum(const struct s &s)
{
int total = 0;
for (int i = 0; i < 10; ++i) total += s.v[i];
return total;
}
int
main(void)
{
struct s s = { { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 } };
return sum(s);
}
If we start GDB in MI mode and continue to 'sum', the behaviour of
'-stack-list-arguments' is as follows:
(gdb)
-stack-list-arguments --simple-values
^done,stack-args=[frame={level="0",args=[{name="s",type="const s &",value="@0x7fffffffe310: {v = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}}"}]},frame={level="1",args=[]}]
Note that the value of the argument 's' was printed, even though 's' is
a reference to a structure, which is not a simple value.
See https://github.com/microsoft/MIEngine/pull/673 for a case where this
behaviour caused Microsoft to avoid the use of '--simple-values' in
their MIEngine debug adapter, because it caused Visual Studio Code to
take too long to refresh the call stack in the user interface.
SOLUTIONS
There are two ways we could fix this problem, depending on whether we
consider the current behaviour to be a bug.
1. If the current behaviour is a bug, then we can update the behaviour
of '--simple-values' so that it takes reference types into account:
that is, a value is simple if it is neither an array, struct, or
union, nor a reference to an array, struct or union.
In this case we must add a feature to the '-list-features' command so
that IDEs can detect that it is safe to use the '--simple-values'
argument when refreshing the call stack.
2. If the current behaviour is not a bug, then we can add a new option
for the PRINT-VALUES argument, for example, '--scalar-values' (3),
that would be suitable for use by IDEs.
In this case we must add a feature to the '-list-features' command
so that IDEs can detect that the '--scalar-values' argument is
available for use when refreshing the call stack.
PATCH
This patch implements solution (1) as I think the current behaviour of
not printing structures, but printing references to structures, is
contrary to reasonable expectation.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29554
|
|
I'd like to move some things so they become methods on struct ui. But
first, I think that struct ui and the related things are big enough to
deserve their own file, instead of being scattered through top.{c,h} and
event-top.c.
Change-Id: I15594269ace61fd76ef80a7b58f51ff3ab6979bc
|
|
This changes field_is_static to be a method on struct field, and
updates all the callers. Most of this patch was written by script.
Regression tested on x86-64 Fedora 36.
|
|
This changes apply_ext_lang_type_printers to use unique_xmalloc_ptr,
removing some manual memory management. Regression tested on x86-64
Fedora 36.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This commit allows Frame.read_var to accept named arguments, and also
improves (I think) some of the error messages emitted when values of
the wrong type are passed to this function.
The read_var method takes two arguments, one a variable, which is
either a gdb.Symbol or a string, while the second, optional, argument
is always a gdb.Block.
I'm now using 'O!' as the format specifier for the second argument,
which allows the argument type to be checked early on. Currently, if
the second argument is of the wrong type then we get this error:
(gdb) python print(gdb.selected_frame().read_var("a1", "xxx"))
Traceback (most recent call last):
File "<string>", line 1, in <module>
RuntimeError: Second argument must be block.
Error while executing Python code.
(gdb)
After this commit, we now get an error like this:
(gdb) python print(gdb.selected_frame().read_var("a1", "xxx"))
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: argument 2 must be gdb.Block, not str
Error while executing Python code.
(gdb)
Changes are:
1. Exception type is TypeError not RuntimeError, this is unfortunate
as user code _could_ be relying on this, but I think the improvement
is worth the risk, user code relying on the exact exception type is
likely to be pretty rare,
2. New error message gives argument position and expected argument
type, as well as the type that was passed.
If the first argument, the variable, has the wrong type then the
previous exception was already a TypeError, however, I've updated the
text of the exception to more closely match the "standard" error
message we see above. If the first argument has the wrong type then
before this commit we saw this:
(gdb) python print(gdb.selected_frame().read_var(123))
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: Argument must be a symbol or string.
Error while executing Python code.
(gdb)
And after we see this:
(gdb) python print(gdb.selected_frame().read_var(123))
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: argument 1 must be gdb.Symbol or str, not int
Error while executing Python code.
(gdb)
For existing code that doesn't use named arguments and doesn't rely on
exceptions, there will be no changes after this commit.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Following on from the previous commit, this updates
Frame.read_register to accept named arguments. As with the previous
commit there's no huge benefit for the users in accepting named
arguments here -- this function only takes a single argument after
all.
But I do think it is worth keeping Frame.read_register method in sync
with the PendingFrame.read_register method, this allows for the
possibility that the user has some code that can operate on either a
Frame or a Pending frame.
Minor update to allow for named arguments, and an extra test to check
the new functionality.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Update the two gdb.PendingFrame methods gdb.PendingFrame.read_register
and gdb.PendingFrame.create_unwind_info to accept keyword arguments.
There's no huge benefit for making this change, both of these methods
only take a single argument, so it is (maybe) less likely that a user
will take advantage of the keyword arguments in these cases, but I
think it's nice to be consistent, and I don't see any particular draw
backs to making this change.
For PendingFrame.read_register I've changed the argument name from
'reg' to 'register' in the documentation and used 'register' as the
argument name in GDB. My preference for APIs is to use full words
where possible, and given we didn't support named arguments before
this change should not break any existing code.
There should be no user visible changes (for existing code) after this
commit.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Update gdb.UnwindInfo.add_saved_register to accept named keyword
arguments.
As part of this update we now use gdb_PyArg_ParseTupleAndKeywords
instead of PyArg_UnpackTuple to parse the function arguments.
By switching to gdb_PyArg_ParseTupleAndKeywords, we can now use 'O!'
as the argument format for the function's value argument. This means
that we can check the argument type (is gdb.Value) as part of the
argument processing rather than manually performing the check later in
the function. One result of this is that we now get a better error
message (at least, I think so). Previously we would get something
like:
ValueError: Bad register value
Now we get:
TypeError: argument 2 must be gdb.Value, not XXXX
It's unfortunate that the exception type changed, but I think the new
exception type actually makes more sense.
My preference for argument names is to use full words where that's not
too excessive. As such, I've updated the name of the argument from
'reg' to 'register' in the documentation, which is the argument name
I've made GDB look for here.
For existing unwinder code that doesn't throw any exceptions nothing
should change with this commit. It is possible that a user has some
code that throws and catches the ValueError, and this code will break
after this commit, but I think this is going to be sufficiently rare
that we can take the risk here.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Make find_thread_ptid (the overload that takes an inferior) a method of
struct inferior.
Change-Id: Ie5b9fa623ff35aa7ddb45e2805254fc8e83c9cd4
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
This adds the DAP readMemory and writeMemory requests. A small change
to the evaluation code is needed in order to test this -- this is one
of the few ways for a client to actually acquire a memory reference.
|
|
When writing an unwinder it is necessary to create a new class to act
as a frame-id. This new class is almost certainly just going to set a
'sp' and 'pc' attribute within the instance.
This commit adds a little helper class gdb.unwinder.FrameId that does
this job. Users can make use of this to avoid having to write out
standard boilerplate code any time they write an unwinder.
Of course, if the user wants their FrameId class to be more
complicated in some way, then they can still write their own class,
just like they could before.
I've simplified the example code in the documentation to now use the
new helper class, and I've also made use of this helper within the
testsuite.
Any existing user code will continue to work just as it did before
after this change.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Currently when creating a gdb.UnwindInfo object a user must call
gdb.PendingFrame.create_unwind_info and pass a frame-id object.
The frame-id object should have at least a 'sp' attribute, and
probably a 'pc' attribute too (it can also, in some cases have a
'special' attribute).
Currently all of these frame-id attributes need to be gdb.Value
objects, but the only reason for that requirement is that we have some
code in py-unwind.c that only handles gdb.Value objects.
If instead we switch to using get_addr_from_python in py-utils.c then
we will support both gdb.Value objects and also raw numbers, which
might make things simpler in some cases.
So, I started rewriting pyuw_object_attribute_to_pointer (in
py-unwind.c) to use get_addr_from_python. However, while looking at
the code I noticed a problem.
The pyuw_object_attribute_to_pointer function returns a boolean flag,
if everything goes OK we return true, but we return false in two
cases, (1) when the attribute is not present, which might be
acceptable, or might be an error, and (2) when we get an error trying
to extract the attribute value, in which case a Python error will have
been set.
Now in pending_framepy_create_unwind_info we have this code:
if (!pyuw_object_attribute_to_pointer (pyo_frame_id, "sp", &sp))
{
PyErr_SetString (PyExc_ValueError,
_("frame_id should have 'sp' attribute."));
return NULL;
}
Notice how we always set an error. This will override any error that
is already set.
So, if you create a frame-id object that has an 'sp' attribute, but
the attribute is not a gdb.Value, then currently we fail to extract
the attribute value (it's not a gdb.Value) and set this error in
pyuw_object_attribute_to_pointer:
rc = pyuw_value_obj_to_pointer (pyo_value.get (), addr);
if (!rc)
PyErr_Format (
PyExc_ValueError,
_("The value of the '%s' attribute is not a pointer."),
attr_name);
Then we return to pending_framepy_create_unwind_info and immediately
override this error with the error about 'sp' being missing.
This all feels very confused.
Here's my proposed solution: pyuw_object_attribute_to_pointer will now
return a tri-state enum, with states OK, MISSING, or ERROR. The
meanings of these states are:
OK - Attribute exists and was extracted fine,
MISSING - Attribute doesn't exist, no Python error was set.
ERROR - Attribute does exist, but there was an error while
extracting it, a Python error was set.
We need to update pending_framepy_create_unwind_info, the only user of
pyuw_object_attribute_to_pointer, but now I think things are much
clearer. Errors from lower levels are not blindly overridden with the
generic meaningless error message, but we still get the "missing 'sp'
attribute" error when appropriate.
This change also includes the switch to get_addr_from_python which was
what started this whole journey.
For well behaving user code there should be no visible changes after
this commit.
For user code that hits an error, hopefully the new errors should be
more helpful in figuring out what's gone wrong.
Additionally, users can now use integers for the 'sp' and 'pc'
attributes in their frame-id objects if that is useful.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
It is not currently possible to directly create gdb.UnwindInfo
instances, they need to be created by calling
gdb.PendingFrame.create_unwind_info so that the newly created
UnwindInfo can be linked to the pending frame.
As such there's no tp_init method defined for UnwindInfo.
A consequence of all this is that it doesn't really make sense to
allow sub-classing of gdb.UnwindInfo. Any sub-class can't call the
parents __init__ method to correctly link up the PendingFrame
object (there is no parent __init__ method). And any instances that
sub-classes UnwindInfo but doesn't call the parent __init__ is going
to be invalid for use in GDB.
This commit removes the Py_TPFLAGS_BASETYPE flag from the UnwindInfo
class, which prevents the class being sub-classed. Then I've added a
test to check that this is indeed prevented.
Any functional user code will not have any issues with this change.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Having a useful __repr__ method can make debugging Python code that
little bit easier. This commit adds __repr__ for gdb.PendingFrame and
gdb.UnwindInfo classes, along with some tests.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
The gdb.Frame class has far more methods than gdb.PendingFrame. Given
that a PendingFrame hasn't yet been claimed by an unwinder, there is a
limit to which methods we can add to it, but many of the methods that
the Frame class has, the PendingFrame class could also support.
In this commit I've added those methods to PendingFrame that I believe
are safe.
In terms of implementation: if I was starting from scratch then I
would implement many of these (or most of these) as attributes rather
than methods. However, given both Frame and PendingFrame are just
different representation of a frame, I think there is value in keeping
the interface for the two classes the same. For this reason
everything here is a method -- that's what the Frame class does.
The new methods I've added are:
- gdb.PendingFrame.is_valid: Return True if the pending frame
object is valid.
- gdb.PendingFrame.name: Return the name for the frame's function,
or None.
- gdb.PendingFrame.pc: Return the $pc register value for this
frame.
- gdb.PendingFrame.language: Return a string containing the
language for this frame, or None.
- gdb.PendingFrame.find_sal: Return a gdb.Symtab_and_line object
for the current location within the pending frame, or None.
- gdb.PendingFrame.block: Return a gdb.Block for the current
pending frame, or None.
- gdb.PendingFrame.function: Return a gdb.Symbol for the current
pending frame, or None.
In every case I've just copied the implementation over from gdb.Frame
and cleaned the code slightly e.g. NULL to nullptr. Additionally each
function required a small update to reflect the PendingFrame type, but
that's pretty minor.
There are tests for all the new methods.
For more extensive testing, I added the following code to the file
gdb/python/lib/command/unwinders.py:
from gdb.unwinder import Unwinder
class TestUnwinder(Unwinder):
def __init__(self):
super().__init__("XXX_TestUnwinder_XXX")
def __call__(self,pending_frame):
lang = pending_frame.language()
try:
block = pending_frame.block()
assert isinstance(block, gdb.Block)
except RuntimeError as rte:
assert str(rte) == "Cannot locate block for frame."
function = pending_frame.function()
arch = pending_frame.architecture()
assert arch is None or isinstance(arch, gdb.Architecture)
name = pending_frame.name()
assert name is None or isinstance(name, str)
valid = pending_frame.is_valid()
pc = pending_frame.pc()
sal = pending_frame.find_sal()
assert sal is None or isinstance(sal, gdb.Symtab_and_line)
return None
gdb.unwinder.register_unwinder(None, TestUnwinder())
This registers a global unwinder that calls each of the new
PendingFrame methods and checks the result is of an acceptable type.
The unwinder never claims any frames though, so shouldn't change how
GDB actually behaves.
I then ran the testsuite. There was only a single regression, a test
that uses 'disable unwinder' and expects a single unwinder to be
disabled -- the extra unwinder is now disabled too, which changes the
test output. So I'm reasonably confident that the new methods are not
going to crash GDB.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
This commit copies the pattern that is present in many other py-*.c
files: having a single macro to check that the Python object is still
valid.
This cleans up the code a little throughout the py-unwind.c file.
Some of the exception messages will change slightly with this commit,
though the type of the exceptions is still ValueError in all cases.
I started writing some tests for this change and immediately ran into
a problem: GDB would crash. It turns out that the PendingFrame
objects are not being marked as invalid!
In pyuw_sniffer where the pending frames are created, we make use of a
scoped_restore to invalidate the pending frame objects. However, this
only restores the pending_frame_object::frame_info field to its
previous value -- and it turns out we never actually give this field
an initial value, it's left undefined.
So, when the scoped_restore (called invalidate_frame) performs its
cleanup, it actually restores the frame_info field to an undefined
value. If this undefined value is not nullptr then any future
accesses to the PendingFrame object result in undefined behaviour and
most likely, a crash.
As part of this commit I now initialize the frame_info field, which
ensures all the new tests now pass.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Spotted a redundant nullptr check in python/py-frame.c in the function
frapy_block. This was introduced in commit 57126e4a45e3000e when we
expanded an earlier check in return early if the pointer in question
is nullptr.
There should be no user visible changes after this commit.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
This commit makes a few related changes to the gdb.unwinder.Unwinder
class attributes:
1. The 'name' attribute is now a read-only attribute. This prevents
user code from changing the name after registering the unwinder. It
seems very unlikely that any user is actually trying to do this in
the wild, so I'm not very worried that this will upset anyone,
2. We now validate that the name is a string in the
Unwinder.__init__ method, and throw an error if this is not the
case. Hopefully nobody was doing this in the wild. This should
make it easier to ensure the 'info unwinder' command shows sane
output (how to display a non-string name for an unwinder?),
3. The 'enabled' attribute is now implemented with a getter and
setter. In the setter we ensure that the new value is a boolean,
but the real important change is that we call
'gdb.invalidate_cached_frames()'. This means that the backtrace
will be updated if a user manually disables an unwinder (rather than
calling the 'disable unwinder' command). It is not unreasonable to
think that a user might register multiple unwinders (relating to
some project) and have one command that disables/enables all the
related unwinders. This command might operate by poking the enabled
attribute of each unwinder object directly, after this commit, this
would now work correctly.
There's tests for all the changes, and lots of documentation updates
that both cover the new changes, but also further improve (I think)
the general documentation for GDB's Unwinder API.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
The evaluate command supports a "context" parameter which tells the
adapter the context in which an evaluation occurs. One of the
supported values is "repl", which we took to mean evaluation of a gdb
command. That is what this patch implements.
Note that some gdb commands probably will not work correctly with the
rest of the protocol. For example if the user types "continue",
confusion may result.
This patch requires the earlier patch to fix up scopes in DAP.
|
|
Internal AdaCore DAP testing on Windows has had occasional failures
that show:
assert threading.current_thread() is _dap_thread
I think this is a race in DAP startup: the _dap_thread global is only
set on return from start_thread, but it seems possible that the thread
itself could already run and encounter a @in_dap_thread decorator.
This patch fixes the problem by setting the global before running any
of the code in the new thread. This also lets us remove a FIXME.
|
|
This input sequence is accepted by DAP:
...
{"seq": 4, "type": "request", "command": "configurationDone"}Content-Length: 84
...
This input sequence has the same effect:
...
{"seq": 4, "type": "request", "command": "configurationDone"}ignorethis
Content-Length: 84
...
but the 'ignorethis' part is silently ignored.
Log the ignored bit, such that we have:
...
READ: <<<{"seq": 4, "type": "request", "command": "configurationDone"}>>>
WROTE: <<<{"request_seq": 4, "type": "response", "command": "configurationDone"
, "success": true}>>>
+++ run
IGNORED: <<<b'ignorethis'>>>
...
|
|
According to black 23, gdb/printing.py was mis-formatted. This patch
fixes it.
|
|
The DAP code already claimed to implement "scopes" and "evaluate", but
this wasn't done completely correctly. This patch implements these
and also implements the "variables" request.
After this patch, variables and scopes correctly report their
sub-structure. This also interfaces with the gdb pretty-printer API,
so the output of pretty-printers is available.
|
|
Tom de Vries pointed out that one DAP test failed on Python 3.6
because gdb.Frame is not hashable.
This patch fixes the problem by using a list to hold the frames. This
is less efficient but there normally won't be that many frames.
Tested-by: Tom de Vries <tdevries@suse.de>
|
|
Linetables no longer change after they are created. This patch
applies const to them.
Note there is one hack to cast away const in mdebugread.c. This code
allocates a linetable using 'malloc', then later copies it to the
obstack. While this could be cleaned up, I chose not to do so because
I have no way of testing it.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes linetables to not add the text offset to the addresses
they contain. I did this in a few steps, necessarily combined
together in one patch: I renamed the 'pc' member to 'm_pc', added the
appropriate accessors, and then recompiled. Then I fixed all the
errors. Where possible I generally chose to use the raw_pc accessor,
as it is less expensive.
Note that this patch discounts the possibility that the text section
offset might cause wraparound in the addresses in the line table.
However, this was already discounted -- in particular,
objfile_relocate1 did not re-sort the table in this scenario. (There
was a bug open about this, but as far as I can tell this has never
happened, it's not even clear what inspired that bug.)
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Replace spaces with tabs in a bunch of places.
Change-Id: If0f87180f1d13028dc178e5a8af7882a067868b0
|
|
As of 2023-02-13 "gdb/python: deallocate tui window factories at Python
shut down" (9ae4519da90), a TUI-less build fails with:
$src/gdb/python/py-tui.c: In function ‘void gdbpy_finalize_tui()’:
$src/gdb/python/py-tui.c:621:3: error: ‘gdbpy_tui_window_maker’ has not been declared
621 | gdbpy_tui_window_maker::invalidate_all ();
| ^~~~~~~~~~~~~~~~~~~~~~
Since gdbpy_tui_window_maker is only defined under #ifdef TUI, add an
#ifdef guard in gdbpy_finalize_tui as well.
|
|
The DAP stackTrace implementation did not fully account for frames
without debuginfo. Attemping this would yield a result like:
{"request_seq": 5, "type": "response", "command": "stackTrace", "success": false, "message": "'NoneType' object has no attribute 'filename'", "seq": 11}
This patch fixes the problem by adding another check for None.
|
|
Small cleanup to use std::string::size instead of calling strlen on
the result of std::string::c_str.
Should be no user visible changes after this call.
|
|
The copyright years in the ROCm files (e.g. solib-rocm.c) are wrong,
they end in 2022 instead of 2023. I suppose because I posted (or at
least prepared) the patches in 2022 but merged them in 2023, and forgot
to update the year. I found a bunch of other files that are in the same
situation. Fix them all up.
Change-Id: Ia55f5b563606c2ba6a89046f22bc0bf1c0ff2e10
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Background
----------
When a thread-specific breakpoint is deleted as a result of the
specific thread exiting the function remove_threaded_breakpoints is
called which sets the disposition of the breakpoint to
disp_del_at_next_stop and sets the breakpoint number to 0. Setting
the breakpoint number to zero has the effect of hiding the breakpoint
from the user. We also print a message indicating that the breakpoint
has been deleted.
It was brought to my attention during a review of another patch[1]
that setting a breakpoints number to zero will suppress the MI
breakpoint-deleted notification for that breakpoint, and indeed, this
can be seen to be true, in delete_breakpoint, if the breakpoint number
is zero, then GDB will not notify the breakpoint_deleted observer.
It seems wrong that a user created, thread-specific breakpoint, will
have a =breakpoint-created notification, but will not have a
=breakpoint-deleted notification. I suspect that this is a bug.
[1] https://sourceware.org/pipermail/gdb-patches/2023-February/196560.html
The First Problem
-----------------
During my initial testing I wanted to see how GDB handled the
breakpoint after it's number was set to zero. To do this I created
the testcase gdb.threads/thread-bp-deleted.exp. This test creates a
worker thread, which immediately exits. After the worker thread has
exited the main thread spins in a loop.
In GDB I break once the worker thread has been created and place a
thread-specific breakpoint, then use 'continue&' to resume the
inferior in non-stop mode. The worker thread then exits, but the main
thread never stops - instead it sits in the spin. I then tried to use
'maint info breakpoints' to see what GDB thought of the
thread-specific breakpoint.
Unfortunately, GDB crashed like this:
(gdb) continue&
Continuing.
(gdb) [Thread 0x7ffff7c5d700 (LWP 1202458) exited]
Thread-specific breakpoint 3 deleted - thread 2 no longer in the thread list.
maint info breakpoints
... snip some output ...
Fatal signal: Segmentation fault
----- Backtrace -----
0x5ffb62 gdb_internal_backtrace_1
../../src/gdb/bt-utils.c:122
0x5ffc05 _Z22gdb_internal_backtracev
../../src/gdb/bt-utils.c:168
0x89965e handle_fatal_signal
../../src/gdb/event-top.c:964
0x8997ca handle_sigsegv
../../src/gdb/event-top.c:1037
0x7f96f5971b1f ???
/usr/src/debug/glibc-2.30-2-gd74461fa34/nptl/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0xe602b0 _Z15print_thread_idP11thread_info
../../src/gdb/thread.c:1439
0x5b3d05 print_one_breakpoint_location
../../src/gdb/breakpoint.c:6542
0x5b462e print_one_breakpoint
../../src/gdb/breakpoint.c:6702
0x5b5354 breakpoint_1
../../src/gdb/breakpoint.c:6924
0x5b58b8 maintenance_info_breakpoints
../../src/gdb/breakpoint.c:7009
... etc ...
As the thread-specific breakpoint is set to disp_del_at_next_stop, and
GDB hasn't stopped yet, then the breakpoint still exists in the global
breakpoint list.
The breakpoint will not show in 'info breakpoints' as its number is
zero, but it will show in 'maint info breakpoints'.
As GDB prints the breakpoint, the thread-id for the breakpoint is
printed as part of the 'stop only in thread ...' line. Printing the
thread-id involves calling find_thread_global_id to convert the global
thread-id into a thread_info*. Then calling print_thread_id to
convert the thread_info* into a string.
The problem is that find_thread_global_id returns nullptr as the
thread for the thread-specific breakpoint has exited. The
print_thread_id assumes it will be passed a non-nullptr. As a result
GDB crashes.
In this commit I've added an assert to print_thread_id (gdb/thread.c)
to check that the pointed passed in is not nullptr. This assert would
have triggered in the above case before GDB crashed.
MI Notifications: The Dangers Of Changing A Breakpoint's Number
---------------------------------------------------------------
Currently the delete_breakpoint function doesn't trigger the
breakpoint_deleted observer for any breakpoint with the number zero.
There is a comment explaining why this is the case in the code; it's
something about watchpoints. But I did consider just removing the 'is
the number zero' guard and always triggering the breakpoint_deleted
observer, figuring that I'd then fix the watchpoint issue some other
way.
But I realised this wasn't going to be good enough. When the MI
notification was delivered the number would be zero, so any frontend
parsing the notifications would not be able to match
=breakpoint-deleted notification to the earlier =breakpoint-created
notification.
What this means is that, at the point the breakpoint_deleted observer
is called, the breakpoint's number must be correct.
MI Notifications: The Dangers Of Delaying Deletion
--------------------------------------------------
The test I used to expose the above crash also brought another problem
to my attention. In the above test we used 'continue&' to resume,
after which a thread exited, but the inferior didn't stop. Recreating
the same test in the MI looks like this:
-break-insert -p 2 main
^done,bkpt={number="2",type="breakpoint",disp="keep",...<snip>...}
(gdb)
-exec-continue
^running
*running,thread-id="all"
(gdb)
~"[Thread 0x7ffff7c5d700 (LWP 987038) exited]\n"
=thread-exited,id="2",group-id="i1"
~"Thread-specific breakpoint 2 deleted - thread 2 no longer in the thread list.\n"
At this point the we have a single thread left, which is still
running:
-thread-info
^done,threads=[{id="1",target-id="Thread 0x7ffff7c5eb80 (LWP 987035)",name="thread-bp-delet",state="running",core="4"}],current-thread-id="1"
(gdb)
Notice that we got the =thread-exited notification from GDB as soon as
the thread exited. We also saw the CLI line from GDB, the line
explaining that breakpoint 2 was deleted. But, as expected, we didn't
see the =breakpoint-deleted notification.
I say "as expected" because the number was set to zero. But, even if
the number was not set to zero we still wouldn't see the
notification. The MI notification is driven by the breakpoint_deleted
observer, which is only called when we actually delete the breakpoint,
which is only done the next time GDB stops.
Now, maybe this is fine. The notification is delivered a little
late. But remember, by setting the number to zero the breakpoint will
be hidden from the user, for example, the breakpoint is removed from
the MI's -break-info command output.
This means that GDB is in a position where the breakpoint doesn't show
up in the breakpoint table, but a =breakpoint-deleted notification has
not yet been sent out. This doesn't seem right to me.
What this means is that, when the thread exits, we should immediately
be sending out the =breakpoint-deleted notification. We should not
wait for GDB to next stop before sending the notification.
The Solution
------------
My proposed solution is this; in remove_threaded_breakpoints, instead
of setting the disposition to disp_del_at_next_stop and setting the
number to zero, we now just call delete_breakpoint directly.
The notification will now be sent out immediately; as soon as the
thread exits.
As the number has not changed when delete_breakpoint is called, the
notification will have the correct number.
And as the breakpoint is immediately removed from the breakpoint list,
we no longer need to worry about 'maint info breakpoints' trying to
print the thread-id for an exited thread.
My only concern is that calling delete_breakpoint directly seems so
obvious that I wonder why the original patch (that added
remove_threaded_breakpoints) didn't take this approach. This code was
added in commit 49fa26b0411d, but the commit message offers no clues
to why this approach was taken, and the original email thread offers
no insights either[2]. There are no test regressions after making
this change, so I'm hopeful that this is going to be fine.
[2] https://sourceware.org/pipermail/gdb-patches/2013-September/106493.html
The Complication
----------------
Of course, it couldn't be that simple.
The script gdb.python/py-finish-breakpoint.exp had some regressions
during testing.
The problem was with the FinishBreakpoint.out_of_scope callback
implementation. This callback is supposed to trigger whenever the
FinishBreakpoint goes out of scope; and this includes when the thread
for the breakpoint exits.
The problem I ran into is the Python FinishBreakpoint implementation.
Specifically, after this change I was loosing some of the out_of_scope
calls.
The problem is that the out_of_scope call (of which I'm interested) is
triggered from the inferior_exit observer. Before my change the
observers were called in this order:
thread_exit
inferior_exit
breakpoint_deleted
The inferior_exit would trigger the out_of_scope call.
After my change the breakpoint_deleted notification (for
thread-specific breakpoints) occurs earlier, as soon as the
thread-exits, so now the order is:
thread_exit
breakpoint_deleted
inferior_exit
Currently, after the breakpoint_deleted call the Python object
associated with the breakpoint is released, so, when we get to the
inferior_exit observer, there's no longer a Python object to call the
out_of_scope method on.
My solution is to follow the model for how bpfinishpy_pre_stop_hook
and bpfinishpy_post_stop_hook are called, this is done from
gdbpy_breakpoint_cond_says_stop in py-breakpoint.c.
I've now added a new bpfinishpy_pre_delete_hook
gdbpy_breakpoint_deleted in py-breakpoint.c, and from this new hook
function I check and where needed call the out_of_scope method.
With this fix in place I now see the
gdb.python/py-finish-breakpoint.exp test fully passing again.
Testing
-------
Tested on x86-64/Linux with unix, native-gdbserver, and
native-extended-gdbserver boards.
New tests added to covers all the cases I've discussed above.
Approved-By: Pedro Alves <pedro@palves.net>
|
|
See the previous patches in this series for the motivation behind
these changes.
This commit contains updates to Python's QUIT handling. Ideally, we'd
like to throw gdb_exception_forced_quit through the extension
language; I made an attempt to do this for gdb_exception_quit in an
earlier version of this patch, but Pedro pointed out that it is
(almost certainly) not safe to do so.
Still, we definitely don't want to swallow the exception representing
a SIGTERM for GDB, nor do we want to force modules written in the
extension language to have to explicitly handle this case. Since the
idea is for GDB to cleanup and quit for this exception, we'll simply
call quit_force() just as if the gdb_exception_forced_quit propagation
had managed to make it back to the top level.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Tested-by: Tom de Vries <tdevries@suse.de>
Approved-By: Pedro Alves <pedro@palves.net>
|
|
Hannes filed a bug showing a crash, where a pretty-printer written in
Python could cause a use-after-free. He sent a patch, but I thought a
different approach was needed.
In a much earlier patch (see bug #12533), we changed the Python code
to release new values from the value chain when constructing a
gdb.Value. The rationale for this is that if you write a command that
does a lot of computations in a loop, all the values will be kept live
by the value chain, resulting in gdb using a large amount of memory.
However, suppose a value is passed to Python from some code in gdb
that needs to use the value after the call into Python. In this
scenario, value_to_value_object will still release the value -- and
because gdb code doesn't generally keep strong references to values (a
consequence of the ancient decision to use the value chain to avoid
memory management), this will result in a use-after-free.
This scenario can happen, as it turns out, when a value is passed to
Python for pretty-printing. Now, normally this route boxes the value
via value_to_value_object_no_release, avoiding the problematic release
from the value chain. However, if you then call Value.cast, the
underlying value API might return the same value, when is then
released from the chain.
This patch fixes the problem by changing how value boxing is done.
value_to_value_object no longer removes a value from the chain.
Instead, every spot in gdb that might construct new values uses a
scoped_value_mark to ensure that the requirements of bug #12533 are
met. And, because incoming values aren't ever released from the chain
(the Value.cast one comes earlier on the chain than the
scoped_value_mark), the bug can no longer occur. (Note that many
spots in the Python layer already take this approach, so not many
places needed to be touched.)
In the future I think we should replace the use of raw "value *" with
value_ref_ptr pretty much everywhere. This will ensure lifetime
safety throughout gdb.
The test case in this patch comes from Hannes' original patch. I only
made a trivial ("require") change to it. However, while this fails
for him, I can't make it fail on this machine; nevertheless, he tried
my patch and reported the bug as being fixed.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30044
|
|
Change-Id: Ie8ec8870a16d71c5858f5d08958309d23c318302
Reviewed-By: Tom Tromey <tom@tromey.com>
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
|
|
Use nullptr instead of NULL and boolify two local variables in
execute_gdb_command.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This removes ALL_BLOCK_SYMBOLS_WITH_NAME in favor of foreach.
|
|
This converts block_static_block and block_global_block to be methods.
This was mostly written by script. It was simpler to convert them at
the same time because they're often used near each other.
|
|
This removes deprecated_lval_hack and the VALUE_LVAL macro, replacing
all uses with a call to value::lval.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
record_latest_value now access some internals of struct value, so turn
it into a method.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes preserve_one_value to be a method of value. Much of this
patch was written by script.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This turns many functions that are related to optimized-out or
availability-checking to be methods of value. The static function
value_entirely_covered_by_range_vector is also converted to be a
private method.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This turns value_copy into a method of value. Much of this was
written by script.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This turns the remaining value_contents functions -- value_contents,
value_contents_all, value_contents_for_printing, and
value_contents_for_printing_const -- into methods of value. It also
converts the static functions require_not_optimized_out and
require_available to be private methods.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes value_incref and value_decref to be methods of value.
Much of this patch was written by script.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes value_fetch_lazy to be a method of value. A few helper
functions are converted as well, to avoid problems in later patches
when the data members are all made private.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This turns value_zero into a static "constructor" of value.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This turns allocate_optimized_out_value into a static "constructor" of
value.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes allocate_value to be a static "constructor" of value.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes the value_address and set_value_address functions to be
methods of value.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|