aboutsummaryrefslogtreecommitdiff
path: root/gdb/thread.c
diff options
context:
space:
mode:
authorAndrew Burgess <aburgess@redhat.com>2023-02-16 13:06:29 +0000
committerAndrew Burgess <aburgess@redhat.com>2023-02-28 10:56:28 +0000
commit2968b79fca38cf18e8eef360c36de7a6e3846d3c (patch)
tree8554b75a28346a635b347669917446b9f602f519 /gdb/thread.c
parent05ac6365e558a207fcf7fa39c3fc8c7b1d0601aa (diff)
downloadgdb-2968b79fca38cf18e8eef360c36de7a6e3846d3c.zip
gdb-2968b79fca38cf18e8eef360c36de7a6e3846d3c.tar.gz
gdb-2968b79fca38cf18e8eef360c36de7a6e3846d3c.tar.bz2
gdb: fix mi breakpoint-deleted notifications for thread-specific b/p
Background ---------- When a thread-specific breakpoint is deleted as a result of the specific thread exiting the function remove_threaded_breakpoints is called which sets the disposition of the breakpoint to disp_del_at_next_stop and sets the breakpoint number to 0. Setting the breakpoint number to zero has the effect of hiding the breakpoint from the user. We also print a message indicating that the breakpoint has been deleted. It was brought to my attention during a review of another patch[1] that setting a breakpoints number to zero will suppress the MI breakpoint-deleted notification for that breakpoint, and indeed, this can be seen to be true, in delete_breakpoint, if the breakpoint number is zero, then GDB will not notify the breakpoint_deleted observer. It seems wrong that a user created, thread-specific breakpoint, will have a =breakpoint-created notification, but will not have a =breakpoint-deleted notification. I suspect that this is a bug. [1] https://sourceware.org/pipermail/gdb-patches/2023-February/196560.html The First Problem ----------------- During my initial testing I wanted to see how GDB handled the breakpoint after it's number was set to zero. To do this I created the testcase gdb.threads/thread-bp-deleted.exp. This test creates a worker thread, which immediately exits. After the worker thread has exited the main thread spins in a loop. In GDB I break once the worker thread has been created and place a thread-specific breakpoint, then use 'continue&' to resume the inferior in non-stop mode. The worker thread then exits, but the main thread never stops - instead it sits in the spin. I then tried to use 'maint info breakpoints' to see what GDB thought of the thread-specific breakpoint. Unfortunately, GDB crashed like this: (gdb) continue& Continuing. (gdb) [Thread 0x7ffff7c5d700 (LWP 1202458) exited] Thread-specific breakpoint 3 deleted - thread 2 no longer in the thread list. maint info breakpoints ... snip some output ... Fatal signal: Segmentation fault ----- Backtrace ----- 0x5ffb62 gdb_internal_backtrace_1 ../../src/gdb/bt-utils.c:122 0x5ffc05 _Z22gdb_internal_backtracev ../../src/gdb/bt-utils.c:168 0x89965e handle_fatal_signal ../../src/gdb/event-top.c:964 0x8997ca handle_sigsegv ../../src/gdb/event-top.c:1037 0x7f96f5971b1f ??? /usr/src/debug/glibc-2.30-2-gd74461fa34/nptl/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 0xe602b0 _Z15print_thread_idP11thread_info ../../src/gdb/thread.c:1439 0x5b3d05 print_one_breakpoint_location ../../src/gdb/breakpoint.c:6542 0x5b462e print_one_breakpoint ../../src/gdb/breakpoint.c:6702 0x5b5354 breakpoint_1 ../../src/gdb/breakpoint.c:6924 0x5b58b8 maintenance_info_breakpoints ../../src/gdb/breakpoint.c:7009 ... etc ... As the thread-specific breakpoint is set to disp_del_at_next_stop, and GDB hasn't stopped yet, then the breakpoint still exists in the global breakpoint list. The breakpoint will not show in 'info breakpoints' as its number is zero, but it will show in 'maint info breakpoints'. As GDB prints the breakpoint, the thread-id for the breakpoint is printed as part of the 'stop only in thread ...' line. Printing the thread-id involves calling find_thread_global_id to convert the global thread-id into a thread_info*. Then calling print_thread_id to convert the thread_info* into a string. The problem is that find_thread_global_id returns nullptr as the thread for the thread-specific breakpoint has exited. The print_thread_id assumes it will be passed a non-nullptr. As a result GDB crashes. In this commit I've added an assert to print_thread_id (gdb/thread.c) to check that the pointed passed in is not nullptr. This assert would have triggered in the above case before GDB crashed. MI Notifications: The Dangers Of Changing A Breakpoint's Number --------------------------------------------------------------- Currently the delete_breakpoint function doesn't trigger the breakpoint_deleted observer for any breakpoint with the number zero. There is a comment explaining why this is the case in the code; it's something about watchpoints. But I did consider just removing the 'is the number zero' guard and always triggering the breakpoint_deleted observer, figuring that I'd then fix the watchpoint issue some other way. But I realised this wasn't going to be good enough. When the MI notification was delivered the number would be zero, so any frontend parsing the notifications would not be able to match =breakpoint-deleted notification to the earlier =breakpoint-created notification. What this means is that, at the point the breakpoint_deleted observer is called, the breakpoint's number must be correct. MI Notifications: The Dangers Of Delaying Deletion -------------------------------------------------- The test I used to expose the above crash also brought another problem to my attention. In the above test we used 'continue&' to resume, after which a thread exited, but the inferior didn't stop. Recreating the same test in the MI looks like this: -break-insert -p 2 main ^done,bkpt={number="2",type="breakpoint",disp="keep",...<snip>...} (gdb) -exec-continue ^running *running,thread-id="all" (gdb) ~"[Thread 0x7ffff7c5d700 (LWP 987038) exited]\n" =thread-exited,id="2",group-id="i1" ~"Thread-specific breakpoint 2 deleted - thread 2 no longer in the thread list.\n" At this point the we have a single thread left, which is still running: -thread-info ^done,threads=[{id="1",target-id="Thread 0x7ffff7c5eb80 (LWP 987035)",name="thread-bp-delet",state="running",core="4"}],current-thread-id="1" (gdb) Notice that we got the =thread-exited notification from GDB as soon as the thread exited. We also saw the CLI line from GDB, the line explaining that breakpoint 2 was deleted. But, as expected, we didn't see the =breakpoint-deleted notification. I say "as expected" because the number was set to zero. But, even if the number was not set to zero we still wouldn't see the notification. The MI notification is driven by the breakpoint_deleted observer, which is only called when we actually delete the breakpoint, which is only done the next time GDB stops. Now, maybe this is fine. The notification is delivered a little late. But remember, by setting the number to zero the breakpoint will be hidden from the user, for example, the breakpoint is removed from the MI's -break-info command output. This means that GDB is in a position where the breakpoint doesn't show up in the breakpoint table, but a =breakpoint-deleted notification has not yet been sent out. This doesn't seem right to me. What this means is that, when the thread exits, we should immediately be sending out the =breakpoint-deleted notification. We should not wait for GDB to next stop before sending the notification. The Solution ------------ My proposed solution is this; in remove_threaded_breakpoints, instead of setting the disposition to disp_del_at_next_stop and setting the number to zero, we now just call delete_breakpoint directly. The notification will now be sent out immediately; as soon as the thread exits. As the number has not changed when delete_breakpoint is called, the notification will have the correct number. And as the breakpoint is immediately removed from the breakpoint list, we no longer need to worry about 'maint info breakpoints' trying to print the thread-id for an exited thread. My only concern is that calling delete_breakpoint directly seems so obvious that I wonder why the original patch (that added remove_threaded_breakpoints) didn't take this approach. This code was added in commit 49fa26b0411d, but the commit message offers no clues to why this approach was taken, and the original email thread offers no insights either[2]. There are no test regressions after making this change, so I'm hopeful that this is going to be fine. [2] https://sourceware.org/pipermail/gdb-patches/2013-September/106493.html The Complication ---------------- Of course, it couldn't be that simple. The script gdb.python/py-finish-breakpoint.exp had some regressions during testing. The problem was with the FinishBreakpoint.out_of_scope callback implementation. This callback is supposed to trigger whenever the FinishBreakpoint goes out of scope; and this includes when the thread for the breakpoint exits. The problem I ran into is the Python FinishBreakpoint implementation. Specifically, after this change I was loosing some of the out_of_scope calls. The problem is that the out_of_scope call (of which I'm interested) is triggered from the inferior_exit observer. Before my change the observers were called in this order: thread_exit inferior_exit breakpoint_deleted The inferior_exit would trigger the out_of_scope call. After my change the breakpoint_deleted notification (for thread-specific breakpoints) occurs earlier, as soon as the thread-exits, so now the order is: thread_exit breakpoint_deleted inferior_exit Currently, after the breakpoint_deleted call the Python object associated with the breakpoint is released, so, when we get to the inferior_exit observer, there's no longer a Python object to call the out_of_scope method on. My solution is to follow the model for how bpfinishpy_pre_stop_hook and bpfinishpy_post_stop_hook are called, this is done from gdbpy_breakpoint_cond_says_stop in py-breakpoint.c. I've now added a new bpfinishpy_pre_delete_hook gdbpy_breakpoint_deleted in py-breakpoint.c, and from this new hook function I check and where needed call the out_of_scope method. With this fix in place I now see the gdb.python/py-finish-breakpoint.exp test fully passing again. Testing ------- Tested on x86-64/Linux with unix, native-gdbserver, and native-extended-gdbserver boards. New tests added to covers all the cases I've discussed above. Approved-By: Pedro Alves <pedro@palves.net>
Diffstat (limited to 'gdb/thread.c')
-rw-r--r--gdb/thread.c2
1 files changed, 2 insertions, 0 deletions
diff --git a/gdb/thread.c b/gdb/thread.c
index 1a852f7..5b47215 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -1431,6 +1431,8 @@ show_inferior_qualified_tids (void)
const char *
print_thread_id (struct thread_info *thr)
{
+ gdb_assert (thr != nullptr);
+
char *s = get_print_cell ();
if (show_inferior_qualified_tids ())