stepping is disturbed by setjmp/longjmp | try/catch in other threads

At https://sourceware.org/ml/gdb-patches/2015-08/msg00097.html, Joel observed that trying to next/step a program on GNU/Linux sometimes results in the following failed assertion: % gdb -q .obj/gprof/main (gdb) start (gdb) n (gdb) step [...]/infrun.c:2391: internal-error: resume: Assertion `sig != GDB_SIGNAL_0' failed. What happened is that, during the "next" operation, GDB hit a longjmp/exception/step-resume breakpoint but failed to see that this breakpoint was set for a different thread than the one being stepped. Joel's detailed analysis follows: More precisely, at the end of the "start" command, we are stopped at the start of function Main in main.adb; there are 4 threads in total, and we are in the main thread (which is thread 1): (gdb) info thread Id Target Id Frame 4 Thread 0xb7a56ba0 (LWP 28379) 0xffffe410 in __kernel_vsyscall () 3 Thread 0xb7c5aba0 (LWP 28378) 0xffffe410 in __kernel_vsyscall () 2 Thread 0xb7e5eba0 (LWP 28377) 0xffffe410 in __kernel_vsyscall () * 1 Thread 0xb7ea18c0 (LWP 28370) main () at /[...]/main.adb:57 All the logs below reference Thread ID/LWP, but it'll be easier to talk about the threads by GDB thread number. For instance, thread 1 is LWP 28370 while thread 3 is LWP 28378. So, the explanations below translate the LWPs into thread numbers. Back to what happens while we are trying to "next' our program: (gdb) n infrun: clear_proceed_status_thread (Thread 0xb7a56ba0 (LWP 28379)) infrun: clear_proceed_status_thread (Thread 0xb7c5aba0 (LWP 28378)) infrun: clear_proceed_status_thread (Thread 0xb7e5eba0 (LWP 28377)) infrun: clear_proceed_status_thread (Thread 0xb7ea18c0 (LWP 28370)) infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT) infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x805451e infrun: target_wait (-1.0.0, status) = infrun: 28370.28370.0 [Thread 0xb7ea18c0 (LWP 28370)], infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x8054523 We've resumed thread 1 (LWP 28370), and received in return a signal that the same thread stopped slightly further. It's still in the range of instructions for the line of source we started the "next" from, as evidenced by the following trace... infrun: stepping inside range [0x805451e-0x8054531] ... and thus, we decide to continue stepping the same thread: infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523 infrun: prepare_to_wait That's when we get an event from a different thread (thread 3)... infrun: target_wait (-1.0.0, status) = infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)], infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x80782d0 infrun: context switch infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378) ... which we find to be at the address where we set a breakpoint on "the unwinder debug hook" (namely "_Unwind_DebugHook"). But GDB fails to notice that the breakpoint was inserted for thread 1 only, and so decides to handle it as... infrun: BPSTAT_WHAT_SET_LONGJMP_RESUME ... and inserts a breakpoint at the corresponding resume address, as evidenced by this the next log: infrun: exception resume at 80542a2 That breakpoint seems innocent right now, but will play a role fairly quickly. But for now, GDB has inserted the exception-resume breakpoint, and needs to single-step thread 3 past the breakpoint it just hit. Thus, it temporarily disables the exception breakpoint, and requests a step of that thread: infrun: skipping breakpoint: stepping past insn at: 0x80782d0 infrun: skipping breakpoint: stepping past insn at: 0x80782d0 infrun: skipping breakpoint: stepping past insn at: 0x80782d0 infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 0xb7c5aba0 (LWP 28378)] at 0x80782d0 infrun: prepare_to_wait We then get a notification, still from thread 3, that it's now past that breakpoint... infrun: prepare_to_wait infrun: target_wait (-1.0.0, status) = infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)], infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x8078424 ... so we can resume what we were doing before, which is single-stepping thread 1 until we get to a new line of code: infrun: switching back to stepped thread infrun: Switching context from Thread 0xb7c5aba0 (LWP 28378) to Thread 0xb7ea18c0 (LWP 28370) infrun: expected thread still hasn't advanced infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523 The "resume" log above shows that we're resuming thread 1 from where we left off (0x8054523). We get one more stop at 0x8054529, which is still inside our stepping range so we go again. That's when we get the following event, from thread 3: infrun: prepare_to_wait infrun: target_wait (-1.0.0, status) = infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)], infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x80542a2 Now the stop_pc address is interesting, because it's the address of "exception resume" breakpoint... infrun: context switch infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378) infrun: BPSTAT_WHAT_CLEAR_LONGJMP_RESUME ... and since that location is at a different line of code, this is where it decides the "next" operation should stop: infrun: stop_waiting [Switching to Thread 0xb7c5aba0 (LWP 28378)] 0x080542a2 in inte_tache_rt.ttache_rt ( <_task>=0x80968ec <inte_tache_rt_inst.tache2>) at /[...]/inte_tache_rt.adb:54 54 end loop; However, what GDB should have noticed earlier that the exception breakpoint we hit was for a different thread, thus should have single-stepped that thread out of the breakpoint _without_ inserting the exception-return breakpoint, and then resumed the single-stepping of the initial thread (thread 1) until that thread stepped out of its stepping range. This is what this patch does, and after applying it, GDB now correctly stops on the next line of code. The patch adds a C++ test that exercises this, both for setjmp/longjmp and exception breakpoints. With an unpatched GDB it shows: (gdb) next [Switching to Thread 22445.22455] thread_try_catch (arg=0x0) at /home/pedro/gdb/mygit/build/../src/gdb/testsuite/gdb.threads/next-other-thr-longjmp.c:59 59 catch (...) (gdb) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 1 next /home/pedro/gdb/mygit/build/../src/gdb/infrun.c:4865: internal-error: process_event_stop_test: Assertion `ecs->event_thread->control.exception_resume_breakpoint != NULL' fa iled. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 2 (GDB internal error) Resyncing due to internal error. n Tested on x86_64-linux, no regressions. gdb/ChangeLog: 2015-08-05 Pedro Alves <palves@redhat.com> Joel Brobecker <brobecker@adacore.com> * breakpoint.c (bpstat_what) <bp_longjmp, bp_longjmp_call_dummy> <bp_exception, bp_longjmp_resume, bp_exception_resume>: Handle the case where BS->STOP is not set. gdb/testsuite/ChangeLog: 2015-08-05 Pedro Alves <palves@redhat.com> * gdb.threads/next-while-other-thread-longjmps.c: New file. * gdb.threads/next-while-other-thread-longjmps.exp: New file.
author: Pedro Alves <palves@redhat.com> 2015-08-05 20:01:42 +0100
committer: Pedro Alves <palves@redhat.com> 2015-08-05 20:01:42 +0100
commit: 0a39bb3218ec528236da4953a97d07f0da9313ce (patch)
tree: fb79fb7950d5dabd0f6e7a40aeae93b47480eddc /gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.exp
parent: 63b9bbb7d7bcdcb6e0f59dd8da9615d80c537b8d (diff)
download: gdb-0a39bb3218ec528236da4953a97d07f0da9313ce.zip
gdb-0a39bb3218ec528236da4953a97d07f0da9313ce.tar.gz
gdb-0a39bb3218ec528236da4953a97d07f0da9313ce.tar.bz2
1 files changed, 40 insertions, 0 deletions
diff --git a/gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.exp b/gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.exp
new file mode 100644
index 0000000..72a0617
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.exp
@@ -0,0 +1,40 @@
+# Copyright (C) 2015 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test has the main thread step over a few lines, while a few
+# threads constantly do setjmp/long and others do try/catch.  The
+# "next" commands in the main thread should be able to complete
+# undisturbed.
+
+standard_testfile
+
+set linenum [gdb_get_line_number "set break here"]
+
+if {[prepare_for_testing "failed to prepare" \
+	 $testfile $srcfile {c++ debug pthreads}] == -1} {
+    return -1
+}
+
+if ![runto_main] then {
+    fail "Can't run to main"
+    return 0
+}
+
+gdb_breakpoint $linenum
+gdb_continue_to_breakpoint "start line"
+
+for {set i 1} {$i <= 10} {incr i} {
+    gdb_test "next" " line $i .*" "next to line $i"
+}
author	Pedro Alves <palves@redhat.com>	2015-08-05 20:01:42 +0100
committer	Pedro Alves <palves@redhat.com>	2015-08-05 20:01:42 +0100
commit	0a39bb3218ec528236da4953a97d07f0da9313ce (patch)
tree	fb79fb7950d5dabd0f6e7a40aeae93b47480eddc /gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.exp
parent	63b9bbb7d7bcdcb6e0f59dd8da9615d80c537b8d (diff)
download	gdb-0a39bb3218ec528236da4953a97d07f0da9313ce.zip gdb-0a39bb3218ec528236da4953a97d07f0da9313ce.tar.gz gdb-0a39bb3218ec528236da4953a97d07f0da9313ce.tar.bz2