diff options
| author | Andrew Burgess <aburgess@redhat.com> | 2025-08-04 15:44:04 +0100 |
|---|---|---|
| committer | Andrew Burgess <aburgess@redhat.com> | 2025-08-29 15:41:08 +0100 |
| commit | 7f15a94e6553deb8074b3aa1ed2d0157b00f7aec (patch) | |
| tree | 5b94dbb743522e2db3d5ac862a56561a71c78c4d /gdb/python/python.c | |
| parent | 81e90cf63a10ad11772c2437c8f2a88f1a00c739 (diff) | |
| download | binutils-7f15a94e6553deb8074b3aa1ed2d0157b00f7aec.zip binutils-7f15a94e6553deb8074b3aa1ed2d0157b00f7aec.tar.gz binutils-7f15a94e6553deb8074b3aa1ed2d0157b00f7aec.tar.bz2 | |
gdb: use kill() in gdbpy_interrupt for hosts with signal support
For background, see this thread:
https://inbox.sourceware.org/gdb-patches/20250612144607.27507-1-tdevries@suse.de
Tom describes the issue clearly in the above thread, here's what he
said:
Once in a while, when running test-case gdb.base/bp-cmds-continue-ctrl-c.exp,
I run into:
...
Breakpoint 2, foo () at bp-cmds-continue-ctrl-c.c:23^M
23 usleep (100);^M
^CFAIL: $exp: run: stop with control-c (unexpected) (timeout)
FAIL: $exp: run: stop with control-c
...
This is PR python/32167, observed both on x86_64-linux and powerpc64le-linux.
This is not a timeout due to accidental slowness, gdb actually hangs.
The backtrace at the hang is (on cfarm120 running AlmaLinux 9.6):
...
(gdb) bt
#0 0x00007fffbca9dd94 in __lll_lock_wait () from
/lib64/glibc-hwcaps/power10/libc.so.6
#1 0x00007fffbcaa6ddc in pthread_mutex_lock@@GLIBC_2.17 () from
/lib64/glibc-hwcaps/power10/libc.so.6
#2 0x000000001067aee8 in __gthread_mutex_lock ()
at /usr/include/c++/11/ppc64le-redhat-linux/bits/gthr-default.h:749
#3 0x000000001067afc8 in __gthread_recursive_mutex_lock ()
at /usr/include/c++/11/ppc64le-redhat-linux/bits/gthr-default.h:811
#4 0x000000001067b0d4 in std::recursive_mutex::lock ()
at /usr/include/c++/11/mutex:108
#5 0x000000001067b380 in std::lock_guard<std::recursive_mutex>::lock_guard ()
at /usr/include/c++/11/bits/std_mutex.h:229
#6 0x0000000010679d3c in set_quit_flag () at gdb/extension.c:865
#7 0x000000001066b6dc in handle_sigint () at gdb/event-top.c:1264
#8 0x00000000109e3b3c in handler_wrapper () at gdb/posix-hdep.c:70
#9 <signal handler called>
#10 0x00007fffbcaa6d14 in pthread_mutex_lock@@GLIBC_2.17 () from
/lib64/glibc-hwcaps/power10/libc.so.6
#11 0x000000001067aee8 in __gthread_mutex_lock ()
at /usr/include/c++/11/ppc64le-redhat-linux/bits/gthr-default.h:749
#12 0x000000001067afc8 in __gthread_recursive_mutex_lock ()
at /usr/include/c++/11/ppc64le-redhat-linux/bits/gthr-default.h:811
#13 0x000000001067b0d4 in std::recursive_mutex::lock ()
at /usr/include/c++/11/mutex:108
#14 0x000000001067b380 in std::lock_guard<std::recursive_mutex>::lock_guard ()
at /usr/include/c++/11/bits/std_mutex.h:229
#15 0x00000000106799cc in set_active_ext_lang ()
at gdb/extension.c:775
#16 0x0000000010b287ac in gdbpy_enter::gdbpy_enter ()
at gdb/python/python.c:232
#17 0x0000000010a8e3f8 in bpfinishpy_handle_stop ()
at gdb/python/py-finishbreakpoint.c:414
...
What happens here is the following:
- the gdbpy_enter constructor attempts to set the current extension language
to python using set_active_ext_lang
- set_active_ext_lang attempts to lock ext_lang_mutex
- while doing so, it is interrupted by sigint_wrapper (the SIGINT handler),
handling a SIGINT
- sigint_wrapper calls handle_sigint, which calls set_quit_flag, which also
tries to lock ext_lang_mutex
- since std::recursive_mutex::lock is not async-signal-safe, things go wrong,
resulting in a hang.
The hang bisects to commit 8bb8f834672 ("Fix gdb.interrupt race"), which
introduced the lock, making PR python/32167 a regression since gdb 15.1.
Commit 8bb8f834672 fixes PR dap/31263, a race reported by ThreadSanitizer:
...
WARNING: ThreadSanitizer: data race (pid=615372)
Read of size 1 at 0x00000328064c by thread T19:
#0 set_active_ext_lang(extension_language_defn const*) gdb/extension.c:755
#1 scoped_disable_cooperative_sigint_handling::scoped_disable_cooperative_sigint_handling()
gdb/extension.c:697
#2 gdbpy_interrupt gdb/python/python.c:1106
#3 cfunction_vectorcall_NOARGS <null>
Previous write of size 1 at 0x00000328064c by main thread:
#0 scoped_disable_cooperative_sigint_handling::scoped_disable_cooperative_sigint_handling()
gdb/extension.c:704
#1 fetch_inferior_event() gdb/infrun.c:4591
...
Location is global 'cooperative_sigint_handling_disabled' of size 1 at 0x00000328064c
...
SUMMARY: ThreadSanitizer: data race gdb/extension.c:755 in \
set_active_ext_lang(extension_language_defn const*)
...
The problem here is that gdb.interrupt is called from a worker thread, and its
implementation, gdbpy_interrupt races with the main thread on some variable.
The fix presented here is based on the fix that Tom proposed, but
fills in the missing Mingw support.
The problem is basically split into two: hosts that support unix like
signals, and Mingw, which doesn't support signals.
For signal supporting hosts, I've adopted the approach that Tom
suggests, gdbpy_interrupt uses kill() to send SIGINT to the GDB
process. This is then handled in the main thread as if the user had
pressed Ctrl+C. For these hosts no locking is required, so the
existing lock is removed. However, everywhere the lock currently
exists I've added an assert:
gdb_assert (is_main_thread ());
If this assert ever triggers then we're setting or reading the quit
flag on a worker thread, this will be a problem without the mutex.
For Mingw, the current mutex is retained. This is fine as there are
no signals, so no chance of the mutex acquisition being interrupted by
a signal, and so, deadlock shouldn't be an issue.
To manage the complexity of when we need an assert, and when we need
the mutex, I've created 'struct ext_lang_guard', which can be used as
a RAII object. This object either performs the assertion check, or
acquires the mutex, depending on the host.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32167
Co-Authored-By: Tom de Vries <tdevries@suse.de>
Approved-By: Tom Tromey <tom@tromey.com>
Diffstat (limited to 'gdb/python/python.c')
| -rw-r--r-- | gdb/python/python.c | 13 |
1 files changed, 10 insertions, 3 deletions
diff --git a/gdb/python/python.c b/gdb/python/python.c index 1af7896..3b5d069 100644 --- a/gdb/python/python.c +++ b/gdb/python/python.c @@ -1202,15 +1202,22 @@ gdbpy_post_event (PyObject *self, PyObject *args) static PyObject * gdbpy_interrupt (PyObject *self, PyObject *args) { +#ifdef __MINGW32__ { - /* Make sure the interrupt isn't delivered immediately somehow. - This probably is not truly needed, but at the same time it - seems more clear to be explicit about the intent. */ gdbpy_allow_threads temporarily_exit_python; scoped_disable_cooperative_sigint_handling no_python_sigint; set_quit_flag (); } +#else + { + /* For targets with support kill() just send SIGINT. This will be + handled as if the user hit Ctrl+C. This isn't exactly the same as + the above, which directly sets the quit flag. Consider, for + example, every place that install_sigint_handler is called. */ + kill (getpid (), SIGINT); + } +#endif Py_RETURN_NONE; } |
