aboutsummaryrefslogtreecommitdiff
path: root/gdb/testsuite/gdb.threads
AgeCommit message (Collapse)AuthorFilesLines
2023-11-15Fix gdb.threads/threads-after-exec.exp racePedro Alves1-3/+7
Simon noticed that gdb.threads/threads-after-exec.exp was racy. You can consistenly reproduce it (at git hash 319b460545dc79280e2904dcc280057cf71fb753), with: $ taskset -c 0 make check TESTS="gdb.threads/threads-after-exec.exp" gdb.log shows: (...) Thread 3 "threads-after-e" hit Catchpoint 2 (exec'd .../gdb.threads/threads-after-exec/threads-after-exec), 0x00007ffff7fe3290 in _start () from /lib64/ld-linux-x86-64.so.2 (gdb) PASS: gdb.threads/threads-after-exec.exp: continue until exec info threads Id Target Id Frame * 3 process 1443269 "threads-after-e" 0x00007ffff7fe3290 in _start () from /lib64/ld-linux-x86-64.so.2 (gdb) FAIL: gdb.threads/threads-after-exec.exp: info threads (...) maint info linux-lwps LWP Ptid Thread ID 1443269.1443269.0 1.3 (gdb) FAIL: gdb.threads/threads-after-exec.exp: maint info linux-lwps The FAILs happen because the .exp file expects that after the exec, the only thread has GDB thread number 1, but it has instead 3. This is yet another case of zombie leader detection making things a bit fuzzy. In the passing case, we have: continue Continuing. [New Thread 0x7ffff7bff640 (LWP 603183)] [Thread 0x7ffff7bff640 (LWP 603183) exited] process 603180 is executing new program: .../gdb.threads/threads-after-exec/threads-after-exec While in the failing case, we have (note remarks on the rhs): continue Continuing. [New Thread 0x7ffff7bff640 (LWP 600205)] [Thread 0x7ffff7f95740 (LWP 600202) exited] <<< gdb deletes leader thread, thread 1. [New LWP 600202] <<< gdb adds it back -- this is now thread 3. [Thread 0x7ffff7bff640 (LWP 600205) exited] process 600202 is executing new program: .../threads-after-exec/threads-after-exec The testcase only has two threads, yet GDB presented the exec for thread 3. This is GDB deleting the leader (the backend detected it was zombie, due to the exec), and then adding the leader back when it saw the exec event. I've recorded some thoughts about this in PR gdb/31069. For now, this commit just makes the testcase cope with the non-one thread number, as the number is not important for what this test is exercising. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31069 Change-Id: Id80b5c73f09c9e0005efeb494cca5d066ac3bbae
2023-11-14[gdb/testsuite] Fix gdb.threads/stepi-over-clone.exp regexpTom de Vries1-5/+9
I ran into the following FAIL: ... (gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls continue^M Continuing.^M ^M Catchpoint 2 (call to syscall clone), clone () at \ ../sysdeps/unix/sysv/linux/x86_64/clone.S:78^M warning: 78 ../sysdeps/unix/sysv/linux/x86_64/clone.S: \ No such file or directory^M (gdb) FAIL: gdb.threads/stepi-over-clone.exp: continue ... All but one regexps in the .exp file use "clone\[23\]?" with "?" to also accept "clone", except the failing case. This commit fixes that case to also use "?". Furthermore, there are FAILs like this: ... (gdb) PASS: gdb.threads/stepi-over-clone.exp: third_thread=false: \ non-stop=on: displaced=off: i=0: continue stepi^M [New Thread 0x7ffff7ff8700 (LWP 15301)]^M Hello from the first thread.^M 78 in ../sysdeps/unix/sysv/linux/x86_64/clone.S^M (gdb) XXX: Consume the initial command XXX: Consume new thread line XXX: Consume first worker thread message FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: \ displaced=off: i=0: stepi ... because this output is expected instead: ... Hello from the first thread.^M 0x00000000004212cd in clone3 ()^M ... The root cause for the difference is the presence of .debug_line info for clone. Fix this by updating the relevant regexps. Tested on x86_64-linux, specifically: - openSUSE Leap 15.4 (where the FAILs where observed), and - openSUSE Tumbleweed (where the FAILs where not observed). Co-Authored-By: Pedro Alves <pedro@palves.net> Approved-By: Pedro Alves <pedro@palves.net> Change-Id: I74ca9e7d4cfe6af294fd50e8c509fcbad289b78c
2023-11-13Cancel execution command on thread exit, when stepping, nexting, etc.Pedro Alves1-26/+55
If your target has no support for TARGET_WAITKIND_NO_RESUMED events (and no way to support them, such as the yet-unsubmitted AMDGPU target), and you step over thread exit with scheduler-locking on, this is what you get: (gdb) n [Thread ... exited] *hang* Getting back the prompt by typing Ctrl-C may not even work, since no inferior thread is running to receive the SIGINT. Even if it works, it seems unnecessarily harsh. If you started an execution command for which there's a clear thread of interest (step, next, until, etc.), and that thread disappears, then I think it's more user friendly if GDB just detects the situation and aborts the command, giving back the prompt. That is what this commit implements. It does this by explicitly requesting the target to report thread exit events whenever the main resumed thread has a thread_fsm. Note that unlike stepping over a breakpoint, we don't need to enable clone events in this case. With this patch, we get: (gdb) n [Thread 0x7ffff7d89700 (LWP 3961883) exited] Command aborted, thread exited. (gdb) Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: I901ab64c91d10830590b2dac217b5264635a2b95
2023-11-13Testcases for stepping over thread exit syscall (PR gdb/27338)Simon Marchi4-0/+324
Add new gdb.threads/step-over-thread-exit.exp and gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcases, exercising stepping over thread exit syscall. These make use of lib/my-syscalls.S to define the exit syscall. Co-authored-by: Pedro Alves <pedro@palves.net> Reviewed-By: Andrew Burgess <aburgess@redhat.com> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338 Change-Id: Ie8b2c5747db99b7023463a897a8390d9e814a9c9
2023-11-13Don't resume new threads if scheduler-locking is in effectPedro Alves2-0/+121
If scheduler-locking is in effect, e.g., with "set scheduler-locking on", and you step over a function that spawns a new thread, the new thread is allowed to run free, at least until some event is hit, at which point, whether the new thread is re-resumed depends on a number of seemingly random factors. E.g., if the target is all-stop, and the parent thread hits a breakpoint, and GDB decides the breakpoint isn't interesting to report to the user, then the parent thread is resumed, but the new thread is left stopped. I think that letting the new threads run with scheduler-locking enabled is a defect. This commit fixes that, making use of the new clone events on Linux, and of target_thread_events() on targets where new threads have no connection to the thread that spawned them. Testcase and documentation changes included. Approved-By: Eli Zaretskii <eliz@gnu.org> Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: Ie12140138b37534b7fc1d904da34f0f174aa11ce
2023-11-13Thread options & clone events (Linux GDBserver)Pedro Alves1-6/+0
This patch teaches the Linux GDBserver backend to report clone events to GDB, when GDB has requested them with the GDB_THREAD_OPTION_CLONE thread option, via the new QThreadOptions packet. This shuffles code in linux_process_target::handle_extended_wait around to a more logical order when we now have to handle and potentially report all of fork/vfork/clone. Raname lwp_info::fork_relative -> lwp_info::relative as the field is no longer only about (v)fork. With this, gdb.threads/stepi-over-clone.exp now cleanly passes against GDBserver, so remove the native-target-only requirement from that testcase. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830 Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: I3a19bc98801ec31e5c6fdbe1ebe17df855142bb2
2023-11-13Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONEDPedro Alves2-0/+485
(A good chunk of the problem statement in the commit log below is Andrew's, adjusted for a different solution, and for covering displaced stepping too. The testcase is mostly Andrew's too.) This commit addresses bugs gdb/19675 and gdb/27830, which are about stepping over a breakpoint set at a clone syscall instruction, one is about displaced stepping, and the other about in-line stepping. Currently, when a new thread is created through a clone syscall, GDB sets the new thread running. With 'continue' this makes sense (assuming no schedlock): - all-stop mode, user issues 'continue', all threads are set running, a newly created thread should also be set running. - non-stop mode, user issues 'continue', other pre-existing threads are not affected, but as the new thread is (sort-of) a child of the thread the user asked to run, it makes sense that the new threads should be created in the running state. Similarly, if we are stopped at the clone syscall, and there's no software breakpoint at this address, then the current behaviour is fine: - all-stop mode, user issues 'stepi', stepping will be done in place (as there's no breakpoint to step over). While stepping the thread of interest all the other threads will be allowed to continue. A newly created thread will be set running, and then stopped once the thread of interest has completed its step. - non-stop mode, user issues 'stepi', stepping will be done in place (as there's no breakpoint to step over). Other threads might be running or stopped, but as with the continue case above, the new thread will be created running. The only possible issue here is that the new thread will be left running after the initial thread has completed its stepi. The user would need to manually select the thread and interrupt it, this might not be what the user expects. However, this is not something this commit tries to change. The problem then is what happens when we try to step over a clone syscall if there is a breakpoint at the syscall address. - For both all-stop and non-stop modes, with in-line stepping: + user issues 'stepi', + [non-stop mode only] GDB stops all threads. In all-stop mode all threads are already stopped. + GDB removes s/w breakpoint at syscall address, + GDB single steps just the thread of interest, all other threads are left stopped, + New thread is created running, + Initial thread completes its step, + [non-stop mode only] GDB resumes all threads that it previously stopped. There are two problems in the in-line stepping scenario above: 1. The new thread might pass through the same code that the initial thread is in (i.e. the clone syscall code), in which case it will fail to hit the breakpoint in clone as this was removed so the first thread can single step, 2. The new thread might trigger some other stop event before the initial thread reports its step completion. If this happens we end up triggering an assertion as GDB assumes that only the thread being stepped should stop. The assert looks like this: infrun.c:5899: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed. - For both all-stop and non-stop modes, with displaced stepping: + user issues 'stepi', + GDB starts the displaced step, moves thread's PC to the out-of-line scratch pad, maybe adjusts registers, + GDB single steps the thread of interest, [non-stop mode only] all other threads are left as they were, either running or stopped. In all-stop, all other threads are left stopped. + New thread is created running, + Initial thread completes its step, GDB re-adjusts its PC, restores/releases scratchpad, + [non-stop mode only] GDB resumes the thread, now past its breakpoint. + [all-stop mode only] GDB resumes all threads. There is one problem with the displaced stepping scenario above: 3. When the parent thread completed its step, GDB adjusted its PC, but did not adjust the child's PC, thus that new child thread will continue execution in the scratch pad, invoking undefined behavior. If you're lucky, you see a crash. If unlucky, the inferior gets silently corrupted. What is needed is for GDB to have more control over whether the new thread is created running or not. Issue #1 above requires that the new thread not be allowed to run until the breakpoint has been reinserted. The only way to guarantee this is if the new thread is held in a stopped state until the single step has completed. Issue #3 above requires that GDB is informed of when a thread clones itself, and of what is the child's ptid, so that GDB can fixup both the parent and the child. When looking for solutions to this problem I considered how GDB handles fork/vfork as these have some of the same issues. The main difference between fork/vfork and clone is that the clone events are not reported back to core GDB. Instead, the clone event is handled automatically in the target code and the child thread is immediately set running. Note we have support for requesting thread creation events out of the target (TARGET_WAITKIND_THREAD_CREATED). However, those are reported for the new/child thread. That would be sufficient to address in-line stepping (issue #1), but not for displaced-stepping (issue #3). To handle displaced-stepping, we need an event that is reported to the _parent_ of the clone, as the information about the displaced step is associated with the clone parent. TARGET_WAITKIND_THREAD_CREATED includes no indication of which thread is the parent that spawned the new child. In fact, for some targets, like e.g., Windows, it would be impossible to know which thread that was, as thread creation there doesn't work by "cloning". The solution implemented here is to model clone on fork/vfork, and introduce a new TARGET_WAITKIND_THREAD_CLONED event. This event is similar to TARGET_WAITKIND_FORKED and TARGET_WAITKIND_VFORKED, except that we end up with a new thread in the same process, instead of a new thread of a new process. Like FORKED and VFORKED, THREAD_CLONED waitstatuses have a child_ptid property, and the child is held stopped until GDB explicitly resumes it. This addresses the in-line stepping case (issues #1 and #2). The infrun code that handles displaced stepping fixup for the child after a fork/vfork event is thus reused for THREAD_CLONE, with some minimal conditions added, addressing the displaced stepping case (issue #3). The native Linux backend is adjusted to unconditionally report TARGET_WAITKIND_THREAD_CLONED events to the core. Following the follow_fork model in core GDB, we introduce a target_follow_clone target method, which is responsible for making the new clone child visible to the rest of GDB. Subsequent patches will add clone events support to the remote protocol and gdbserver. displaced_step_in_progress_thread becomes unused with this patch, but a new use will reappear later in the series. To avoid deleting it and readding it back, this patch marks it with attribute unused, and the latter patch removes the attribute again. We need to do this because the function is static, and with no callers, the compiler would warn, (error with -Werror), breaking the build. This adds a new gdb.threads/stepi-over-clone.exp testcase, which exercises stepping over a clone syscall, with displaced stepping vs inline stepping, and all-stop vs non-stop. We already test stepping over clone syscalls with gdb.base/step-over-syscall.exp, but this test uses pthreads, while the other test uses raw clone, and this one is more thorough. The testcase passes on native GNU/Linux, but fails against GDBserver. GDBserver will be fixed by a later patch in the series. Co-authored-by: Andrew Burgess <aburgess@redhat.com> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830 Change-Id: I95c06024736384ae8542a67ed9fdf6534c325c8e Reviewed-By: Andrew Burgess <aburgess@redhat.com>
2023-11-13gdb/linux: Delete all other LWPs immediately on ptrace exec eventPedro Alves2-0/+113
I noticed that on an Ubuntu 20.04 system, after a following patch ("Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED"), the gdb.threads/step-over-exec.exp was passing cleanly, but still, we'd end up with four new unexpected GDB core dumps: === gdb Summary === # of unexpected core files 4 # of expected passes 48 That said patch is making the pre-existing gdb.threads/step-over-exec.exp testcase (almost silently) expose a latent problem in gdb/linux-nat.c, resulting in a GDB crash when: #1 - a non-leader thread execs #2 - the post-exec program stops somewhere #3 - you kill the inferior Instead of #3 directly, the testcase just returns, which ends up in gdb_exit, tearing down GDB, which kills the inferior, and is thus equivalent to #3 above. Vis (after said patch is applied): $ gdb --args ./gdb /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true ... (top-gdb) r ... (gdb) b main ... (gdb) r ... Breakpoint 1, main (argc=1, argv=0x7fffffffdb88) at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec.c:69 69 argv0 = argv[0]; (gdb) c Continuing. [New Thread 0x7ffff7d89700 (LWP 2506975)] Other going in exec. Exec-ing /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd process 2506769 is executing new program: /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd Thread 1 "step-over-exec-" hit Breakpoint 1, main () at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec-execd.c:28 28 foo (); (gdb) k ... Thread 1 "gdb" received signal SIGSEGV, Segmentation fault. 0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393 393 return m_suspend.waitstatus_pending_p; (top-gdb) bt #0 0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393 #1 0x0000555555a884d1 in get_pending_child_status (lp=0x5555579b8230, ws=0x7fffffffd130) at ../../src/gdb/linux-nat.c:1345 #2 0x0000555555a8e5e6 in kill_unfollowed_child_callback (lp=0x5555579b8230) at ../../src/gdb/linux-nat.c:3564 #3 0x0000555555a92a26 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::operator()(gdb::fv_detail::erased_callable, lwp_info*) const (this=0x0, ecall=..., args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:284 #4 0x0000555555a92a51 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::_FUN(gdb::fv_detail::erased_callable, lwp_info*) () at ../../src/gdb/../gdbsupport/function-view.h:278 #5 0x0000555555a91f84 in gdb::function_view<int (lwp_info*)>::operator()(lwp_info*) const (this=0x7fffffffd210, args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:247 #6 0x0000555555a87072 in iterate_over_lwps(ptid_t, gdb::function_view<int (lwp_info*)>) (filter=..., callback=...) at ../../src/gdb/linux-nat.c:864 #7 0x0000555555a8e732 in linux_nat_target::kill (this=0x55555653af40 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3590 #8 0x0000555555cfdc11 in target_kill () at ../../src/gdb/target.c:911 ... The root of the problem is that when a non-leader LWP execs, it just changes its tid to the tgid, replacing the pre-exec leader thread, becoming the new leader. There's no thread exit event for the execing thread. It's as if the old pre-exec LWP vanishes without trace. The ptrace man page says: "PTRACE_O_TRACEEXEC (since Linux 2.5.46) Stop the tracee at the next execve(2). A waitpid(2) by the tracer will return a status value such that status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8)) If the execing thread is not a thread group leader, the thread ID is reset to thread group leader's ID before this stop. Since Linux 3.0, the former thread ID can be retrieved with PTRACE_GETEVENTMSG." When the core of GDB processes an exec events, it deletes all the threads of the inferior. But, that is too late -- deleting the thread does not delete the corresponding LWP, so we end leaving the pre-exec non-leader LWP stale in the LWP list. That's what leads to the crash above -- linux_nat_target::kill iterates over all LWPs, and after the patch in question, that code will look for the corresponding thread_info for each LWP. For the pre-exec non-leader LWP still listed, won't find one. This patch fixes it, by deleting the pre-exec non-leader LWP (and thread) from the LWP/thread lists as soon as we get an exec event out of ptrace. GDBserver does not need an equivalent fix, because it is already doing this, as side effect of mourning the pre-exec process, in gdbserver/linux-low.cc: else if (event == PTRACE_EVENT_EXEC && cs.report_exec_events) { ... /* Delete the execing process and all its threads. */ mourn (proc); switch_to_thread (nullptr); The crash with gdb.threads/step-over-exec.exp is not observable on newer systems, which postdate the glibc change to move "libpthread.so" internals to "libc.so.6", because right after the exec, GDB traps a load event for "libc.so.6", which leads to GDB trying to open libthread_db for the post-exec inferior, and, on such systems that succeeds. When we load libthread_db, we call linux_stop_and_wait_all_lwps, which, as the name suggests, stops all lwps, and then waits to see their stops. While doing this, GDB detects that the pre-exec stale LWP is gone, and deletes it. If we use "catch exec" to stop right at the exec before the "libc.so.6" load event ever happens, and issue "kill" right there, then GDB crashes on newer systems as well. So instead of tweaking gdb.threads/step-over-exec.exp to cover the fix, add a new gdb.threads/threads-after-exec.exp testcase that uses "catch exec". The test also uses the new "maint info linux-lwps" command if testing on Linux native, which also exposes the stale LWP problem with an unfixed GDB. Also tweak a comment in infrun.c:follow_exec referring to how linux-nat.c used to behave, as it would become stale otherwise. Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: I21ec18072c7750f3a972160ae6b9e46590376643
2023-11-08gdb: call update_thread_list after completing an inferior callAndrew Burgess2-0/+263
I noticed that if GDB is using a remote or extended-remote target, then, if an inferior call caused a new thread to appear, or for an existing thread to exit, then these events are not reported to the user. The problem is that for these targets GDB relies on a call to update_thread_list to learn about changes to the inferior's thread list. If GDB doesn't pass through the normal stop code then GDB will not call update_thread_list, and so will not report changes in the thread list. This commit adds an additional update_thread_list call, after which thread events are correctly reported.
2023-11-08gdb: call update_thread_list for $_inferior_thread_count functionAndrew Burgess2-0/+255
I noticed that sometimes the value returned by $_inferior_thread_count can become out of sync with the actual thread count of the inferior, and will disagree with the number of threads reported by 'info threads'. This commit fixes this issue. The cause of the problem is that 'info threads' includes a call to update_thread_list, this can be seen in print_thread_info_1 in thread.c, while $_inferior_thread_count doesn't include a similar call, see the function inferior_thread_count_make_value also in thread.c. Of course, this is only a problem when GDB is running on a target that relies on update_thread_list calls to learn about new threads, e.g. remote or extended-remote targets. Native targets generally learn about new threads as soon as they appear and will not have this problem. I ran into this issue when writing a test for the next commit which uses inferior function calls to add an remove threads from an inferior. But for testing I've made use of non-stop mode and asynchronous inferior execution; by reading the inferior state I can know when a new thread has been created, at which point I can print $_inferior_thread_count while the inferior is still running. This is important, if I stop the inferior then GDB will pass through an update_thread_list call in the normal stop code, which will synchronise the thread list, after which $_inferior_thread_count will report the correct value. With this change in place $_inferior_thread_count is now correct.
2023-10-26gdb: handle main thread exiting during detachAndrew Burgess2-0/+201
Overview ======== Consider the following situation, GDB is in non-stop mode, the main thread is running while a second thread is stopped. The user has the second thread selected as the current thread and asks GDB to detach. At the exact moment of detach the main thread exits. This situation currently causes crashes, assertion failures, and unexpected errors to be reported from GDB for both native and remote targets. This commit addresses this situation for native and remote targets. There are a number of different fixes, but all are required in order to get this functionality working correct for native and remote targets. Native Linux Target =================== For the native Linux target, detaching is handled in the function linux_nat_target::detach. In here we call stop_wait_callback for each thread, and it is this callback that will spot that the main thread has exited. GDB then detaches from everything except the main thread by calling detach_callback. After this the first problem is this assert: /* Only the initial process should be left right now. */ gdb_assert (num_lwps (pid) == 1); The num_lwps call will return 0 as the main thread has exited and all of the other threads have now been detached. I fix this by changing the assert to allow for 0 or 1 lwps at this point. As the 0 case can only happen in non-stop mode, the assert becomes: gdb_assert (num_lwps (pid) == 1 || (target_is_non_stop_p () && num_lwps (pid) == 0)); The next problem is that we do: main_lwp = find_lwp_pid (ptid_t (pid)); and then proceed assuming that main_lwp is not nullptr. In the case that the main thread has exited though, main_lwp will be nullptr. However, we only need main_lwp so that GDB can detach from the thread. If the main thread has exited, and GDB has already detached from every other thread, then GDB has finished detaching, GDB can skip the calls that try to detach from the main thread, and then tell the user that the detach was a success. For Remote Targets ================== On remote targets there are two problems. First is that when the exit occurs during the early phase of the detach, we see the stop notification arrive while GDB is removing the breakpoints ahead of the detach. The 'set debug remote on' trace looks like this: [remote] Sending packet: $z0,7f1648fe0241,1#35 [remote] Notification received: Stop:W0;process:2a0ac8 # At this point an unpatched gdbserver segfaults, and the connection # is broken. A patched gdbserver continues as below... [remote] Packet received: E01 [remote] Sending packet: $z0,7f1648ff00a8,1#68 [remote] Packet received: E01 [remote] Sending packet: $z0,7f1648ff132f,1#6b [remote] Packet received: E01 [remote] Sending packet: $D;2a0ac8#3e [remote] Packet received: E01 I was originally running into Segmentation Faults, from within gdbserver/mem-break.cc, in the function find_gdb_breakpoint. This function calls current_process() and then dereferences the result to find the breakpoint list. However, in our case, the current process has already exited, and so the current_process() call returns nullptr. At the point of failure, the gdbserver backtrace looks like this: #0 0x00000000004190e4 in find_gdb_breakpoint (z_type=48 '0', addr=4198762, kind=1) at ../../src/gdbserver/mem-break.cc:982 #1 0x000000000041930d in delete_gdb_breakpoint (z_type=48 '0', addr=4198762, kind=1) at ../../src/gdbserver/mem-break.cc:1093 #2 0x000000000042d8db in process_serial_event () at ../../src/gdbserver/server.cc:4372 #3 0x000000000042dcab in handle_serial_event (err=0, client_data=0x0) at ../../src/gdbserver/server.cc:4498 ... The problem is that, as a result non-stop being on, the process exiting is only reported back to GDB after the request to remove a breakpoint has been sent. Clearly gdbserver can't actually remove this breakpoint -- the process has already exited -- so I think the best solution is for gdbserver just to report an error, which is what I've done. The second problem I ran into was on the gdb side, as the process has already exited, but GDB has not yet acknowledged the exit event, the detach -- the 'D' packet in the above trace -- fails. This was being reported to the user with a 'Can't detach process' error. As the test actually calls detach from Python code, this error was then becoming a Python exception. Though clearly the detach has returned an error, and so, maybe, having GDB throw an error would be fine, I think in this case, there's a good argument that the remote error can be ignored -- if GDB tries to detach and gets back an error, and if there's a pending exit event for the pid we tried to detach, then just ignore the error and pretend the detach worked fine. We could possibly check for a pending exit event before sending the detach packet, however, I believe that it might be possible (in non-stop mode) for the stop notification to arrive after the detach is sent, but before gdbserver has started processing the detach. In this case we would still need to check for pending stop events after seeing the detach fail, so I figure there's no point having two checks -- we just send the detach request, and if it fails, check to see if the process has already exited. Testing ======= In order to test this issue I needed to ensure that the exit event arrives at the same time as the detach call. The window of opportunity for getting the exit to arrive is so small I've never managed to trigger this in real use -- I originally spotted this issue while working on another patch, which did manage to trigger this issue. However, if we trigger both the exit and the detach from a single Python function then we never return to GDB's event loop, as such GDB never processes the exit event, and so the first time GDB gets a chance to see the exit is during the detach call. And so that is the approach I've taken for testing this patch. Tested-By: Kevin Buettner <kevinb@redhat.com> Approved-By: Kevin Buettner <kevinb@redhat.com>
2023-10-06process-dies-while-detaching.exp: Exit early if GDB misses sync breakpointThiago Jung Bauermann1-5/+5
I'm seeing a lot of variability in the failures of gdb.threads/process-dies-while-detaching.exp on aarch64-linux. On this platform, a problem yet to be investigated causes GDB to miss the _exit breakpoint. What happens next is random because after missing that breakpoint, GDB is out of sync with the inferior. This causes the tests following that point in the testcase to fail in a random way. In this scenario it's better to exit the testcase early to avoid random results in the testsuite. We are relying on gdb_continue_to_breakpoint to return the result of gdb_test_multiple. This is already the case because in Tcl the return value of a function is the return value of the last command it runs. But change gdb_continue_to_breakpoint to explicitly return this value, to make it clear this is the intended behaviour. Tested on aarch64-linux. Tested-By: Guinevere Larsen <blarsen@redhat.com> Approved-By: Andrew Burgess <aburgess@redhat.com>
2023-09-28gdb/testsuite: Add relative versus absolute LD_LIBRARY_PATH testKevin Buettner3-0/+145
At one time, circa 2006, there was a bug, which was presumably fixed without adding a test case: If you provided some relative path to the shared library, such as with export LD_LIBRARY_PATH=. then gdb would fail to match the shared library name during the TLS lookup. I think there may have been a bit more to it than is provided by that explanation, since the test also takes care to split the debug info into a separate file. In any case, this commit is based on one of Red Hat's really old local patches. I've attempted to update it and remove a fair amount of cruft, hopefully without losing any critical elements from the test. Testing on Fedora 38 (correctly) shows 1 unsupported test for native-gdbserver and 5 PASSes for the native target as well as native-extended-gdbserver. In his review of v1 of this patch, Lancelot SIX observed that 'thread_local' could be used in place of '__thread' in the C source files. But it only became available via the standard in C11, so I used additional_flags=-std=c11 for compiling both the shared object and the main program. Also, while testing with CC_FOR_TARGET=clang, I found that 'additional_flags=-Wl,-soname=${binsharedbase}' caused clang to complain that this linker flag was unused when compiling the source file, so I moved this linker option to 'ldflags='. My testing for this v2 patch shows the same results as with v1, but I've done additional testing with CC_FOR_TARGET=clang as well. The results are the same as when gcc is used. Co-Authored-by: Jan Kratochvil <jan@jankratochvil.net> Reviewed-By: Lancelot Six <lancelot.six@amd.com>
2023-09-27Adjust gdb.thread/pthreads.exp for CygwinPedro Alves1-7/+7
The Cygwin runtime spawns a few extra threads, so using hardcoded thread numbers in tests rarely works correctly. Thankfully, this testcase already records the ids of the important threads in globals. It just so happens that they are not used in a few tests. This commit fixes that. With this, the test passes cleanly on Cygwin [1]. Still passes cleanly on x86-64 GNU/Linux. [1] - with system GDB. Upstream GDB is missing a couple patches Cygwin carries downstream. Approved-By: Tom Tromey <tom@tromey.com> Change-Id: I01bf71fcb44ceddea8bd16b933b10b964749a6af
2023-09-27In gdb.threads/pthreads.c, handle pthread_attr_setscope ENOTSUPPedro Alves1-1/+2
On Cygwin, I see: (gdb) PASS: gdb.threads/pthreads.exp: break thread1 continue Continuing. pthread_attr_setscope 1: Not supported (134) [Thread 3732.0x265c exited with code 1] [Thread 3732.0x2834 exited with code 1] [Thread 3732.0x2690 exited with code 1] Program terminated with signal SIGHUP, Hangup. The program no longer exists. (gdb) FAIL: gdb.threads/pthreads.exp: Continue to creation of first thread ... and then a set of cascading failures. Fix this by treating ENOTSUP the same way as if PTHREAD_SCOPE_SYSTEM were not defined. I.e., ignore ENOTSUP errors, and proceed with testing. Approved-By: Tom Tromey <tom@tromey.com> Change-Id: Iea68ff8b9937570726154f36610c48ef96101871
2023-09-27Fix gdb.threads/pthreads.exp error handling/printingPedro Alves1-8/+22
On Cygwin, I noticed: (gdb) PASS: gdb.threads/pthreads.exp: break thread1 continue Continuing. pthread_attr_setscope 1: No error [Thread 8732.0x28f8 exited with code 1] [Thread 8732.0xb50 exited with code 1] [Thread 8732.0x17f8 exited with code 1] Program terminated with signal SIGHUP, Hangup. The program no longer exists. (gdb) FAIL: gdb.threads/pthreads.exp: Continue to creation of first thread Note "No error" in "pthread_attr_setscope 1: No error". That is a bug in the test. It is using perror, but that prints errno, while the pthread functions return the error directly. Fix all cases of the same problem, by adding a new print_error function and using it. We now get: ... pthread_attr_setscope 1: Not supported (134) ... Approved-By: Tom Tromey <tom@tromey.com> Change-Id: I972ebc931b157bc0f9084e6ecd8916a5e39238f5
2023-09-27Fix gdb.threads/pthreads.c formattingPedro Alves1-18/+27
Just some GNU formatting fixes throughout. Approved-By: Tom Tromey <tom@tromey.com> Change-Id: Ie851e3815b839e57898263896db0ba8ddfefe09e
2023-09-27gdb.threads/pthreads.c, K&R -> ANSI function stylePedro Alves1-7/+3
gdb.threads/pthreads.c is declaring functions with old K&R style. This commit converts them to ANSI style. Approved-By: Tom Tromey <tom@tromey.com> Change-Id: I1ce007c67bb4ab1e49248c014c7881e46634f8f8
2023-09-26gdb/testsuite: add xfail for gdb/29965 in ↵Simon Marchi1-1/+18
gdb.threads/process-exit-status-is-leader-exit-status.exp Bug 29965 shows on a Linux kernel >= 6.1, that test fails consistently with: FAIL: gdb.threads/process-exit-status-is-leader-exit-status.exp: iteration=0: continue (the program exited) ... FAIL: gdb.threads/process-exit-status-is-leader-exit-status.exp: iteration=9: continue (the program exited) This is due to a change in Linux kernel behavior [1] that affects exactly what this test tests. That is, if multiple threads (including the leader) call SYS_exit, the exit status of the process should be the exit status of the leader. After that change in the kernel, it is no longer the case. Add an xfail in the test, based on the Linux kernel version. The goal is that if a regression is introduced in GDB regarding this feature, it should be caught if running on an older kernel where the behavior was consistent. [1] https://bugzilla.suse.com/show_bug.cgi?id=1206926 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29965 Change-Id: If6ab7171c92bfc1a3b961c7179e26611773969eb Approved-By: Tom de Vries <tdevries@suse.de>
2023-08-23gdb: centralize "[Thread ...exited]" notificationsPedro Alves1-11/+1
Currently, each target backend is responsible for printing "[Thread ...exited]" before deleting a thread. This leads to unnecessary differences between targets, like e.g. with the remote target, we never print such messages, even though we do print "[New Thread ...]". E.g., debugging the gdb.threads/attach-many-short-lived-threads.exp with gdbserver, letting it run for a bit, and then pressing Ctrl-C, we currently see: (gdb) c Continuing. ^C[New Thread 3850398.3887449] [New Thread 3850398.3887500] [New Thread 3850398.3887551] [New Thread 3850398.3887602] [New Thread 3850398.3887653] ... Thread 1 "attach-many-sho" received signal SIGINT, Interrupt. 0x00007ffff7e6a23f in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fffffffda80, rem=rem@entry=0x7fffffffda80) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78 78 in ../sysdeps/unix/sysv/linux/clock_nanosleep.c (gdb) Above, we only see "New Thread" notifications, even though threads were deleted. After this patch, we'll see: (gdb) c Continuing. ^C[Thread 3558643.3577053 exited] [Thread 3558643.3577104 exited] [Thread 3558643.3577155 exited] [Thread 3558643.3579603 exited] ... [New Thread 3558643.3597415] [New Thread 3558643.3600015] [New Thread 3558643.3599965] ... Thread 1 "attach-many-sho" received signal SIGINT, Interrupt. 0x00007ffff7e6a23f in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fffffffda80, rem=rem@entry=0x7fffffffda80) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78 78 in ../sysdeps/unix/sysv/linux/clock_nanosleep.c (gdb) q This commit fixes this by moving the thread exit printing to common code instead, triggered from within delete_thread (or rather, set_thread_exited). There's one wrinkle, though. While most targest want to print: [Thread ... exited] the Windows target wants to print: [Thread ... exited with code <exit_code>] ... and sometimes wants to suppress the notification for the main thread. To address that, this commits adds a delete_thread_with_code function, only used by that target (so far). This fix was originally posted as part of a larger series: https://inbox.sourceware.org/gdb-patches/20221212203101.1034916-1-pedro@palves.net/ But didn't really need to be part of that series. In order to get this fix merged sooner, I (Andrew Burgess) have rebased this commit outside of the original series. Any bugs introduced while splitting this patch out and rebasing, are entirely my own. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30129 Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
2023-08-11gdbserver: Reinstall software single-step breakpoints in ↵Kevin Buettner2-0/+213
resume_stopped_resumed_lwps At the moment, while performing a software single-step, gdbserver fails to reinsert software single-step breakpoints for a LWP when interrupted by a signal in another thread. This commit fixes this problem by reinstalling software single-step breakpoints in linux_process_target::resume_stopped_resumed_lwps in gdbserver/linux-low.cc. This bug was discovered due to a failing assert in maybe_hw_step() in gdbserver/linux-low.cc. Looking at the backtrace revealed that the caller was linux_process_target::resume_stopped_resumed_lwps. I was uncertain whether the assert should still be valid when called from that method, so I tried hoisting the assert from maybe_hw_step to all callers except resume_stopped_resumed_lwps. But running the new test case, described below, showed that merely eliminating the assert for this case was NOT a good fix - a study of the log file for the test showed that the single-step operation failed to occur. Instead GDB (via gdbserver) stopped at the next breakpoint that was hit. Zhiyong Yan had proposed a fix which resinserted software single-step breakpoints, albeit at a different location in linux-low.cc. Testing revealed that, while running gdb.threads/pending-fork-event-detach, the executable associated with that test would die due to a SIGTRAP after the test program was detached. Examination of the core file(s) showed that a breakpoint instruction had been left in program memory. Test results were otherwise very good, so Zhiyong was definitely on the right track! This commit causes software single-step breakpoint(s) to be inserted before the call to maybe_hw_step in resume_stopped_resumed_lwps. This will cause 'has_single_step_breakpoints (thread)' to be true, so that the assert in maybe_hw_step... /* GDBserver must insert single-step breakpoint for software single step. */ gdb_assert (has_single_step_breakpoints (thread)); ...will no longer fail. And better still, the single-step breakpoints are reinstalled, so that stepping will actually work, even when interrupted. The C code for the test case was loosely adapted from the reproducer provided in Zhiyong's bug report for this problem. The .exp file was copied from next-fork-other-thread.exp and then tweaked slightly. As noted in a comment in next-fork-exec-other-thread.exp, I had to remove "on" from the loop for non-stop as it was failing on all architectures (including x86-64) that I tested. I have a feeling that it ought to work, but this can be investigated separately and (re)enabled once it works. I also increased the number of iterations for the loop running the "next" commands. I've had some test runs which don't show the bug until the loop counter exceeded 100 iterations. The C file for the new test uses shorter delays than next-fork-other-thread.c though, so it doesn't take overly long (IMO) to run this new test. Running the new test on a Raspberry Pi w/ a 32-bit (Arm) kernel and userland using a gdbserver build without the fix in this commit shows the following results: FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=auto: i=12: next to other line FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=on: i=9: next to other line FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=off: i=18: next to other line FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=auto: i=3: next to other line FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=on: i=11: next to other line FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=off: i=1: next to other line FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=auto: non-stop=off: displaced-stepping=auto: i=1: next to break here FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=auto: non-stop=off: displaced-stepping=on: i=3: next to break here FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=auto: non-stop=off: displaced-stepping=off: i=1: next to break here FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=auto: i=47: next to other line FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=on: i=57: next to other line FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=off: non-stop=off: displaced-stepping=auto: i=1: next to break here FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=off: non-stop=off: displaced-stepping=on: i=10: next to break here FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=off: non-stop=off: displaced-stepping=off: i=1: next to break here === gdb Summary === # of unexpected core files 12 # of expected passes 3011 # of unexpected failures 14 Each of the 12 core files were caused by the failed assertion in maybe_hw_step in linux-low.c. These correspond to 12 of the unexpected failures. When the tests are run using a gdbserver build which includes the fix in this commit, the results are significantly better, but not perfect: FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=auto: i=143: next to other line FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=on: i=25: next to other line === gdb Summary === # of expected passes 10178 # of unexpected failures 2 I think that the two remaining failures are due to some different problem. They are also racy - I've seen runs with no failures or only one failure, but never more than two. Also, those runs were conducted with the loop count in next-fork-exec-other-thread.exp set to 200. During his testing of this fix and the new test case, Luis Machado found that this test was taking a long time and asked about ways to speed it up. I then conducted additional tests in which I gradually reduced the loop count, timing each one, also noting the number of failures. With the loop count set to 30, I found that I could still reliably reproduce the failures that Zhiyong reported (in which, with the proper settings, core files are created). But, with the loop count set to 30, the other failures noted above were much less likely to show up. Anyone wishing to investigate those other failures should set the loop count back up to 200. Running the new test on x86-64 and aarch64, both native and native-gdbserver shows no failures. Also, I see no regressions when running the entire test suite for armv7l-unknown-linux-gnueabihf (i.e. the Raspberry Pi w/ 32-bit kernel+userland) with --target_board=native-gdbserver. Additionally, using --target_board=native-gdbserver, I also see no regressions for the entire test suite for x86-64 and aarch64 running Fedora 38. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30387 Co-Authored-By: Zhiyong Yan <zhiyong.yan@windriver.com> Tested-By: Zhiyong Yan <zhiyong.yan@windriver.com> Tested-By: Luis Machado <luis.machado@arm.com>
2023-06-07[gdb/testsuite] Handle output after prompt in ↵Tom de Vries1-1/+1
gdb.threads/step-N-all-progress.exp Using "taskset -c 0" I run into this timeout: ... (gdb) PASS: gdb.threads/step-N-all-progress.exp: non-stop=on: \ target-non-stop=on: continue to breakpoint: break here next 3^M [New Thread 0x7ffff7dbd6c0 (LWP 10202)]^M 50 return 0;^M (gdb) [Thread 0x7ffff7dbd6c0 (LWP 10202) exited]^M FAIL: gdb.threads/step-N-all-progress.exp: non-stop=on: target-non-stop=on: \ next 3 (timeout) ... The problem is that this test: ... gdb_test "next 3" "return 0;" ... expects no output after the prompt. Fix this by using -no-prompt-anchor. Tested on x86_64-linux.
2023-06-03[gdb] Fix typosTom de Vries1-1/+1
Fix a few typos: - implemention -> implementation - convertion(s) -> conversion(s) - backlashes -> backslashes - signoring -> ignoring - (un)ambigious -> (un)ambiguous - occured -> occurred - hidding -> hiding - temporarilly -> temporarily - immediatelly -> immediately - sillyness -> silliness - similiar -> similar - porkuser -> pokeuser - thats -> that - alway -> always - supercede -> supersede - accomodate -> accommodate - aquire -> acquire - priveleged -> privileged - priviliged -> privileged - priviledges -> privileges - privilige -> privilege - recieve -> receive - (p)refered -> (p)referred - succesfully -> successfully - successfuly -> successfully - responsability -> responsibility - wether -> whether - wich -> which - disasbleable -> disableable - descriminant -> discriminant - construcstor -> constructor - underlaying -> underlying - underyling -> underlying - structureal -> structural - appearences -> appearances - terciarily -> tertiarily - resgisters -> registers - reacheable -> reachable - likelyhood -> likelihood - intepreter -> interpreter - disassemly -> disassembly - covnersion -> conversion - conviently -> conveniently - atttribute -> attribute - struction -> struct - resonable -> reasonable - popupated -> populated - namespaxe -> namespace - intialize -> initialize - identifer(s) -> identifier(s) - expection -> exception - exectuted -> executed - dungerous -> dangerous - dissapear -> disappear - completly -> completely - (inter)changable -> (inter)changeable - beakpoint -> breakpoint - automativ -> automatic - alocating -> allocating - agressive -> aggressive - writting -> writing - reguires -> requires - registed -> registered - recuding -> reducing - opeartor -> operator - ommitted -> omitted - modifing -> modifying - intances -> instances - imbedded -> embedded - gdbaarch -> gdbarch - exection -> execution - direcive -> directive - demanged -> demangled - decidely -> decidedly - argments -> arguments - agrument -> argument - amespace -> namespace - targtet -> target - supress(ed) -> suppress(ed) - startum -> stratum - squence -> sequence - prompty -> prompt - overlow -> overflow - memember -> member - languge -> language - geneate -> generate - funcion -> function - exising -> existing - dinking -> syncing - destroh -> destroy - clenaed -> cleaned - changep -> changedp (name of variable) - arround -> around - aproach -> approach - whould -> would - symobl -> symbol - recuse -> recurse - outter -> outer - freeds -> frees - contex -> context Tested on x86_64-linux. Reviewed-By: Tom Tromey <tom@tromey.com>
2023-04-27gdb/testsuite: change newline patterns used in gdb_testAndrew Burgess2-3/+3
This commit makes two changes to how we match newline characters in the gdb_test proc. First, for the newline pattern between the command output and the prompt, I propose changing from '[\r\n]+' to an explicit '\r\n'. The old pattern would spot multiple newlines, and so there are a few places where, as part of this commit, I've needed to add an extra trailing '\r\n' to the pattern in the main test file, where GDB's output actually includes a blank line. But I think this is a good thing. If a command produces a blank line then we should be checking for it, the current gdb_test doesn't do that. But also, with the current gdb_test, if a blank line suddenly appears in the output, this is going to be silently ignored, and I think this is wrong, the test should fail in that case. Additionally, the existing pattern will happily match a partial newline. There are a strangely large number of tests that end with a random '.' character. Not matching a literal period, but matching any single character, this is then matching half of the trailing newline sequence, while the \[\r\n\]+ in gdb_test is matching the other half of the sequence. I can think of no reason why this would be intentional, I suspect that the expected output at one time included a period, which has since been remove, but I haven't bothered to check on this. In this commit I've removed all these unneeded trailing '.' characters. The basic rule of gdb_test after this is that the expected pattern needs to match everything up to, but not including the newline sequence immediately before the GDB prompt. This is generally how the proc is used anyway, so in almost all cases, this commit represents no significant change. Second, while I was cleaning up newline matching in gdb_test, I've also removed the '[\r\n]*' that was added to the start of the pattern passed to gdb_test_multiple. The addition of this pattern adds no value. If the user pattern matches at the start of a line then this would match against the newline sequence. But, due to the '*', if the user pattern doesn't match at the start of a line then this group doesn't care, it'll happily match nothing. As such, there's no value to it, it just adds more complexity for no gain, so I'm removing it. No tests will need updating as a consequence of this part of the patch. Reviewed-By: Tom Tromey <tom@tromey.com>
2023-03-18[gdb/testsuite] Handle my-syscalls.h for remote hostTom de Vries2-1/+3
Handle $srcdir/lib/my-syscalls.h using lappend_include_dir. Tested on x86_64-linux.
2023-03-13[gdb/testsuite] Fix gdb.threads/step-bg-decr-pc-switch-thread.exp for ↵Tom de Vries1-2/+13
native-gdbserver With test-case gdb.threads/step-bg-decr-pc-switch-thread.exp and target board native-gdbserver, I run into: ... (gdb) UNSUPPORTED: gdb.threads/step-bg-decr-pc-switch-thread.exp: \ switch to main thread Remote debugging from host ::1, port 43914^M monitor exit^M Cannot execute this command while the target is running.^M Use the "interrupt" command to stop the target^M and then try again.^M (gdb) WARNING: Timed out waiting for EOF in server after monitor exit ... Fix this by following the advice and issuing an interrupt command, allowing the following monitor exit command to succeed. Tested on x86_64-linux.
2023-03-10More uses of require with istargetTom Tromey1-1/+2
I found a few more spots that check istarget that can be switched to use 'require'.
2023-03-10Use require with target_infoTom Tromey17-78/+28
This changes many tests to use 'require' when checking target_info. In a few spots, the require is hoisted to the top of the file, to avoid doing any extra work when the test is going to be skipped anyway.
2023-03-09[gdb/testsuite] Fix gdb.threads/pending-fork-event-detach.exp for remote targetTom de Vries1-1/+14
Fix test-case gdb.threads/pending-fork-event-detach.exp for target board remote-gdbserver-on-localhost using gdb_remote_download for $touch_file_bin. Then, fix the test-case for target board remote-stdio-gdbserver with REMOTE_TMPDIR=~/tmp.remote-stdio-gdbserver by creating $touch_file_path on target using remote_download, and using the resulting path. Tested on x86_64-linux.
2023-03-09[gdb/testsuite] Fix gdb.threads/multiple-successive-infcall.exp on ↵Tom de Vries1-0/+3
native-gdbserver With test-case gdb.threads/multiple-successive-infcall.exp and target board native-gdbserver I run into: ... (gdb) continue^M Continuing.^M [New Thread 758.759]^M ^M Thread 1 "multiple-succes" hit Breakpoint 2, main () at \ multiple-successive-infcall.c:97^M 97 thread_ids[tid] = tid + 2; /* prethreadcreationmarker */^M (gdb) FAIL: gdb.threads/multiple-successive-infcall.exp: thread=5: \ created new thread ... The problem is that the new thread message doesn't match the regexp, which expects something like this instead: ... [New Thread 0x7ffff746e700 (LWP 570)]^M ... Fix this by accepting this form of new thread message. Tested on x86_64-linux.
2023-03-09[gdb/testsuite] Fix gdb.threads/thread-specific-bp.exp on native-gdbserverTom de Vries1-1/+7
With test-case gdb.threads/thread-specific-bp.exp and target board native-gdbserver I run into: ... (gdb) PASS: gdb.threads/thread-specific-bp.exp: non_stop=off: thread 1 selected continue^M Continuing.^M Thread-specific breakpoint 3 deleted - thread 2 no longer in the thread list.^M ^M Thread 1 "thread-specific" hit Breakpoint 4, end () at \ thread-specific-bp.c:29^M 29 }^M (gdb) FAIL: gdb.threads/thread-specific-bp.exp: non_stop=off: \ continue to end (timeout) ... The problem is that the test-case tries to match the "[Thread ... exited]" message which we do see with native testing: ... Continuing.^M [Thread 0x7ffff746e700 (LWP 7047) exited]^M Thread-specific breakpoint 3 deleted - thread 2 no longer in the thread list.^M ... The fact that the message is missing was reported as PR remote/30129. We could add a KFAIL for this, but the functionality the test-case is trying to test has nothing to do with the message, so it should pass. I only added matching of the message in commit 2e5843d87c4 ("[gdb/testsuite] Fix gdb.threads/thread-specific-bp.exp") to handle a race, not realizing doing so broke testing on native-gdbserver. Fix this by matching the "Thread-specific breakpoint $decimal deleted" message instead. Tested on x86_64-linux.
2023-03-07[gdb/testsuite] Fix gdb.threads/execl.exp for remote targetTom de Vries1-0/+3
Fix test-case gdb.threads/execl.exp on target board remote-gdbserver-on-localhost using gdb_remote_download. Tested on x86_64-linux.
2023-03-06gdb.threads/next-bp-other-thread.c: Ensure child thread is started.John Baldwin1-0/+9
Use a pthread_barrier to ensure the child thread is started before the main thread gets to the first breakpoint.
2023-03-06gdb.threads/execl.c: Ensure all threads are started before execl.John Baldwin1-0/+8
Use a pthread_barrier to ensure all threads are started before proceeding to the breakpoint where info threads output is checked.
2023-03-06gdb.threads/multi-create: Double the existing stack size.John Baldwin1-2/+6
Setting the stack size to 2*PTHREAD_STACK_MIN actually lowered the stack on FreeBSD rather than raising it causing non-main threads in the test program to overflow their stack and crash. Double the existing stack size rather than assuming that the initial stack size is PTHREAD_STACK_MIN.
2023-03-03gdb/testsuite: use `kill -FOO` instead of `kill -SIGFOO`Simon Marchi1-1/+1
When running gdb.base/bg-exec-sigint-bp-cond.exp when SHELL is dash, rather than bash, I get: c&^M Continuing.^M (gdb) sh: 1: kill: Illegal option -S^M ^M Breakpoint 2, foo () at /home/jenkins/smarchi/binutils-gdb/build/gdb/testsuite/../../../gdb/testsuite/gdb.base/bg-exec-sigint-bp-cond.c:23^M 23 return 0;^M FAIL: gdb.base/bg-exec-sigint-bp-cond.exp: no force memory write: SIGINT does not interrupt background execution (timeout) This is because it uses the kill command built-in the dash shell, and using the SIG prefix with kill does not work with dash's kill. The difference is listed in the documentation for bash's POSIX-correct mode [1]: The kill builtin does not accept signal names with a ‘SIG’ prefix. Replace SIGINT with INT in that test. By grepping, I found two other instances (gdb.base/sigwinch-notty.exp and gdb.threads/detach-step-over.exp). Those were not problematic on my system though. Since they are done through remote_exec, they don't go through the shell and therefore invoke /bin/kill. On my Arch Linux, it's: $ /bin/kill --version kill from util-linux 2.38.1 (with: sigqueue, pidfd) and on my Ubuntu: $ /bin/kill --version kill from procps-ng 3.3.17 These two implementations accept "-SIGINT". But according to the POSIX spec [2], the kill utility should recognize the signal name without the SIG prefix (if it recognizes them with the SIG prefix, it's an extension): -s signal_name Specify the signal to send, using one of the symbolic names defined in the <signal.h> header. Values of signal_name shall be recognized in a case-independent fashion, without the SIG prefix. In addition, the symbolic name 0 shall be recognized, representing the signal value zero. The corresponding signal shall be sent instead of SIGTERM. -signal_name [XSI] [Option Start] Equivalent to -s signal_name. [Option End] So, just in case some /bin/kill implementation happens to not recognize the SIG prefixes, change these two other calls to remove the SIG prefix. [1] https://www.gnu.org/software/bash/manual/html_node/Bash-POSIX-Mode.html [2] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/kill.html Change-Id: I81ccedd6c9428ab63b9261813f1905a18941f8da Reviewed-By: Tom Tromey <tom@tromey.com>
2023-03-01gdb: update some copyright years (2022 -> 2023)Simon Marchi2-2/+2
The copyright years in the ROCm files (e.g. solib-rocm.c) are wrong, they end in 2022 instead of 2023. I suppose because I posted (or at least prepared) the patches in 2022 but merged them in 2023, and forgot to update the year. I found a bunch of other files that are in the same situation. Fix them all up. Change-Id: Ia55f5b563606c2ba6a89046f22bc0bf1c0ff2e10 Reviewed-By: Tom Tromey <tom@tromey.com>
2023-02-28gdb: fix mi breakpoint-deleted notifications for thread-specific b/pAndrew Burgess2-0/+298
Background ---------- When a thread-specific breakpoint is deleted as a result of the specific thread exiting the function remove_threaded_breakpoints is called which sets the disposition of the breakpoint to disp_del_at_next_stop and sets the breakpoint number to 0. Setting the breakpoint number to zero has the effect of hiding the breakpoint from the user. We also print a message indicating that the breakpoint has been deleted. It was brought to my attention during a review of another patch[1] that setting a breakpoints number to zero will suppress the MI breakpoint-deleted notification for that breakpoint, and indeed, this can be seen to be true, in delete_breakpoint, if the breakpoint number is zero, then GDB will not notify the breakpoint_deleted observer. It seems wrong that a user created, thread-specific breakpoint, will have a =breakpoint-created notification, but will not have a =breakpoint-deleted notification. I suspect that this is a bug. [1] https://sourceware.org/pipermail/gdb-patches/2023-February/196560.html The First Problem ----------------- During my initial testing I wanted to see how GDB handled the breakpoint after it's number was set to zero. To do this I created the testcase gdb.threads/thread-bp-deleted.exp. This test creates a worker thread, which immediately exits. After the worker thread has exited the main thread spins in a loop. In GDB I break once the worker thread has been created and place a thread-specific breakpoint, then use 'continue&' to resume the inferior in non-stop mode. The worker thread then exits, but the main thread never stops - instead it sits in the spin. I then tried to use 'maint info breakpoints' to see what GDB thought of the thread-specific breakpoint. Unfortunately, GDB crashed like this: (gdb) continue& Continuing. (gdb) [Thread 0x7ffff7c5d700 (LWP 1202458) exited] Thread-specific breakpoint 3 deleted - thread 2 no longer in the thread list. maint info breakpoints ... snip some output ... Fatal signal: Segmentation fault ----- Backtrace ----- 0x5ffb62 gdb_internal_backtrace_1 ../../src/gdb/bt-utils.c:122 0x5ffc05 _Z22gdb_internal_backtracev ../../src/gdb/bt-utils.c:168 0x89965e handle_fatal_signal ../../src/gdb/event-top.c:964 0x8997ca handle_sigsegv ../../src/gdb/event-top.c:1037 0x7f96f5971b1f ??? /usr/src/debug/glibc-2.30-2-gd74461fa34/nptl/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 0xe602b0 _Z15print_thread_idP11thread_info ../../src/gdb/thread.c:1439 0x5b3d05 print_one_breakpoint_location ../../src/gdb/breakpoint.c:6542 0x5b462e print_one_breakpoint ../../src/gdb/breakpoint.c:6702 0x5b5354 breakpoint_1 ../../src/gdb/breakpoint.c:6924 0x5b58b8 maintenance_info_breakpoints ../../src/gdb/breakpoint.c:7009 ... etc ... As the thread-specific breakpoint is set to disp_del_at_next_stop, and GDB hasn't stopped yet, then the breakpoint still exists in the global breakpoint list. The breakpoint will not show in 'info breakpoints' as its number is zero, but it will show in 'maint info breakpoints'. As GDB prints the breakpoint, the thread-id for the breakpoint is printed as part of the 'stop only in thread ...' line. Printing the thread-id involves calling find_thread_global_id to convert the global thread-id into a thread_info*. Then calling print_thread_id to convert the thread_info* into a string. The problem is that find_thread_global_id returns nullptr as the thread for the thread-specific breakpoint has exited. The print_thread_id assumes it will be passed a non-nullptr. As a result GDB crashes. In this commit I've added an assert to print_thread_id (gdb/thread.c) to check that the pointed passed in is not nullptr. This assert would have triggered in the above case before GDB crashed. MI Notifications: The Dangers Of Changing A Breakpoint's Number --------------------------------------------------------------- Currently the delete_breakpoint function doesn't trigger the breakpoint_deleted observer for any breakpoint with the number zero. There is a comment explaining why this is the case in the code; it's something about watchpoints. But I did consider just removing the 'is the number zero' guard and always triggering the breakpoint_deleted observer, figuring that I'd then fix the watchpoint issue some other way. But I realised this wasn't going to be good enough. When the MI notification was delivered the number would be zero, so any frontend parsing the notifications would not be able to match =breakpoint-deleted notification to the earlier =breakpoint-created notification. What this means is that, at the point the breakpoint_deleted observer is called, the breakpoint's number must be correct. MI Notifications: The Dangers Of Delaying Deletion -------------------------------------------------- The test I used to expose the above crash also brought another problem to my attention. In the above test we used 'continue&' to resume, after which a thread exited, but the inferior didn't stop. Recreating the same test in the MI looks like this: -break-insert -p 2 main ^done,bkpt={number="2",type="breakpoint",disp="keep",...<snip>...} (gdb) -exec-continue ^running *running,thread-id="all" (gdb) ~"[Thread 0x7ffff7c5d700 (LWP 987038) exited]\n" =thread-exited,id="2",group-id="i1" ~"Thread-specific breakpoint 2 deleted - thread 2 no longer in the thread list.\n" At this point the we have a single thread left, which is still running: -thread-info ^done,threads=[{id="1",target-id="Thread 0x7ffff7c5eb80 (LWP 987035)",name="thread-bp-delet",state="running",core="4"}],current-thread-id="1" (gdb) Notice that we got the =thread-exited notification from GDB as soon as the thread exited. We also saw the CLI line from GDB, the line explaining that breakpoint 2 was deleted. But, as expected, we didn't see the =breakpoint-deleted notification. I say "as expected" because the number was set to zero. But, even if the number was not set to zero we still wouldn't see the notification. The MI notification is driven by the breakpoint_deleted observer, which is only called when we actually delete the breakpoint, which is only done the next time GDB stops. Now, maybe this is fine. The notification is delivered a little late. But remember, by setting the number to zero the breakpoint will be hidden from the user, for example, the breakpoint is removed from the MI's -break-info command output. This means that GDB is in a position where the breakpoint doesn't show up in the breakpoint table, but a =breakpoint-deleted notification has not yet been sent out. This doesn't seem right to me. What this means is that, when the thread exits, we should immediately be sending out the =breakpoint-deleted notification. We should not wait for GDB to next stop before sending the notification. The Solution ------------ My proposed solution is this; in remove_threaded_breakpoints, instead of setting the disposition to disp_del_at_next_stop and setting the number to zero, we now just call delete_breakpoint directly. The notification will now be sent out immediately; as soon as the thread exits. As the number has not changed when delete_breakpoint is called, the notification will have the correct number. And as the breakpoint is immediately removed from the breakpoint list, we no longer need to worry about 'maint info breakpoints' trying to print the thread-id for an exited thread. My only concern is that calling delete_breakpoint directly seems so obvious that I wonder why the original patch (that added remove_threaded_breakpoints) didn't take this approach. This code was added in commit 49fa26b0411d, but the commit message offers no clues to why this approach was taken, and the original email thread offers no insights either[2]. There are no test regressions after making this change, so I'm hopeful that this is going to be fine. [2] https://sourceware.org/pipermail/gdb-patches/2013-September/106493.html The Complication ---------------- Of course, it couldn't be that simple. The script gdb.python/py-finish-breakpoint.exp had some regressions during testing. The problem was with the FinishBreakpoint.out_of_scope callback implementation. This callback is supposed to trigger whenever the FinishBreakpoint goes out of scope; and this includes when the thread for the breakpoint exits. The problem I ran into is the Python FinishBreakpoint implementation. Specifically, after this change I was loosing some of the out_of_scope calls. The problem is that the out_of_scope call (of which I'm interested) is triggered from the inferior_exit observer. Before my change the observers were called in this order: thread_exit inferior_exit breakpoint_deleted The inferior_exit would trigger the out_of_scope call. After my change the breakpoint_deleted notification (for thread-specific breakpoints) occurs earlier, as soon as the thread-exits, so now the order is: thread_exit breakpoint_deleted inferior_exit Currently, after the breakpoint_deleted call the Python object associated with the breakpoint is released, so, when we get to the inferior_exit observer, there's no longer a Python object to call the out_of_scope method on. My solution is to follow the model for how bpfinishpy_pre_stop_hook and bpfinishpy_post_stop_hook are called, this is done from gdbpy_breakpoint_cond_says_stop in py-breakpoint.c. I've now added a new bpfinishpy_pre_delete_hook gdbpy_breakpoint_deleted in py-breakpoint.c, and from this new hook function I check and where needed call the out_of_scope method. With this fix in place I now see the gdb.python/py-finish-breakpoint.exp test fully passing again. Testing ------- Tested on x86-64/Linux with unix, native-gdbserver, and native-extended-gdbserver boards. New tests added to covers all the cases I've discussed above. Approved-By: Pedro Alves <pedro@palves.net>
2023-02-28gdb/testsuite: introduce is_target_non_stop helper procAndrew Burgess3-21/+8
I noticed that several tests included copy & pasted code to run the 'maint show target-non-stop' command, and then switch based on the result. In this commit I factor this code out into a helper proc in lib/gdb.exp, and update all the places I could find that used this pattern to make use of the helper proc. There should be no change in what is tested after this commit. Reviewed-By: Pedro Alves <pedro@palves.net>
2023-02-27all-stop "follow-fork parent" and selecting another threadPedro Alves2-0/+256
With: - catch a fork in thread 1 - select thread 2 - set follow-fork child - next ... follow_fork notices that thread 1 had last stopped for a fork which hasn't been followed yet, and because thread 1 is not the current thread, GDB aborts the execution command, presenting the stop in thread 1. That makes sense, as only the forking thread (thread 1) survives in the child, so better stop and let the user decide how to proceed. However, with: - catch a fork in thread 1 - select thread 2 - set follow-fork parent << note difference here - next ... GDB does the same: follow_fork notices that thread 1 had last stopped for a fork which hasn't been followed yet, and because thread 1 is not the current thread, GDB aborts the execution command, presenting the stop in thread 1. Aborting/stopping in this case doesn't make sense to me. As we're following the parent, thread 2 will still continue to exist in the parent. What the child does after we've followed the parent shouldn't matter -- it can go on running free, be detached, etc., depending on "set schedule-multiple", "set detach-on-fork", etc. That does not influence the execution command that the user issued for the parent thread. So this patch changes GDB in that direction -- in follow_fork, if following the parent, and we've switched threads meanwhile, switch back to the unfollowed thread, follow it (stay with the parent), and don't abort/stop. If we're following a fork (as opposed to vfork), then switch back again to the thread that the user was trying to resume. If following a vfork, however, stay with the vforking-thread selected, as we will need to see a vfork_done event first, before we can resume any other thread. As I was working on this, I managed to end up calling target_resume for a solo-thread resume (to collect the vfork_done event), with scope_ptid pointing at the vfork parent thread, and inferior_ptid pointing to the vfork child. For a solo-thread resume, the scope_ptid argument to target_resume must the same as inferior_ptid. The mistake was caught by the assertion in target_resume, like so: ... [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [1722839.1722839.0] at 0x5555555553c3 [infrun] do_target_resume: resume_ptid=1722839.1722939.0, step=0, sig=GDB_SIGNAL_0 ../../src/gdb/target.c:2661: internal-error: target_resume: Assertion `inferior_ptid.matches (scope_ptid)' failed. ... but I think it doesn't hurt to catch such a mistake earlier, hence the change in internal_resume_ptid. Change-Id: I896705506a16d2488b1bfb4736315dd966f4e412
2023-02-20[gdb/testsuite] Fix gdb.threads/schedlock.exp for gcc 4.8.5Tom de Vries1-3/+2
Since commit 9af467b8240 ("[gdb/testsuite] Fix gdb.threads/schedlock.exp on fast cpu"), the test-case fails for gcc 4.8.5. The problem is that for gcc 4.8.5, the commit turned a two-line loop: ... (gdb) next 78 while (*myp > 0) (gdb) next 81 MAYBE_CALL_SOME_FUNCTION(); (*myp) ++; (gdb) next 78 while (*myp > 0) ... into a three-line loop: ... (gdb) next 83 MAYBE_CALL_SOME_FUNCTION(); (*myp) ++; (gdb) next 84 cnt++; (gdb) next 85 } (gdb) next 83 MAYBE_CALL_SOME_FUNCTION(); (*myp) ++; (gdb) ... and the test-case doesn't expect this. Fix this by reverting back to the original loop shape as much as possible by: - removing the cnt++ line - replacing "while (1)" with "while (one)", where one is a volatile variable set to 1. Tested on x86_64-linux, using compilers: - gcc 4.8.5, 7.5.0, 12.2.1 - clang 4.0.1, 13.0.1
2023-02-06[gdb/testsuite] Fix gdb.threads/schedlock.exp on fast cpuTom de Vries1-4/+7
Occasionally, I run into: ... (gdb) PASS: gdb.threads/schedlock.exp: schedlock=on: cmd=continue: \ set scheduler-locking on continue^M Continuing.^M PASS: gdb.threads/schedlock.exp: schedlock=on: cmd=continue: \ continue (with lock) [Thread 0x7ffff746e700 (LWP 1339) exited]^M No unwaited-for children left.^M (gdb) Quit^M (gdb) FAIL: gdb.threads/schedlock.exp: schedlock=on: cmd=continue: \ stop all threads (with lock) (timeout) ... What happens is that this loop which is supposed to run "just short of forever": ... /* Don't run forever. Run just short of it :) */ while (*myp > 0) { /* schedlock.exp: main loop. */ MAYBE_CALL_SOME_FUNCTION(); (*myp) ++; } ... finishes after 0x7fffffff iterations (when a signed wrap occurs), which on my system takes only about 1.5 seconds. Fix this by: - changing the pointed-at type of myp from signed to unsigned, which makes the wrap defined behaviour (and which also make the loop run twice as long, which is already enough to make it impossible for me to reproduce the FAIL. But let's try to solve this more structurally). - changing the pointed-at type of myp from int to long long, making the wrap unlikely. - making sure the loop runs forever, by setting the loop condition to 1. - making sure the loop still contains different lines (as far as debug info is concerned) by incrementing a volatile counter in the loop. - making sure the program doesn't run forever in case of trouble, by adding an "alarm (30)". Tested on x86_64-linux. PR testsuite/30074 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30074
2023-02-06gdb: error if 'thread' or 'task' keywords are overusedAndrew Burgess2-0/+7
When creating a breakpoint or watchpoint, the 'thread' and 'task' keywords can be used to create a thread or task specific breakpoint or watchpoint. Currently, a thread or task specific breakpoint can only apply for a single thread or task, if multiple threads or tasks are specified when creating the breakpoint (or watchpoint), then the last specified id will be used. The exception to the above is that when the 'thread' keyword is used during the creation of a watchpoint, GDB will give an error if 'thread' is given more than once. In this commit I propose making this behaviour consistent, if the 'thread' or 'task' keywords are used more than once when creating either a breakpoint or watchpoint, then GDB will give an error. I haven't updated the manual, we don't explicitly say that these keywords can be repeated, and (to me), given the keyword takes a single id, I don't think it makes much sense to repeat the keyword. As such, I see this more as adding a missing error to GDB, rather than making some big change. However, I have added an entry to the NEWS file as I guess it is possible that some people might hit this new error with an existing (I claim, badly written) GDB script. I've added some new tests to check for the new error. Just one test needed updating, gdb.linespec/keywords.exp, this test did use the 'thread' keyword twice, and expected the breakpoint to be created. Looking at what this test was for though, it was checking the use of '-force-condition', and I don't think that being able to repeat 'thread' was actually a critical part of this test. As such, I've updated this test to expect the error when 'thread' is repeated.
2023-02-04gdb/testsuite: don't try to set non-stop mode on a running targetAndrew Burgess1-71/+67
The test gdb.threads/thread-specific-bp.exp tries to set non-stop mode on a running target, something which the manual makes clear is not allowed. This commit restructures the test a little, we now set the non-stop mode as part of the GDBFLAGS, so the mode will be set before GDB connects to the target. As a consequence I'm able to move the with_test_prefix out of the check_thread_specific_breakpoint proc. The check_thread_specific_breakpoint proc is now called within a loop. After this commit the gdb.threads/thread-specific-bp.exp test still has some failures, this is because of an issue GDB currently has printing "Thread ... exited" messages. This problem should be addressed by this patch: https://sourceware.org/pipermail/gdb-patches/2022-December/194694.html when it is merged.
2023-01-30gdb: Make global feature array a per-remote target arrayChristina Schimpe1-3/+6
This patch applies the appropriate FIXME notes described in commit 5b6d1e4 "Multi-target support". "You'll notice that remote.c includes some FIXME notes. These refer to the fact that the global arrays that hold data for the remote packets supported are still globals. For example, if we connect to two different servers/stubs, then each might support different remote protocol features. They might even be different architectures, like e.g., one ARM baremetal stub, and a x86 gdbserver, to debug a host/controller scenario as a single program. That isn't going to work correctly today, because of said globals. I'm leaving fixing that for another pass, since it does not appear to be trivial, and I'd rather land the base work first. It's already useful to be able to debug multiple instances of the same server (e.g., a distributed cluster, where you have full control over the servers installed), so I think as is it's already reasonable incremental progress." Using this patch it is possible to configure per-remote targets' feature packets. Given the following setup for two gdbservers: ~~~~ gdbserver --multi :1234 gdbserver --disable-packet=vCont --multi :2345 ~~~~ Before this patch configuring of range-stepping was not possible for one of two connected remote targets with different support for the vCont packet. As one of the targets supports vCont, it should be possible to configure "set range-stepping". However, the output of GDB looks like: (gdb) target extended-remote :1234 Remote debugging using :1234 (gdb) add-inferior -no-connection [New inferior 2] Added inferior 2 (gdb) inferior 2 [Switching to inferior 2 [<null>] (<noexec>)] (gdb) target extended-remote :2345 Remote debugging using :2345 (gdb) set range-stepping on warning: Range stepping is not supported by the current target (gdb) inferior 1 [Switching to inferior 1 [<null>] (<noexec>)] (gdb) set range-stepping on warning: Range stepping is not supported by the current target ~~~~ Two warnings are shown. The warning for inferior 1 should not appear as it is connected to a target supporting the vCont package. ~~~~ (gdb) target extended-remote :1234 Remote debugging using :1234 (gdb) add-inferior -no-connection [New inferior 2] Added inferior 2 (gdb) inferior 2 [Switching to inferior 2 [<null>] (<noexec>)] (gdb) target extended-remote :2345 Remote debugging using :2345 (gdb) set range-stepping on warning: Range stepping is not supported by the current target (gdb) inferior 1 [Switching to inferior 1 [<null>] (<noexec>)] (gdb) set range-stepping on (gdb) ~~~~ Now only one warning is shown for inferior 2, which is connected to a target not supporting vCont. The per-remote target feature array is realized by a new class remote_features, which stores the per-remote target array and provides functions to determine supported features of the target. A remote_target object now has a new member of that class. Each time a new remote_target object is initialized, a new per-remote target array is constructed based on the global remote_protocol_packets array. The global array is initialized in the function _initialize_remote and can be configured using the command line. Before this patch the command line configuration affected current targets and future remote targets (due to the global feature array used by all remote targets). This behavior is different and the configuration applies as follows: - If a target is connected, the command line configuration affects the current connection. All other existing remote targets are not affected. - If not connected, the command line configuration affects future connections. The show command displays the current remote target's configuration. If no remote target is selected the default configuration for future connections is shown. If we have for instance the following setup with inferior 2 being selected: ~~~~ (gdb) info inferiors Num Description Connection Executable 1 <null> 1 (extended-remote :1234) * 2 <null> 2 (extended-remote :2345) ~~~~ Before this patch, if we run 'set remote multiprocess-feature-packet', the following configuration was set: The feature array of all remote targets (in this setup the two connected targets) and all future remote connections are affected. After this patch, it will be configured as follows: The feature array of target with port :2345 which is currently selected will be configured. All other existing remote targets are not affected. The show command 'show remote multiprocess-feature-packet' will display the configuration of target with port :2345. Due to this configuration change, it is required to adapt the test "gdb/testsuite/gdb.multi/multi-target-info-inferiors.exp" to configure the multiprocess-feature-packet before the connections are created. To inform the gdb user about the new behaviour of the 'show remote PACKET-NAME' commands and the new configuration impact for remote targets using the 'set remote PACKET-NAME' commands the commands' outputs are adapted. Due to this change it is required to adapt each test using the set/show remote 'PACKET-NAME' commands.
2023-01-26Use clean_restart in gdb.threadsTom Tromey4-23/+4
Change gdb.threads to use clean_restart more consistently.
2023-01-26Eliminate spurious returns from the test suiteTom Tromey9-18/+0
A number of tests end with "return". However, this is unnecessary. This patch removes all of these.
2023-01-25Use require with is_remoteTom Tromey3-9/+6
This changes some tests to use require with 'is_remote', rather than an explicit test. This adds uniformity helps clean up more spots where a test might exit early without any notification.
2023-01-25Add unsupported calls where neededTom Tromey1-0/+1
A number of tests will exit early without saying why. This patch adds "unsupported" at spots like this that I could readily find. There are definitely more of these; for example dw2-ranges.exp fails silently because a compilation fails. I didn't attempt to address these as that is a much larger task.
2023-01-13Rename to allow_hw_breakpoint_testsTom Tromey1-1/+1
This changes skip_hw_breakpoint_tests to invert the sense, and renames it to allow_hw_breakpoint_tests. This also converts some tests to use "require" -- I missed this particular check in the first series.