aboutsummaryrefslogtreecommitdiff
path: root/gdb/testsuite/gdb.threads
AgeCommit message (Collapse)AuthorFilesLines
2022-06-04[gdb/testsuite] Fix gdb.threads/manythreads.exp with check-read1Tom de Vries1-15/+19
When running test-case gdb.threads/manythreads.exp with check-read1, I ran into this hard-to-reproduce FAIL: ... [New Thread 0x7ffff7318700 (LWP 31125)]^M [Thread 0x7ffff7321700 (LWP 31124) exited]^M [New T^C^M ^M Thread 769 "manythreads" received signal SIGINT, Interrupt.^M [Switching to Thread 0x7ffff6d66700 (LWP 31287)]^M 0x00007ffff7586a81 in clone () from /lib64/libc.so.6^M (gdb) FAIL: gdb.threads/manythreads.exp: stop threads 1 ... The matching in the failing gdb_test_multiple is done in an intricate way, trying to pass on some order and fail on another order. Fix this by rewriting the regexps to match one line at most, and detecting invalid order by setting and checking state variables. Tested on x86_64-linux. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29177
2022-05-25gdb: Fix DUPLICATE and PATH regressions throughoutPedro Alves3-9/+8
The previous patch to add -prompt/-lbl to gdb_test introduced a regression: Before, you could specify an explicit empty message to indicate you didn't want to PASS, like so: gdb_test COMMAND PATTERN "" After said patch, gdb_test no longer distinguishes no-message-specified vs empty-message, so tests that previously would be silent on PASS, now started emitting PASS messages based on COMMAND. This in turn introduced a number of PATH/DUPLICATE violations in the testsuite. This commit fixes all the regressions I could see. This patch uses the new -nopass feature introduced in the previous commit, but tries to avoid it if possible. Most of the patch fixes DUPLICATE issues the usual way, of using with_test_prefix or explicit unique messages. See previous commit's log for more info. In addition to looking for DUPLICATEs, I also looked for cases where we would now end up with an empty message in gdb.sum, due to a gdb_test being passed both no message and empty command. E.g., this in gdb.ada/bp_reset.exp: gdb_run_cmd gdb_test "" "Breakpoint $decimal, foo\\.nested_sub \\(\\).*" was resulting in this in gdb.sum: PASS: gdb.ada/bp_reset.exp: I fixed such cases by passing an explicit message. We may want to make such cases error out. Tested on x86_64 GNU/Linux, native and native-extended-gdbserver. I see zero PATH cases now. I get zero DUPLICATEs with native testing now. I still see some DUPLICATEs with native-extended-gdbserver, but those were preexisting, unrelated to the gdb_test change. Change-Id: I5375f23f073493e0672190a0ec2e847938a580b2
2022-05-22Accept functions with DW_AT_linkage_name presentAlok Kumar Sharma2-0/+100
Currently GDB is not able to debug (Binary generated with Clang) variables present in shared/private clause of OpenMP Task construct. Please note that LLVM debugger LLDB is able to debug. In case of OpenMP, compilers generate artificial functions which are not present in actual program. This is done to apply parallelism to block of code. For non-artifical functions, DW_AT_name attribute should contains the name exactly as present in actual program. (Ref# http://wiki.dwarfstd.org/index.php?title=Best_Practices) Since artificial functions are not present in actual program they not having DW_AT_name and having DW_AT_linkage_name instead should be fine. Currently GDB is invalidating any function not havnig DW_AT_name which is why it is not able to debug OpenMP (Clang). It should be fair to fallback to check DW_AT_linkage_name in case DW_AT_name is absent.
2022-05-16gdb/testsuite: fix "continue outside of loop" TCL errorsBruno Larsen13-14/+14
Many test cases had a few lines in the beginning that look like: if { condition } { continue } Where conditions varied, but were mostly in the form of ![runto_main] or [skip_*_tests], making it quite clear that this code block was supposed to finish the test if it entered the code block. This generates TCL errors, as most of these tests are not inside loops. All cases on which this was an obvious mistake are changed in this patch.
2022-05-08[gdb/testsuite] Fix gdb.threads/fork-plus-threads.exp with check-readmoreTom de Vries1-11/+7
When running test-case gdb.threads/fork-plus-threads.exp with check-readmore, I run into: ... [Inferior 11 (process 7029) exited normally]^M [Inferior 1 (process 6956) exited normally]^M FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \ inferior 1 exited (timeout) ... The problem is that the regexp consuming the "Inferior exited normally" messages: - consumes more than one of those messages at a time, but - counts only one of those messages. Fix this by adopting a line-by-line approach, which deals with those messages one at a time. Tested on x86_64-linux with native, check-read1 and check-readmore.
2022-05-03Fix gdb.threads/access-mem-running-thread-exit.exp w/ native-extended-gdbserverPedro Alves1-1/+29
When testing gdb.threads/access-mem-running-thread-exit.exp with --target_board=native-extended-gdbserver, we get: Running gdb.threads/access-mem-running-thread-exit.exp ... FAIL: gdb.threads/access-mem-running-thread-exit.exp: non-stop: second inferior: runto: run to main WARNING: Timed out waiting for EOF in server after monitor exit === gdb Summary === # of expected passes 3 # of unexpected failures 1 # of unsupported tests 1 The problem is that the testcase spawns a second inferior with -no-connection, and then runto_main does "run", which fails like so: (gdb) run Don't know how to run. Try "help target". (gdb) FAIL: gdb.threads/access-mem-running-thread-exit.exp: non-stop: second inferior: runto: run to main That "run" above failed because native-extended-gdbserver forces "set auto-connect-native-target off", to prevent testcases from mistakenly running programs with the native target, which would exactly be the case here. Fix this by letting the second inferior share the first inferior's connection everywhere except on targets that do reload on run (e.g., --target_board=native-gdbserver). Change-Id: Ib57105a238cbc69c57220e71261219fa55d329ed
2022-04-22gdb: prune inferiors at end of fetch_inferior_event, fix intermittent ↵Simon Marchi1-2/+31
failure of gdb.threads/fork-plus-threads.exp This test sometimes fail like this: info threads^M Id Target Id Frame ^M 11.12 process 2270719 Couldn't get registers: No such process.^M (gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: no threads left [Inferior 11 (process 2270719) exited normally]^M info inferiors^M Num Description Connection Executable ^M * 1 <null> /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/fork-plus-threads/fork-plus-threads ^M 11 <null> /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/fork-plus-threads/fork-plus-threads ^M (gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: only inferior 1 left (the program exited) I can get it to fail quite reliably by pinning it to a core: $ taskset -c 5 make check TESTS="gdb.threads/fork-plus-threads.exp" The previous attempt at fixing this was: https://sourceware.org/pipermail/gdb-patches/2021-October/182846.html What we see is part due to a possible unfortunate ordering of events given by the kernel, and what could be considered a bug in GDB. The test program makes a number of forks, waits them all, then exits. Most of the time, GDB will get and process the exit event for inferior 1 after the exit events of all the children. But this is not guaranteed. After the last child exits and is waited by the parent, the parent can exit quickly, such that GDB collects from the kernel the exit events for the parent and that child at the same time. It then chooses one event at random, which can be the event for the parent. This will result in the parent appearing to exit before its child. There's not much we can do about it, so I think we have to adjust the test to cope. After expect has seen the "exited normally" notification for inferior 1, it immediately does an "info thread" that it expects to come back empty. But at this point, GDB might not have processed inferior 11's (the last child) exit event, so it will look like there is still a thread. Of course that thread is dead, we just don't know it yet. But that makes the "no thread" test fail. If the test waited just a bit more for the "exited normally" notification for inferior 11, then the list of threads would be empty. So, first change, make the test collect all the "exited normally" notifications for all inferiors before proceeding, that should ensure we see an empty thread list. That would fix the first FAIL above. However, we would still have the second FAIL, as we expect inferior 11 to not be there, it should have been deleted automatically. Inferior 11 is normally deleted when prune_inferiors is called. That is called by normal_stop, which is only called by fetch_inferior_event only if the event thread completed an execution command FSM (thread_fsm). But the FSM for the continue command completed when inferior 1 exited. At that point inferior 11 was not prunable, as it still had a thread. When inferior 11 exits, prune_inferiors is not called. I think that can be considered a GDB bug. From the user point of view, there's no reason why in one case inferior 11 would be deleted and not in the other case. This patch makes the somewhat naive change to call prune_inferiors in fetch_inferior_event, so that it is called in this case. It is placed at this particular point in the function so that it is called after the user inferior / thread selection is restored. If it was called before that, inferior 11 wouldn't be pruned, because it would still be the current inferior. Change-Id: I48a15d118f30b1c72c528a9f805ed4974170484a Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26272
2022-04-18gdb/testsuite: add text_segment option to gdb_compileVignesh Balasubramanian1-2/+2
LLVM's lld linker doesn't have the "-Ttext-segment" option, but "--image-base" can be used instead. To centralize the logic of checking which option is supported, add the text_segment option to gdb_compile. Change tests that are currently using -Ttext-segment to use that new option instead. This patch fixes only compilation error, for example: Before: $ make check TESTS="gdb.base/jit-elf.exp" RUNTESTFLAGS="CC_FOR_TARGET=clang LDFLAGS_FOR_TARGET=-fuse-ld=ld" Running /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/jit-elf.exp ... gdb compile failed, clang-13: warning: -Xlinker -Ttext-segment=0x7000000: 'linker' input unused [-Wunused-command-line-argument] After: $ make check TESTS="gdb.base/jit-elf.exp" RUNTESTFLAGS="CC_FOR_TARGET=clang LDFLAGS_FOR_TARGET=-fuse-ld=ld" Running /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/jit-elf.exp ... FAIL: gdb.base/jit-elf.exp: one_jit_test-1: continue to breakpoint: break here 1 FAIL: gdb.base/jit-elf.exp: one_jit_test-1: continue to breakpoint: break here 2 FAIL: gdb.base/jit-elf.exp: one_jit_test-2: continue to breakpoint: break here 1 FAIL: gdb.base/jit-elf.exp: one_jit_test-2: info function ^jit_function FAIL: gdb.base/jit-elf.exp: one_jit_test-2: continue to breakpoint: break here 2 FAIL: gdb.base/jit-elf.exp: attach: one_jit_test-2: continue to breakpoint: break here 1 FAIL: gdb.base/jit-elf.exp: attach: one_jit_test-2: break here 1: attach FAIL: gdb.base/jit-elf.exp: PIE: one_jit_test-1: continue to breakpoint: break here 1 FAIL: gdb.base/jit-elf.exp: PIE: one_jit_test-1: continue to breakpoint: break here 2 === gdb Summary === # of expected passes 26 # of unexpected failures 9 Change-Id: I3678c5c9bbfc2f80671698e28a038e6b3d14e635
2022-04-04gdb: resume ongoing step after handling fork or vforkSimon Marchi2-0/+208
The test introduced by this patch would fail in this configuration, with the native-gdbserver or native-extended-gdbserver boards: FAIL: gdb.threads/next-fork-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=auto: i=2: next to for loop The problem is that the step operation is forgotten when handling the fork/vfork. With "debug infrun" and "debug remote", it looks like this (some lines omitted for brevity). We do the next: [infrun] proceed: enter [infrun] proceed: addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=0, current thread [4154304.4154304.0] at 0x5555555553bf [infrun] do_target_resume: resume_ptid=4154304.0.0, step=1, sig=GDB_SIGNAL_0 [remote] Sending packet: $vCont;r5555555553bf,5555555553c4:p3f63c0.3f63c0;c:p3f63c0.-1#cd [infrun] proceed: exit We then handle a fork event: [infrun] fetch_inferior_event: enter [remote] wait: enter [remote] Packet received: T05fork:p3f63ee.3f63ee;06:0100000000000000;07:b08e59f6ff7f0000;10:bf60e8f7ff7f0000;thread:p3f63c0.3f63c6;core:17; [remote] wait: exit [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) = [infrun] print_target_wait_results: 4154304.4154310.0 [Thread 4154304.4154310], [infrun] print_target_wait_results: status->kind = FORKED, child_ptid = 4154350.4154350.0 [infrun] handle_inferior_event: status->kind = FORKED, child_ptid = 4154350.4154350.0 [remote] Sending packet: $D;3f63ee#4b [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [4154304.4154310.0] at 0x7ffff7e860bf [infrun] do_target_resume: resume_ptid=4154304.0.0, step=0, sig=GDB_SIGNAL_0 [remote] Sending packet: $vCont;c:p3f63c0.-1#73 [infrun] fetch_inferior_event: exit In the first snippet, we resume the stepping thread with the range-stepping (r) vCont command. But after handling the fork (detaching the fork child), we resumed the whole process freely. The stepping thread, which was paused by GDBserver while reporting the fork event, was therefore resumed freely, instead of confined to the addresses of the stepped line. Note that since this is a "next", it could be that we have entered a function, installed a step-resume breakpoint, and it's ok to continue freely the stepping thread, but that's not the case here. The two snippets shown above were next to each other in the logs. For the fork case, we can resume stepping right after handling the event. However, for the vfork case, where we are waiting for the external child process to exec or exit, we only resume the thread that called vfork, and keep the others stopped (see patch "gdb: fix handling of vfork by multi-threaded program" prior in this series). So we can't resume the stepping thread right now. Instead, do it after handling the vfork-done event. Change-Id: I92539c970397ce880110e039fe92b87480f816bd
2022-04-04gdb: fix handling of vfork by multi-threaded program ↵Simon Marchi2-0/+184
(follow-fork-mode=parent, detach-on-fork=on) There is a problem with how GDB handles a vfork happening in a multi-threaded program. This problem was reported to me by somebody not using vfork directly, but using system(3) in a multi-threaded program, which may be implemented using vfork. This patch only deals about the follow-fork-mode=parent, detach-on-fork=on case, because it would be too much to chew at once to fix the bugs in the other cases as well (I tried). The problem ----------- When a program vforks, the parent thread is suspended by the kernel until the child process exits or execs. Specifically, in a multi-threaded program, only the thread that called vfork is suspended, other threads keep running freely. This is documented in the vfork(2) man page ("Caveats" section). Let's suppose GDB is handling a vfork and the user's desire is to detach from the child. Before detaching the child, GDB must remove the software breakpoints inserted in the shared parent/child address space, in case there's a breakpoint in the path the child is going to take before exec'ing or exit'ing (unlikely, but possible). Otherwise the child could hit a breakpoint instruction while running outside the control of GDB, which would make it crash. GDB must also avoid re-inserting breakpoints in the parent as long as it didn't receive the "vfork done" event (that is, when the child has exited or execed): since the address space is shared with the child, that would re-insert breakpoints in the child process also. So what GDB does is: 1. Receive "vfork" event for the parent 2. Remove breakpoints from the (shared) address space and set program_space::breakpoints_not_allowed to avoid re-inserting them 3. Detach from the child thread 4. Resume the parent 5. Wait for and receive "vfork done" event for the parent 6. Clean program_space::breakpoints_not_allowed and re-insert breakpoints 7. Resume the parent Resuming the parent at step 4 is necessary in order for the kernel to report the "vfork done" event. The kernel won't report a ptrace event for a thread that is ptrace-stopped. But the theory behind this is that between steps 4 and 5, the parent won't actually do any progress even though it is ptrace-resumed, because the kernel keeps it suspended, waiting for the child to exec or exit. So it doesn't matter for that thread if breakpoints are not inserted. The problem is when the program is multi-threaded. In step 4, GDB resumes all threads of the parent. The thread that did the vfork stays suspended by the kernel, so that's fine. But other threads are running freely while breakpoints are removed, which is a problem because they could miss a breakpoint that they should have hit. The problem is present with all-stop and non-stop targets. The only difference is that with an all-stop targets, the other threads are stopped by the target when it reports the vfork event and are resumed by the target when GDB resumes the parent. With a non-stop target, the other threads are simply never stopped. The fix ------- There many combinations of settings to consider (all-stop/non-stop, target-non-stop on/off, follow-fork-mode parent/child, detach-on-fork on/off, schedule-multiple on/off), but for this patch I restrict the scope to follow-fork-mode=parent, detach-on-fork=on. That's the "default" case, where we detach the child and keep debugging the parent. I tried to fix them all, but it's just too much to do at once. The code paths and behaviors for when we don't detach the child are completely different. The guiding principle for this patch is that all threads of the vforking inferior should be stopped as long as breakpoints are removed. This is similar to handling in-line step-overs, in a way. For non-stop targets (the default on Linux native), this is what happens: - In follow_fork, we call stop_all_threads to stop all threads of the inferior - In follow_fork_inferior, we record the vfork parent thread in inferior::thread_waiting_for_vfork_done - Back in handle_inferior_event, we call keep_going, which resumes only the event thread (this is already the case, with a non-stop target). This is the thread that will be waiting for vfork-done. - When we get the vfork-done event, we go in the (new) handle_vfork_done function to restart the previously stopped threads. In the same scenario, but with an all-stop target: - In follow_fork, no need to stop all threads of the inferior, the target has stopped all threads of all its inferiors before returning the event. - In follow_fork_inferior, we record the vfork parent thread in inferior::thread_waiting_for_vfork_done. - Back in handle_inferior_event, we also call keep_going. However, we only want to resume the event thread here, not all inferior threads. In internal_resume_ptid (called by resume_1), we therefore now check whether one of the inferiors we are about to resume has thread_waiting_for_vfork_done set. If so, we only resume that thread. Note that when resuming multiple inferiors, one vforking and one not non-vforking, we could resume the vforking thread from the vforking inferior plus all threads from the non-vforking inferior. However, this is not implemented, it would require more work. - When we get the vfork-done event, the existing call to keep_going naturally resumes all threads. Testing-wise, add a test that tries to make the main thread hit a breakpoint while a secondary thread calls vfork. Without the fix, the main thread keeps going while breakpoints are removed, resulting in a missed breakpoint and the program exiting. Change-Id: I20eb78e17ca91f93c19c2b89a7e12c382ee814a1
2022-03-31gdb/linux-nat: remove check based on current_inferior in ↵Simon Marchi3-0/+195
linux_handle_extended_wait The check removed by this patch, using current_inferior, looks wrong. When debugging multiple inferiors with the Linux native target and linux_handle_extended_wait is called, there's no guarantee about which is the current inferior. The vfork-done event we receive could be for any inferior. If the vfork-done event is for a non-current inferior, we end up wrongfully ignoring it. As a result, the core never processes a TARGET_WAITKIND_VFORK_DONE event, program_space::breakpoints_not_allowed is never cleared, and breakpoints are never reinserted. However, because the Linux native target decided to ignore the event, it resumed the thread - while breakpoints out. And that's bad. The proposed fix is to remove this check. Always report vfork-done events and let infrun's logic decide if it should be ignored. We don't save much cycles by filtering the event here. Add a test that replicates the situation described above. See comments in the test for more details. Change-Id: Ibe33c1716c3602e847be6c2093120696f2286fbf
2022-03-10Process exit status is leader exit status testcaseLancelot SIX2-0/+110
This adds a multi-threaded testcase that has all threads in the process exit with a different exit code, and ensures that GDB reports the thread group leader's exit status as the whole-process exit status. Before this set of patches, this would randomly report the exit code of some other thread, and thus fail. Tested on Linux-x86_64, native and gdbserver. Co-Authored-By: Pedro Alves <pedro@palves.net> Change-Id: I30cba2ff4576fb01b5169cc72667f3268d919557
2022-03-10Fix gdb.threads/current-lwp-dead.exp racePedro Alves2-37/+87
If we make GDB report the process EXIT event for the leader thread, as will be done in a latter patch of this series, then gdb.threads/current-lwp-dead.exp starts failing: (gdb) break fn_return Breakpoint 2 at 0x5555555551b5: file /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/current-lwp-dead.c, line 45. (gdb) continue Continuing. [New LWP 2138466] [Inferior 1 (process 2138459) exited normally] (gdb) FAIL: gdb.threads/current-lwp-dead.exp: continue to breakpoint: fn_return (the program exited) The inferior exit reported is actually correct. The main thread has indeed exited, and that's the thread that has the right exit code to report to the user, as that's the exit code that is reported to the program's parent. In this case, GDB managed to collect the exit code for the leader thread before reaping the other thread, because in reality, the testcase isn't creating standard threads, it is using raw clone, and the new clones are put in their own thread group. Fix it by making the main "thread" not exit until the scenario we're exercising plays out. Also, run the program to completion for completeness. The original program really wanted the leader thread to exit before the fn_return function was reached -- it was important that the current thread as pointed by inferior_ptid was gone when infrun got the breakpoint event. I've tweaked the testcase to ensure that that condition is still held, though it is no longer the main thread that exits. This required a bit of synchronization between the threads, which required using CLONE_VM unconditionally. The #ifdef guards were added as a fix for https://sourceware.org/bugzilla/show_bug.cgi?id=11214, though I don't think they were necessary because the program is not using TLS. If it turns out they were necessary, we can link the testcase with "-z now" instead, which was mentioned as an alternative workaround in that Bugzilla. Change-Id: I7be2f0da4c2fe8f80a60bdde5e6c623d8bd5a0aa
2022-03-10Fix gdb.threads/clone-new-thread-event.exp racePedro Alves2-1/+17
If we make GDB report the process EXIT event for the leader thread, instead of whatever is the last thread in the LWP list, as will be done in a latter patch of this series, then gdb.threads/current-lwp-dead.exp starts failing: (gdb) FAIL: gdb.threads/clone-new-thread-event.exp: catch SIGUSR1 (the program exited) This is a testcase race -- the main thread does not wait for the spawned clone "thread" to finish before exiting, so the main program may exit before the second thread is scheduled and reports its SIGUSR1. With the change to make GDB report the EXIT for the leader, the race is 100% reproducible by adding a sleep(), like so: --- c/gdb/testsuite/gdb.threads/clone-new-thread-event.c +++ w/gdb/testsuite/gdb.threads/clone-new-thread-event.c @@ -51,6 +51,7 @@ local_gettid (void) static int fn (void *unused) { + sleep (1); tkill (local_gettid (), SIGUSR1); return 0; } Resulting in: Breakpoint 1, main (argc=1, argv=0x7fffffffd418) at gdb.threads/clone-new-thread-event.c:65 65 stack = malloc (STACK_SIZE); (gdb) continue Continuing. [New LWP 3715562] [Inferior 1 (process 3715555) exited normally] (gdb) FAIL: gdb.threads/clone-new-thread-event.exp: catch SIGUSR1 (the program exited) That inferior exit reported is actually correct. The main thread has indeed exited, and that's the thread that has the right exit code to report to the user, as that's the exit code that is reported to the program's parent. In this case, GDB managed to collect the exit code for the leader thread before reaping the other thread, because in reality, the testcase isn't creating standard threads, it is using raw clone, and the new clones are put in their own thread group. Fix it by making the main thread wait for the child to exit. Also, run the program to completion for completeness. Change-Id: I315cd3dc2b9e860395dcab9658341ea868d7a6bf
2022-02-03testsuite: fix failure in gdb.threads/killed-outside.expTankut Baris Aktemur1-2/+2
Starting with commit commit 1da5d0e664e362857153af8682321a89ebafb7f6 Date: Tue Jan 4 08:02:24 2022 -0700 Change how Python architecture and language are handled we see a failure in gdb.threads/killed-outside.exp: ... Executing on target: kill -9 16622 (timeout = 300) builtin_spawn -ignore SIGHUP kill -9 16622 continue Continuing. Couldn't get registers: No such process. (gdb) [Thread 0x7ffff77c2700 (LWP 16626) exited] Program terminated with signal SIGKILL, Killed. The program no longer exists. FAIL: gdb.threads/killed-outside.exp: prompt after first continue (timeout) This is not a regression but a failure due to a change in GDB's output. Prior to the aforementioned commit, GDB has been printing the "Couldn't get registers: No such process." message twice. The second one came from (top-gdb) bt #0 amd64_linux_nat_target::fetch_registers (this=0x555557f31440 <the_amd64_linux_nat_target>, regcache=0x555558805ce0, regnum=16) at /gdb-up/gdb/amd64-linux-nat.c:225 #1 0x000055555640ac5f in target_ops::fetch_registers (this=0x555557d636d0 <the_thread_db_target>, arg0=0x555558805ce0, arg1=16) at /gdb-up/gdb/target-delegates.c:502 #2 0x000055555641a647 in target_fetch_registers (regcache=0x555558805ce0, regno=16) at /gdb-up/gdb/target.c:3945 #3 0x0000555556278e68 in regcache::raw_update (this=0x555558805ce0, regnum=16) at /gdb-up/gdb/regcache.c:587 #4 0x0000555556278f14 in readable_regcache::raw_read (this=0x555558805ce0, regnum=16, buf=0x555558881950 "") at /gdb-up/gdb/regcache.c:601 #5 0x00005555562792aa in readable_regcache::cooked_read (this=0x555558805ce0, regnum=16, buf=0x555558881950 "") at /gdb-up/gdb/regcache.c:690 #6 0x000055555627965e in readable_regcache::cooked_read_value (this=0x555558805ce0, regnum=16) at /gdb-up/gdb/regcache.c:748 #7 0x0000555556352a37 in sentinel_frame_prev_register (this_frame=0x555558181090, this_prologue_cache=0x5555581810a8, regnum=16) at /gdb-up/gdb/sentinel-frame.c:53 #8 0x0000555555fa4773 in frame_unwind_register_value (next_frame=0x555558181090, regnum=16) at /gdb-up/gdb/frame.c:1235 #9 0x0000555555fa420d in frame_register_unwind (next_frame=0x555558181090, regnum=16, optimizedp=0x7fffffffd570, unavailablep=0x7fffffffd574, lvalp=0x7fffffffd57c, addrp=0x7fffffffd580, realnump=0x7fffffffd578, bufferp=0x7fffffffd5b0 "") at /gdb-up/gdb/frame.c:1143 #10 0x0000555555fa455f in frame_unwind_register (next_frame=0x555558181090, regnum=16, buf=0x7fffffffd5b0 "") at /gdb-up/gdb/frame.c:1199 #11 0x00005555560178e2 in i386_unwind_pc (gdbarch=0x5555587c4a70, next_frame=0x555558181090) at /gdb-up/gdb/i386-tdep.c:1972 #12 0x0000555555cd2b9d in gdbarch_unwind_pc (gdbarch=0x5555587c4a70, next_frame=0x555558181090) at /gdb-up/gdb/gdbarch.c:3007 #13 0x0000555555fa3a5b in frame_unwind_pc (this_frame=0x555558181090) at /gdb-up/gdb/frame.c:948 #14 0x0000555555fa7621 in get_frame_pc (frame=0x555558181160) at /gdb-up/gdb/frame.c:2572 #15 0x0000555555fa7706 in get_frame_address_in_block (this_frame=0x555558181160) at /gdb-up/gdb/frame.c:2602 #16 0x0000555555fa77d0 in get_frame_address_in_block_if_available (this_frame=0x555558181160, pc=0x7fffffffd708) at /gdb-up/gdb/frame.c:2665 #17 0x0000555555fa5f8d in select_frame (fi=0x555558181160) at /gdb-up/gdb/frame.c:1890 #18 0x0000555555fa5bab in lookup_selected_frame (a_frame_id=..., frame_level=-1) at /gdb-up/gdb/frame.c:1720 #19 0x0000555555fa5e47 in get_selected_frame (message=0x0) at /gdb-up/gdb/frame.c:1810 #20 0x0000555555cc9c6e in get_current_arch () at /gdb-up/gdb/arch-utils.c:848 #21 0x000055555625b239 in gdbpy_before_prompt_hook (extlang=0x555557451f20 <extension_language_python>, current_gdb_prompt=0x555557f4d890 <top_prompt+16> "(gdb) ") at /gdb-up/gdb/python/python.c:1063 #22 0x0000555555f7cfbb in ext_lang_before_prompt (current_gdb_prompt=0x555557f4d890 <top_prompt+16> "(gdb) ") at /gdb-up/gdb/extension.c:922 #23 0x0000555555f7d442 in std::_Function_handler<void (char const*), void (*)(char const*)>::_M_invoke(std::_Any_data const&, char const*&&) (__functor=..., __args#0=@0x7fffffffd900: 0x555557f4d890 <top_prompt+16> "(gdb) ") at /usr/include/c++/7/bits/std_function.h:316 #24 0x0000555555f752dd in std::function<void (char const*)>::operator()(char const*) const (this=0x55555817d838, __args#0=0x555557f4d890 <top_prompt+16> "(gdb) ") at /usr/include/c++/7/bits/std_function.h:706 #25 0x0000555555f75100 in gdb::observers::observable<char const*>::notify (this=0x555557f49060 <gdb::observers::before_prompt>, args#0=0x555557f4d890 <top_prompt+16> "(gdb) ") at /gdb-up/gdb/../gdbsupport/observable.h:150 #26 0x0000555555f736dc in top_level_prompt () at /gdb-up/gdb/event-top.c:444 #27 0x0000555555f735ba in display_gdb_prompt (new_prompt=0x0) at /gdb-up/gdb/event-top.c:411 #28 0x00005555564611a7 in tui_on_command_error () at /gdb-up/gdb/tui/tui-interp.c:205 #29 0x0000555555c2173f in std::_Function_handler<void (), void (*)()>::_M_invoke(std::_Any_data const&) (__functor=...) at /usr/include/c++/7/bits/std_function.h:316 #30 0x0000555555e10c20 in std::function<void ()>::operator()() const (this=0x5555580f9028) at /usr/include/c++/7/bits/std_function.h:706 #31 0x0000555555e10973 in gdb::observers::observable<>::notify() const (this=0x555557f48d20 <gdb::observers::command_error>) at /gdb-up/gdb/../gdbsupport/observable.h:150 #32 0x00005555560e9b3f in start_event_loop () at /gdb-up/gdb/main.c:438 #33 0x00005555560e9bcc in captured_command_loop () at /gdb-up/gdb/main.c:481 #34 0x00005555560eb616 in captured_main (data=0x7fffffffddd0) at /gdb-up/gdb/main.c:1348 #35 0x00005555560eb67c in gdb_main (args=0x7fffffffddd0) at /gdb-up/gdb/main.c:1363 #36 0x0000555555c1b6b3 in main (argc=12, argv=0x7fffffffded8) at /gdb-up/gdb/gdb.c:32 Commit 1da5d0e664 eliminated the call to 'get_current_arch' in 'gdbpy_before_prompt_hook'. Hence, the second instance of "Couldn't get registers: No such process." does not appear anymore. Fix the failure by updating the regular expression in the test.
2022-01-12gdb: rename lin-lwp to linux-nat in set/show debugAndrew Burgess1-1/+1
Rename 'set debug lin-lwp' to 'set debug linux-nat' and 'show debug lin-lwp' to 'show debug linux-nat'. I've updated the documentation and help text to match, as well as making it clear that the debug that is coming out relates to all aspects of Linux native inferior support, not just the LWP aspect of it. The boundary between general "native" target debug, and the lwp specific part of that debug was always a little blurry, but the actual debug variable inside GDB is debug_linux_nat, and the print routine linux_nat_debug_printf, is used throughout the linux-nat.c file, not just for lwp related debug, so the new name seems to make more sense.
2022-01-07gdb/testsuite: Remove duplicates from gdb.threads/staticthreads.exLancelot SIX1-1/+1
When running the testsuite, I have: Running .../gdb/testsuite/gdb.threads/staticthreads.exp ... DUPLICATE: gdb.threads/staticthreads.exp: couldn't compile staticthreads.c: unrecognized error Fix by using foreach_with_prefix instead of foreach when preparing the test case. Testeed on x86_64-linux both in a setup where the test fails to prepare and in a setup where the test fails to setup.
2022-01-03[gdb/testsuite] Handle for loop initial decl with gcc 4.8.5Tom de Vries2-2/+3
When running test-case gdb.threads/schedlock-thread-exit.exp on a system with system compiler gcc 4.8.5, I run into: ... src/gdb/testsuite/gdb.threads/schedlock-thread-exit.c:33:3: error: \ 'for' loop initial declarations are only allowed in C99 mode ... Fix this by: - using -std=c99, or - using -std=gnu99, in case that's required, or - in the case of the jit test-cases, rewriting the for loops. Tested on x86_64-linux, both with gcc 4.8.5 and gcc 7.5.0.
2022-01-01Automatic Copyright Year update after running gdb/copyright.pyJoel Brobecker217-217/+217
This commit brings all the changes made by running gdb/copyright.py as per GDB's Start of New Year Procedure. For the avoidance of doubt, all changes in this commits were performed by the script.
2021-12-08gdb, gdbserver: detach fork child when detaching from fork parentSimon Marchi6-82/+407
While working with pending fork events, I wondered what would happen if the user detached an inferior while a thread of that inferior had a pending fork event. What happens with the fork child, which is ptrace-attached by the GDB process (or by GDBserver), but not known to the core? Sure enough, neither the core of GDB or the target detach the child process, so GDB (or GDBserver) just stays ptrace-attached to the process. The result is that the fork child process is stuck, while you would expect it to be detached and run. Make GDBserver detach of fork children it knows about. That is done in the generic handle_detach function. Since a process_info already exists for the child, we can simply call detach_inferior on it. GDB-side, make the linux-nat and remote targets detach of fork children known because of pending fork events. These pending fork events can be stored in: - thread_info::pending_waitstatus, if the core has consumed the event but then saved it for later (for example, because it got the event while stopping all threads, to present an all-stop stop on top of a non-stop target) - thread_info::pending_follow: if we ran to a "catch fork" and we detach at that moment Additionally, pending fork events can be in target-specific fields: - For linux-nat, they can be in lwp_info::status and lwp_info::waitstatus. - For the remote target, they could be stored as pending stop replies, saved in `remote_state::notif_state::pending_event`, if not acknowledged yet, or in `remote_state::stop_reply_queue`, if acknowledged. I followed the model of remove_new_fork_children for this: call remote_notif_get_pending_events to process / acknowledge any unacknowledged notification, then look through stop_reply_queue. Update the gdb.threads/pending-fork-event.exp test (and rename it to gdb.threads/pending-fork-event-detach.exp) to try to detach the process while it is stopped with a pending fork event. In order to verify that the fork child process is correctly detached and resumes execution outside of GDB's control, make that process create a file in the test output directory, and make the test wait $timeout seconds for that file to appear (it happens instantly if everything goes well). This test catches a bug in linux-nat.c, also reported as PR 28512 ("waitstatus.h:300: internal-error: gdb_signal target_waitstatus::sig() const: Assertion `m_kind == TARGET_WAITKIND_STOPPED || m_kind == TARGET_WAITKIND_SIGNALLED' failed.). When detaching a thread with a pending event, get_detach_signal unconditionally fetches the signal stored in the waitstatus (`tp->pending_waitstatus ().sig ()`). However, that is only valid if the pending event is of type TARGET_WAITKIND_STOPPED, and this is now enforced using assertions (iit would also be valid for TARGET_WAITKIND_SIGNALLED, but that would mean the thread does not exist anymore, so we wouldn't be detaching it). Add a condition in get_detach_signal to access the signal number only if the wait status is of kind TARGET_WAITKIND_STOPPED, and use GDB_SIGNAL_0 instead (since the thread was not stopped with a signal to begin with). Add another test, gdb.threads/pending-fork-event-ns.exp, specifically to verify that we consider events in pending stop replies in the remote target. This test has many threads constantly forking, and we detach from the program while the program is executing. That gives us some chance that we detach while a fork stop reply is stored in the remote target. To verify that we correctly detach all fork children, we ask the parent to exit by sending it a SIGUSR1 signal and have it write a file to the filesystem before exiting. Because the parent's main thread joins the forking threads, and the forking threads wait for their fork children to exit, if some fork child is not detach by GDB, the parent will not write the file, and the test will time out. If I remove the new remote_detach_pid calls in remote.c, the test fails eventually if I run it in a loop. There is a known limitation: we don't remove breakpoints from the children before detaching it. So the children, could hit a trap instruction after being detached and crash. I know this is wrong, and it should be fixed, but I would like to handle that later. The current patch doesn't fix everything, but it's a step in the right direction. Change-Id: I6d811a56f520e3cb92d5ea563ad38976f92e93dd Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28512
2021-12-08gdbserver: hide fork child threads from GDBSimon Marchi2-0/+161
This patch aims at fixing a bug where an inferior is unexpectedly created when a fork happens at the same time as another event, and that other event is reported to GDB first (and the fork event stays pending in GDBserver). This happens for example when we step a thread and another thread forks at the same time. The bug looks like (if I reproduce the included test by hand): (gdb) show detach-on-fork Whether gdb will detach the child of a fork is on. (gdb) show follow-fork-mode Debugger response to a program call of fork or vfork is "parent". (gdb) si [New inferior 2] Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target... Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target... Reading symbols from target:/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread... [New Thread 965190.965190] [Switching to Thread 965190.965190] Remote 'g' packet reply is too long (expected 560 bytes, got 816 bytes): ... <long series of bytes> The sequence of events leading to the problem is: - We are using the all-stop user-visible mode as well as the synchronous / all-stop variant of the remote protocol - We have two threads, thread A that we single-step and thread B that calls fork at the same time - GDBserver's linux_process_target::wait pulls the "single step complete SIGTRAP" and the "fork" events from the kernel. It arbitrarily choses one event to report, it happens to be the single-step SIGTRAP. The fork stays pending in the thread_info. - GDBserver send that SIGTRAP as a stop reply to GDB - While in stop_all_threads, GDB calls update_thread_list, which ends up querying the remote thread list using qXfer:threads:read. - In the reply, GDBserver includes the fork child created as a result of thread B's fork. - GDB-side, the remote target sees the new PID, calls remote_notice_new_inferior, which ends up unexpectedly creating a new inferior, and things go downhill from there. The problem here is that as long as GDB did not process the fork event, it should pretend the fork child does not exist. Ultimately, this event will be reported, we'll go through follow_fork, and that process will be detached. The remote target (GDB-side), has some code to remove from the reported thread list the threads that are the result of forks not processed by GDB yet. But that only works for fork events that have made their way to the remote target (GDB-side), but haven't been consumed by the core yet, so are still lingering as pending stop replies in the remote target (see remove_new_fork_children in remote.c). But in our case, the fork event hasn't made its way to the GDB-side remote target. We need to implement the same kind of logic GDBserver-side: if there exists a thread / inferior that is the result of a fork event GDBserver hasn't reported yet, it should exclude that thread / inferior from the reported thread list. This was actually discussed a while ago, but not implemented AFAIK: https://pi.simark.ca/gdb-patches/1ad9f5a8-d00e-9a26-b0c9-3f4066af5142@redhat.com/#t https://sourceware.org/pipermail/gdb-patches/2016-June/133906.html Implementation details-wise, the fix for this is all in GDBserver. The Linux layer of GDBserver already tracks unreported fork parent / child relationships using the lwp_info::fork_relative, in order to avoid wildcard actions resuming fork childs unknown to GDB. This information needs to be made available to the handle_qxfer_threads_worker function, so it can filter the reported threads. Add a new thread_pending_parent target function that allows the Linux target to return the parent of an eventual fork child. Testing-wise, the test replicates pretty-much the sequence of events shown above. The setup of the test makes it such that the main thread is about to fork. We stepi the other thread, so that the step completes very quickly, in a single event. Meanwhile, the main thread is resumed, so very likely has time to call fork. This means that the bug may not reproduce every time (if the main thread does not have time to call fork), but it will reproduce more often than not. The test fails without the fix applied on the native-gdbserver and native-extended-gdbserver boards. At some point I suspected that which thread called fork and which thread did the step influenced the order in which the events were reported, and therefore the reproducibility of the bug. So I made the test try both combinations: main thread forks while other thread steps, and vice versa. I'm not sure this is still necessary, but I left it there anyway. It doesn't hurt to test a few more combinations. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28288 Change-Id: I2158d5732fc7d7ca06b0eb01f88cf27bf527b990
2021-11-19[gdb/testsuite] Fix gdb.threads/thread-specific-bp.expTom de Vries1-3/+18
On OBS I ran into a failure in test-case gdb.threads/thread-specific-bp.exp: ... (gdb) PASS: gdb.threads/thread-specific-bp.exp: non-stop: continue to end info breakpoint^M Num Type Disp Enb Address What^M 1 breakpoint keep y 0x0000555555555167 in main at $src:36^M breakpoint already hit 1 time^M 2 breakpoint keep y 0x0000555555555151 in start at $src:23^M breakpoint already hit 1 time^M 3 breakpoint keep y 0x0000555555555167 in main at $src:36 thread 2^M stop only in thread 2^M 4 breakpoint keep y 0x000055555555515c in end at $src:29^M breakpoint already hit 1 time^M (gdb) [Thread 0x7ffff7db1640 (LWP 19984) exited]^M Thread-specific breakpoint 3 deleted - thread 2 no longer in the thread list.^M FAIL: gdb.threads/thread-specific-bp.exp: non-stop: \ thread-specific breakpoint was deleted (timeout) ... Fix this by waiting for the "[Thread 0x7ffff7db1640 (LWP 19984) exited]" message before issuing the "info breakpoint command". Tested on x86_64-linux.
2021-11-11gdb: fix "set scheduler-locking" thread exit hangSimon Marchi2-0/+88
GDB hangs when doing this: - launch inferior with multiple threads - multiple threads hit some breakpoint(s) - one breakpoint hit is presented as a stop, the rest are saved as pending wait statuses - "set scheduler-locking on" - resume the currently selected thread (because of scheduler-locking, it's the only one resumed), let it execute until exit - GDB hangs, not showing the prompt, impossible to interrupt with ^C When the resumed thread exits, we expect the target to return a TARGET_WAITKIND_NO_RESUMED event, and that's what we see: [infrun] fetch_inferior_event: enter [infrun] scoped_disable_commit_resumed: reason=handling event [infrun] random_pending_event_thread: None found. [Thread 0x7ffff7d9c700 (LWP 309357) exited] [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) = [infrun] print_target_wait_results: -1.0.0 [process -1], [infrun] print_target_wait_results: status->kind = no-resumed [infrun] handle_inferior_event: status->kind = no-resumed [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed) [infrun] prepare_to_wait: prepare_to_wait [infrun] reset: reason=handling event [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target native, no resumed threads [infrun] fetch_inferior_event: exit The problem is in handle_no_resumed: we check if some other thread is actually resumed, to see if we should ignore that event (see comments in that function for more info). If this condition is true: (thread->executing () || thread->has_pending_waitstatus ()) ... then we ignore the event. The problem is that there are some non-resumed threads with a pending event, which makes us ignore the event. But these threads are not resumed, so we end up waiting while nothing executes, hence waiting for ever. My first fix was to change the condition to: (thread->executing () || (thread->resumed () && thread->has_pending_waitstatus ())) ... but then it occured to me that we could simply check for: (thread->resumed ()) Since "executing" implies "resumed", checking simply for "resumed" covers threads that are resumed and executing, as well as threads that are resumed with a pending status, which is what we want. Change-Id: Ie796290f8ae7f34c026ca3a8fcef7397414f4780
2021-11-05Avoid /proc/pid/mem races (PR 28065)Pedro Alves1-6/+8
PR 28065 (gdb.threads/access-mem-running-thread-exit.exp intermittent failure) shows that GDB can hit an unexpected scenario -- it can happen that the kernel manages to open a /proc/PID/task/LWP/mem file, but then reading from the file returns 0/EOF, even though the process hasn't exited or execed. "0" out of read/write is normally what you get when the address space of the process the file was open for is gone, because the process execed or exited. So when GDB gets the 0, it returns memory access failure. In the bad case in question, the process hasn't execed or exited, so GDB fails a memory access when the access should have worked. GDB has code in place to gracefully handle the case of opening the /proc/PID/task/LWP/mem just while the LWP is exiting -- most often the open fails with EACCES or ENOENT. When it happens, GDB just tries opening the file for a different thread of the process. The testcase is written such that it stresses GDB's logic of closing/reopening the /proc/PID/task/LWP/mem file, by constantly spawning short lived threads. However, there's a window where the kernel manages to find the thread, but the thread exits just after and clears its address space pointer. In this case, the kernel creates a file successfully, but the file ends up with no address space associated, so a subsequent read/write returns 0/EOF too, just like if the whole process had execed or exited. This is the case in question that GDB does not handle. Oleg Nesterov gave this suggestion as workaround for that race: gdb can open(/proc/pid/mem) and then read (say) /proc/pid/statm. If statm reports something non-zero, then open() was "successfull". I think that might work. However, I didn't try it, because I realized we have another nasty race that that wouldn't fix. The other race I realized is that because we close/reopen the /proc/PID/task/LWP/mem file when GDB switches to a different inferior, then it can happen that GDB reopens /proc/PID/task/LWP/mem just after a thread execs, and before GDB has seen the corresponding exec event. I.e., we can open a /proc/PID/task/LWP/mem file accessing the post-exec address space thinking we're accessing the pre-exec address space. A few months back, Simon, Oleg and I discussed a similar race: [Bug gdb/26754] Race condition when resuming threads and one does an exec https://sourceware.org/bugzilla/show_bug.cgi?id=26754 The solution back then was to make the kernel fail any ptrace operation until the exec event is consumed, with this kernel commit: commit dbb5afad100a828c97e012c6106566d99f041db6 Author: Oleg Nesterov <oleg@redhat.com> AuthorDate: Wed May 12 15:33:08 2021 +0200 Commit: Linus Torvalds <torvalds@linux-foundation.org> CommitDate: Wed May 12 10:45:22 2021 -0700 ptrace: make ptrace() fail if the tracee changed its pid unexpectedly This however, only applies to ptrace, not to the /proc/pid/mem file opening case. Also, even if it did apply to the file open case, we would want to support current kernels until such a fix is more wide spread anyhow. So all in all, this commit gives up on the idea of only ever keeping one /proc/pid/mem file descriptor open. Instead, make GDB open a /proc/pid/mem per inferior, and keep it open until the inferior exits, is detached or execs. Make GDB open the file right after the inferior is created or is attached to or forks, at which point we know the inferior is stable and stopped and isn't thus going to exec, or have a thread exit, and so the file open won't fail (unless the whole process is SIGKILLed from outside GDB, at which point it doesn't matter whether we open the file). This way, we avoid both races described above, at the expense of using more file descriptors (one per inferior). Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28065 Change-Id: Iff943b95126d0f98a7973a07e989e4f020c29419
2021-10-22[gdb/testsuite] Fix gdb.threads/linux-dp.expTom de Vries1-1/+1
On openSUSE Tumbleweed with glibc-debuginfo installed I get: ... (gdb) PASS: gdb.threads/linux-dp.exp: continue to breakpoint: thread 5's print where^M #0 print_philosopher (n=3, left=33 '!', right=33 '!') at linux-dp.c:105^M #1 0x0000000000401628 in philosopher (data=0x40537c) at linux-dp.c:148^M #2 0x00007ffff7d56b37 in start_thread (arg=<optimized out>) \ at pthread_create.c:435^M #3 0x00007ffff7ddb640 in clone3 () \ at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81^M (gdb) PASS: gdb.threads/linux-dp.exp: first thread-specific breakpoint hit ... while without debuginfo installed I get instead: ... (gdb) PASS: gdb.threads/linux-dp.exp: continue to breakpoint: thread 5's print where^M #0 print_philosopher (n=3, left=33 '!', right=33 '!') at linux-dp.c:105^M #1 0x0000000000401628 in philosopher (data=0x40537c) at linux-dp.c:148^M #2 0x00007ffff7d56b37 in start_thread () from /lib64/libc.so.6^M #3 0x00007ffff7ddb640 in clone3 () from /lib64/libc.so.6^M (gdb) FAIL: gdb.threads/linux-dp.exp: first thread-specific breakpoint hit ... The problem is that the regexp used: ... "\(from .*libpthread\|at pthread_create\|in pthread_create\)" ... expects the 'from' part to match libpthread, but in glibc 2.34 libpthread has been merged into libc. Fix this by updating the regexp. Tested on x86_64-linux.
2021-10-07[gdb/testsuite] Fix gdb.threads/check-libthread-db.exp with glibc 2.34Tom de Vries1-1/+3
When running test-case gdb.threads/check-libthread-db.exp on openSUSE Tumbleweed (with glibc 2.34) I get: ... (gdb) continue^M Continuing.^M [Thread debugging using libthread_db enabled]^M Using host libthread_db library "/lib64/libthread_db.so.1".^M Stopped due to shared library event:^M Inferior loaded /lib64/libm.so.6^M /lib64/libc.so.6^M (gdb) FAIL: gdb.threads/check-libthread-db.exp: user-initiated check: continue ... The check expect the inferior to load libpthread, but since glibc 2.34 libpthread has been integrated into glibc, and consequently it's no longer a dependency: ... $ ldd outputs/gdb.threads/check-libthread-db/check-libthread-db linux-vdso.so.1 (0x00007ffe4cae4000) libm.so.6 => /lib64/libm.so.6 (0x00007f167c77c000) libc.so.6 => /lib64/libc.so.6 (0x00007f167c572000) /lib64/ld-linux-x86-64.so.2 (0x00007f167c86e000) ... Fix this by updating the regexp to expect libpthread or libc. Tested on x86_64-linux.
2021-09-30gdb/testsuite: make runto_main not pass no-message to runtoSimon Marchi46-53/+0
As follow-up to this discussion: https://sourceware.org/pipermail/gdb-patches/2020-August/171385.html ... make runto_main not pass no-message to runto. This means that if we fail to run to main, for some reason, we'll emit a FAIL. This is the behavior we want the majority of (if not all) the time. Without this, we rely on tests logging a failure if runto_main fails, otherwise. They do so in a very inconsisteny mannet, sometimes using "fail", "unsupported" or "untested". The messages also vary widly. This patch removes all these messages as well. Also, remove a few "fail" where we call runto (and not runto_main). by default (without an explicit no-message argument), runto prints a failure already. In two places, gdb.multi/multi-re-run.exp and gdb.python/py-pp-registration.exp, remove "message" passed to runto. This removes a few PASSes that we don't care about (but FAILs will still be printed if we fail to run to where we want to). This aligns their behavior with the rest of the testsuite. Change-Id: Ib763c98c5f4fb6898886b635210d7c34bd4b9023
2021-09-27[gdb/testsuite] Test sw watchpoint in ↵Tom de Vries1-11/+30
gdb.threads/process-dies-while-detaching.exp The test-case gdb.threads/process-dies-while-detaching.exp takes about 20s when using hw watchpoints, but when forcing sw watchpoints (using the patch mentioned in PR28375#c0), the test-case takes instead 3m14s. Also, it show a FAIL: ... (gdb) continue^M Continuing.^M Cannot find user-level thread for LWP 10324: generic error^M (gdb) FAIL: gdb.threads/process-dies-while-detaching.exp: single-process: continue: watchpoint: continue ... for which PR28375 was filed. Modify the test-case to: - add the hw/sw axis to the watchpoint testing, to ensure that we observe the sw watchpoint behaviour also on can-use-hw-watchpoints architectures. - skip the hw breakpoint testing if not supported - set the sw watchpoint later to avoid making the test too slow. This still triggers the same PR, but now takes just 24s. This patch adds a KFAIL for PR28375. Tested on x86_64-linux.
2021-09-25[gdb/testsuite] Minimize gdb restartsTom de Vries8-13/+8
Minimize gdb restarts, applying the following rules: - don't use prepare_for_testing unless necessary - don't use clean_restart unless necessary Also, if possible, replace build_for_executable + clean_restart with prepare_for_testing for brevity. Touches 68 test-cases. Tested on x86_64-linux.
2021-09-23[gdb/testsuite] Use pie instead of -fPIE -pieTom de Vries1-1/+1
Replace {additional_flags=-fPIE ldflags=-pie} with {pie}. This makes sure that the test-cases properly error out when using target board unix/-fno-PIE/-no-pie. Tested on x86_64-linux.
2021-09-16[gdb/testsuite] Fix interrupted sleep in multi-threaded test-casesTom de Vries1-1/+5
When running test-case gdb.threads/continue-pending-status.exp with native, I have: ... (gdb) continue^M Continuing.^M PASS: gdb.threads/continue-pending-status.exp: attempt 0: continue for ctrl-c ^C^M Thread 1 "continue-pendin" received signal SIGINT, Interrupt.^M [Switching to Thread 0x7ffff7fc4740 (LWP 1276)]^M 0x00007ffff758e4c0 in __GI___nanosleep () at nanosleep.c:27^M 27 return SYSCALL_CANCEL (nanosleep, requested_time, remaining);^M (gdb) PASS: gdb.threads/continue-pending-status.exp: attempt 0: caught interrupt ... but with target board unix/-m32, I run into: ... (gdb) continue^M Continuing.^M PASS: gdb.threads/continue-pending-status.exp: attempt 0: continue for ctrl-c [Thread 0xf74aeb40 (LWP 31957) exited]^M [Thread 0xf7cafb40 (LWP 31956) exited]^M [Inferior 1 (process 31952) exited normally]^M (gdb) Quit^M ... The problem is that the sleep (300) call at the end of main is interrupted, which causes the inferior to exit before the ctrl-c can be send. This problem is described at "Interrupted System Calls" in the docs, and the suggested solution (using a sleep loop) indeed fixes the problem. Fix this instead using the more prevalent: ... alarm (300); ... while (1) sleep (1); ... which is roughly equivalent because the sleep is called at the end of main, but slightly better because it guards against hangs from the start rather than from the end of main. Likewise in gdb.base/watch_thread_num.exp. Likewise in gdb.btrace/enable-running.exp, but use the sleep loop there, because the sleep is not called at the end of main. Tested on x86_64-linux.
2021-07-13[gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33Tom de Vries1-16/+21
When running test-case gdb.threads/check-libthread-db.exp on openSUSE Tumbleweed with glibc 2.33, I get: ... (gdb) maint check libthread-db^M Running libthread_db integrity checks:^M Got thread 0x7ffff7c79b80 => 9354 => 0x7ffff7c79b80; errno = 0 ... OK^M libthread_db integrity checks passed.^M (gdb) FAIL: gdb.threads/check-libthread-db.exp: user-initiated check: \ libpthread.so not initialized (pattern 2) ... The test-case expects instead: ... Got thread 0x0 => 9354 => 0x0 ... OK^M ... which is what I get on openSUSE Leap 15.2 with glibc 2.26, and what is described in the test-case like this: ... # libthread_db should fake a single thread with th_unique == NULL. ... Using a breakpoint on check_thread_db_callback we can compare the two scenarios, and find that in the latter case we hit this code in glibc function iterate_thread_list in nptl_db/td_ta_thr_iter.c: ... if (next == 0 && fake_empty) { /* __pthread_initialize_minimal has not run. There is just the main thread to return. We cannot rely on its thread register. They sometimes contain garbage that would confuse us, left by the kernel at exec. So if it looks like initialization is incomplete, we only fake a special descriptor for the initial thread. */ td_thrhandle_t th = { ta, 0 }; return callback (&th, cbdata_p) != 0 ? TD_DBERR : TD_OK; } ... while in the former case we don't because this preceding statement doesn't result in next == 0: ... err = DB_GET_FIELD (next, ta, head, list_t, next, 0); ... Note that the comment mentions __pthread_initialize_minimal, but in both cases it has already run before we hit the callback, so it's possible the comment is no longer accurate. The change in behaviour bisect to glibc commit 1daccf403b "nptl: Move stack list variables into _rtld_global", which moves the initialization of stack list variables such as __stack_user to an earlier moment, which explains well enough the observed difference. Fix this by updating the regexp patterns to agree with what libthread-db is telling us. Tested on x86_64-linux, both with glibc 2.33 and 2.26. gdb/testsuite/ChangeLog: 2021-07-07 Tom de Vries <tdevries@suse.de> PR testsuite/27690 * gdb.threads/check-libthread-db.exp: Update patterns for glibc 2.33.
2021-07-01Linux: Access memory even if threads are runningPedro Alves2-0/+289
Currently, on GNU/Linux, if you try to access memory and you have a running thread selected, GDB fails the memory accesses, like: (gdb) c& Continuing. (gdb) p global_var Cannot access memory at address 0x555555558010 Or: (gdb) b main Breakpoint 2 at 0x55555555524d: file access-mem-running.c, line 59. Warning: Cannot insert breakpoint 2. Cannot access memory at address 0x55555555524d This patch removes this limitation. It teaches the native Linux target to read/write memory even if the target is running. And it does this without temporarily stopping threads. We now get: (gdb) c& Continuing. (gdb) p global_var $1 = 123 (gdb) b main Breakpoint 2 at 0x555555555259: file access-mem-running.c, line 62. (The scenarios above work correctly with current GDBserver, because GDBserver temporarily stops all threads in the process whenever GDB wants to access memory (see prepare_to_access_memory / done_accessing_memory). Freezing the whole process makes sense when we need to be sure that we have a consistent view of memory and don't race with the inferior changing it at the same time as GDB is accessing it. But I think that's a too-heavy hammer for the default behavior. I think that ideally, whether to stop all threads or not should be policy decided by gdb core, probably best implemented by exposing something like gdbserver's prepare_to_access_memory / done_accessing_memory to gdb core.) Currently, if we're accessing (reading/writing) just a few bytes, then the Linux native backend does not try accessing memory via /proc/<pid>/mem and goes straight to ptrace PTRACE_PEEKTEXT/PTRACE_POKETEXT. However, ptrace always fails when the ptracee is running. So the first step is to prefer /proc/<pid>/mem even for small accesses. Without further changes however, that may cause a performance regression, due to constantly opening and closing /proc/<pid>/mem for each memory access. So the next step is to keep the /proc/<pid>/mem file open across memory accesses. If we have this, then it doesn't make sense anymore to even have the ptrace fallback, so the patch disables it. I've made it such that GDB only ever has one /proc/<pid>/mem file open at any time. As long as a memory access hits the same inferior process as the previous access, then we reuse the previously open file. If however, we access memory of a different process, then we close the previous file and open a new one for the new process. If we wanted, we could keep one /proc/<pid>/mem file open per inferior, and never close them (unless the inferior exits or execs). However, having seen bfd patches recently about hitting too many open file descriptors, I kept the logic to have only one file open tops. Also, we need to handle memory accesses for processes for which we don't have an inferior object, for when we need to detach a fork-child, and we'd probaly want to handle caching the open file for that scenario (no inferior for process) too, which would probably end up meaning caching for last non-inferior process, which is very much what I'm proposing anyhow. So always having one file open likely ends up a smaller patch. The next step is handling the case of GDB reading/writing memory through a thread that is running and exits. The access should not result in a user-visible failure if the inferior/process is still alive. Once we manage to open a /proc/<lwpid>/mem file, then that file is usable for memory accesses even if the corresponding lwp exits and is reaped. I double checked that trying to open the same /proc/<lwpid>/mem path again fails because the lwp is really gone so there's no /proc/<lwpid>/ entry on the filesystem anymore, but the previously open file remains usable. It's only when the whole process execs that we need to reopen a new file. When the kernel destroys the whole address space, i.e., when the process exits or execs, the reads/writes fail with 0 aka EOF, in which case there's nothing else to do than returning a memory access failure. Note this means that when we get an exec event, we need to reopen the file, to access the process's new address space. If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP we're opening it for exits before we open it and before we reap the LWP (i.e., the LWP is zombie), the open fails with EACCES. The patch handles this by just looking for another thread until it finds one that we can open a /proc/<pid>/mem successfully for. If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP we're opening has exited and we already reaped it, which is the case if the selected thread is in THREAD_EXIT state, the open fails with ENOENT. The patch handles this the same way as a zombie race (EACCES), instead of checking upfront whether we're accessing a known-exited thread, because that would result in more complicated code, because we also need to handle accessing lwps that are not listed in the core thread list, and it's the core thread list that records the THREAD_EXIT state. The patch includes two testcases: #1 - gdb.base/access-mem-running.exp This is the conceptually simplest - it is single-threaded, and has GDB read and write memory while the program is running. It also tests setting a breakpoint while the program is running, and checks that the breakpoint is hit immediately. #2 - gdb.threads/access-mem-running-thread-exit.exp This one is more elaborate, as it continuously spawns short-lived threads in order to exercise accessing memory just while threads are exiting. It also spawns two different processes and alternates accessing memory between the two processes to exercise the reopening the /proc file frequently. This also ends up exercising GDB reading from an exited thread frequently. I confirmed by putting abort() calls in the EACCES/ENOENT paths added by the patch that we do hit all of them frequently with the testcase. It also exits the process's main thread (i.e., the main thread becomes zombie), to make sure accessing memory in such a corner-case scenario works now and in the future. The tests fail on GNU/Linux native before the code changes, and pass after. They pass against current GDBserver, again because GDBserver supports memory access even if all threads are running, by transparently pausing the whole process. gdb/ChangeLog: yyyy-mm-dd Pedro Alves <pedro@palves.net> PR mi/15729 PR gdb/13463 * linux-nat.c (linux_nat_target::detach): Close the /proc/<pid>/mem file if it was open for this process. (linux_handle_extended_wait) <PTRACE_EVENT_EXEC>: Close the /proc/<pid>/mem file if it was open for this process. (linux_nat_target::mourn_inferior): Close the /proc/<pid>/mem file if it was open for this process. (linux_nat_target::xfer_partial): Adjust. Do not fall back to inf_ptrace_target::xfer_partial for memory accesses. (last_proc_mem_file): New. (maybe_close_proc_mem_file): New. (linux_proc_xfer_memory_partial_pid): New, with bits factored out from linux_proc_xfer_partial. (linux_proc_xfer_partial): Delete. (linux_proc_xfer_memory_partial): New. gdb/testsuite/ChangeLog yyyy-mm-dd Pedro Alves <pedro@palves.net> PR mi/15729 PR gdb/13463 * gdb.base/access-mem-running.c: New. * gdb.base/access-mem-running.exp: New. * gdb.threads/access-mem-running-thread-exit.c: New. * gdb.threads/access-mem-running-thread-exit.exp: New. Change-Id: Ib3c082528872662a3fc0ca9b31c34d4876c874c9
2021-06-08[gdb/testsuite] Fix gdb.threads/multi-create-ns-info-thr.expTom de Vries1-1/+1
With a testsuite setup modified to make expect wait a little bit longer for gdb output (see PR27957), I reliably run into: ... PASS: gdb.threads/multi-create-ns-info-thr.exp: continue to breakpoint 1 FAIL: gdb.threads/multi-create-ns-info-thr.exp: continue to breakpoint 2 \ (timeout) ... This is due to this regexp: ... -re "Breakpoint $decimal,.*$srcfile:$bp_location1" { ... consuming several lines using the ".*" part, while it's intended to match one line looking like this: ... Thread 1 "multi-create-ns" hit Breakpoint 2, create_function () \ at multi-create.c:45^M ... Fix this by limiting the regexp to one line. Tested on x86_64-linux. gdb/testsuite/ChangeLog: 2021-06-08 Tom de Vries <tdevries@suse.de> * gdb.threads/multi-create-ns-info-thr.exp: Limit breakpoint regexp to one line.
2021-06-02Fix threadapply testCarl Love1-7/+5
The current test case leaves detached processes running at the end of the test. This patch changes the test to use a barrier wait to ensure all processes exit cleanly at the end of the tests. gdb/testsuite/ChangeLog: 2021-06-02 Carl Love <cel@us.ibm.com> * gdb.threads/threadapply.c: Add global mybarrier. (main): Add pthread_barrier_init. (thread_function): Replace while loop with myp increment and pthread_barrier_wait.
2021-05-05[gdb/testsuite] Fix timeout in gdb.threads/detach-step-over.exp with readnowTom de Vries1-17/+32
When running test-case gdb.threads/detach-step-over.exp with target board readnow, I run into: ... Reading symbols from /lib64/libc.so.6...^M Reading symbols from \ /usr/lib/debug/lib64/libc-2.26.so-2.26-lp152.26.6.1.x86_64.debug...^M Expanding full symbols from \ /usr/lib/debug/lib64/libc-2.26.so-2.26-lp152.26.6.1.x86_64.debug...^M FAIL: gdb.threads/detach-step-over.exp: \ breakpoint-condition-evaluation=host: target-non-stop=on: non-stop=on: \ displaced=off: iter 2: attach (timeout) ... Fix this by doing exp_continue when encountering the "Reading symbols" or "Expanding full symbols" lines. This is still fragile and times out with a higher load, similated f.i. by stress -c 5. Fix that by using a timeout factor of 2. Tested on x86_64-linux. gdb/testsuite/ChangeLog: 2021-05-05 Tom de Vries <tdevries@suse.de> * gdb.threads/detach-step-over.exp: Do exp_continue when encountering "Reading symbols" or "Expanding full symbols" lines. Using timeout factor of 2 for attach.
2021-05-05[gdb/testsuite] Fix gdb.threads/fork-plus-threads.exp with readnowTom de Vries1-2/+2
When running test-case gdb.threads/fork-plus-threads.exp with target board readnow, I run into: ... [LWP 9362 exited]^M [New LWP 9365]^M [New LWP 9363]^M [New LWP 9364]^M FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \ inferior 1 exited (timeout) ... There is code in the test-case to prevent timeouts with readnow: ... -re "Thread \[^\r\n\]+ exited" { # Avoid timeout with check-read1 exp_continue } -re "New Thread \[^\r\n\]+" { # Avoid timeout with check-read1 exp_continue } ... but this doesn't trigger because we get LWP rather than Thread. Fix this by making these regexps accept LWP as well. Tested on x86_64-linux. gdb/testsuite/ChangeLog: 2021-05-05 Tom de Vries <tdevries@suse.de> * gdb.threads/fork-plus-threads.exp: Handle "New LWP <n>" and "LWP <n> exited" messages.
2021-04-15gdb/testsuite: use foreach_with_prefix in gdb.threads/fork-plus-threads.expSimon Marchi1-6/+4
I noticed that using foreach_with_prefix could make things a bit less verbose. No changes in behavior expected. gdb/testsuite/ChangeLog: * gdb.threads/fork-plus-threads.exp: Use foreach_with_prefix. Change-Id: I06aa6e3d10a9cfb6ada11547aefe8c70b636ac81
2021-04-06[gdb/testsuite] Fix xfail handling in gdb.threads/gcore-thread.expTom de Vries1-4/+16
When running test-case gdb.threads/gcore-thread.exp on openSUSE Tumbleweed, I run into these XFAILs: ... XFAIL: gdb.threads/gcore-thread.exp: clear __stack_user.next XFAIL: gdb.threads/gcore-thread.exp: clear stack_used.next ... Apart from the xfail, the test-case also sets core0file to "": ... -re "No symbol \"${symbol}\" in current context\\.\r\n$gdb_prompt $" { xfail $test # Do not do the verification. set core0file "" } ... After which we run into this FAIL, because gdb_core_cmd fails to load a core file called "": ... (gdb) core ^M No core file now.^M (gdb) FAIL: gdb.threads/gcore-thread.exp: core0file: \ re-load generated corefile ... Fix this FAIL by skipping gdb_core_cmd if the core file is "". Tested on x86_64-linux. gdb/testsuite/ChangeLog: 2021-04-06 Tom de Vries <tdevries@suse.de> PR testsuite/27691 * gdb.threads/gcore-thread.exp: Don't call gdb_core_cmd with core file "".
2021-03-16gdb/testsuite: squash duplicate test names in gdb.threads/*.expAndrew Burgess4-43/+48
Resolve all of the duplicate test names in the gdb.threads/*.exp set of tests (that I see). Nothing very exciting here, mostly either giving tests explicit testnames, or adding with_test_prefix. The only interesting one is gdb.threads/execl.exp, I believe the duplicate test name was caused by an actual duplicate test. I've remove the simpler form of the test. I don't believe we've lost any test coverage with this change. gdb/testsuite/ChangeLog: * gdb.threads/execl.exp: Remove duplicate 'info threads' test. Make use of $gdb_test_name instead of creating a separate $test variable. * gdb.threads/print-threads.exp: Add a with_test_prefix instead of adding a '($name)' at the end of each test. This also catches the one place where '($name)' was missing, and so caused a duplicate test name. * gdb.threads/queue-signal.exp: Give tests unique names to avoid duplicate test names based on the command being tested. * gdb.threads/signal-command-multiple-signals-pending.exp: Likewise. * lib/gdb.exp (gdb_compile_shlib_pthreads): Tweak test name to avoid duplicate testnames when a test script uses this proc and also gdb_compile_pthreads. * lib/prelink-support.exp (build_executable_own_libs): Use with_test_prefix to avoid duplicate test names when we call build_executable twice.
2021-02-09[testsuite] Don't use 'testfile' before 'standard_testfile'.Hafiz Abid Qadeer4-8/+8
While running tests on arm-none-eabi, I noticed following errors in some gdb.threads tests. ERROR: can't read "testfile": no such variable These were being caused by ${testfile} being used before 'standard_testfile' which sets it. This patch just moves standard_testfile before the use. 2021-02-09 Abid Qadeer <abidh@codesourcery.com> gdb/testsuite/ChangeLog: * gdb.threads/signal-command-handle-nopass.exp: Call 'standard_testfile' before using 'testfile'. * gdb.threads/signal-command-multiple-signals-pending.exp: Likewise. * gdb.threads/signal-delivered-right-thread.exp: Likewise * gdb.threads/signal-sigtrap.exp: Likewise
2021-02-03Testcase for detaching while stepping over breakpointPedro Alves2-0/+402
This adds a testcase that exercises detaching while GDB is stepping over a breakpoint, in all combinations of: - maint target non-stop off/on - set non-stop on/off - displaced stepping on/off This exercises the bugs fixed in the previous 8 patches. gdb/testsuite/ChangeLog: * gdb.threads/detach-step-over.c: New file. * gdb.threads/detach-step-over.exp: New file.
2021-02-03Testcase for attaching in non-stop modePedro Alves2-0/+206
This adds a testcase exercising attaching to a multi-threaded process, in all combinations of: - set non-stop on/off - maint target non-stop off/on - "attach" vs "attach &" This exercises the bugs fixed in the two previous patches. gdb/testsuite/ChangeLog: * gdb.threads/attach-non-stop.c: New file. * gdb.threads/attach-non-stop.exp: New file.
2021-01-26[gdb/testsuite] Fix gdb.threads/killed-outside.exp with -m32Tom de Vries1-0/+3
When running test-case gdb.threads/killed-outside.exp with target board unix/-m32, we run into: ... (gdb) PASS: gdb.threads/killed-outside.exp: get pid of inferior Executing on target: kill -9 10969 (timeout = 300) spawn -ignore SIGHUP kill -9 10969^M continue^M Continuing.^M [Thread 0xf7cb4b40 (LWP 10973) exited]^M ^M Program terminated with signal SIGKILL, Killed.^M The program no longer exists.^M (gdb) FAIL: gdb.threads/killed-outside.exp: prompt after first continue ... Fix this by allowing this output. Tested on x86_64-linux. gdb/testsuite/ChangeLog: 2021-01-26 Tom de Vries <tdevries@suse.de> * gdb.threads/killed-outside.exp: Allow regular output.
2021-01-06gdb/testsuite: fix race in ↵Simon Marchi1-1/+1
gdb.threads/signal-while-stepping-over-bp-other-thread.exp Commit 3ec3145c5dd6 ("gdb: introduce scoped debug prints") updated some tests using "set debug infrun" to handle the fact that a debug print is now shown after the prompt, after an inferior stop. The same issue happens in gdb.threads/signal-while-stepping-over-bp-other-thread.exp. If I run it in a loop, it eventually fails like these other tests. The problem is that the testsuite expects to see $gdb_prompt followed by the end of the buffer. It happens that expect reads $gdb_prompt and the debug print at the same time, in which case the regexp never matches and we get a timeout. The fix is the same as was done in 3ec3145c5dd6, make the testsuite believe that the prompt is the standard GDB prompt followed by that debug print. Since that test uses gdb_test_sequence, and the expected prompt is in gdb_test_sequence, add a -prompt switch to gdb_test_sequence to override the prompt used for that call. gdb/testsuite/ChangeLog: * lib/gdb.exp (gdb_test_sequence): Accept -prompt switch. * gdb.threads/signal-while-stepping-over-bp-other-thread.exp: Pass prompt containing debug print to gdb_test_sequence. Change-Id: I33161c53ddab45cdfeadfd50b964f8dc3caa9729
2021-01-04gdb: introduce scoped debug printsSimon Marchi3-9/+19
I spent a lot of time reading infrun debug logs recently, and I think they could be made much more readable by being indented, to clearly see what operation is done as part of what other operation. In the current format, there are no visual cues to tell where things start and end, it's just a big flat list. It's also difficult to understand what caused a given operation (e.g. a call to resume_1) to be done. To help with this, I propose to add the new scoped_debug_start_end structure, along with a bunch of macros to make it convenient to use. The idea of scoped_debug_start_end is simply to print a start and end message at construction and destruction. It also increments/decrements a depth counter in order to make debug statements printed during this range use some indentation. Some care is taken to handle the fact that debug can be turned on or off in the middle of such a range. For example, a "set debug foo 1" command in a breakpoint command, or a superior GDB manually changing the debug_foo variable. Two macros are added in gdbsupport/common-debug.h, which are helpers to define module-specific macros: - scoped_debug_start_end: takes a message that is printed both at construction / destruction, with "start: " and "end: " prefixes. - scoped_debug_enter_exit: prints hard-coded "enter" and "exit" messages, to denote the entry and exit of a function. I added some examples in the infrun module to give an idea of how it can be used and what the result looks like. The macros are in capital letters (INFRUN_SCOPED_DEBUG_START_END and INFRUN_SCOPED_DEBUG_ENTER_EXIT) to mimic the existing SCOPE_EXIT, but that can be changed if you prefer something else. Here's an excerpt of the debug statements printed when doing "continue", where a displaced step is started: [infrun] proceed: enter [infrun] proceed: addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT [infrun] global_thread_step_over_chain_enqueue: enqueueing thread Thread 0x7ffff75a5640 (LWP 2289301) in global step over chain [infrun] start_step_over: enter [infrun] start_step_over: stealing global queue of threads to step, length = 1 [infrun] start_step_over: resuming [Thread 0x7ffff75a5640 (LWP 2289301)] for step-over [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [Thread 0x7ffff75a5640 (LWP 2289301)] at 0x5555555551bd [displaced] displaced_step_prepare_throw: displaced-stepping Thread 0x7ffff75a5640 (LWP 2289301) now [displaced] prepare: selected buffer at 0x5555555550c2 [displaced] prepare: saved 0x5555555550c2: 1e fa 31 ed 49 89 d1 5e 48 89 e2 48 83 e4 f0 50 [displaced] amd64_displaced_step_copy_insn: copy 0x5555555551bd->0x5555555550c2: c7 45 fc 00 00 00 00 eb 13 8b 05 d4 2e 00 00 83 [displaced] displaced_step_prepare_throw: prepared successfully thread=Thread 0x7ffff75a5640 (LWP 2289301), original_pc=0x5555555551bd, displaced_pc=0x5555555550c2 [displaced] resume_1: run 0x5555555550c2: c7 45 fc 00 [infrun] infrun_async: enable=1 [infrun] prepare_to_wait: prepare_to_wait [infrun] start_step_over: [Thread 0x7ffff75a5640 (LWP 2289301)] was resumed. [infrun] operator(): step-over queue now empty [infrun] start_step_over: exit [infrun] proceed: start: resuming threads, all-stop-on-top-of-non-stop [infrun] proceed: resuming Thread 0x7ffff7da7740 (LWP 2289296) [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [Thread 0x7ffff7da7740 (LWP 2289296)] at 0x7ffff7f7d9b7 [infrun] prepare_to_wait: prepare_to_wait [infrun] proceed: resuming Thread 0x7ffff7da6640 (LWP 2289300) [infrun] resume_1: thread Thread 0x7ffff7da6640 (LWP 2289300) has pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP (currently_stepping=0). [infrun] prepare_to_wait: prepare_to_wait [infrun] proceed: [Thread 0x7ffff75a5640 (LWP 2289301)] resumed [infrun] proceed: resuming Thread 0x7ffff6da4640 (LWP 2289302) [infrun] resume_1: thread Thread 0x7ffff6da4640 (LWP 2289302) has pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP (currently_stepping=0). [infrun] prepare_to_wait: prepare_to_wait [infrun] proceed: end: resuming threads, all-stop-on-top-of-non-stop [infrun] proceed: exit We can easily see where the call to `proceed` starts and end. We can also see why there are a bunch of resume_1 calls, it's because we are resuming threads, emulating all-stop on top of a non-stop target. We also see that debug statements nest well with other modules that have been migrated to use the "new" debug statement helpers (because they all use debug_prefixed_vprintf in the end. I think this is desirable, for example we could see the debug statements about reading the DWARF info of a library nested under the debug statements about loading that library. Of course, modules that haven't been migrated to use the "new" helpers will still print without indentations. This will be one good reason to migrate them. I think the runtime cost (when debug statements are disabled) of this is reasonable, given the improvement in readability. There is the cost of the conditionals (like standard debug statements), one more condition (if (m_must_decrement_print_depth)) and the cost of constructing a stack object, which means copying a fews pointers. Adding the print in fetch_inferior_event breaks some tests that use "set debug infrun", because it prints a debug statement after the prompt. I adapted these tests to cope with it, by using the "-prompt" switch of gdb_test_multiple to as if this debug statement is part of the expected prompt. It's unfortunate that we have to do this, but I think the debug print is useful, and I don't want a few tests to get in the way of adding good debug output. gdbsupport/ChangeLog: * common-debug.h (debug_print_depth): New. (struct scoped_debug_start_end): New. (scoped_debug_start_end): New. (scoped_debug_enter_exit): New. * common-debug.cc (debug_prefixed_vprintf): Print indentation. gdb/ChangeLog: * debug.c (debug_print_depth): New. * infrun.h (INFRUN_SCOPED_DEBUG_START_END): New. (INFRUN_SCOPED_DEBUG_ENTER_EXIT): New. * infrun.c (start_step_over): Use INFRUN_SCOPED_DEBUG_ENTER_EXIT. (proceed): Use INFRUN_SCOPED_DEBUG_ENTER_EXIT and INFRUN_SCOPED_DEBUG_START_END. (fetch_inferior_event): Use INFRUN_SCOPED_DEBUG_ENTER_EXIT. gdbserver/ChangeLog: * debug.cc (debug_print_depth): New. gdb/testsuite/ChangeLog: * gdb.base/ui-redirect.exp: Expect infrun debug print after prompt. * gdb.threads/ia64-sigill.exp: Likewise. * gdb.threads/watchthreads-reorder.exp: Likewise. Change-Id: I7c3805e6487807aa63a1bae318876a0c69dce949
2021-01-01Update copyright year range in all GDB filesJoel Brobecker204-204/+204
This commits the result of running gdb/copyright.py as per our Start of New Year procedure... gdb/ChangeLog Update copyright year range in copyright header of all GDB files.
2020-12-04gdb: move displaced stepping logic to gdbarch, allow starting concurrent ↵Simon Marchi2-2/+2
displaced steps Today, GDB only allows a single displaced stepping operation to happen per inferior at a time. There is a single displaced stepping buffer per inferior, whose address is fixed (obtained with gdbarch_displaced_step_location), managed by infrun.c. In the case of the AMD ROCm target [1] (in the context of which this work has been done), it is typical to have thousands of threads (or waves, in SMT terminology) executing the same code, hitting the same breakpoint (possibly conditional) and needing to to displaced step it at the same time. The limitation of only one displaced step executing at a any given time becomes a real bottleneck. To fix this bottleneck, we want to make it possible for threads of a same inferior to execute multiple displaced steps in parallel. This patch builds the foundation for that. In essence, this patch moves the task of preparing a displaced step and cleaning up after to gdbarch functions. This allows using different schemes for allocating and managing displaced stepping buffers for different platforms. The gdbarch decides how to assign a buffer to a thread that needs to execute a displaced step. On the ROCm target, we are able to allocate one displaced stepping buffer per thread, so a thread will never have to wait to execute a displaced step. On Linux, the entry point of the executable if used as the displaced stepping buffer, since we assume that this code won't get used after startup. From what I saw (I checked with a binary generated against glibc and musl), on AMD64 we have enough space there to fit two displaced stepping buffers. A subsequent patch makes AMD64/Linux use two buffers. In addition to having multiple displaced stepping buffers, there is also the idea of sharing displaced stepping buffers between threads. Two threads doing displaced steps for the same PC could use the same buffer at the same time. Two threads stepping over the same instruction (same opcode) at two different PCs may also be able to share a displaced stepping buffer. This is an idea for future patches, but the architecture built by this patch is made to allow this. Now, the implementation details. The main part of this patch is moving the responsibility of preparing and finishing a displaced step to the gdbarch. Before this patch, preparing a displaced step is driven by the displaced_step_prepare_throw function. It does some calls to the gdbarch to do some low-level operations, but the high-level logic is there. The steps are roughly: - Ask the gdbarch for the displaced step buffer location - Save the existing bytes in the displaced step buffer - Ask the gdbarch to copy the instruction into the displaced step buffer - Set the pc of the thread to the beginning of the displaced step buffer Similarly, the "fixup" phase, executed after the instruction was successfully single-stepped, is driven by the infrun code (function displaced_step_finish). The steps are roughly: - Restore the original bytes in the displaced stepping buffer - Ask the gdbarch to fixup the instruction result (adjust the target's registers or memory to do as if the instruction had been executed in its original location) The displaced_step_inferior_state::step_thread field indicates which thread (if any) is currently using the displaced stepping buffer, so it is used by displaced_step_prepare_throw to check if the displaced stepping buffer is free to use or not. This patch defers the whole task of preparing and cleaning up after a displaced step to the gdbarch. Two new main gdbarch methods are added, with the following semantics: - gdbarch_displaced_step_prepare: Prepare for the given thread to execute a displaced step of the instruction located at its current PC. Upon return, everything should be ready for GDB to resume the thread (with either a single step or continue, as indicated by gdbarch_displaced_step_hw_singlestep) to make it displaced step the instruction. - gdbarch_displaced_step_finish: Called when the thread stopped after having started a displaced step. Verify if the instruction was executed, if so apply any fixup required to compensate for the fact that the instruction was executed at a different place than its original pc. Release any resources that were allocated for this displaced step. Upon return, everything should be ready for GDB to resume the thread in its "normal" code path. The displaced_step_prepare_throw function now pretty much just offloads to gdbarch_displaced_step_prepare and the displaced_step_finish function offloads to gdbarch_displaced_step_finish. The gdbarch_displaced_step_location method is now unnecessary, so is removed. Indeed, the core of GDB doesn't know how many displaced step buffers there are nor where they are. To keep the existing behavior for existing architectures, the logic that was previously implemented in infrun.c for preparing and finishing a displaced step is moved to displaced-stepping.c, to the displaced_step_buffer class. Architectures are modified to implement the new gdbarch methods using this class. The behavior is not expected to change. The other important change (which arises from the above) is that the core of GDB no longer prevents concurrent displaced steps. Before this patch, start_step_over walks the global step over chain and tries to initiate a step over (whether it is in-line or displaced). It follows these rules: - if an in-line step is in progress (in any inferior), don't start any other step over - if a displaced step is in progress for an inferior, don't start another displaced step for that inferior After starting a displaced step for a given inferior, it won't start another displaced step for that inferior. In the new code, start_step_over simply tries to initiate step overs for all the threads in the list. But because threads may be added back to the global list as it iterates the global list, trying to initiate step overs, start_step_over now starts by stealing the global queue into a local queue and iterates on the local queue. In the typical case, each thread will either: - have initiated a displaced step and be resumed - have been added back by the global step over queue by displaced_step_prepare_throw, because the gdbarch will have returned that there aren't enough resources (i.e. buffers) to initiate a displaced step for that thread Lastly, if start_step_over initiates an in-line step, it stops iterating, and moves back whatever remaining threads it had in its local step over queue to the global step over queue. Two other gdbarch methods are added, to handle some slightly annoying corner cases. They feel awkwardly specific to these cases, but I don't see any way around them: - gdbarch_displaced_step_copy_insn_closure_by_addr: in arm_pc_is_thumb, arm-tdep.c wants to get the closure for a given buffer address. - gdbarch_displaced_step_restore_all_in_ptid: when a process forks (at least on Linux), the address space is copied. If some displaced step buffers were in use at the time of the fork, we need to restore the original bytes in the child's address space. These two adjustments are also made in infrun.c: - prepare_for_detach: there may be multiple threads doing displaced steps when we detach, so wait until all of them are done - handle_inferior_event: when we handle a fork event for a given thread, it's possible that other threads are doing a displaced step at the same time. Make sure to restore the displaced step buffer contents in the child for them. [1] https://github.com/ROCm-Developer-Tools/ROCgdb gdb/ChangeLog: * displaced-stepping.h (struct displaced_step_copy_insn_closure): Adjust comments. (struct displaced_step_inferior_state) <step_thread, step_gdbarch, step_closure, step_original, step_copy, step_saved_copy>: Remove fields. (struct displaced_step_thread_state): New. (struct displaced_step_buffer): New. * displaced-stepping.c (displaced_step_buffer::prepare): New. (write_memory_ptid): Move from infrun.c. (displaced_step_instruction_executed_successfully): New, factored out of displaced_step_finish. (displaced_step_buffer::finish): New. (displaced_step_buffer::copy_insn_closure_by_addr): New. (displaced_step_buffer::restore_in_ptid): New. * gdbarch.sh (displaced_step_location): Remove. (displaced_step_prepare, displaced_step_finish, displaced_step_copy_insn_closure_by_addr, displaced_step_restore_all_in_ptid): New. * gdbarch.c: Re-generate. * gdbarch.h: Re-generate. * gdbthread.h (class thread_info) <displaced_step_state>: New field. (thread_step_over_chain_remove): New declaration. (thread_step_over_chain_next): New declaration. (thread_step_over_chain_length): New declaration. * thread.c (thread_step_over_chain_remove): Make non-static. (thread_step_over_chain_next): New. (global_thread_step_over_chain_next): Use thread_step_over_chain_next. (thread_step_over_chain_length): New. (global_thread_step_over_chain_enqueue): Add debug print. (global_thread_step_over_chain_remove): Add debug print. * infrun.h (get_displaced_step_copy_insn_closure_by_addr): Remove. * infrun.c (get_displaced_stepping_state): New. (displaced_step_in_progress_any_inferior): Remove. (displaced_step_in_progress_thread): Adjust. (displaced_step_in_progress): Adjust. (displaced_step_in_progress_any_thread): New. (get_displaced_step_copy_insn_closure_by_addr): Remove. (gdbarch_supports_displaced_stepping): Use gdbarch_displaced_step_prepare_p. (displaced_step_reset): Change parameter from inferior to thread. (displaced_step_prepare_throw): Implement using gdbarch_displaced_step_prepare. (write_memory_ptid): Move to displaced-step.c. (displaced_step_restore): Remove. (displaced_step_finish): Implement using gdbarch_displaced_step_finish. (start_step_over): Allow starting more than one displaced step. (prepare_for_detach): Handle possibly multiple threads doing displaced steps. (handle_inferior_event): Handle possibility that fork event happens while another thread displaced steps. * linux-tdep.h (linux_displaced_step_prepare): New. (linux_displaced_step_finish): New. (linux_displaced_step_copy_insn_closure_by_addr): New. (linux_displaced_step_restore_all_in_ptid): New. (linux_init_abi): Add supports_displaced_step parameter. * linux-tdep.c (struct linux_info) <disp_step_buf>: New field. (linux_displaced_step_prepare): New. (linux_displaced_step_finish): New. (linux_displaced_step_copy_insn_closure_by_addr): New. (linux_displaced_step_restore_all_in_ptid): New. (linux_init_abi): Add supports_displaced_step parameter, register displaced step methods if true. (_initialize_linux_tdep): Register inferior_execd observer. * amd64-linux-tdep.c (amd64_linux_init_abi_common): Add supports_displaced_step parameter, adjust call to linux_init_abi. Remove call to set_gdbarch_displaced_step_location. (amd64_linux_init_abi): Adjust call to amd64_linux_init_abi_common. (amd64_x32_linux_init_abi): Likewise. * aarch64-linux-tdep.c (aarch64_linux_init_abi): Adjust call to linux_init_abi. Remove call to set_gdbarch_displaced_step_location. * arm-linux-tdep.c (arm_linux_init_abi): Likewise. * i386-linux-tdep.c (i386_linux_init_abi): Likewise. * alpha-linux-tdep.c (alpha_linux_init_abi): Adjust call to linux_init_abi. * arc-linux-tdep.c (arc_linux_init_osabi): Likewise. * bfin-linux-tdep.c (bfin_linux_init_abi): Likewise. * cris-linux-tdep.c (cris_linux_init_abi): Likewise. * csky-linux-tdep.c (csky_linux_init_abi): Likewise. * frv-linux-tdep.c (frv_linux_init_abi): Likewise. * hppa-linux-tdep.c (hppa_linux_init_abi): Likewise. * ia64-linux-tdep.c (ia64_linux_init_abi): Likewise. * m32r-linux-tdep.c (m32r_linux_init_abi): Likewise. * m68k-linux-tdep.c (m68k_linux_init_abi): Likewise. * microblaze-linux-tdep.c (microblaze_linux_init_abi): Likewise. * mips-linux-tdep.c (mips_linux_init_abi): Likewise. * mn10300-linux-tdep.c (am33_linux_init_osabi): Likewise. * nios2-linux-tdep.c (nios2_linux_init_abi): Likewise. * or1k-linux-tdep.c (or1k_linux_init_abi): Likewise. * riscv-linux-tdep.c (riscv_linux_init_abi): Likewise. * s390-linux-tdep.c (s390_linux_init_abi_any): Likewise. * sh-linux-tdep.c (sh_linux_init_abi): Likewise. * sparc-linux-tdep.c (sparc32_linux_init_abi): Likewise. * sparc64-linux-tdep.c (sparc64_linux_init_abi): Likewise. * tic6x-linux-tdep.c (tic6x_uclinux_init_abi): Likewise. * tilegx-linux-tdep.c (tilegx_linux_init_abi): Likewise. * xtensa-linux-tdep.c (xtensa_linux_init_abi): Likewise. * ppc-linux-tdep.c (ppc_linux_init_abi): Adjust call to linux_init_abi. Remove call to set_gdbarch_displaced_step_location. * arm-tdep.c (arm_pc_is_thumb): Call gdbarch_displaced_step_copy_insn_closure_by_addr instead of get_displaced_step_copy_insn_closure_by_addr. * rs6000-aix-tdep.c (rs6000_aix_init_osabi): Adjust calls to clear gdbarch methods. * rs6000-tdep.c (struct ppc_inferior_data): New structure. (get_ppc_per_inferior): New function. (ppc_displaced_step_prepare): New function. (ppc_displaced_step_finish): New function. (ppc_displaced_step_restore_all_in_ptid): New function. (rs6000_gdbarch_init): Register new gdbarch methods. * s390-tdep.c (s390_gdbarch_init): Don't call set_gdbarch_displaced_step_location, set new gdbarch methods. gdb/testsuite/ChangeLog: * gdb.arch/amd64-disp-step-avx.exp: Adjust pattern. * gdb.threads/forking-threads-plus-breakpoint.exp: Likewise. * gdb.threads/non-stop-fair-events.exp: Likewise. Change-Id: I387cd235a442d0620ec43608fd3dc0097fcbf8c8
2020-12-04gdb: clear inferior displaced stepping state and in-line step-over info on execSimon Marchi3-0/+233
When a process does an exec, all its program space is replaced with the newly loaded executable. All non-main threads disappear and the main thread starts executing at the entry point of the new executable. Things can go wrong if a displaced step operation is in progress while we process the exec event. If the main thread is the one executing the displaced step: when that thread (now executing in the new executable) stops somewhere (say, at a breakpoint), displaced_step_fixup will run and clear up the state. We will execute the "fixup" phase for the instruction we single-stepped in the old program space. We are now in a completely different context, so doing the fixup may corrupt the state. If it is a non-main thread that is doing the displaced step: while handling the exec event, GDB deletes the thread_info representing that thread (since the thread doesn't exist in the inferior after the exec). But inferior::displaced_step_state::step_thread will still point to it. When handling events later, this condition, in displaced_step_fixup, will likely never be true: /* Was this event for the thread we displaced? */ if (displaced->step_thread != event_thread) return 0; ... since displaced->step_thread points to a deleted thread (unless that storage gets re-used for a new thread_info, but that wouldn't be good either). This effectively makes the displaced stepping buffer occupied for ever. When a thread in the new program space will want to do a displaced step, it will wait for ever. I think we simply need to reset the displaced stepping state of the inferior on exec. Everything execution-related that existed before the exec is now gone. Similarly, if a thread does an in-line step over an exec syscall instruction, nothing clears the in-line step over info when the event is handled. So it the in-line step over info stays there indefinitely, and things hang because we can never start another step over. To fix this, I added a call to clear_step_over_info in infrun_inferior_execd. Add a test with a program with two threads that does an exec. The test includes the following axes: - whether it's the leader thread or the other thread that does the exec. - whether the exec'r and exec'd program have different text segment addresses. This is to hopefully catch cases where the displaced stepping info doesn't get reset, and GDB later tries to restore bytes of the old address space in the new address space. If the mapped addresses are different, we should get some memory error. This happens without the patch applied: $ ./gdb -q -nx --data-directory=data-directory testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-leader-diff-text-segs-true -ex "b main" -ex r -ex "b my_execve_syscall if 0" -ex "set displaced-stepping on" ... Breakpoint 1, main (argc=1, argv=0x7fffffffde38) at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.threads/step-over-exec.c:69 69 argv0 = argv[0]; Breakpoint 2 at 0x60133a: file /home/simark/src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S, line 34. (gdb) c Continuing. [New Thread 0x7ffff7c62640 (LWP 1455423)] Leader going in exec. Exec-ing /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-leader-diff-text-segs-true-execd [Thread 0x7ffff7c62640 (LWP 1455423) exited] process 1455418 is executing new program: /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-leader-diff-text-segs-true-execd Error in re-setting breakpoint 2: Function "my_execve_syscall" not defined. No unwaited-for children left. (gdb) n Single stepping until exit from function _start, which has no line number information. Cannot access memory at address 0x6010d2 (gdb) - Whether displaced stepping is allowed or not, so that we end up testing both displaced stepping and in-line stepping on arches that do support displaced stepping (otherwise, it just tests in-line stepping twice I suppose) To be able to precisely put a breakpoint on the syscall instruction, I added a small assembly file (lib/my-syscalls.S) that contains minimal Linux syscall wrappers. I prefer that to the strategy used in gdb.base/step-over-syscall.exp, which is to stepi into the glibc wrapper until we find something that looks like a syscall instruction, I find that more predictable. gdb/ChangeLog: * infrun.c (infrun_inferior_execd): New function. (_initialize_infrun): Attach inferior_execd observer. gdb/testsuite/ChangeLog: * gdb.threads/step-over-exec.exp: New. * gdb.threads/step-over-exec.c: New. * gdb.threads/step-over-exec-execd.c: New. * lib/my-syscalls.S: New. * lib/my-syscalls.h: New. Change-Id: I1bbc8538e683f53af5b980091849086f4fec5ff9
2020-12-01gdb/testsuite: fix indentation in gdb.threads/non-ldr-exc-1.expSimon Marchi1-2/+2
gdb/testsuite/ChangeLog: * gdb.threads/non-ldr-exc-1.exp: Fix indentation. Change-Id: I02ba8a518aae9cb67106d09bef92968a7078e91e