aboutsummaryrefslogtreecommitdiff
path: root/gdb/infrun.c
AgeCommit message (Collapse)AuthorFilesLines
2022-07-28Change address_space to use new and deleteTom Tromey1-3/+3
This changes address_space to use new and delete, and makes some other small C++-ification changes as well, like changing address_space_num to be a method. This patch was needed for the subsequent patch to rewrite the registry system.
2022-07-22Change target_ops::async to accept boolTom Tromey1-2/+2
This changes the parameter of target_ops::async from int to bool. Regression tested on x86-64 Fedora 34.
2022-07-20Don't stop all threads prematurely after first step of "step N"Pedro Alves1-5/+14
In all-stop mode, when the target is itself in non-stop mode (like GNU/Linux), if you use the "step N" (or "stepi/next/nexti N") to step a thread a number of times: (gdb) help step step, s Step program until it reaches a different source line. Usage: step [N] Argument N means step N times (or till program stops for another reason). ... GDB prematurely stops all threads after the first step, and doesn't re-resume them for the subsequent N-1 steps. It's as if for the 2nd and subsequent steps, the command was running with scheduler-locking enabled. This can be observed with the testcase added by this commit, which looks like this: static pthread_barrier_t barrier; static void * thread_func (void *arg) { pthread_barrier_wait (&barrier); return NULL; } int main () { pthread_t thread; int ret; pthread_barrier_init (&barrier, NULL, 2); /* We run to this line below, and then issue "next 3". That should step over the 3 lines below and land on the return statement. If GDB prematurely stops the thread_func thread after the first of the 3 nexts (and never resumes it again), then the join won't ever return. */ pthread_create (&thread, NULL, thread_func, NULL); /* set break here */ pthread_barrier_wait (&barrier); pthread_join (thread, NULL); return 0; } The test hangs and times out without the GDB fix: (gdb) next 3 [New Thread 0x7ffff7d89700 (LWP 525772)] FAIL: gdb.threads/step-N-all-progress.exp: non-stop=off: target-non-stop=on: next 3 (timeout) The problem is a core gdb bug. When you do "step/stepi/next/nexti N", GDB internally creates a thread_fsm object and associates it with the stepping thread. For the stepping commands, the FSM's class is step_command_fsm. That object is what keeps track of how many steps are left to make. When one step finishes, handle_inferior_event calls stop_waiting and returns, and then fetch_inferior_event calls the "should_stop" method of the event thread's FSM. The implementation of that method decrements the steps-left counter. If the counter is 0, it returns true and we proceed to presenting the stop to the user. If it isn't 0 yet, then the method returns false, indicating to fetch_inferior_event to "keep going". Focusing now on when the first step finishes -- we're in "all-stop" mode, with the target in non-stop mode. When a step finishes, handle_inferior_event calls stop_waiting, which itself calls stop_all_threads to stop everything. I.e., after the first step completes, all threads are stopped, before handle_inferior_event returns. And after that, now in fetch_inferior_event, we consult the thread's thread_fsm::should_stop, which as we've seen, for the first step returns false -- i.e., we need to keep_going for another step. However, since the target is in non-stop mode, keep_going resumes _only_ the current thread. All the other threads remain stopped, inadvertently. If the target is in non-stop mode, we don't actually need to stop all threads right after each step finishes, and then re-resume them again. We can instead defer stopping all threads until all the steps are completed. So fix this by delaying the stopping of all threads until after we called the FSM's "should_stop" method. I.e., move it from stop_waiting, to handle_inferior_events's callers, fetch_inferior_event and wait_for_inferior. New test included. Tested on x86-64 GNU/Linux native and gdbserver. Change-Id: Iaad50dcfea4464c84bdbac853a89df92ade6ae01
2022-07-18Remove ui_register_input_event_handlerTom Tromey1-1/+1
This patch removes ui_register_input_event_handler and ui_unregister_input_event_handler, replacing them with methods on 'ui'. It also changes gdb to use these methods everywhere, rather than sometimes reaching in to the ui to manage the file descriptor directly.
2022-05-03gdb: add some additional thread status debug outputAndrew Burgess1-0/+3
While working on this patch: https://sourceware.org/pipermail/gdb-patches/2022-January/185109.html I found it really useful to print the executing/resumed status of all threads (or all threads in a particular inferior) at various places (e.g. when a new inferior is started, when GDB attaches, etc). This debug was originally part of the above patch, but I wanted to rewrite this as a separate patch and move the code into a new function in infrun.h, which is what this patch does. Unless 'set debug infrun on' is in effect, then there should be no user visible changes after this commit.
2022-04-29gdb/infrun: make fetch_inferior_event restore thread if exited or signalledSimon Marchi1-3/+7
Commit 152a1749566 ("gdb: prune inferiors at end of fetch_inferior_event, fix intermittent failure of gdb.threads/fork-plus-threads.exp") introduced some follow-fork-related test failures, such as: info inferiors^M Num Description Connection Executable ^M * 1 process 634972 1 (native) /home/simark/build/binutils-gdb-one-target/gdb/testsuite/outputs/gdb.base/foll-fork/foll-fork ^M 2 process 634975 1 (native) /home/simark/build/binutils-gdb-one-target/gdb/testsuite/outputs/gdb.base/foll-fork/foll-fork ^M (gdb) PASS: gdb.base/foll-fork.exp: follow-fork-mode=parent: detach-on-fork=off: cmd=next 2: test_follow_fork: info inferiors inferior 2^M [Switching to inferior 2 [process 634975] (/home/simark/build/binutils-gdb-one-target/gdb/testsuite/outputs/gdb.base/foll-fork/foll-fork)]^M [Switching to thread 2.1 (Thread 0x7ffff7c9a740 (LWP 634975))]^M #0 0x00007ffff7d7abf7 in _Fork () from /usr/lib/libc.so.6^M (gdb) PASS: gdb.base/foll-fork.exp: follow-fork-mode=parent: detach-on-fork=off: cmd=next 2: test_follow_fork: inferior 2 continue^M Continuing.^M [Inferior 2 (process 634975) exited normally]^M [Switching to Thread 0x7ffff7c9a740 (LWP 634972)]^M (gdb) PASS: gdb.base/foll-fork.exp: follow-fork-mode=parent: detach-on-fork=off: cmd=next 2: test_follow_fork: continue until exit at continue unfollowed inferior to end break callee^M Breakpoint 2 at 0x555555555160: file /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/foll-fork.c, line 9.^M (gdb) FAIL: gdb.base/foll-fork.exp: follow-fork-mode=parent: detach-on-fork=off: cmd=next 2: test_follow_fork: break callee What happens here is: - inferior 2 is selected - we continue, leading to inferior 2's exit - we set breakpoint, expect 2 locations, but only one location is resolved Reading between the lines, we understand that inferior 2 got pruned, when it shouldn't have been. The issue can be reproduced by hand with: $ ./gdb -q --data-directory=data-directory testsuite/outputs/gdb.base/foll-fork/foll-fork -ex "set detach-on-fork off" -ex start -ex "next 2" -ex "inferior 2" -ex "set debug infrun" ... Temporary breakpoint 1, main () at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/foll-fork.c:14 14 int v = 5; [New inferior 2 (process 637627)] [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1". 17 if (pid == 0) /* set breakpoint here */ [Switching to inferior 2 [process 637627] (/home/simark/build/binutils-gdb-one-target/gdb/testsuite/outputs/gdb.base/foll-fork/foll-fork)] [Switching to thread 2.1 (Thread 0x7ffff7c9a740 (LWP 637627))] #0 0x00007ffff7d7abf7 in _Fork () from /usr/lib/libc.so.6 (gdb) continue Continuing. [infrun] clear_proceed_status_thread: 637627.637627.0 [infrun] proceed: enter [infrun] proceed: addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT [infrun] scoped_disable_commit_resumed: reason=proceeding [infrun] start_step_over: enter [infrun] start_step_over: stealing global queue of threads to step, length = 0 [infrun] operator(): step-over queue now empty [infrun] start_step_over: exit [infrun] proceed: start: resuming threads, all-stop-on-top-of-non-stop [infrun] proceed: resuming 637627.637627.0 [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [637627.637627.0] at 0x7ffff7d7abf7 [infrun] do_target_resume: resume_ptid=637627.637627.0, step=0, sig=GDB_SIGNAL_0 [infrun] infrun_async: enable=1 [infrun] prepare_to_wait: prepare_to_wait [infrun] proceed: end: resuming threads, all-stop-on-top-of-non-stop [infrun] reset: reason=proceeding [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target native [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target native [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target native [infrun] proceed: exit [infrun] fetch_inferior_event: enter [infrun] scoped_disable_commit_resumed: reason=handling event [infrun] do_target_wait: Found 2 inferiors, starting at #1 [infrun] random_pending_event_thread: None found. [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) = [infrun] print_target_wait_results: 637627.637627.0 [process 637627], [infrun] print_target_wait_results: status->kind = EXITED, exit_status = 0 [infrun] handle_inferior_event: status->kind = EXITED, exit_status = 0 [Inferior 2 (process 637627) exited normally] [infrun] stop_waiting: stop_waiting [infrun] stop_all_threads: start: reason=presenting stop to user in all-stop, inf=-1 [infrun] stop_all_threads: pass=0, iterations=0 [infrun] stop_all_threads: 637624.637624.0 not executing [infrun] stop_all_threads: pass=1, iterations=1 [infrun] stop_all_threads: 637624.637624.0 not executing [infrun] stop_all_threads: done [infrun] stop_all_threads: end: reason=presenting stop to user in all-stop, inf=-1 [Switching to Thread 0x7ffff7c9a740 (LWP 637624)] [infrun] infrun_async: enable=0 [infrun] reset: reason=handling event [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target native, no resumed threads (gdb) [infrun] fetch_inferior_event: exit (gdb) info inferiors Num Description Connection Executable * 1 process 637624 1 (native) /home/simark/build/binutils-gdb-one-target/gdb/testsuite/outputs/gdb.base/foll-fork/foll-fork (gdb) i th Id Target Id Frame * 1 Thread 0x7ffff7c9a740 (LWP 637624) "foll-fork" main () at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/foll-fork.c:17 After handling the EXITED event for inferior 2, inferior 2 should have stayed the current inferior, which should have prevented it from getting pruned. When debugging, we find that when getting at the prune_inferiors call, the current inferior is inferior 1. Further debugging shows that prior to the call to clean_up_just_stopped_threads_fsms, the current inferior is inferior 2, and after, it's inferior 1. Then, back in fetch_inferior_event, the restore_thread object is disabled, due to: /* If we got a TARGET_WAITKIND_NO_RESUMED event, then the previously selected thread is gone. We have two choices - switch to no thread selected, or restore the previously selected thread (now exited). We chose the later, just because that's what GDB used to do. After this, "info threads" says "The current thread <Thread ID 2> has terminated." instead of "No thread selected.". */ if (!non_stop && cmd_done && ecs->ws.kind () != TARGET_WAITKIND_NO_RESUMED) restore_thread.dont_restore (); So in the end, inferior 1 stays current, and inferior 2 gets wrongfully pruned. I'd say clean_up_just_stopped_threads_fsms is the culprit here. It actually attempts to restore the event_thread to be current at the end, after the loop (I presume the current thread on entry is always supposed to be the event thread). But in this case, the event is of kind EXITED, and ecs->event_thread is not set, so the current inferior isn't restored. Fix that by using scoped_restore_current_thread. If there is no current thread, scoped_restore_current_thread will still restore the current inferior, and that's what we want. Random note: the thread_info object for inferior 2's thread is never freed. It is held (by refcount) by the restore_thread object in fetch_inferior_event, while the inferior's thread list gets cleared, in the exit event processing. When the refcount reaches 0 (when the restore_thread object is destroyed), there's nothing that actually deletes the thread_info object. And I think that nothing in GDB points to it anymore, so it leaks. I don't want to fix that in this patch, but thought it would be good to mention it, in case somebody has an idea for how to fix that. Change-Id: Ibc7df543e2c46aad5f3b9250b28c3fb5912be4e8
2022-04-27gdb: remove BLOCK_ENTRY_PC macroSimon Marchi1-2/+2
Replace with equivalent method. Change-Id: I0e033095e7358799930775e61028b48246971a7d
2022-04-25gdb/infrun: assert !step_over_info_valid_p in restart_threadsLancelot SIX1-1/+7
While working in gdb/infrun.c:restart_threads, I was wondering what are the preconditions to call the function. It seems to me that !step_over_info_valid_p should be a precondition (i.e. if we are doing an inline step over breakpoint, we do not want to resume non stepping threads as one of them might miss the breakpoint which is temporally disabled). To convince myself that this is true, I have added an assertion to enforce this, and got one regression in the testsuite: FAIL: gdb.base/step-over-syscall.exp: vfork: displaced=off: single step over vfork (GDB internal error) This call to restart_threads originates from handle_vfork_done which does not check if a step over is active when restarting other threads: if (target_is_non_stop_p ()) { scoped_restore_current_thread restore_thread; insert_breakpoints (); restart_threads (event_thread, event_thread->inf); start_step_over (); } In this patch, I propose to: - Call start_step_over before restart_threads. If a step over is already in progress (as it is the case in the failing testcase), start_step_over return immediately, and there is no point in restarting all threads just to stop them right away for a step over breakpoint. - Only call restart_threads if no step over is in progress at this point. In this patch, I also propose to keep the assertion in restart_threads to help enforce this precondition, and state it explicitly. I have also checked all other places which call restart_threads, and they all seem to check that there is no step over currently active before doing the call. As for infrun-related things, I am never completely sure I did not miss something. So as usual, all feedback and thoughts are very welcome. Tested on x86_64-linux-gnu. Change-Id: If5f5f98ec4cf9aaeaabb5e3aa88ae6ffd70d4f37
2022-04-22gdb: handle_no_resumed: only update thread list of event targetSimon Marchi1-6/+1
When running: $ make check TESTS="gdb.multi/multi-re-run.exp" RUNTESTFLAGS="--target_board=native-gdbserver" I get: target remote localhost:2347^M Remote debugging using localhost:2347^M Reading symbols from /lib64/ld-linux-x86-64.so.2...^M Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.31.so...^M 0x00007ffff7fd0100 in _start () from /lib64/ld-linux-x86-64.so.2^M (gdb) continue^M Continuing.^M Cannot execute this command while the target is running.^M Use the "interrupt" command to stop the target^M and then try again.^M (gdb) FAIL: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=1: runto: run to all_started The test does: - Connect to a remote target with inferior 2, continue and stop on the all_started function - Connect to a separate remote target / GDBserver instance with inferior 1, continue and (expect to) stop on the all_started function The failure seen above happens when trying to continue inferior 1. What happens is: - GDB tells inferior 1's remote target to continue - We go into fetch_inferior_event, try to get an event at random from the targets - do_target_wait happens to pick inferior 2's target - That target return TARGET_WAITKIND_NO_RESUMED, which makes sense: inferior 2 is stopped, that target has no resumed thread - handle_no_resumed tries to update the thread list of all targets: for (auto *target : all_non_exited_process_targets ()) { switch_to_target_no_thread (target); update_thread_list (); } - When trying to update the thread list of inferior 1's target, it hits the "Cannot execute this command while the target is running" error. This target is working in "remote all-stop" mode, and it is currently waiting for a stop reply, so it can't send packets to update the thread list at this time. To handle the problem described in the comment in handle_no_resumed, I don't think it is necessary to update the thread list of all targets, but only the event target. That comment describes a kind of race condition where some target reports a breakpoint hit for a thread and then its last remaining resumed thread exits, so sends a "no resumed" event. If we ended up resuming the thread that hit a breakpoint, we want to ignore the "no resumed" and carry on. But I don't really see why we need to update the thread list on the other targets. I can't really articulate this, it's more a gut feeling, maybe I just fail to imagine the situation where this is needed. But here is the patch anyway, so we can discuss it. The patch changes handle_no_resumed to only update the thread list of the event target. This fixes the test run shown above. The way I originally tried to fix this was to make remote_target::update_thread_list return early if the target is currently awaiting a stop reply, since there's no way it can update the thread list at that time. But that felt more like papering over the problem. I then thought that we probably shouldn't be asking the target to update the thread list unnecessarily. Change-Id: Ide3df22b4f556478e155ad1c67ad4b4fe7c26a58
2022-04-22gdb: prune inferiors at end of fetch_inferior_event, fix intermittent ↵Simon Marchi1-5/+12
failure of gdb.threads/fork-plus-threads.exp This test sometimes fail like this: info threads^M Id Target Id Frame ^M 11.12 process 2270719 Couldn't get registers: No such process.^M (gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: no threads left [Inferior 11 (process 2270719) exited normally]^M info inferiors^M Num Description Connection Executable ^M * 1 <null> /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/fork-plus-threads/fork-plus-threads ^M 11 <null> /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/fork-plus-threads/fork-plus-threads ^M (gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: only inferior 1 left (the program exited) I can get it to fail quite reliably by pinning it to a core: $ taskset -c 5 make check TESTS="gdb.threads/fork-plus-threads.exp" The previous attempt at fixing this was: https://sourceware.org/pipermail/gdb-patches/2021-October/182846.html What we see is part due to a possible unfortunate ordering of events given by the kernel, and what could be considered a bug in GDB. The test program makes a number of forks, waits them all, then exits. Most of the time, GDB will get and process the exit event for inferior 1 after the exit events of all the children. But this is not guaranteed. After the last child exits and is waited by the parent, the parent can exit quickly, such that GDB collects from the kernel the exit events for the parent and that child at the same time. It then chooses one event at random, which can be the event for the parent. This will result in the parent appearing to exit before its child. There's not much we can do about it, so I think we have to adjust the test to cope. After expect has seen the "exited normally" notification for inferior 1, it immediately does an "info thread" that it expects to come back empty. But at this point, GDB might not have processed inferior 11's (the last child) exit event, so it will look like there is still a thread. Of course that thread is dead, we just don't know it yet. But that makes the "no thread" test fail. If the test waited just a bit more for the "exited normally" notification for inferior 11, then the list of threads would be empty. So, first change, make the test collect all the "exited normally" notifications for all inferiors before proceeding, that should ensure we see an empty thread list. That would fix the first FAIL above. However, we would still have the second FAIL, as we expect inferior 11 to not be there, it should have been deleted automatically. Inferior 11 is normally deleted when prune_inferiors is called. That is called by normal_stop, which is only called by fetch_inferior_event only if the event thread completed an execution command FSM (thread_fsm). But the FSM for the continue command completed when inferior 1 exited. At that point inferior 11 was not prunable, as it still had a thread. When inferior 11 exits, prune_inferiors is not called. I think that can be considered a GDB bug. From the user point of view, there's no reason why in one case inferior 11 would be deleted and not in the other case. This patch makes the somewhat naive change to call prune_inferiors in fetch_inferior_event, so that it is called in this case. It is placed at this particular point in the function so that it is called after the user inferior / thread selection is restored. If it was called before that, inferior 11 wouldn't be pruned, because it would still be the current inferior. Change-Id: I48a15d118f30b1c72c528a9f805ed4974170484a Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26272
2022-04-11gdb: remove symbol value macrosSimon Marchi1-1/+1
Remove all macros related to getting and setting some symbol value: #define SYMBOL_VALUE(symbol) (symbol)->value.ivalue #define SYMBOL_VALUE_ADDRESS(symbol) \ #define SET_SYMBOL_VALUE_ADDRESS(symbol, new_value) \ #define SYMBOL_VALUE_BYTES(symbol) (symbol)->value.bytes #define SYMBOL_VALUE_COMMON_BLOCK(symbol) (symbol)->value.common_block #define SYMBOL_BLOCK_VALUE(symbol) (symbol)->value.block #define SYMBOL_VALUE_CHAIN(symbol) (symbol)->value.chain #define MSYMBOL_VALUE(symbol) (symbol)->value.ivalue #define MSYMBOL_VALUE_RAW_ADDRESS(symbol) ((symbol)->value.address + 0) #define MSYMBOL_VALUE_ADDRESS(objfile, symbol) \ #define BMSYMBOL_VALUE_ADDRESS(symbol) \ #define SET_MSYMBOL_VALUE_ADDRESS(symbol, new_value) \ #define MSYMBOL_VALUE_BYTES(symbol) (symbol)->value.bytes #define MSYMBOL_BLOCK_VALUE(symbol) (symbol)->value.block Replace them with equivalent methods on the appropriate objects. Change-Id: Iafdab3b8eefc6dc2fd895aa955bf64fafc59ed50
2022-04-04gdb: resume ongoing step after handling fork or vforkSimon Marchi1-4/+19
The test introduced by this patch would fail in this configuration, with the native-gdbserver or native-extended-gdbserver boards: FAIL: gdb.threads/next-fork-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=auto: i=2: next to for loop The problem is that the step operation is forgotten when handling the fork/vfork. With "debug infrun" and "debug remote", it looks like this (some lines omitted for brevity). We do the next: [infrun] proceed: enter [infrun] proceed: addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=0, current thread [4154304.4154304.0] at 0x5555555553bf [infrun] do_target_resume: resume_ptid=4154304.0.0, step=1, sig=GDB_SIGNAL_0 [remote] Sending packet: $vCont;r5555555553bf,5555555553c4:p3f63c0.3f63c0;c:p3f63c0.-1#cd [infrun] proceed: exit We then handle a fork event: [infrun] fetch_inferior_event: enter [remote] wait: enter [remote] Packet received: T05fork:p3f63ee.3f63ee;06:0100000000000000;07:b08e59f6ff7f0000;10:bf60e8f7ff7f0000;thread:p3f63c0.3f63c6;core:17; [remote] wait: exit [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) = [infrun] print_target_wait_results: 4154304.4154310.0 [Thread 4154304.4154310], [infrun] print_target_wait_results: status->kind = FORKED, child_ptid = 4154350.4154350.0 [infrun] handle_inferior_event: status->kind = FORKED, child_ptid = 4154350.4154350.0 [remote] Sending packet: $D;3f63ee#4b [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [4154304.4154310.0] at 0x7ffff7e860bf [infrun] do_target_resume: resume_ptid=4154304.0.0, step=0, sig=GDB_SIGNAL_0 [remote] Sending packet: $vCont;c:p3f63c0.-1#73 [infrun] fetch_inferior_event: exit In the first snippet, we resume the stepping thread with the range-stepping (r) vCont command. But after handling the fork (detaching the fork child), we resumed the whole process freely. The stepping thread, which was paused by GDBserver while reporting the fork event, was therefore resumed freely, instead of confined to the addresses of the stepped line. Note that since this is a "next", it could be that we have entered a function, installed a step-resume breakpoint, and it's ok to continue freely the stepping thread, but that's not the case here. The two snippets shown above were next to each other in the logs. For the fork case, we can resume stepping right after handling the event. However, for the vfork case, where we are waiting for the external child process to exec or exit, we only resume the thread that called vfork, and keep the others stopped (see patch "gdb: fix handling of vfork by multi-threaded program" prior in this series). So we can't resume the stepping thread right now. Instead, do it after handling the vfork-done event. Change-Id: I92539c970397ce880110e039fe92b87480f816bd
2022-04-04gdb: fix handling of vfork by multi-threaded program ↵Simon Marchi1-8/+127
(follow-fork-mode=parent, detach-on-fork=on) There is a problem with how GDB handles a vfork happening in a multi-threaded program. This problem was reported to me by somebody not using vfork directly, but using system(3) in a multi-threaded program, which may be implemented using vfork. This patch only deals about the follow-fork-mode=parent, detach-on-fork=on case, because it would be too much to chew at once to fix the bugs in the other cases as well (I tried). The problem ----------- When a program vforks, the parent thread is suspended by the kernel until the child process exits or execs. Specifically, in a multi-threaded program, only the thread that called vfork is suspended, other threads keep running freely. This is documented in the vfork(2) man page ("Caveats" section). Let's suppose GDB is handling a vfork and the user's desire is to detach from the child. Before detaching the child, GDB must remove the software breakpoints inserted in the shared parent/child address space, in case there's a breakpoint in the path the child is going to take before exec'ing or exit'ing (unlikely, but possible). Otherwise the child could hit a breakpoint instruction while running outside the control of GDB, which would make it crash. GDB must also avoid re-inserting breakpoints in the parent as long as it didn't receive the "vfork done" event (that is, when the child has exited or execed): since the address space is shared with the child, that would re-insert breakpoints in the child process also. So what GDB does is: 1. Receive "vfork" event for the parent 2. Remove breakpoints from the (shared) address space and set program_space::breakpoints_not_allowed to avoid re-inserting them 3. Detach from the child thread 4. Resume the parent 5. Wait for and receive "vfork done" event for the parent 6. Clean program_space::breakpoints_not_allowed and re-insert breakpoints 7. Resume the parent Resuming the parent at step 4 is necessary in order for the kernel to report the "vfork done" event. The kernel won't report a ptrace event for a thread that is ptrace-stopped. But the theory behind this is that between steps 4 and 5, the parent won't actually do any progress even though it is ptrace-resumed, because the kernel keeps it suspended, waiting for the child to exec or exit. So it doesn't matter for that thread if breakpoints are not inserted. The problem is when the program is multi-threaded. In step 4, GDB resumes all threads of the parent. The thread that did the vfork stays suspended by the kernel, so that's fine. But other threads are running freely while breakpoints are removed, which is a problem because they could miss a breakpoint that they should have hit. The problem is present with all-stop and non-stop targets. The only difference is that with an all-stop targets, the other threads are stopped by the target when it reports the vfork event and are resumed by the target when GDB resumes the parent. With a non-stop target, the other threads are simply never stopped. The fix ------- There many combinations of settings to consider (all-stop/non-stop, target-non-stop on/off, follow-fork-mode parent/child, detach-on-fork on/off, schedule-multiple on/off), but for this patch I restrict the scope to follow-fork-mode=parent, detach-on-fork=on. That's the "default" case, where we detach the child and keep debugging the parent. I tried to fix them all, but it's just too much to do at once. The code paths and behaviors for when we don't detach the child are completely different. The guiding principle for this patch is that all threads of the vforking inferior should be stopped as long as breakpoints are removed. This is similar to handling in-line step-overs, in a way. For non-stop targets (the default on Linux native), this is what happens: - In follow_fork, we call stop_all_threads to stop all threads of the inferior - In follow_fork_inferior, we record the vfork parent thread in inferior::thread_waiting_for_vfork_done - Back in handle_inferior_event, we call keep_going, which resumes only the event thread (this is already the case, with a non-stop target). This is the thread that will be waiting for vfork-done. - When we get the vfork-done event, we go in the (new) handle_vfork_done function to restart the previously stopped threads. In the same scenario, but with an all-stop target: - In follow_fork, no need to stop all threads of the inferior, the target has stopped all threads of all its inferiors before returning the event. - In follow_fork_inferior, we record the vfork parent thread in inferior::thread_waiting_for_vfork_done. - Back in handle_inferior_event, we also call keep_going. However, we only want to resume the event thread here, not all inferior threads. In internal_resume_ptid (called by resume_1), we therefore now check whether one of the inferiors we are about to resume has thread_waiting_for_vfork_done set. If so, we only resume that thread. Note that when resuming multiple inferiors, one vforking and one not non-vforking, we could resume the vforking thread from the vforking inferior plus all threads from the non-vforking inferior. However, this is not implemented, it would require more work. - When we get the vfork-done event, the existing call to keep_going naturally resumes all threads. Testing-wise, add a test that tries to make the main thread hit a breakpoint while a secondary thread calls vfork. Without the fix, the main thread keeps going while breakpoints are removed, resulting in a missed breakpoint and the program exiting. Change-Id: I20eb78e17ca91f93c19c2b89a7e12c382ee814a1
2022-04-04gdb/infrun: add logging statement to do_target_resumeSimon Marchi1-0/+4
This helped me, it shows which ptid we actually call target_resume with. Change-Id: I2dfd771e83df8c25f39371a13e3e91dc7882b73d
2022-04-04gdb/infrun: add inferior parameters to stop_all_threads and restart_threadsSimon Marchi1-8/+31
A following patch will want to stop all threads of a given inferior (as opposed to all threads of all inferiors) while handling a vfork, and restart them after. To help with this, add inferior parameters to stop_all_threads and restart_threads. This is done as a separate patch to make sure this doesn't cause regressions on its own, and to keep the following patches more concise. No visible changes are expected here, since all calls sites pass nullptr, which should keep the existing behavior. Change-Id: I4d9ba886ce842042075b4e346cfa64bbe2580dbf
2022-04-04gdb: replace inferior::waiting_for_vfork_done with ↵Simon Marchi1-5/+9
inferior::thread_waiting_for_vfork_done The inferior::waiting_for_vfork_done flag indicates that some thread in that inferior is waiting for a vfork-done event. Subsequent patches will need to know which thread precisely is waiting for that event. Replace the boolean flag (waiting_for_vfork_done) with a thread_info pointer (thread_waiting_for_vfork_done). I think there is a latent buglet in that waiting_for_vfork_done is currently not reset on inferior exec or exit. I could imagine that if a thread in the parent process calls exec or exit while another thread of the parent process is waiting for its vfork child to exec or exit, we could end up with inferior::waiting_for_vfork_done without a thread actually waiting for a vfork-done event anymore. And since that flag is checked in resume_1, things could misbehave there. Since the new field points to a thread_info object, and those are destroyed on exec or exit, it could be worse now since we could try to access freed memory, if thread_waiting_for_vfork_done were to point to a stale thread_info. To avoid this, clear the field in infrun_inferior_exit and infrun_inferior_execd. Change-Id: I31b847278613a49ba03fc4915f74d9ceb228fdce
2022-03-31Remove dbx modeTom Tromey1-23/+19
This patch removes gdb's dbx mode. Regression tested on x86-64 Fedora 34.
2022-03-31gdb/infrun: add reason parameter to stop_all_threadsSimon Marchi1-5/+5
Add a "reason" parameter, only used to show in debug messages what is the reason for stopping all threads. This helped me understand the debug logs while adding some new uses of stop_all_threads, so I am proposing to merge it. Change-Id: I66c8c335ebf41836a7bc3d5fe1db92c195f65e55
2022-03-29Unify gdb printf functionsTom Tromey1-73/+73
Now that filtered and unfiltered output can be treated identically, we can unify the printf family of functions. This is done under the name "gdb_printf". Most of this patch was written by script.
2022-03-29Unify gdb puts functionsTom Tromey1-3/+3
Now that filtered and unfiltered output can be treated identically, we can unify the puts family of functions. This is done under the name "gdb_puts". Most of this patch was written by script.
2022-03-29Remove some uses of printf_unfilteredTom Tromey1-5/+5
A number of spots call printf_unfiltered only because they are in code that should not be interrupted by the pager. However, I believe these cases are all handled by infrun's blanket ban on paging, and so can be converted to the default (_filtered) API. After this patch, I think all the remaining _unfiltered calls are ones that really ought to be. A few -- namely in complete_command -- could be replaced by a scoped assignment to pagination_enabled, but for the remainder, the code seems simple enough like this.
2022-03-21Watchpoint followed by catchpoint misreports watchpoint (PR gdb/28621)Pedro Alves1-12/+12
If GDB reports a watchpoint hit, and then the next event is not TARGET_WAITKIND_STOPPED, but instead some event for which there's a catchpoint, such that GDB calls bpstat_stop_status, GDB mistakenly thinks the watchpoint triggered. Vis, using foll-fork.c: (gdb) awatch v Hardware access (read/write) watchpoint 2: v (gdb) catch fork Catchpoint 3 (fork) (gdb) c Continuing. Hardware access (read/write) watchpoint 2: v Old value = 0 New value = 5 main () at gdb.base/foll-fork.c:16 16 pid = fork (); (gdb) Continuing. Hardware access (read/write) watchpoint 2: v <<<< <<<< these lines are spurious Value = 5 <<<< Catchpoint 3 (forked process 1712369), arch_fork (ctid=0x7ffff7fa4810) at arch-fork.h:49 49 arch-fork.h: No such file or directory. (gdb) The problem is that when we handle the fork event, nothing called watchpoints_triggered before calling bpstat_stop_status. Thus, each watchpoint's watchpoint_triggered field was still set to watch_triggered_yes from the previous (real) watchpoint stop. watchpoint_triggered is only current called in the handle_signal_stop path, when handling TARGET_WAITKIND_STOPPED. This fixes it by adding watchpoint_triggered calls in the other events paths that call bpstat_stop_status. But instead of adding them explicitly, it adds a new function bpstat_stop_status_nowatch that wraps bpstat_stop_status and calls watchpoint_triggered, and then replaces most calls to bpstat_stop_status with calls to bpstat_stop_status_nowatch. This required constifying watchpoints_triggered. New test included, which fails without the fix. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28621 Change-Id: I282b38c2eee428d25319af3bc842f9feafed461c
2022-03-18Fix crash with stepi, no debug info, and "set debug infrun 1"Pedro Alves1-1/+2
A stepi in a function without debug info with "set debug infrun 1" crashes GDB since commit c8353d682f69 (gdb/infrun: some extra infrun debug print statements), due to a reference to "tp->current_symtab->filename" when tp->current_symtab is null. This commit adds the missing null check. The output in this case becomes: [infrun] set_step_info: symtab = <null>, line = 0, step_frame_id = {stack=0x7fffffffd980,code=0x0000000000456c30,!special}, step_stack_frame_id = {stack=0x7fffffffd980,code=0x0000000000456c30,!special} Change-Id: I5171a9d222bc7e15b9dba2feaba7d55c7acd99f8
2022-03-14Replace deprecated_target_wait_hook by observersPatrick Monnerat1-12/+3
Commit b60cea7 (Make target_wait options use enum flags) broke deprecated_target_wait_hook usage: there's a commit comment telling this hook has not been converted. Rather than trying to mend it, this patch replaces the hook by two target_wait observers: target_pre_wait (ptid_t ptid) target_post_wait (ptid_t event_ptid) Upon target_wait entry, target_pre_wait is notified with the ptid passed to target_wait. Upon exit, target_post_wait is notified with the event ptid returned by target_wait. Should an exception occur, event_ptid is null_ptid. This change benefits to Insight (out-of-tree): there's no real use of the late hook in gdb itself.
2022-03-07Remove unnecessary inferior lookup in infrun:handle_onePedro Alves1-2/+1
infrun.c:handle_one calls find_inferior_ptid unnecessarily, since we already have a thread pointer handy, and the thread has a pointer to the inferior. This commit removes the unnecessary lookup. Change-Id: I2ae18601dd75346c6c91068e9a4f9a6484fb3339
2022-03-06gdb: remove internalvar_funcs::destroySimon Marchi1-1/+0
No kind of internal var uses it remove it. This makes the transition to using a variant easier, since we don't need to think about where this should be called (in a destructor or not), if it can throw, etc. Change-Id: Iebbc867d1ce6716480450d9790410d6684cbe4dd
2022-02-24Revert "do_target_wait_1: Clear TARGET_WNOHANG if the target isn't async."John Baldwin1-1/+1
Commit 14b3360508b1 ("do_target_wait_1: Clear TARGET_WNOHANG if the target isn't async.") broke some multi-target tests, such as gdb.multi/multi-target-info-inferiors.exp. The symptom is that execution just hangs at some point. What happens is: 1. One remote inferior is started, and now sits stopped at a breakpoint. It is not "async" at this point (but it "can async"). 2. We run a native inferior, the event loop gets woken up by the native target's fd. 3. In do_target_wait, we randomly choose an inferior to call target_wait on first, it happens to be the remote inferior. 4. Because the target is currently not "async", we clear TARGET_WNOHANG, resulting in synchronous wait. We therefore block here: #0 0x00007fe9540dbb4d in select () from /usr/lib/libc.so.6 #1 0x000055fc7e821da7 in gdb_select (n=15, readfds=0x7ffdb77c1fb0, writefds=0x0, exceptfds=0x7ffdb77c2050, timeout=0x7ffdb77c1f90) at /home/simark/src/binutils-gdb/gdb/posix-hdep.c:31 #2 0x000055fc7ddef905 in interruptible_select (n=15, readfds=0x7ffdb77c1fb0, writefds=0x0, exceptfds=0x7ffdb77c2050, timeout=0x7ffdb77c1f90) at /home/simark/src/binutils-gdb/gdb/event-top.c:1134 #3 0x000055fc7eda58e4 in ser_base_wait_for (scb=0x6250002e4100, timeout=1) at /home/simark/src/binutils-gdb/gdb/ser-base.c:240 #4 0x000055fc7eda66ba in do_ser_base_readchar (scb=0x6250002e4100, timeout=-1) at /home/simark/src/binutils-gdb/gdb/ser-base.c:365 #5 0x000055fc7eda6ff6 in generic_readchar (scb=0x6250002e4100, timeout=-1, do_readchar=0x55fc7eda663c <do_ser_base_readchar(serial*, int)>) at /home/simark/src/binutils-gdb/gdb/ser-base.c:444 #6 0x000055fc7eda718a in ser_base_readchar (scb=0x6250002e4100, timeout=-1) at /home/simark/src/binutils-gdb/gdb/ser-base.c:471 #7 0x000055fc7edb1ecd in serial_readchar (scb=0x6250002e4100, timeout=-1) at /home/simark/src/binutils-gdb/gdb/serial.c:393 #8 0x000055fc7ec48b8f in remote_target::readchar (this=0x617000038780, timeout=-1) at /home/simark/src/binutils-gdb/gdb/remote.c:9446 #9 0x000055fc7ec4da82 in remote_target::getpkt_or_notif_sane_1 (this=0x617000038780, buf=0x6170000387a8, forever=1, expecting_notif=1, is_notif=0x7ffdb77c24f0) at /home/simark/src/binutils-gdb/gdb/remote.c:9928 #10 0x000055fc7ec4f045 in remote_target::getpkt_or_notif_sane (this=0x617000038780, buf=0x6170000387a8, forever=1, is_notif=0x7ffdb77c24f0) at /home/simark/src/binutils-gdb/gdb/remote.c:10037 #11 0x000055fc7ec354d4 in remote_target::wait_ns (this=0x617000038780, ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/remote.c:8147 #12 0x000055fc7ec38aa1 in remote_target::wait (this=0x617000038780, ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/remote.c:8337 #13 0x000055fc7f1409ce in target_wait (ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/target.c:2612 #14 0x000055fc7e19da98 in do_target_wait_1 (inf=0x617000038080, ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3636 #15 0x000055fc7e19e26b in operator() (__closure=0x7ffdb77c2f90, inf=0x617000038080) at /home/simark/src/binutils-gdb/gdb/infrun.c:3697 #16 0x000055fc7e19f0c4 in do_target_wait (ecs=0x7ffdb77c33a0, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3716 #17 0x000055fc7e1a31f7 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4061 Before the aforementioned commit, we would not have cleared TARGET_WNOHANG, the remote target's wait would have returned nothing, and we would have consumed the native target's event. After applying this revert, the testsuite state looks as good as before for me on Ubuntu 20.04 amd64. Change-Id: Ic17a1642935cabcc16c25cb6899d52e12c2f5c3f
2022-02-22do_target_wait_1: Clear TARGET_WNOHANG if the target isn't async.John Baldwin1-1/+1
Previously, TARGET_WNOHANG was cleared if a target supported async mode even if async mode wasn't currently enabled. This change only permits TARGET_WNOHANG if async mode is enabled.
2022-02-22Enable async mode on supported targets in target_resume.John Baldwin1-3/+0
Enabling async mode above the target layer removes duplicate code in ::resume methods of async-capable targets. Commit 5b6d1e4fa4f ("Multi-target support") enabled async mode in do_target_resume after target_resume returns which is a step in this direction. However, other callers of target_resume such as target_continue do not enable async mode. Rather than enabling async mode in each of the callers after target_resume returns, enable async mode at the end of target_resume.
2022-02-10gdb/infrun: some extra infrun debug print statementsAndrew Burgess1-0/+11
While reviewing a different patch I wanted to know more about what was going on during GDB's stepping. I added some extra infrun debug print calls, and I thought these might be useful to others.
2022-02-07gdb: make thread_info::m_thread_fsm a std::unique_ptrLancelot SIX1-25/+19
While working on function calls, I realized that the thread_fsm member of struct thread_info is a raw pointer to a resource it owns. This commit changes the type of the thread_fsm member to a std::unique_ptr in order to signify this ownership relationship and slightly ease resource management (no need to manually call delete). To ensure consistent use, the field is made a private member (m_thread_fsm). The setter method (set_thread_fsm) can then check that it is incorrect to associate a FSM to a thread_info object if another one is already in place. This is ensured by an assertion. The function run_inferior_call takes an argument as a pointer to a call_thread_fsm and installs it in it in a thread_info instance. Also change this function's signature to accept a unique_ptr in order to signify that the ownership of the call_thread_fsm is transferred during the call. No user visible change expected after this commit. Tested on x86_64-linux with no regression observed. Change-Id: Ia1224f72a4afa247801ce6650ce82f90224a9ae8
2022-02-06gdb: remove SYMBOL_IS_ARGUMENT macroSimon Marchi1-1/+1
Add a getter and a setter for whether a symbol is an argument. Remove the corresponding macro and adjust all callers. Change-Id: I71b4f0465f3dfd2ed8b9e140bd3f7d5eb8d9ee81
2022-01-18Move gdb_argv to gdbsupportTom Tromey1-0/+1
This moves the gdb_argv class to a new header in gdbsupport.
2022-01-17gdb/infrun: rename variable and move to more specific scopeSimon Marchi1-4/+2
Move the "started" variable to the scope it's needed, and rename it to "step_over_started". Change-Id: I56f3384dbd328f55198063bb855edda10f1492a3
2022-01-01Automatic Copyright Year update after running gdb/copyright.pyJoel Brobecker1-1/+1
This commit brings all the changes made by running gdb/copyright.py as per GDB's Start of New Year Procedure. For the avoidance of doubt, all changes in this commits were performed by the script.
2021-12-08gdb: move clearing of tp->pending_follow to follow_fork_inferiorSimon Marchi1-10/+18
A following patch will change targets so that when they detach an inferior, they also detach any pending fork children this inferior may have. While doing this, I hit a case where we couldn't differentiate two cases, where in one we should detach the fork detach but not in the other. Suppose we continue past a fork with "follow-fork-mode == child" && "detach-on-fork on". follow_fork_inferior calls target_detach to detach the parent. In that case the target should not detach the fork child, as we'll continue debugging the child. As of now, the tp->pending_follow field of the thread who called fork still contains the details about the fork. Then, suppose we run to a fork catchpoint and the user types "detach". In that case, the target should detach the fork child in addition to the parent. In that case as well, the tp->pending_follow field contains the details about the fork. To allow targets to differentiate the two cases, clear tp->pending_follow a bit earlier, when following a fork. Targets will then see that tp->pending_follow contains TARGET_WAITKIND_SPURIOUS, and won't detach the fork child. As of this patch, no behavior changes are expected. Change-Id: I537741859ed712cb531baaefc78bb934e2a28153
2021-11-22gdb: pass more const target_waitstatus by referenceSimon Marchi1-31/+31
While working on target_waitstatus changes, I noticed a few places where const target_waitstatus objects could be passed by reference instead of by pointers. And in some cases, places where a target_waitstatus could be passed as const, but was not. Convert them as much as possible. Change-Id: Ied552d464be5d5b87489913b95f9720a5ad50c5a
2021-11-22gdb: rename target_waitstatus_to_string to target_waitstatus::to_stringSimon Marchi1-9/+8
Make target_waitstatus_to_string a "to_string" method of target_waitstatus, a bit like we have ptid_t::to_string already. This will save a bit of typing. Change-Id: Id261b7a09fa9fa3c738abac131c191a6f9c13905
2021-11-18gdbsupport: make gdb_assert_not_reached accept a format stringSimon Marchi1-2/+2
Change gdb_assert_not_reached to accept a format string plus corresponding arguments. This allows giving more precise messages. Because the format string passed by the caller is prepended with a "%s:" to add the function name, the callers can no longer pass a translated string (`_(...)`). Make the gdb_assert_not_reached include the _(), just like the gdb_assert_fail macro just above. Change-Id: Id0cfda5a57979df6cdaacaba0d55dd91ae9efee7
2021-11-15[gdb] Don't use gdb_stdlog for inferior-eventsTom de Vries1-19/+14
The test-case gdb.base/foll-vfork.exp contains: ... if [gdb_debug_enabled] { untested "debug is enabled" return 0 } ... To understand what it does, I disabled this bit and ran with GDB_DEBUG=infrun, like so: ... $ cd $build/gdb/testsuite $ make check GDB_DEBUG=infrun RUNTESTFLAGS=gdb.base/foll-vfork.exp ... and ran into: ... (gdb) PASS: gdb.base/foll-vfork.exp: exec: \ vfork parent follow, through step: set follow-fork parent next^M 33 if (pid == 0) {^M (gdb) FAIL: gdb.base/foll-vfork.exp: exec: \ vfork parent follow, through step: step ... The problem is that the test-case expects: ... (gdb) PASS: gdb.base/foll-vfork.exp: exec: \ vfork parent follow, through step: set follow-fork parent next^M [Detaching after vfork from child process 28169]^M 33 if (pid == 0) {^M (gdb) PASS: gdb.base/foll-vfork.exp: exec: \ vfork parent follow, through step: step ... but the "Detaching" line has been redirected to $outputs/gdb.base/foll-vfork/gdb.debug. I looked at the documentation of "set logging debugredirect [on|off]": ... By default, GDB debug output will go to both the terminal and the logfile. Set debugredirect if you want debug output to go only to the log file. ... and my interpretation of it was that "debug output" did not match the "messages" description of inferior-events: ... The set print inferior-events command allows you to enable or disable printing of messages when GDB notices that new inferiors have started or that inferiors have exited or have been detached. ... Fix the discrepancy by not using gdb_stdlog for inferior-events. Update the gdb.base/foll-vfork.exp test-case to not require gdb_debug_enabled == 0. Tested on x86_64-linux. Tested test-case gdb.base/foll-vfork.exp with and without GDB_DEBUG=infrun.
2021-11-11gdb: fix "set scheduler-locking" thread exit hangSimon Marchi1-2/+1
GDB hangs when doing this: - launch inferior with multiple threads - multiple threads hit some breakpoint(s) - one breakpoint hit is presented as a stop, the rest are saved as pending wait statuses - "set scheduler-locking on" - resume the currently selected thread (because of scheduler-locking, it's the only one resumed), let it execute until exit - GDB hangs, not showing the prompt, impossible to interrupt with ^C When the resumed thread exits, we expect the target to return a TARGET_WAITKIND_NO_RESUMED event, and that's what we see: [infrun] fetch_inferior_event: enter [infrun] scoped_disable_commit_resumed: reason=handling event [infrun] random_pending_event_thread: None found. [Thread 0x7ffff7d9c700 (LWP 309357) exited] [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) = [infrun] print_target_wait_results: -1.0.0 [process -1], [infrun] print_target_wait_results: status->kind = no-resumed [infrun] handle_inferior_event: status->kind = no-resumed [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed) [infrun] prepare_to_wait: prepare_to_wait [infrun] reset: reason=handling event [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target native, no resumed threads [infrun] fetch_inferior_event: exit The problem is in handle_no_resumed: we check if some other thread is actually resumed, to see if we should ignore that event (see comments in that function for more info). If this condition is true: (thread->executing () || thread->has_pending_waitstatus ()) ... then we ignore the event. The problem is that there are some non-resumed threads with a pending event, which makes us ignore the event. But these threads are not resumed, so we end up waiting while nothing executes, hence waiting for ever. My first fix was to change the condition to: (thread->executing () || (thread->resumed () && thread->has_pending_waitstatus ())) ... but then it occured to me that we could simply check for: (thread->resumed ()) Since "executing" implies "resumed", checking simply for "resumed" covers threads that are resumed and executing, as well as threads that are resumed with a pending status, which is what we want. Change-Id: Ie796290f8ae7f34c026ca3a8fcef7397414f4780
2021-11-08gdb: remove bpstat typedef, rename bpstats to bpstatSimon Marchi1-1/+1
I don't find that the bpstat typedef, which hides a pointer, is particularly useful. In fact, it confused me many times, and I just see it as something to remember that adds cognitive load. Also, with C++, we might want to be able to pass bpstats objects by const-reference, not necessarily by pointer. So, remove the bpstat typedef and rename struct bpstats to bpstat (since it represents one bpstat, it makes sense that it is singular). Change-Id: I52e763b6e54ee666a9e045785f686d37b4f5f849
2021-10-28gdb: use ptid_t::to_string in infrun debug messagesSimon Marchi1-59/+57
In debug messages, I think it would be more helpful to print ptid using the simple "pid.lwp.tid" notation in infrun debug messages. I am currently debugging some fork issues, and find the pid_to_str output not so useful, as it doesn't tell which process a thread belongs to. It currently shows up like this: [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=0, current thread [Thread 0x7ffff7d95740 (LWP 892942)] at 0x55555555521f With the patch, it shows up like this: [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [894072.894077.0] at 0x5555555551d9 Change-Id: I130796d7dfb0d8e763b8358d8a6002701d80c4ea
2021-10-25gdb: change functions returning value contents to use gdb::array_viewSimon Marchi1-2/+2
The bug fixed by this [1] patch was caused by an out-of-bounds access to a value's content. The code gets the value's content (just a pointer) and then indexes it with a non-sensical index. This made me think of changing functions that return value contents to return array_views instead of a plain pointer. This has the advantage that when GDB is built with _GLIBCXX_DEBUG, accesses to the array_view are checked, making bugs more apparent / easier to find. This patch changes the return types of these functions, and updates callers to call .data() on the result, meaning it's not changing anything in practice. Additional work will be needed (which can be done little by little) to make callers propagate the use of array_view and reap the benefits. [1] https://sourceware.org/pipermail/gdb-patches/2021-September/182306.html Change-Id: I5151f888f169e1c36abe2cbc57620110673816f3
2021-10-21gdb, gdbserver: make target_waitstatus safeSimon Marchi1-90/+99
I stumbled on a bug caused by the fact that a code path read target_waitstatus::value::sig (expecting it to contain a gdb_signal value) while target_waitstatus::kind was TARGET_WAITKIND_FORKED. This meant that the active union field was in fact target_waitstatus::value::related_pid, and contained a ptid. The read signal value was therefore garbage, and that caused GDB to crash soon after. Or, since that GDB was built with ubsan, this nice error message: /home/simark/src/binutils-gdb/gdb/linux-nat.c:1271:12: runtime error: load of value 2686365, which is not a valid value for type 'gdb_signal' Despite being a large-ish change, I think it would be nice to make target_waitstatus safe against that kind of bug. As already done elsewhere (e.g. dynamic_prop), validate that the type of value read from the union matches what is supposed to be the active field. - Make the kind and value of target_waitstatus private. - Make the kind initialized to TARGET_WAITKIND_IGNORE on target_waitstatus construction. This is what most users appear to do explicitly. - Add setters, one for each kind. Each setter takes as a parameter the data associated to that kind, if any. This makes it impossible to forget to attach the associated data. - Add getters, one for each associated data type. Each getter validates that the data type fetched by the user matches the wait status kind. - Change "integer" to "exit_status", "related_pid" to "child_ptid", just because that's more precise terminology. - Fix all users. That last point is semi-mechanical. There are a lot of obvious changes, but some less obvious ones. For example, it's not possible to set the kind at some point and the associated data later, as some users did. But in any case, the intent of the code should not change in this patch. This was tested on x86-64 Linux (unix, native-gdbserver and native-extended-gdbserver boards). It was built-tested on x86-64 FreeBSD, NetBSD, MinGW and macOS. The rest of the changes to native files was done as a best effort. If I forgot any place to update in these files, it should be easy to fix (unless the change happens to reveal an actual bug). Change-Id: I0ae967df1ff6e28de78abbe3ac9b4b2ff4ad03b7
2021-09-30Fix Windows crash from stop_pc changeTom Tromey1-0/+1
The "make thread_suspend_state::stop_pc optional" patch caused a regression on Windows when using shared libraries. I tracked this down to an unguarded use of stop_pc() in the TARGET_WAITKIND_LOADED case of handle_inferior_event. This patch fixes the bug by ensuring that the stop PC is set at this point.
2021-09-27gdb: don't share aspace/pspace on fork with "detach-on-fork on" and ↵Simon Marchi1-9/+30
"follow-fork-mode child" We found that when handling forks, two inferiors can unexpectedly share their program space and address space. To reproduce: 1. Using a test program that forks... 2. "set follow-fork-mode child" 3. "set detach-on-fork on" (the default) 4. run to a breakpoint somewhere after the fork Step 4 should have created a new inferior: (gdb) info inferiors Num Description Connection Executable 1 <null> /home/smarchi/build/wt/amd/gdb/fork * 2 process 251425 1 (native) /home/smarchi/build/wt/amd/gdb/fork By inspecting the state of GDB, we can see that the two inferiors now share one program space and one address space: Inferior 1: (top-gdb) p inferior_list.m_front.num $2 = 1 (top-gdb) p inferior_list.m_front.aspace $3 = (struct address_space *) 0x5595e2520400 (top-gdb) p inferior_list.m_front.pspace $4 = (struct program_space *) 0x5595e2520440 Inferior 2: (top-gdb) p inferior_list.m_front.next.num $5 = 2 (top-gdb) p inferior_list.m_front.next.aspace $6 = (struct address_space *) 0x5595e2520400 (top-gdb) p inferior_list.m_front.next.pspace $7 = (struct program_space *) 0x5595e2520440 You can then run inferior 1 again and the two inferiors will still erroneously share their spaces, but already at this point this is wrong. The cause of the bad {a,p}space sharing is in follow_fork_inferior. When following the child and detaching from the parent, we just re-use the parent's spaces, rather than cloning them. When we switch back to inferior 1 and run again, we find ourselves with two unrelated inferiors sharing spaces. Fix that by creating new spaces for the parent after having moved them to the child. My initial implementation created new spaces for the child instead. Doing this breaks doing "next" over fork(). When "next" start, we record the symtab of the starting location. When the program stops, we compare that symtab with the symtab the program has stopped at. If the symtab or the line number has changed, we conclude the "next" is done. If we create a new program space for the child and copy the parent's program space to it with clone_program_space, it creates new symtabs for the child as well. When the child stop, but still on the fork() line, GDB thinks the "next" is done because the symtab pointers no longer match. In reality they are two symtab instances that represent the same file. But moving the spaces to the child and creating new spaces for the parent, we avoid this problem. Note that the problem described above happens today with "detach-on-fork off" and "follow-fork-mode child", because we create new spaces for the child. This will have to be addressed later. Test-wise, improve gdb.base/foll-fork.exp to set a breakpoint that is expected to have a location in each inferiors. Without the fix, when the two inferiors erroneously share a program space, GDB reports a single location. Change-Id: Ifea76e14f87b9f7321fc3a766217061190e71c6e
2021-09-24gdb: change thread_info::name to unique_xmalloc_ptr, add helper functionSimon Marchi1-3/+1
This started out as changing thread_info::name to a unique_xmalloc_ptr. That showed that almost all users of that field had the same logic to get a thread's name: use thread_info::name if non-nullptr, else ask the target. Factor out this logic in a new thread_name free function. Make the field private (rename to m_name) and add some accessors. Change-Id: Iebdd95f4cd21fbefc505249bd1d05befc466a2fc
2021-09-23Change ptid_t::tid to ULONGESTTom Tromey1-6/+4
The ptid_t 'tid' member is normally used as an address in gdb -- both bsd-uthread and ravenscar-thread use it this way. However, because the type is 'long', this can cause problems with sign extension. This patch changes the type to ULONGEST to ensure that sign extension does not occur.
2021-09-08gdb: make thread_suspend_state::stop_pc optionalAndrew Burgess1-1/+2
Currently the stop_pc field of thread_suspect_state is a CORE_ADDR and when we want to indicate that there is no stop_pc available we set this field back to a special value. There are actually two special values used, in post_create_inferior the stop_pc is set to 0. This is a little unfortunate, there are plenty of embedded targets where 0 is a valid pc value. The more common special value for stop_pc though, is set in thread_info::set_executing, where the value (~(CORE_ADDR) 0) is used. This commit changes things so that the stop_pc is instead a gdb::optional. We can now explicitly reset the field to an uninitialised state, we also have asserts that we don't read the stop_pc when its in an uninitialised state (both in gdbsupport/gdb_optional.h, when compiling with _GLIBCXX_DEBUG defined, and in thread_info::stop_pc). One situation where a thread will not have a stop_pc value is when the thread is stopped as a consequence of GDB being in all stop mode, and some other thread stopped at an interesting event. When GDB brings all the other threads to a stop those other threads will not have a stop_pc set (thus avoiding an unnecessary read of the pc register). Previously, when GDB passed through handle_one (in infrun.c) the threads executing flag was set to false and the stop_pc field was left unchanged, i.e. it would (previous) have been left as ~0. Now, handle_one leaves the stop_pc with no value. This caused a problem when we later try to set these threads running again, in proceed() we compare the current pc with the cached stop_pc. If the thread was stopped via handle_one then the stop_pc would have been left as ~0, and the compare (in proceed) would (likely) fail. Now however, this compare tries to read the stop_pc when it has no value and this would trigger an assert. To resolve this I've added thread_info::stop_pc_p() which returns true if the thread has a cached stop_pc. We should only ever call thread_info::stop_pc() if we know that there is a cached stop_pc, however, this doesn't mean that every call to thread_info::stop_pc() needs to be guarded with a call to thread_info::stop_pc_p(), in most cases we know that the thread we are looking at stopped due to some interesting event in that thread, and so, we know that the stop_pc is valid. After running the testsuite I've seen no other situations where stop_pc is read uninitialised. There should be no user visible changes after this commit.