diff options
author | Pedro Alves <palves@redhat.com> | 2015-02-04 19:13:28 +0100 |
---|---|---|
committer | Pedro Alves <palves@redhat.com> | 2015-02-04 19:13:28 +0100 |
commit | 20ba1ce66d31b9dd16ed8c648f46ce32aa3a03e0 (patch) | |
tree | 2f33431e75790cfddfac5f6ef30f175f826f9af1 /gdb/linux-nat.c | |
parent | 3c537f7fdb11f02f7082749f3f21dfdd2c2025e8 (diff) | |
download | gdb-20ba1ce66d31b9dd16ed8c648f46ce32aa3a03e0.zip gdb-20ba1ce66d31b9dd16ed8c648f46ce32aa3a03e0.tar.gz gdb-20ba1ce66d31b9dd16ed8c648f46ce32aa3a03e0.tar.bz2 |
Linux: don't resume new LWPs until we've pulled all events out of the kernel
Since the starvation avoidance series
(https://sourceware.org/ml/gdb-patches/2014-12/msg00631.html), both
GDB and GDBserver pull all events out of ptrace before deciding which
event to process.
There's one problem with that though. Because we resume new threads
immediately when we see a PTRACE_EVENT_CLONE event, if the program
constantly spawns threads fast enough, new threads can spawn threads
faster we can pull events out of the kernel, and thus we'd get stuck
in an infinite loop, never returning any event to the core to process.
I occasionally see this happen with the
attach-many-short-lived-threads.exp test against gdbserver.
The fix is to delay resuming new threads until we've pulled out all
events out of the kernel.
On native, we already have the resume_stopped_resumed_lwps function
that knows to resume LWPs that are stopped with no event to report to
the core. So the patch just adds another use. GDBserver didn't have
the equivalent yet, so the patch adds one.
Tested on x86_64 Fedora 20, native and gdbserver (remote and
extended-remote).
gdb/gdbserver/ChangeLog:
2015-02-04 Pedro Alves <palves@redhat.com>
* linux-low.c (handle_extended_wait): Don't resume LWPs here.
(resume_stopped_resumed_lwps): New function.
(linux_wait_for_event_filtered): Use it.
gdb/ChangeLog:
2015-02-04 Pedro Alves <palves@redhat.com>
* linux-nat.c (handle_extended_wait): Don't resume LWPs here.
(wait_lwp): Don't call wait_lwp if linux_handle_extended_wait
returns true.
(resume_stopped_resumed_lwps): Don't check whether the thread is
marked as executing.
(linux_nat_wait_1): Use resume_stopped_resumed_lwps.
Diffstat (limited to 'gdb/linux-nat.c')
-rw-r--r-- | gdb/linux-nat.c | 42 |
1 files changed, 12 insertions, 30 deletions
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c index b4893d44..169188a 100644 --- a/gdb/linux-nat.c +++ b/gdb/linux-nat.c @@ -691,6 +691,7 @@ linux_nat_pass_signals (struct target_ops *self, static int stop_wait_callback (struct lwp_info *lp, void *data); static int linux_thread_alive (ptid_t ptid); static char *linux_child_pid_to_exec_file (struct target_ops *self, int pid); +static int resume_stopped_resumed_lwps (struct lwp_info *lp, void *data); @@ -2033,28 +2034,7 @@ linux_handle_extended_wait (struct lwp_info *lp, int status, new_lp->status = status; } - /* Note the need to use the low target ops to resume, to - handle resuming with PT_SYSCALL if we have syscall - catchpoints. */ - if (!stopping) - { - new_lp->resumed = 1; - - if (status == 0) - { - gdb_assert (new_lp->last_resume_kind == resume_continue); - if (debug_linux_nat) - fprintf_unfiltered (gdb_stdlog, - "LHEW: resuming new LWP %ld\n", - ptid_get_lwp (new_lp->ptid)); - linux_resume_one_lwp (new_lp, 0, GDB_SIGNAL_0); - } - } - - if (debug_linux_nat) - fprintf_unfiltered (gdb_stdlog, - "LHEW: resuming parent LWP %d\n", pid); - linux_resume_one_lwp (lp, 0, GDB_SIGNAL_0); + new_lp->resumed = !stopping; return 1; } @@ -2096,9 +2076,8 @@ linux_handle_extended_wait (struct lwp_info *lp, int status, if (debug_linux_nat) fprintf_unfiltered (gdb_stdlog, "LHEW: Got PTRACE_EVENT_VFORK_DONE " - "from LWP %ld: resuming\n", + "from LWP %ld: ignoring\n", ptid_get_lwp (lp->ptid)); - ptrace (PTRACE_CONT, ptid_get_lwp (lp->ptid), 0, 0); return 1; } @@ -2245,8 +2224,8 @@ wait_lwp (struct lwp_info *lp) fprintf_unfiltered (gdb_stdlog, "WL: Handling extended status 0x%06x\n", status); - if (linux_handle_extended_wait (lp, status, 1)) - return wait_lwp (lp); + linux_handle_extended_wait (lp, status, 1); + return 0; } return status; @@ -3274,8 +3253,13 @@ linux_nat_wait_1 (struct target_ops *ops, continue; } - /* Now that we've pulled all events out of the kernel, check if - there's any LWP with a status to report to the core. */ + /* Now that we've pulled all events out of the kernel, resume + LWPs that don't have an interesting event to report. */ + iterate_over_lwps (minus_one_ptid, + resume_stopped_resumed_lwps, &minus_one_ptid); + + /* ... and find an LWP with a status to report to the core, if + any. */ lp = iterate_over_lwps (ptid, status_callback, NULL); if (lp != NULL) break; @@ -3436,8 +3420,6 @@ resume_stopped_resumed_lwps (struct lwp_info *lp, void *data) struct gdbarch *gdbarch = get_regcache_arch (regcache); CORE_ADDR pc = regcache_read_pc (regcache); - gdb_assert (is_executing (lp->ptid)); - /* Don't bother if there's a breakpoint at PC that we'd hit immediately, and we're not waiting for this LWP. */ if (!ptid_match (lp->ptid, *wait_ptid_p)) |