Age | Commit message (Collapse) | Author | Files | Lines |
|
After the previous commit, exit_inferior_1 no longer makes use of the
silent parameter. This commit removes this parameter and cleans up
the callers.
After doing this exit_inferior_1, exit_inferior, and
exit_inferior_silent are all equivalent, so rename exit_inferior_1 to
exit_inferior and delete exit_inferior_silent, update all the callers.
Also I spotted the declaration exit_inferior_num_silent in inferior.h,
but this function is not defined anywhere, so I deleted the
declaration.
There should be no user visible changes after this commit.
|
|
It was pointed out on the mailing list[1] that after this commit:
commit b1e0126ec56e099d753c20e91a9f8623aabd6b46
Date: Wed Jun 21 14:18:54 2023 +0100
gdb: don't resume vfork parent while child is still running
the test gdb.base/vfork-follow-parent.exp now has some failures when
run with the native-gdbserver or native-extended-gdbserver boards:
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to end of inferior 2 (timeout)
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: inferior 1 (timeout)
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1 (timeout)
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent (timeout)
The reason that these failures don't show up when run on the standard
unix board is that the test is only run in the default operating mode,
so for Linux this will be all-stop on top of non-stop.
If we adjust the test script so that it runs in the default mode and
with target-non-stop turned off, then we see the same failures on the
unix board. This commit includes this change.
The way that the test is written means that it is not (currently)
possible to turn on non-stop mode and have the test still work, so
this commit does not do that.
I have also updated the test script so that the vfork child performs
an exec as well as the current exit. Exec and exit are the two ways
in which a vfork child can release the vfork parent, so testing both
of these cases is useful I think.
In this test the inferior performs a vfork and the vfork-child
immediately exits. The vfork-parent will wait for the vfork-child and
then blocks waiting for gdb. Once gdb has released the vfork-parent,
the vfork-parent also exits.
In the test that fails, GDB sets 'detach-on-fork off' and then runs to
the vfork. At this point the test tries to just "continue", but this
fails as the vfork-parent is still selected, and the parent can't
continue until the vfork-child completes. As the vfork-child is
stopped by GDB the parent will never stop once resumed, so GDB refuses
to resume it.
The test script then sets 'schedule-multiple on' and once again
continues. This time GDB, in theory, resumes both the parent and the
child, the parent will be held blocked by the kernel, but the child
will run until it exits, and which point GDB stops again, this time
with inferior 2, the newly exited vfork-child, selected.
What happens after this in the test script is irrelevant as far as
this failure is concerned.
To understand why the test started failing we should consider the
behaviour of four different cases:
1. All-stop-on-non-stop before commit b1e0126ec56e,
2. All-stop-on-non-stop after commit b1e0126ec56e,
3. All-stop-on-all-stop before commit b1e0126ec56e, and
4. All-stop-on-all-stop after commit b1e0126ec56e.
Only case #4 is failing after commit b1e0126ec56e, but I think the
other cases are interesting because, (a) they inform how we might fix
the regression, and (b) it turns out the behaviour of #2 changed too
with the commit, but the change was harmless.
For #1 All-stop-on-non-stop before commit b1e0126ec56e, what happens
is:
1. GDB calls proceed with the vfork-parent selected, as schedule
multiple is on user_visible_resume_ptid returns -1 (everything)
as the resume_ptid (see proceed function),
2. As this is all-stop-on-non-stop, every thread is resumed
individually, so GDB tries to resume both the vfork-parent and the
vfork-child, both of which succeed,
3. The vfork-parent is held stopped by the kernel,
4. The vfork-child completes (exits) at which point the GDB sees the
EXITED event for the vfork-child and the VFORK_DONE event for the
vfork-parent,
5. At this point we might take two paths depending on which event
GDB handles first, if GDB handles the VFORK_DONE first then:
(a) As GDB is controlling both parent and child the VFORK_DONE is
ignored (see handle_vfork_done), the vfork-parent will be
resumed,
(b) GDB processes the EXITED event, selects the (now defunct)
vfork-child, and stops, returning control to the user.
Alternatively, if GDB selects the EXITED event first then:
(c) GDB processes the EXITED event, selects the (now defunct)
vfork-child, and stops, returning control to the user.
(d) At some future time the user resumes the vfork-parent, at
which point the VFORK_DONE is reported to GDB, however, GDB
is ignoring the VFORK_DONE (see handle_vfork_done), so the
parent is resumed.
For case #2, all-stop-on-non-stop after commit b1e0126ec56e, the
important difference is in step (2) above, now, instead of resuming
both the vfork-parent and the vfork-child, only the vfork-child is
resumed. As such, when we get to step (5), only a single event, the
EXITED event is reported.
GDB handles the EXITED just as in (5)(c), then, later, when the user
resumes the vfork-parent, the VFORKED_DONE is immediately delivered
from the kernel, but this is ignored just as in (5)(d), and so,
though the pattern of when the vfork-parent is resumed changes, the
overall pattern of which events are reported and when, doesn't
actually change. In fact, by not resuming the vfork-parent, the order
of events (in this test) is now deterministic, which (maybe?) is a
good thing.
If we now consider case #3, all-stop-on-all-stop before commit
b1e0126ec56e, then what happens is:
1. GDB calls proceed with the vfork-parent selected, as schedule
multiple is on user_visible_resume_ptid returns -1 (everything)
as the resume_ptid (see proceed function),
2. As this is all-stop-on-all-stop, the resume is passed down to the
linux-nat target, the vfork-parent is the event thread, while the
vfork-child is a sibling of the event thread,
3. In linux_nat_target::resume, GDB calls linux_nat_resume_callback
for all threads, this causes the vfork-child to be resumed. Then
in linux_nat_target::resume, the event thread, the vfork-parent,
is also resumed.
4. The vfork-parent is held stopped by the kernel,
5. The vfork-child completes (exits) at which point the GDB sees the
EXITED event for the vfork-child and the VFORK_DONE event for the
vfork-parent,
6. We are now in a situation identical to step (5) as for
all-stop-on-non-stop above, GDB selects one of the events to
handle, and whichever we select the user sees the correct
behaviour.
And so, finally, we can consider #4, all-stop-on-all-stop after commit
b1e0126ec56e, this is the case that started failing.
We start out just like above, in proceed, the resume_ptid is
-1 (resume everything), due to schedule multiple being on. And just
like above, due to the target being all-stop, we call
proceed_resume_thread_checked just once, for the current thread,
which, remember, is the vfork-parent thread.
The change in commit b1e0126ec56e was to avoid resuming a vfork-parent
thread, read the commit message for the justification for this change.
However, this means that GDB now rejects resuming the vfork-parent in
this case, which means that nothing gets resumed! Obviously, if
nothing resumes, then nothing will ever stop, and so GDB appears to
hang.
I considered a couple of solutions which, in the end, I didn't go
with, these were:
1. Move the vfork-parent check out of proceed_resume_thread_checked,
and place it in proceed, but only on the all-stop-on-non-stop
path, this should still address the issue seen in b1e0126ec56e,
but would avoid the issue seen here. I rejected this just
because it didn't feel great to split the checks that exist in
proceed_resume_thread_checked like this,
2. Extend the condition in proceed_resume_thread_checked by adding a
target_is_non_stop_p check. This would have the same effect as
idea 1, but leaves all the checks in the same place, which I
think would be better, but this still just didn't feel right to
me, and so,
What I noticed was that for the all-stop-on-non-stop, after commit
b1e0126ec56e, we only resumed the vfork-child, and this seems fine.
The vfork-parent isn't going to run anyway (the kernel will hold it
back), so if feels like we there's no harm in just waiting for the
child to complete, and then resuming the parent.
So then I started looking at follow_fork, which is called from the top
of proceed. This function already has the task of switching between
the parent and child based on which the user wishes to follow. So, I
wondered, could we use this to switch to the vfork-child in the case
that we are attached to both?
Turns out this is pretty simple to do.
Having done that, now the process is for all-stop-on-all-stop after
commit b1e0126ec56e, and with this new fix is:
1. GDB calls proceed with the vfork-parent selected, but,
2. In follow_fork, and follow_fork_inferior, GDB switches the
selected thread to be that of the vfork-child,
3. Back in proceed user_visible_resume_ptid returns -1 (everything)
as the resume_ptid still, but now,
4. When GDB calls proceed_resume_thread_checked, the vfork-child is
the current selected thread, this is not a vfork-parent, and so
GDB allows the proceed to continue to the linux-nat target,
5. In linux_nat_target::resume, GDB calls linux_nat_resume_callback
for all threads, this does not resume the vfork-parent (because
it is a vfork-parent), and then the vfork-child is resumed as
this is the event thread,
At this point we are back in the same situation as for
all-stop-on-non-stop after commit b1e0126ec56e, that is, the
vfork-child is resumed, while the vfork-parent is held stopped by
GDB.
Eventually the vfork-child will exit or exec, at which point the
vfork-parent will be resumed.
[1] https://inbox.sourceware.org/gdb-patches/3e1e1db0-13d9-dd32-b4bb-051149ae6e76@simark.ca/
|
|
This commit replaces this earlier commit:
commit 2e411b8c68eb2b035b31d5b00d940d4be1a0928b
Date: Fri Oct 14 14:53:15 2022 +0100
gdb: don't always print breakpoint location after failed condition check
and is a result of feedback received here[1].
The original commit addressed a problem where, if a breakpoint
condition included an inferior function call, and if the inferior
function call failed, then GDB would announce the stop twice. Here's
an example of GDB's output before the above commit that shows the
problem being addressed:
(gdb) break foo if (some_func ())
Breakpoint 1 at 0x40111e: file bpcond.c, line 11.
(gdb) r
Starting program: /tmp/bpcond
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401116 in some_func () at bpcond.c:5
5 return *p;
Error in testing condition for breakpoint 1:
The program being debugged stopped while in a function called from GDB.
Evaluation of the expression containing the function
(some_func) will be abandoned.
When the function is done executing, GDB will silently stop.
Breakpoint 1, 0x0000000000401116 in some_func () at bpcond.c:5
5 return *p;
(gdb)
The original commit addressed this issue in breakpoint.c, by spotting
that the $pc had changed while evaluating the breakpoint condition,
and inferring from this that GDB must have stopped elsewhere.
However, the way in which the original commit suppressed the second
stop announcement was to set bpstat::print to true -- this tells GDB
not to print the frame during the stop announcement, and for the CLI
this is fine, however, it was pointed out that for the MI this still
isn't really enough. Below is an example from an MI session after the
above commit was applied, this shows the problem with the above
commit:
-break-insert -c "cond_fail()" foo
^done,bkpt={number="1",type="breakpoint",disp="keep",enabled="y",addr="0x000000000040111e",func="foo",file="/tmp/mi-condbreak-fail.c",line="30",thread-groups=["i1"],cond="cond_fail()",times="0",original-location="foo"}
(gdb)
-exec-run
=thread-group-started,id="i1",pid="2636270"
=thread-created,id="1",group-id="i1"
=library-loaded,id="/lib64/ld-linux-x86-64.so.2",target-name="/lib64/ld-linux-x86-64.so.2",host-name="/lib64/ld-linux-x86-64.so.2",symbols-loaded="0",thread-group="i1",ranges=[{from="0x00007ffff7fd3110",to="0x00007ffff7ff2bb4"}]
^running
*running,thread-id="all"
(gdb)
=library-loaded,id="/lib64/libm.so.6",target-name="/lib64/libm.so.6",host-name="/lib64/libm.so.6",symbols-loaded="0",thread-group="i1",ranges=[{from="0x00007ffff7e59390",to="0x00007ffff7ef4f98"}]
=library-loaded,id="/lib64/libc.so.6",target-name="/lib64/libc.so.6",host-name="/lib64/libc.so.6",symbols-loaded="0",thread-group="i1",ranges=[{from="0x00007ffff7ca66b0",to="0x00007ffff7df3c5f"}]
~"\nProgram"
~" received signal SIGSEGV, Segmentation fault.\n"
~"0x0000000000401116 in cond_fail () at /tmp/mi-condbreak-fail.c:24\n"
~"24\t return *p;\t\t\t/* Crash here. */\n"
*stopped,reason="signal-received",signal-name="SIGSEGV",signal-meaning="Segmentation fault",frame={addr="0x0000000000401116",func="cond_fail",args=[],file="/tmp/mi-condbreak-fail.c",fullname="/tmp/mi-condbreak-fail.c",line="24",arch="i386:x86-64"},thread-id="1",stopped-threads="all",core="9"
&"Error in testing condition for breakpoint 1:\n"
&"The program being debugged was signaled while in a function called from GDB.\n"
&"GDB remains in the frame where the signal was received.\n"
&"To change this behavior use \"set unwindonsignal on\".\n"
&"Evaluation of the expression containing the function\n"
&"(cond_fail) will be abandoned.\n"
&"When the function is done executing, GDB will silently stop.\n"
=breakpoint-modified,bkpt={number="1",type="breakpoint",disp="keep",enabled="y",addr="0x000000000040111e",func="foo",file="/tmp/mi-condbreak-fail.c",fullname="/tmp/mi-condbreak-fail.c",line="30",thread-groups=["i1"],cond="cond_fail()",times="1",original-location="foo"}
*stopped
(gdb)
Notice that we still see two '*stopped' lines, the first includes the
full frame information, while the second has no frame information,
this is a result of bpstat::print having been set. Ideally, the
second '*stopped' line should not be present.
By setting bpstat::print I was addressing the problem too late, this
flag really only changes how interp::on_normal_stop prints the stop
event, and interp::on_normal_stop is called (indirectly) from the
normal_stop function in infrun.c. A better solution is to avoid
calling normal_stop at all for the stops which should not be reported
to the user, and this is what I do in this commit.
This commit has 3 parts:
1. In breakpoint.c, revert the above commit,
2. In fetch_inferior_event (infrun.c), capture the stop-id before
calling handle_inferior_event. If, after calling
handle_inferior_event, the stop-id has changed, then this indicates
that somewhere within handle_inferior_event, a stop was announced to
the user. If this is the case then GDB should not call normal_stop,
and we should rely on whoever announced the stop to ensure that we
are in a PROMPT_NEEDED state, which means the prompt will be
displayed once fetch_inferior_event returns. And,
3. In infcall.c, do two things:
(a) In run_inferior_call, after making the inferior call, ensure
that either async_disable_stdin or async_enable_stdin is called
to put the prompt state, and stdin handling into the correct
state based on whether the inferior call completed successfully
or not, and
(b) In call_thread_fsm::should_stop, call async_enable_stdin
rather than changing the prompt state directly. This isn't
strictly necessary, but helped me understand this code more.
This async_enable_stdin call is only reached if normal_stop is
not going to be called, and replaces the async_enable_stdin call
that exists in normal_stop. Though we could just adjust the
prompt state if felt (to me) much easier to understand when I
could see this call and the corresponding call in normal_stop.
With these changes in place now, when the inferior call (from the
breakpoint condition) fails, infcall.c leaves the prompt state as
PROMPT_NEEDED, and leaves stdin registered with the event loop.
Back in fetch_inferior_event GDB notices that the stop-id has changed
and so avoids calling normal_stop.
And on return from fetch_inferior_event GDB will display the prompt
and handle input from stdin.
As normal_stop is not called the MI problem is solved, and the test
added in the earlier mentioned commit still passes just fine, so the
CLI has not regressed.
[1] https://inbox.sourceware.org/gdb-patches/6fd4aa13-6003-2563-5841-e80d5a55d59e@palves.net/
|
|
While working on some of the recent patches relating to vfork handling
I found this debug output very helpful, I'd like to propose adding
this into GDB.
With debug turned off there should be no user visible changes after
this commit.
|
|
Like the last few commit, this fixes yet another vfork related issue.
Like the commit titled:
gdb: don't restart vfork parent while waiting for child to finish
which addressed a case in linux-nat where we would try to resume a
vfork parent, this commit addresses a very similar case, but this time
occurring in infrun.c. Just like with that previous commit, this bug
results in the assert:
x86-linux-dregs.c:146: internal-error: x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.
In this case the issue occurs when target-non-stop is on, but non-stop
is off, and again, schedule-multiple is on. As with the previous
commit, GDB is in follow-fork-mode child. If you have not done so, it
is worth reading the earlier commit as many of the problems leading to
the failure are the same, they just appear in a different part of GDB.
Here are the steps leading to the assertion failure:
1. The user performs a 'next' over a vfork, GDB stop in the vfork
child,
2. As we are planning to follow the child GDB sets the vfork_parent
and vfork_child member variables in the two inferiors, the
thread_waiting_for_vfork_done member is left as nullptr, that member
is only used when GDB is planning to follow the parent inferior,
3. The user does 'continue', our expectation is that the vfork child
should resume, and once that process has exited or execd, GDB should
detach from the vfork parent. As a result of the 'continue' GDB
eventually enters the proceed function,
4. In proceed we selected a ptid_t to resume, because
schedule-multiple is on we select minus_one_ptid (see
user_visible_resume_ptid),
5. As GDB is running in all-stop on top of non-stop mode, in the
proceed function we iterate over all threads that match the resume
ptid, which turns out to be all threads, and call
proceed_resume_thread_checked. One of the threads we iterate over
is the vfork parent thread,
6. As the thread passed to proceed_resume_thread_checked doesn't
match any of the early return conditions, GDB will set the thread
resumed,
7. As we are resuming one thread at a time, this thread is seen by
the lower layers (e.g. linux-nat) as the "event thread", which means
we don't apply any of the checks, e.g. is this thread a
vfork parent, instead we assume that GDB core knows what it's doing,
and linux-nat will resume the thread, we have now incorrectly set
running the vfork parent thread when this thread should be waiting
for the vfork child to complete,
8. Back in the proceed function GDB continues to iterate over all
threads, and now (correctly) resumes the vfork child thread,
8. As the vfork child is still alive the kernel holds the vfork
parent stopped,
9. Eventually the child performs its exec and GDB is sent and EXECD
event. However, because the parent is resumed, as soon as the child
performs its exec the vfork parent also sends a VFORK_DONE event to
GDB,
10. Depending on timing both of these events might seem to arrive in
GDB at the same time. Normally GDB expects to see the EXECD or
EXITED/SIGNALED event from the vfork child before getting the
VFORK_DONE in the parent. We know this because it is as a result of
the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
handle_vfork_child_exec_or_exit for details). Further the comment
in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
when we remain attached to the child (not the parent) we should not
expect to see a VFORK_DONE,
11. If both events arrive at the same time then GDB will randomly
choose one event to handle first, in some cases this will be the
VFORK_DONE. As described above, upon seeing a VFORK_DONE GDB
expects that (a) the vfork child has finished, however, in this case
this is not completely true, the child has finished, but GDB has not
processed the event associated with the completion yet, and (b) upon
seeing a VFORK_DONE GDB assumes we are remaining attached to the
parent, and so resumes the parent process,
12. GDB now handles the EXECD event. In our case we are detaching
from the parent, so GDB calls target_detach (see
handle_vfork_child_exec_or_exit),
13. While this has been going on the vfork parent is executing, and
might even exit,
14. In linux_nat_target::detach the first thing we do is stop all
threads in the process we're detaching from, the result of the stop
request will be cached on the lwp_info object,
15. In our case the vfork parent has exited though, so when GDB
waits for the thread, instead of a stop due to signal, we instead
get a thread exited status,
16. Later in the detach process we try to resume the threads just
prior to making the ptrace call to actually detach (see
detach_one_lwp), as part of the process to resume a thread we try to
touch some registers within the thread, and before doing this GDB
asserts that the thread is stopped,
17. An exited thread is not classified as stopped, and so the assert
triggers!
Just like with the earlier commit, the fix is to spot the vfork parent
status of the thread, and not resume such threads. Where the earlier
commit fixed this in linux-nat, in this case I think the fix should
live in infrun.c, in proceed_resume_thread_checked. This function
already has a similar check to not resume the vfork parent in the case
where we are planning to follow the vfork parent, I propose adding a
similar case that checks for the vfork parent when we plan to follow
the vfork child.
This new check will mean that at step #6 above GDB doesn't try to
resume the vfork parent thread, which prevents the VFORK_DONE from
ever arriving. Once GDB sees the EXECD/EXITED/SIGNALLED event from
the vfork child GDB will detach from the parent.
There's no test included in this commit. In a subsequent commit I
will expand gdb.base/foll-vfork.exp which is when this bug would be
exposed.
If you do want to reproduce this failure then you will for certainly
need to run the gdb.base/foll-vfork.exp test in a loop as the failures
are all very timing sensitive. I've found that running multiple
copies in parallel makes the failure more likely to appear, I usually
run ~6 copies in parallel and expect to see a failure after within
10mins.
|
|
Split the thread resuming code from proceed into new function
proceed_resume_thread_checked.
Co-Authored-By: Christina Schimpe <christina.schimpe@intel.com>
|
|
This commit fixes a bug introduced by this commit:
commit d8bbae6ea080249c05ca90b1f8640fde48a18301
Date: Fri Jan 14 15:40:59 2022 -0500
gdb: fix handling of vfork by multi-threaded program (follow-fork-mode=parent, detach-on-fork=on)
The problem can be seen in this GDB session:
$ gdb -q
(gdb) set non-stop on
(gdb) file ./gdb/testsuite/outputs/gdb.base/foll-vfork/foll-vfork
Reading symbols from ./gdb/testsuite/outputs/gdb.base/foll-vfork/foll-vfork...
(gdb) tcatch vfork
Catchpoint 1 (vfork)
(gdb) run
Starting program: /tmp/gdb/testsuite/outputs/gdb.base/foll-vfork/foll-vfork
Temporary catchpoint 1 (vforked process 1375914), 0x00007ffff7d5043c in vfork () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff7d5043c in vfork () from /lib64/libc.so.6
#1 0x00000000004011af in main (argc=1, argv=0x7fffffffad88) at .../gdb/testsuite/gdb.base/foll-vfork.c:32
(gdb) finish
Run till exit from #0 0x00007ffff7d5043c in vfork () from /lib64/libc.so.6
[Detaching after vfork from child process 1375914]
No unwaited-for children left.
(gdb)
Notice the "No unwaited-for children left." error. This is incorrect,
given where we are stopped there's no reason why we shouldn't be able
to use "finish" to return to the main frame.
When the inferior is stopped as a result of the 'tcatch vfork', the
inferior is in the process of performing the vfork, that is, GDB has
seen the VFORKED event, but has not yet attached to the new child
process, nor has the child process been resumed.
However, GDB has seen the VFORKED, and, as we are going to follow the
parent process, the inferior for the vfork parent will have its
thread_waiting_for_vfork_done member variable set, this will point to
the one and only thread that makes up the vfork parent process.
When the "finish" command is used GDB eventually ends up in the
proceed function (in infrun.c), in here we pass through all the
function until we eventually encounter this 'else if' condition:
else if (!cur_thr->resumed ()
&& !thread_is_in_step_over_chain (cur_thr)
/* In non-stop, forbid resuming a thread if some other thread of
that inferior is waiting for a vfork-done event (this means
breakpoints are out for this inferior). */
&& !(non_stop
&& cur_thr->inf->thread_waiting_for_vfork_done != nullptr))
{
The first two of these conditions will both be true, the thread is not
already resumed, and is not in the step-over chain, however, the third
condition, this one:
&& !(non_stop
&& cur_thr->inf->thread_waiting_for_vfork_done != nullptr))
is false, and this prevents the thread we are trying to finish from
being resumed. This condition is false because (a) non_stop is true,
and (b) cur_thr->inf->thread_waiting_for_vfork_done is not
nullptr (see above for why).
Now, if we check the comment embedded within the condition it says:
/* In non-stop, forbid resuming a thread if some other thread of
that inferior is waiting for a vfork-done event (this means
breakpoints are out for this inferior). */
And this makes sense, if we have a vfork parent with two thread, and
one thread has performed a vfork, then we shouldn't try to resume the
second thread.
However, if we are trying to resume the thread that actually performed
a vfork, then this is fine. If we never resume the vfork parent then
we'll never get a VFORK_DONE event, and so the vfork will never
complete.
Thus, the condition should actually be:
&& !(non_stop
&& cur_thr->inf->thread_waiting_for_vfork_done != nullptr
&& cur_thr->inf->thread_waiting_for_vfork_done != cur_thr))
This extra check will allow the vfork parent thread to resume, but
prevent any other thread in the vfork parent process from resuming.
This is the same condition that already exists in the all-stop on a
non-stop-target block earlier in the proceed function.
My actual fix is slightly different to the above, first, I've chosen
to use a nested 'if' check instead of extending the original 'else if'
check, this makes it easier to write a longer comment explaining
what's going on, and second, instead of checking 'non_stop' I've
switched to checking 'target_is_non_stop_p'. In this context this is
effectively the same thing, a previous 'else if' block in proceed
already handles '!non_stop && target_is_non_stop_p ()', so by the time
we get here, if 'target_is_non_stop_p ()' then we must be running in
non_stop mode.
Both of these tweaks will make the next patch easier, which is a
refactor to merge two parts of the proceed function, so this nested
'if' block is not going to exist for long.
For testing, there is no test included with this commit. The test was
exposed when using a modified version of the gdb.base/foll-vfork.exp
test script, however, there are other bugs that are exposed when using
the modified test script. These bugs will be addressed in subsequent
commits, and then I'll add the updated gdb.base/foll-vfork.exp.
If you wish to reproduce this failure then grab the updates to
gdb.base/foll-vfork.exp from the later commit and run this test, the
failure is always reproducible.
|
|
Fix some more typos:
- distinquish -> distinguish
- actualy -> actually
- singe -> single
- frash -> frame
- chid -> child
- dissassembler -> disassembler
- uninitalized -> uninitialized
- precontidion -> precondition
- regsiters -> registers
- marge -> merge
- sate -> state
- garanteed -> guaranteed
- explictly -> explicitly
- prefices (nonstandard plural) -> prefixes
- bondary -> boundary
- formated -> formatted
- ithe -> the
- arrav -> array
- coresponding -> corresponding
- owend -> owned
- fials -> fails
- diasm -> disasm
- ture -> true
- tpye -> type
There's one code change, the name of macro SIG_CODE_BONDARY_FAULT changed to
SIG_CODE_BOUNDARY_FAULT.
Tested on x86_64-linux.
|
|
Same idea as previous patches, but for about_to_proceed. We only need
(and want, as far as the mi_interp implementation is concerned) to
notify the interpreter that caused the proceed.
Change-Id: Id259bca10dbc3d43d46607ff7b95243a9cbe2f89
|
|
Same as previous patches, but for user_selected_context_changed.
Change-Id: I40de15be897671227d4bcf3e747f0fd595f0d5be
|
|
Same as previous patches, but for sync_execution_done. Except that
here, we only want to notify the interpreter that is executing the
command, not all interpreters.
Change-Id: I729c719447b5c5f29af65dbf6fed9132e2cd308b
|
|
Same as previous patches, but for no_history.
Change-Id: I06930fe7cb4082138c6c5496c5118fe4951c10da
|
|
Same as previous patch, but for exited. Remove the exited observable,
since nothing uses it anymore, and we don't have anything coming that
will use it.
Change-Id: I358cbea0159af56752dfee7510d6a86191e722bb
|
|
Same as previous patch, but for signal_exited. Remove the signal_exited
observable, since nothing uses it anymore, and we don't have anything
coming that will use it.
Change-Id: I0dca1eab76338bf27be755786e3dad3241698b10
|
|
Same idea as the previous patch, but for the normal_stop event.
Change-Id: I4fc8ca8a51c63829dea390a2b6ce30b77f9fb863
|
|
Instead of having the interpreter code registering observers for the
signal_received observable, add a "signal_received" virtual method to
struct interp. Add a interps_notify_signal_received function that loops
over all UIs and calls the signal_received method on the interpreter.
Finally, add a notify_signal_received function that calls
interps_notify_signal_received and then notifies the observers. Replace
all existing notifications to the signal_received observers with calls
to notify_signal_received.
Before this patch, the CLI and MI code both register a signal_received
observer. These observer go over all UIs, and, for those that have a
interpreter of the right kind, print the stop notifiation.
After this patch, we have just one "loop over all UIs", inside
interps_notify_signal_received. Since the interp::on_signal_received
method gets called once for each interpreter, the implementations only
need to deal with the current interpreter (the "this" pointer).
The motivation for this patch comes from a future patch, that makes the
amdgpu code register an observer to print a warning after the CLI's
signal stop message. Since the amdgpu and the CLI code both use
observers, the order of the two messages is not stable, unless we define
the priority using the observer dependency system. However, the
approach of using virtual methods on the interpreters seems like a good
change anyway, I think it's more straightforward and simple to
understand than the current solution that uses observers. We are sure
that the amdgpu message gets printed after the CLI message, since
observers are notified after interpreters.
Keep the signal_received, even if nothing uses if, because we will be
using it in the upcoming amdgpu patch implementing the warning described
above.
Change-Id: I4d8614bb8f6e0717f4bfc2a59abded3702f23ac4
|
|
Add convenience first_loc methods to struct breakpoint (const and
non-const overloads). A subsequent patch changes the list of locations
to be an intrusive_list and makes the actual list private, so these
spots would need to change from:
b->loc
to something ugly like:
*b->locations ().begin ()
That would make the code much heavier and not readable. There is a
surprisingly big number of places that access the first location of
breakpoints. Whether this is correct, or these spots fail to consider
the possibility of multi-location breakpoints, I don't know. But
anyhow, I think that using this instead:
b->first_loc ()
conveys the intention better than the other two forms.
Change-Id: Ibbefe3e4ca6cdfe570351fe7e2725f2ce11d1e95
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
|
|
I'd like to move some things so they become methods on struct ui. But
first, I think that struct ui and the related things are big enough to
deserve their own file, instead of being scattered through top.{c,h} and
event-top.c.
Change-Id: I15594269ace61fd76ef80a7b58f51ff3ab6979bc
|
|
I noticed that this observable was never notified, which means we can
probably safely remove it. The notification was removed in:
commit 243a925328f8e3184b2356bee497181049c0174f
Author: Pedro Alves <palves@redhat.com>
Date: Wed Sep 9 18:23:24 2015 +0100
Replace "struct continuation" mechanism by something more extensible
print_end_stepping_range_reason in turn becomes unused, so remote it as
well.
Change-Id: If5da5149276c282d2540097c8c4327ce0f70431a
|
|
The problem explained and fixed in the previous patch could have also
been fixed by this patch. But I think it's good change anyhow, that
could prevent future bugs, so here it is.
fetch_inferior_event switches to an arbitrary (in practice, the first) inferior
of the process target of the inferior used to fetch the event. The idea is
that the event handling code will need to do some target calls, so we want to
switch to an inferior that has target target.
However, you can have two inferiors that share a process target, but with one
inferior having an additional target on top:
inf 1 inf 2
----- -----
another target
process target process target
exec exec
Let's say inferior 2 is selected by do_target_wait and returns an event that is
really synthetized by "another target". This "another target" could be a
thread or record stratum target (in the case explained by the previous patch,
it was the arch stratum target, but it's because the amd-dbgapi abuses the arch
layer). fetch_inferior_event will then switch to the first inferior with
"process target", so inferior 1. handle_signal_stop then tries to fetch the
thread's registers:
ecs->event_thread->set_stop_pc
(regcache_read_pc (get_thread_regcache (ecs->event_thread)));
This will try to get the thread's register by calling into the current target
stack, the stack of inferior 1. This is problematic because "another target"
might have a special fetch_registers implementation.
I think it would be a good idea to switch to the inferior for which the
even was reported, not just some inferior of the same process target.
This will ensure that any target call done before we eventually call
context_switch will be done on the full target stack that reported the
event.
Not all events are associated to an inferior though. For instance,
TARGET_WAITKIND_NO_RESUMED. In those cases, some targets return
null_ptid, some return minus_one_ptid (ideally the expected return value
should be clearly defined / documented). So, if the ptid returned is
either of these, switch to an arbitrary inferior with that process
target, as before.
Change-Id: I1ffc8c1095125ab591d0dc79ea40025b1d7454af
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
With the following patch, which teaches the amd-dbgapi target to handle
inferiors that fork, we end up with target stacks in the following
state, when an inferior that does not use the GPU forks an inferior that
eventually uses the GPU.
inf 1 inf 2
----- -----
amd-dbgapi
linux-nat linux-nat
exec exec
When a GPU thread from inferior 2 hits a breakpoint, the following
sequence of events would happen, if it was not for the current patch.
- we start with inferior 1 as current
- do_target_wait_1 makes inferior 2 current, does a target_wait, which
returns a stop event for an amd-dbgapi wave (thread).
- do_target_wait's scoped_restore_current_thread restores inferior 1 as
current
- fetch_inferior_event calls switch_to_target_no_thread with linux-nat
as the process target, since linux-nat is officially the process
target of inferior 2. This makes inferior 1 the current inferior, as
it's the first inferior with that target.
- In handle_signal_stop, we have:
ecs->event_thread->suspend.stop_pc
= regcache_read_pc (get_thread_regcache (ecs->event_thread));
context_switch (ecs);
regcache_read_pc executes while inferior 1 is still the current one
(because it's before the `context_switch`). This is a problem,
because the regcache is for a ptid managed by the amd-dbgapi target
(e.g. (12345, 1, 1)), a ptid that does not make sense for the
linux-nat target. The fetch_registers target call goes directly
to the linux-nat target, which gets confused.
- We would then get an error like:
Couldn't get extended state status: No such process.
... since linux-nat tries to do a ptrace call on tid 1.
GDB should switch to the inferior the ptid belongs to before doing the
target call to fetch registers, to make sure the call hits the right
target stack (it should be handled by the amd-dbgapi target in this
case). In fact the following patch does this change, and it would be
enough to fix this specific problem.
However, I propose to change regcache to make it switch to the right
inferior, if needed, before doing target calls. That makes the
interface as a whole more independent of the global context.
My first attempt at doing this was to find an inferior using the process
stratum target and the ptid that regcache already knows about:
gdb::optional<scoped_restore_current_thread> restore_thread;
inferior *inf = find_inferior_ptid (this->target (), this->ptid ());
if (inf != current_inferior ())
{
restore_thread.emplace ();
switch_to_inferior_no_thread (inf);
}
However, this caused some failures in fork-related tests and gdbserver
boards. When we detach a fork child, we may create a regcache for the
child, but there is no corresponding inferior. For instance, to restore
the PC after a displaced step over the fork syscall. So
find_inferior_ptid would return nullptr, and
switch_to_inferior_no_thread would hit a failed assertion.
So, this patch adds to regcache the information "the inferior to switch
to to makes target calls". In typical cases, it will be the inferior
that matches the regcache's ptid. But in some cases, like the detached
fork child one, it will be another inferior (in this example, it will be
the fork parent inferior).
The problem that we witnessed was in regcache::raw_update specifically,
but I looked for other regcache methods doing target calls, and added
the same inferior switching code to raw_write too.
In the regcache constructor and in get_thread_arch_aspace_regcache,
"inf_for_target_calls" replaces the process_stratum_target parameter.
We suppose that the process stratum target that would be passed
otherwise is the same that is in inf_for_target_calls's target stack, so
we don't need to pass both in parallel. The process stratum target is
still used as a key in the `target_pid_ptid_regcache_map` map, but
that's it.
There is one spot that needs to be updated outside of the regcache code,
which is the path that handles the "restore PC after a displaced step in
a fork child we're about to detach" case mentioned above.
regcache_test_data needs to be changed to include full-fledged mock
contexts (because there now needs to be inferiors, not just targets).
Change-Id: Id088569ce106e1f194d9ae7240ff436f11c5e123
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
In the upcoming patch to support fork in the amd-dbgapi target, the
amd-dbgapi target will need to be notified of fork events through an
observer, to attach itself (attach in the amd-dbgapi sense, not ptrace
sense) to the new inferior / process.
The reason that this can't be done through target_ops::follow_fork is
that the amd-dbgapi target isn't pushed on the inferior's target stack
right away. It attaches itself to the process and only pushes itself on
its target stack if and when the inferior initializes the ROCm runtime.
If an inferior that is not using the ROCm runtime forks, we want to be
notified of it, so we can attach to the child, and catch if the child
starts using the ROCm runtime.
So, add a new observable and notify it in follow_fork_inferior. It will
be used later in this series.
Change-Id: I67fced5a9cba6d5da72b9c7ea1c8397644ca1d54
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
The upcoming patch to support exec in the amd-dbgapi target needs to
detach amd-dbgapi from the inferior doing the exec and attach amd-dbgapi
to the inferior continuing the execution. They may or may not be the
same, depending on the `set follow-exec-mode` setting. But even if they
are the same, we need to do the detach / attach dance.
With the current observable signature, the observers only receive the
inferior in which execution continues (the "following" inferior).
Change the signature to pass both inferiors, and update all existing
observers.
Change-Id: I259d1ea09f70f43be739378d6023796f2fce2659
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
Make find_thread_ptid (the overload that takes a process_stratum_target)
a method of process_stratum_target.
Change-Id: Ib190a925a83c6b93e9c585dc7c6ab65efbdd8629
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Make find_thread_ptid (the overload that takes an inferior) a method of
struct inferior.
Change-Id: Ie5b9fa623ff35aa7ddb45e2805254fc8e83c9cd4
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
I noticed a couple of places in infrun.c where we call
set_momentary_breakpoint_at_pc, and then set the newly created
breakpoint's thread field, these are in:
insert_exception_resume_breakpoint
insert_exception_resume_from_probe
Function set_momentary_breakpoint_at_pc calls
set_momentary_breakpoint, which always creates the breakpoint as
thread-specific for the current inferior_thread().
The two insert_* functions mentioned above take an arbitrary
thread_info* as an argument and set the breakpoint::thread to hold the
thread number of that arbitrary thread.
However, the insert_* functions store the breakpoint pointer within
the current inferior_thread(), so we know that the thread being passed
in must be the currently selected thread.
What this means is that we can:
1. Assert that the thread being passed in is the currently selected
thread, and
2. No longer adjust the breakpoint::thread field, this will already
have been set correctly be calling set_momentary_breakpoint_at_pc.
There should be no user visible changes after this commit.
|
|
It was pointed out during review of another patch that the function
displaced_step_dump_bytes really isn't specific to displaced stepping,
and should really get a more generic name and move into gdbsupport/.
This commit does just that. The function is renamed to
bytes_to_string and is moved into gdbsupport/common-utils.{cc,h}. The
function implementation doesn't really change. Much...
... I have updated the function to take an array view, which makes it
slightly easier to call in a couple of places where we already have a
gdb::bytes_vector. I've then added an inline wrapper to convert a raw
pointer and length into an array view, which is used in places where
we don't easily have a gdb::bytes_vector (or similar).
Updated all users of displaced_step_dump_bytes.
There should be no user visible changes after this commit.
Finally, I ended up having to add an include of gdb_assert.h into
array-view.h. When I include array-view.h into common-utils.h I ran
into build problems because array-view.h calls gdb_assert.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
While investigating a displaced stepping issue I wanted an easy way to
see what GDB thought the original instruction was, and what
instruction GDB replaced that with when performing the displaced step.
We do print out the address that is being stepped, so I can track down
the original instruction, I just need to go find the information
myself.
And we do print out the bytes of the new instruction, so I can figure
out what the replacement instruction was, but it's not really easy.
Also, the code that prints the bytes of the replacement instruction
only prints 4 bytes, which clearly isn't always going to be correct.
In this commit I remove the existing code that prints the bytes of the
replacement instruction, and add two new blocks of code to
displaced_step_prepare_throw. This new code prints the original
instruction, and the replacement instruction. In each case we print
both the bytes that make up the instruction and the completely
disassembled instruction.
Here's an example of what the output looks like on x86-64 (this is
with 'set debug displaced on'). The two interesting lines contain the
strings 'original insn' and 'replacement insn':
(gdb) step
[displaced] displaced_step_prepare_throw: displaced-stepping 2892655.2892655.0 now
[displaced] displaced_step_prepare_throw: original insn 0x401030: ff 25 e2 2f 00 00 jmp *0x2fe2(%rip) # 0x404018 <puts@got.plt>
[displaced] prepare: selected buffer at 0x401052
[displaced] prepare: saved 0x401052: 1e fa 31 ed 49 89 d1 5e 48 89 e2 48 83 e4 f0 50
[displaced] fixup_riprel: %rip-relative addressing used.
[displaced] fixup_riprel: using temp reg 2, old value 0x7ffff7f8a578, new value 0x401036
[displaced] amd64_displaced_step_copy_insn: copy 0x401030->0x401052: ff a1 e2 2f 00 00 68 00 00 00 00 e9 e0 ff ff ff
[displaced] displaced_step_prepare_throw: prepared successfully thread=2892655.2892655.0, original_pc=0x401030, displaced_pc=0x401052
[displaced] displaced_step_prepare_throw: replacement insn 0x401052: ff a1 e2 2f 00 00 jmp *0x2fe2(%rcx)
[displaced] finish: restored 2892655.2892655.0 0x401052
[displaced] amd64_displaced_step_fixup: fixup (0x401030, 0x401052), insn = 0xff 0xa1 ...
[displaced] amd64_displaced_step_fixup: restoring reg 2 to 0x7ffff7f8a578
0x00007ffff7e402c0 in puts () from /lib64/libc.so.6
(gdb)
One final note. For many targets that support displaced stepping (in
fact all targets except ARM) the replacement instruction is always a
single instruction. But on ARM the replacement could actually be a
series of instructions.
The debug code tries to handle this by disassembling the entire
displaced stepping buffer. Obviously this might actually print more
than is necessary, but there's (currently) no easy way to know how
many instructions to disassemble; that knowledge is all locked in the
architecture specific code. Still I don't think it really hurts, if
someone is looking at this debug then hopefully they known what to
expect.
Obviously we can imagine schemes where the architecture specific
displaced stepping code could communicate back how many bytes its
replacement sequence was, and then our debug print code could use this
to limit the disassembly. But this seems like a lot of effort just to
save printing a few additional instructions in some debug output.
I'm not proposing to do anything about this issue for now.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This commit tweaks displaced_step_finish & friends to pass down a
target_waitstatus instead of a gdb_signal. This is needed because a
patch later in the step-over-{thread-exit,clone] series will want to
make displaced_step_buffers::finish handle
TARGET_WAITKIND_THREAD_EXITED. It also helps with the
TARGET_WAITKIND_THREAD_CLONED patch later in that same series.
It's also a bit more logical this way, as we don't have to pass down
signals when the thread didn't actually stop for a signal. So we can
also think of it as a clean up.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Change-Id: I4c5d338647b028071bc498c4e47063795a2db4c0
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
gdb.reverse/finish-reverse.exp
PPC64 multiple entry points, a normal entry point and an alternate entry
point. The alternate entry point is to setup the Table of Contents (TOC)
register before continuing at the normal entry point. When the TOC is
already valid, the normal entry point is used, this is typically the case.
The alternate entry point is typically referred to as the global entry
point (GEP) in IBM. The normal entry point is typically referred to as
the local entry point (LEP).
When GDB is executing the finish command in reverse, the function
finish_backward currently sets the break point at the alternate entry point.
This issue is if the function, when executing in the forward direction,
entered the function via the normal entry point, execution in the reverse
direction will never sees the break point at the alternate entry point. In
this case, the reverse execution continues until the next break point is
encountered thus stopping at the wrong place.
This patch adds a new address to struct execution_control_state to hold the
address of the alternate entry point (GEP). The finish_backwards function
is updated, if the stopping point is between the normal entry point (LEP)
and the end of the function, a breakpoint is set at the normal entry point.
If the stopping point is between the entry points, a breakpoint is set at
the alternate entry point. This ensures that GDB will always stop at the
normal entry point. If the function did enter via the alternate entry
point, GDB will detect that and continue to execute backwards in the
function until the alternate entry point is reached.
The patch fixes the behavior of the reverse-finish command on PowerPC to
match the behavior of the command on other platforms, specifically X86.
The patch does not change the behavior of the command on X86.
A new test is added to verify the reverse-finish command on PowerPC
correctly stops at the instruction where the function call is made.
The patch fixes 11 regression errors in test gdb.reverse/finish-precsave.exp
and 11 regression errors in test gdb.reverse/finish-reverse.exp.
The patch has been tested on Power 10 and X86 processor with no new
regression failures.
|
|
Replace spaces with tabs in a bunch of places.
Change-Id: If0f87180f1d13028dc178e5a8af7882a067868b0
|
|
As described in the previous commit for this series, I became
concerned that there might be instances in which a QUIT (due to either
a SIGINT or SIGTERM) might not cause execution to return to the top
level. In some (though very few) instances, it is okay to not
propagate the exception for a Ctrl-C / SIGINT, but I don't think that
it is ever okay to swallow the exception caused by a SIGTERM.
Allowing that to happen would definitely be a deviation from the
current behavior in which GDB exits upon receipt of a SIGTERM.
I looked at all cases where an exception handler catches a
gdb_exception. Handlers which did NOT need modification were those
which satisifed one or more of the following conditions:
1) There is no call path to maybe_quit() in the try block. I used a
static analysis tool to help make this determination. In
instances where the tool didn't provide an answer of "yes, this
call path can result in maybe_quit() being called", I reviewed it
by hand.
2) The catch block contains a throw for conditions that it
doesn't want to handle; these "not handled" conditions
must include the quit exception and the new "forced quit" exception.
3) There was (also) a catch for gdb_exception_quit.
Any try/catch blocks not meeting the above conditions could
potentially swallow a QUIT exception.
My first thought was to add catch blocks for gdb_exception_quit and
then rethrow the exception. But Pedro pointed out that this can be
handled without adding additional code by simply catching
gdb_exception_error instead. That's what this patch series does.
There are some oddball cases which needed to be handled differently,
plus the extension languages, but those are handled in later patches.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Tested-by: Tom de Vries <tdevries@suse.de>
Approved-by: Pedro Alves <pedro@palves.net>
|
|
After the previous patches, I believe this observer isn't necessary
anymore for anything. Remove it.
Change-Id: Idb33fb6b6f55589c8c523a92169b3ca95a23d0b9
|
|
With:
- catch a fork in thread 1
- select thread 2
- set follow-fork child
- next
... follow_fork notices that thread 1 had last stopped for a fork
which hasn't been followed yet, and because thread 1 is not the
current thread, GDB aborts the execution command, presenting the stop
in thread 1.
That makes sense, as only the forking thread (thread 1) survives in
the child, so better stop and let the user decide how to proceed.
However, with:
- catch a fork in thread 1
- select thread 2
- set follow-fork parent << note difference here
- next
... GDB does the same: follow_fork notices that thread 1 had last
stopped for a fork which hasn't been followed yet, and because thread
1 is not the current thread, GDB aborts the execution command,
presenting the stop in thread 1.
Aborting/stopping in this case doesn't make sense to me. As we're
following the parent, thread 2 will still continue to exist in the
parent. What the child does after we've followed the parent shouldn't
matter -- it can go on running free, be detached, etc., depending on
"set schedule-multiple", "set detach-on-fork", etc. That does not
influence the execution command that the user issued for the parent
thread.
So this patch changes GDB in that direction -- in follow_fork, if
following the parent, and we've switched threads meanwhile, switch
back to the unfollowed thread, follow it (stay with the parent), and
don't abort/stop. If we're following a fork (as opposed to vfork),
then switch back again to the thread that the user was trying to
resume. If following a vfork, however, stay with the vforking-thread
selected, as we will need to see a vfork_done event first, before we
can resume any other thread.
As I was working on this, I managed to end up calling target_resume
for a solo-thread resume (to collect the vfork_done event), with
scope_ptid pointing at the vfork parent thread, and inferior_ptid
pointing to the vfork child. For a solo-thread resume, the scope_ptid
argument to target_resume must the same as inferior_ptid. The mistake
was caught by the assertion in target_resume, like so:
...
[infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [1722839.1722839.0] at 0x5555555553c3
[infrun] do_target_resume: resume_ptid=1722839.1722939.0, step=0, sig=GDB_SIGNAL_0
../../src/gdb/target.c:2661: internal-error: target_resume: Assertion `inferior_ptid.matches (scope_ptid)' failed.
...
but I think it doesn't hurt to catch such a mistake earlier, hence the
change in internal_resume_ptid.
Change-Id: I896705506a16d2488b1bfb4736315dd966f4e412
|
|
Currently, if
- you're in all-stop mode,
- the inferior last stopped because of a fork catchpoint,
when you next resume the program, gdb checks whether it had last
stopped for a fork/vfork, and if so,
a) if the current thread is the one that forked, gdb follows the
parent/child, depending on "set follow-fork" mode.
b) if the current thread is some other thread (because you switched
threads meanwhile), gdb switches back to that thread, gdb follows
the parent/child, and stops the resumption command.
There's a problem in b), however -- if you have "set schedule-multiple
off", which is the default, or "set scheduler-locking on", gdb will
still switch back to the forking thread, even if you didn't want to
resume it. For example, with:
(gdb) catch fork
(gdb) c
* thread 1 stops for fork
(gdb) thread 2
(gdb) set scheduler-locking on
(gdb) c
gdb switches back to thread 1, and follows the fork.
Or with:
(gdb) add-inferior -exec prog
(gdb) inferior 2
(gdb) start
(gdb) inferior 1
(gdb) catch fork
(gdb) c
* thread 1.1 stops for fork
(gdb) inferior 2
(gdb) set schedule-multiple off # this is the default
(gdb) c
gdb switches back to thread 1.1, and follows the fork.
Another issue is that, because follow_fork relies on
get_last_target_status to find the thread that has a pending fork, it
is possible to confuse it. For example, "run" or "start" call
init_wait_for_inferior, which clears the last target status, so this:
(gdb) catch fork
(gdb) c
* thread 1 stops for fork
(gdb) add-inferior -exec prog
(gdb) inferior 2
(gdb) start
(gdb) set follow-fork child
(gdb) inferior 1
(gdb) n
... does not follow to the fork child of inferior 1, because the
get_last_target_status call in follow_fork doesn't return a
TARGET_WAITKIND_FORKED. Thanks to Simon for this example.
All of the above are fixed by this patch. It changes follow_fork to
not look at get_last_target_status, but to instead iterate over the
set of threads that the user is resuming, and find the one that has a
pending_follow kind of fork/vfork.
gdb.base/foll-fork.exp is augmented to exercise the last "start"
scenario described above. The other cases will be exercised in the
testcase added by the following patch.
Change-Id: Ifcca77e7b2456277387f40660ef06cec2b93b97e
|
|
With gdb.base/catch-follow-exec.exp, we currently see:
~~~~~~~~~~~~~~~
(gdb)
continue
Continuing.
process 693251 is executing new program: /usr/bin/ls
[New inferior 2]
[New process 693251]
[Switching to process 693251]
Thread 2.1 "ls" hit Catchpoint 2 (exec'd /usr/bin/ls), 0x00007ffff7fd0100 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb)
info prog
No selected thread.
~~~~~~~~~~~~~~~
Note the "No selected thread" output. That is totally bogus, because
there _is_ a selected thread. What GDB really means, is that it can't
find the thread that had the latest (user-visible) stop. And that
happens because "info program" gets that info from
get_last_target_status, and the last target status has been cleared.
However, GDB also checks if there is a selected thread, here:
if (ptid == null_ptid || ptid == minus_one_ptid)
error (_("No selected thread."));
.. the null_ptid part. That is also bogus, because what matters is
the thread that last reported a stop, not the current thread:
- in all-stop mode, "info program" displays info about the last stop.
That may have happened on a thread different from the selected
thread.
- in non-stop mode, because all threads are controlled individually,
"info program" shows info about the last stop of the selected
thread.
The current code already behaves this way, though in a poor way. This
patch reimplements it, such that the all-stop version now finds the
thread that last reported an event via the 'previous_thread' strong
reference. Being a strong reference means that if that thread has
exited since the event was reported, 'previous_thread' will still
point to it, so we can say that the thread exited meanwhile.
The patch also extends "info program" output a little, to let the user
know which thread we are printing info for. For example, for the
gdb.base/catch-follow-exec.exp case we shown above, we now get:
(gdb) info prog
Last stopped for thread 2.1 (process 710867).
Using the running image of child process 710867.
Program stopped at 0x7ffff7fd0100.
It stopped at breakpoint 2.
Type "info stack" or "info registers" for more information.
(gdb)
while in non-stop mode, we get:
(gdb) info prog
Selected thread 2.1 (process 710867).
Using the running image of child process 710867.
Program stopped at 0x7ffff7fd0100.
It stopped at breakpoint 2.
Type "info stack" or "info registers" for more information.
(gdb)
In both cases, the first line of output is new.
The existing code considered these running/exited cases as an error,
but I think that that's incorrect, since this is IMO just plain
execution info as well. So the patch makes those cases regular
prints, not errors.
If the thread is running, we get, in non-stop mode:
(gdb) info prog
Selected thread 2.1 (process 710867).
Selected thread is running.
... and in all-stop:
(gdb) info prog
Last stopped for thread 2.1 (process 710867).
Thread is now running.
If the thread has exited, we get, in non-stop mode:
(gdb) info prog
Selected thread 2.1 (process 710867).
Selected thread has exited.
... and in all-stop:
(gdb) info prog
Last stopped for thread 2.1 (process 710867).
Thread has since exited.
The gdb.base/info-program.exp testcase was much extended to test
all-stop/non-stop and single-threaded/multi-threaded.
Change-Id: I51d9d445f772d872af3eead3449ad4aa445781b1
|
|
I originally wrote this patch, because while working on some other
patch, I spotted a regression in the
gdb.multi/multi-target-no-resumed.exp.exp testcase. Debugging the
issue, I realized that the problem was related to how I was using
previous_inferior_ptid to look up the thread the user had last
selected. The problem is that previous_inferior_ptid alone doesn't
tell you which target that ptid is from, and I was just always using
the current target, which was incorrect. Two different targets may
have threads with the same ptid.
I decided to fix this by replacing previous_inferior_ptid with a
strong reference to the thread, called previous_thread.
I have since found a new motivation for this change -- I would like to
tweak "info program" to not rely on get_last_target_status returning a
ptid that still exists in the thread list. With both the follow_fork
changes later in this series, and the step-over-thread-exit changes,
that can happen, as we'll delete threads and not clear the last
waitstatus.
A new update_previous_thread function is added that can be used to
update previous_thread from inferior_ptid. This must be called in
several places that really want to get rid of previous_thread thread,
and reset the thread id counter, otherwise we get regressions like
these:
(gdb) info threads -gid
Id GId Target Id Frame
- * 1 1 Thread 2974541.2974541 "tids-gid-reset" main () at src/gdb/testsuite/gdb.multi/tids-gid-reset.c:21
- (gdb) PASS: gdb.multi/tids-gid-reset.exp: single-inferior: after restart: info threads -gid
+ * 1 2 Thread 2958361.2958361 "tids-gid-reset" main () at src/gdb/testsuite/gdb.multi/tids-gid-reset.c:21
+ (gdb) FAIL: gdb.multi/tids-gid-reset.exp: single-inferior: after restart: info threads -gid
and:
Core was generated by `build/gdb/testsuite/outputs/gdb.reverse/sigall-precsave/si'.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0 gen_ABRT () at src/gdb/testsuite/gdb.reverse/sigall-reverse.c:398
398 kill (getpid (), SIGABRT);
+[Current thread is 1 (LWP 2662066)]
Restored records from core file build/gdb/testsuite/outputs/gdb.reverse/sigall-precsave/sigall.precsave.
#0 gen_ABRT () at src/gdb/testsuite/gdb.reverse/sigall-reverse.c:398
398 kill (getpid (), SIGABRT);
continue
Continuing.
-Program received signal SIGABRT, Aborted.
+Thread 1 received signal SIGABRT, Aborted.
0x00007ffff7dfd55b in kill () at ../sysdeps/unix/syscall-template.S:78
78 ../sysdeps/unix/syscall-template.S: No such file or directory.
-(gdb) PASS: gdb.reverse/sigall-precsave.exp: sig-test-1: get signal ABRT
+(gdb) FAIL: gdb.reverse/sigall-precsave.exp: sig-test-1: get signal ABRT
I.e., GDB was failing to restart the thread counter back to 1, because
the previous_thread thread was being help due to the strong reference.
Tested on GNU/Linux native, gdbserver and gdbserver + "maint set
target-non-stop on".
gdb/ChangeLog:
yyyy-mm-dd Pedro Alves <pedro@palves.net>
* infcmd.c (kill_command, detach_command, disconnect_command):
Call update_previous_thread.
* infrun.c (previous_inferior_ptid): Delete.
(previous_thread): New.
(update_previous_thread): New.
(proceed, init_wait_for_inferior): Call update_previous_thread.
(normal_stop): Adjust to compare previous_thread and
inferior_thread. Call update_previous_thread.
* infrun.h (update_previous_thread): Declare.
* target.c (target_pre_inferior, target_preopen): Call
update_previous_thread.
Change-Id: I42779a1ee51a996fa1e8f6e1525c6605dbfd42c7
|
|
This removes ALL_BLOCK_SYMBOLS in favor of foreach.
|
|
I noticed that if Ctrl-C was typed just while GDB is evaluating a
breakpoint condition in the background, and GDB ends up reaching out
to the Python interpreter, then the breakpoint condition would still
fail, like:
c&
Continuing.
(gdb) Error in testing breakpoint condition:
Quit
That happens because while evaluating the breakpoint condition, we
enter Python, and end up calling PyErr_SetInterrupt (it's called by
gdbpy_set_quit_flag, in frame #0):
(top-gdb) bt
#0 gdbpy_set_quit_flag (extlang=0x558c68f81900 <extension_language_python>) at ../../src/gdb/python/python.c:288
#1 0x0000558c6845f049 in set_quit_flag () at ../../src/gdb/extension.c:785
#2 0x0000558c6845ef98 in set_active_ext_lang (now_active=0x558c68f81900 <extension_language_python>) at ../../src/gdb/extension.c:743
#3 0x0000558c686d3e56 in gdbpy_enter::gdbpy_enter (this=0x7fff2b70bb90, gdbarch=0x558c6ab9eac0, language=0x0) at ../../src/gdb/python/python.c:212
#4 0x0000558c68695d49 in python_on_memory_change (inferior=0x558c6a830b00, addr=0x555555558014, len=4, data=0x558c6af8a610 "") at ../../src/gdb/python/py-inferior.c:146
#5 0x0000558c6823a071 in std::__invoke_impl<void, void (*&)(inferior*, unsigned long, long, unsigned char const*), inferior*, unsigned long, long, unsigned char const*> (__f=@0x558c6a8ecd98: 0x558c68695d01 <python_on_memory_change(inferior*, CORE_ADDR, ssize_t, bfd_byte const*)>) at /usr/include/c++/11/bits/invoke.h:61
#6 0x0000558c68237591 in std::__invoke_r<void, void (*&)(inferior*, unsigned long, long, unsigned char const*), inferior*, unsigned long, long, unsigned char const*> (__fn=@0x558c6a8ecd98: 0x558c68695d01 <python_on_memory_change(inferior*, CORE_ADDR, ssize_t, bfd_byte const*)>) at /usr/include/c++/11/bits/invoke.h:111
#7 0x0000558c68233e64 in std::_Function_handler<void (inferior*, unsigned long, long, unsigned char const*), void (*)(inferior*, unsigned long, long, unsigned char const*)>::_M_invoke(std::_Any_data const&, inferior*&&, unsigned long&&, long&&, unsigned char const*&&) (__functor=..., __args#0=@0x7fff2b70bd40: 0x558c6a830b00, __args#1=@0x7fff2b70bd38: 93824992247828, __args#2=@0x7fff2b70bd30: 4, __args#3=@0x7fff2b70bd28: 0x558c6af8a610 "") at /usr/include/c++/11/bits/std_function.h:290
#8 0x0000558c6830a96e in std::function<void (inferior*, unsigned long, long, unsigned char const*)>::operator()(inferior*, unsigned long, long, unsigned char const*) const (this=0x558c6a8ecd98, __args#0=0x558c6a830b00, __args#1=93824992247828, __args#2=4, __args#3=0x558c6af8a610 "") at /usr/include/c++/11/bits/std_function.h:590
#9 0x0000558c6830a620 in gdb::observers::observable<inferior*, unsigned long, long, unsigned char const*>::notify (this=0x558c690828c0 <gdb::observers::memory_changed>, args#0=0x558c6a830b00, args#1=93824992247828, args#2=4, args#3=0x558c6af8a610 "") at ../../src/gdb/../gdbsupport/observable.h:166
#10 0x0000558c68309d95 in write_memory_with_notification (memaddr=0x555555558014, myaddr=0x558c6af8a610 "", len=4) at ../../src/gdb/corefile.c:363
#11 0x0000558c68904224 in value_assign (toval=0x558c6afce910, fromval=0x558c6afba6c0) at ../../src/gdb/valops.c:1190
#12 0x0000558c681e3869 in expr::assign_operation::evaluate (this=0x558c6af8e150, expect_type=0x0, exp=0x558c6afcfe60, noside=EVAL_NORMAL) at ../../src/gdb/expop.h:1902
#13 0x0000558c68450c89 in expr::logical_or_operation::evaluate (this=0x558c6afab060, expect_type=0x0, exp=0x558c6afcfe60, noside=EVAL_NORMAL) at ../../src/gdb/eval.c:2330
#14 0x0000558c6844a896 in expression::evaluate (this=0x558c6afcfe60, expect_type=0x0, noside=EVAL_NORMAL) at ../../src/gdb/eval.c:110
#15 0x0000558c6844a95e in evaluate_expression (exp=0x558c6afcfe60, expect_type=0x0) at ../../src/gdb/eval.c:124
#16 0x0000558c682061ef in breakpoint_cond_eval (exp=0x558c6afcfe60) at ../../src/gdb/breakpoint.c:4971
...
The fix is to disable cooperative SIGINT handling while handling
inferior events, so that SIGINT is saved in the global quit flag, and
not in the extension language, while handling an event.
This commit augments the testcase added by the previous commit to test
this scenario as well.
Approved-By: Tom Tromey <tom@tromey.com>
Change-Id: Idf8ab815774ee6f4b45ca2d0caaf30c9b9f127bb
|
|
This implements what I suggested here:
https://inbox.sourceware.org/gdb-patches/ab97c553-f406-b094-cdf3-ba031fdea925@palves.net/
Here is the current default quit_handler, a function that ends up
called by the QUIT macro:
void
default_quit_handler (void)
{
if (check_quit_flag ())
{
if (target_terminal::is_ours ())
quit ();
else
target_pass_ctrlc ();
}
}
As we can see above, when the inferior is running in the foreground,
then a Ctrl-C is translated into a call to target_pass_ctrlc().
The target_terminal::is_ours() case above is there to handle the
scenario where GDB has the terminal, meaning it is handling some
command the user typed, like "list", or "p a + b" or some such.
However, when the inferior is running on the background, say with
"c&", GDB also has the terminal. Run control handling is now done in
the "background". The CLI is responsive to user commands. If users
type Ctrl-C, they're expecting it to interrupt whatever command they
next type in the CLI, which again, could be "list", "p a + b", etc.
It's as if background run control was handled by a separate thread,
and the Ctrl-C is meant to go to the main thread, handling the CLI.
However, when handling an event, inside fetch_inferior_event &
friends, a Ctrl-C _also_ results in a Quit exception, from the same
default_quit_handler function shown above. This quit aborts run
control handling, breakpoint condition evaluation, etc., and may even
leave run control in an inconsistent state.
The testcase added by this patch illustrates this. The test program
just loops a number of times calling the "foo" function.
The idea is to set a breakpoint in the "foo" function with a condition
that sends SIGINT to GDB, and then evaluates to false, which results
in the program being re-resumed in the background. The SIGINT-sending
emulates pressing Ctrl-C just while GDB was evaluating the breakpoint
condition, except, it's more deterministic.
It looks like this:
(gdb) p $counter = 0
$1 = 0
(gdb) b foo if $counter++ == 10 || $_shell("kill -SIGINT `pidof gdb`") != 0
Breakpoint 2 at 0x555555555131: file gdb.base/bg-exec-sigint-bp-cond.c, line 21.
(gdb) c&
Continuing.
(gdb)
After that background continue, the breakpoint should be hit 10 times,
and we should see 10 "Quit" being printed on the screen. As if the
user typed Ctrl-C on the prompt a number of times with no inferior
running:
(gdb) <<< Ctrl-C
(gdb) Quit <<< Ctrl-C
(gdb) Quit <<< Ctrl-C
(gdb)
However, here's what you see instead:
(gdb) c&
Continuing.
(gdb) Quit
(gdb)
Just one Quit, and nothing else. If we look at the thread's state, we see:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff7d6f740 (LWP 112192) "bg-exec-sigint-" foo () at gdb.base/bg-exec-sigint-bp-cond.c:21
So the thread stopped, but we didn't report a stop...
Issuing another continue shows the same immediate-and-silent-stop:
(gdb) c&
Continuing.
(gdb) Quit
(gdb) p $counter
$2 = 2
As mentioned, since the run control handling, and breakpoint and
watchpoint evaluation, etc. are running in the background from the
perspective of the CLI, when users type Ctrl-C in this situation,
they're thinking of aborting whatever other command they were typing
or running at the prompt, not the run control side, not the previous
"c&" command.
So I think that we should install a custom quit_handler while inside
fetch_inferior_event, where we already disable pagination and other
things for a similar reason. This custom quit handler does nothing if
GDB has the terminal, and forwards Ctrl-C to the inferior otherwise.
With the patch implementing that, and the same testcase, here's what
you see instead:
(gdb) p $counter = 0
$1 = 0
(gdb) b foo if $counter++ == 10 || $_shell("kill -SIGINT `pidof gdb`") != 0
Breakpoint 2 at 0x555555555131: file gdb.base/bg-exec-sigint-bp-cond.c, line 21.
(gdb) c&
Continuing.
(gdb) Quit
(gdb) Quit
(gdb) Quit
(gdb) Quit
(gdb) Quit
(gdb) Quit
(gdb) Quit
(gdb) Quit
(gdb) Quit
(gdb) Quit
(gdb)
Breakpoint 2, foo () at gdb.base/bg-exec-sigint-bp-cond.c:21
21 return 0;
Approved-By: Tom Tromey <tom@tromey.com>
Change-Id: I1f10d99496a7d67c94b258e45963e83e439e1778
|
|
This turns many functions that are related to optimized-out or
availability-checking to be methods of value. The static function
value_entirely_covered_by_range_vector is also converted to be a
private method.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This turns value_contents_raw, value_contents_writeable, and
value_contents_all_raw into methods on value. The remaining functions
will be changed later in the series; they were a bit trickier and so I
didn't include them in this patch.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This turns allocate_computed_value into a static "constructor" of
value.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes allocate_value to be a static "constructor" of value.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes value_offset to be a method of value. Much of this patch
was written by script.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes value_type to be a method of value. Much of this patch
was written by script.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This function is now trivial, we can just copy inferior::tdesc_info
where needed.
Change-Id: I25185e2cd4ba1ef24a822d9e0eebec6e611d54d6
|
|
This reverts commit b22548ddb30bfb167708e82d3bb932461c1b703a.
This patch is being reverted since the patch series is causing regressions.
|
|
gdb.reverse/finish-reverse.exp"
This reverts commit 92e07580db6a5572573d5177ca23933064158f89.
Reverting patch as the patch series is causing regressions.
|
|
gdb.reverse/finish-reverse.exp
PR record/29927 - reverse-finish requires two reverse next instructions to
reach previous source line
PowerPC uses two entry points called the local entry point (LEP) and the
global entry point (GEP). Normally the LEP is used when calling a
function. However, if the table of contents (TOC) value in register 2 is
not valid the GEP is called to setup the TOC before execution continues at
the LEP. When executing in reverse, the function finish_backward sets the
break point at the alternate entry point (GEP). However if the forward
execution enters via the normal entry point (LEP), the reverse execution
never sees the break point at the GEP of the function. Reverse execution
continues until the next break point is encountered or the end of the
recorded log is reached causing gdb to stop at the wrong place.
This patch adds a new address to struct execution_control_state to hold the
address of the alternate function start address, known as the GEP on
PowerPC. The finish_backwards function is updated. If the stopping point
is between the two entry points (the LEP and GEP on PowerPC), the stepping
range is set to execute back to the alternate entry point (GEP on PowerPC).
Otherwise, a breakpoint is inserted at the normal entry point (LEP on
PowerPC).
Function process_event_stop_test checks uses a stepping range to stop
execution in the caller at the first instruction of the source code line.
Note, on systems that only support one entry point, the address of the two
entry points are the same.
Test finish-reverse-next.exp is updated to include tests for the
reverse-finish command when the function is entered via the normal entry
point (i.e. the LEP) and the alternate entry point (i.e. the GEP).
The patch has been tested on X86 and PowerPC with no regressions.
|