Age | Commit message (Collapse) | Author | Files | Lines |
|
Prior to this patch, it's not possible for GDB to debug GPU code in fork
children or after an exec. The amd-dbgapi target attaches to processes
when an inferior appears due to a "run" or "attach" command, but not
after a fork or exec. This patch adds support for that, such that it's
possible to for an inferior to fork and for GDB to debug the GPU code in
the child.
To achieve that, use the inferior_forked and inferior_execd observers.
In the case of fork, we have nothing to do if `child_inf` is nullptr,
meaning that GDB won't debug the child. We also don't attach if the
inferior has vforked. We are already attached to the parent's address
space, which is shared with the child, so trying to attach would cause
problems. And anyway, the inferior can't do anything other than exec or
exit, it certainly won't start GPU kernels before exec'ing.
In the case of exec, we detach from the exec'ing inferior and attach to
the following inferior. This works regardless of whether they are the
same or not. If they are the same, meaning the execution continues in
the existing inferior, we need to do a detach/attach anyway, as
amd-dbgapi needs to be aware of the new address space created by the
exec.
Note that we use observers and not target_ops::follow_{fork,exec} here.
When the amd-dbgapi target is compiled in, it will attach (in the
amd_dbgapi_process_attach sense, not the ptrace sense) to native
inferiors when they appear, but won't push itself on the inferior's
target stack just yet. It only pushes itself if the inferior
initializes the ROCm runtime. So, if a non-GPU-using inferior calls
fork, an amd_dbgapi_target::follow_fork method would not get called.
Same for exec. A previous version of the code had the amd-dbgapi target
pushed all the time, in which case we could use the target methods. But
we prefer having the target pushed only when necessary, it's less
intrusive when doing native debugging that doesn't involve the GPU.
Change-Id: I5819c151c371120da8bab2fa9cbfa8769ba1d6f9
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
The problem explained and fixed in the previous patch could have also
been fixed by this patch. But I think it's good change anyhow, that
could prevent future bugs, so here it is.
fetch_inferior_event switches to an arbitrary (in practice, the first) inferior
of the process target of the inferior used to fetch the event. The idea is
that the event handling code will need to do some target calls, so we want to
switch to an inferior that has target target.
However, you can have two inferiors that share a process target, but with one
inferior having an additional target on top:
inf 1 inf 2
----- -----
another target
process target process target
exec exec
Let's say inferior 2 is selected by do_target_wait and returns an event that is
really synthetized by "another target". This "another target" could be a
thread or record stratum target (in the case explained by the previous patch,
it was the arch stratum target, but it's because the amd-dbgapi abuses the arch
layer). fetch_inferior_event will then switch to the first inferior with
"process target", so inferior 1. handle_signal_stop then tries to fetch the
thread's registers:
ecs->event_thread->set_stop_pc
(regcache_read_pc (get_thread_regcache (ecs->event_thread)));
This will try to get the thread's register by calling into the current target
stack, the stack of inferior 1. This is problematic because "another target"
might have a special fetch_registers implementation.
I think it would be a good idea to switch to the inferior for which the
even was reported, not just some inferior of the same process target.
This will ensure that any target call done before we eventually call
context_switch will be done on the full target stack that reported the
event.
Not all events are associated to an inferior though. For instance,
TARGET_WAITKIND_NO_RESUMED. In those cases, some targets return
null_ptid, some return minus_one_ptid (ideally the expected return value
should be clearly defined / documented). So, if the ptid returned is
either of these, switch to an arbitrary inferior with that process
target, as before.
Change-Id: I1ffc8c1095125ab591d0dc79ea40025b1d7454af
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
With the following patch, which teaches the amd-dbgapi target to handle
inferiors that fork, we end up with target stacks in the following
state, when an inferior that does not use the GPU forks an inferior that
eventually uses the GPU.
inf 1 inf 2
----- -----
amd-dbgapi
linux-nat linux-nat
exec exec
When a GPU thread from inferior 2 hits a breakpoint, the following
sequence of events would happen, if it was not for the current patch.
- we start with inferior 1 as current
- do_target_wait_1 makes inferior 2 current, does a target_wait, which
returns a stop event for an amd-dbgapi wave (thread).
- do_target_wait's scoped_restore_current_thread restores inferior 1 as
current
- fetch_inferior_event calls switch_to_target_no_thread with linux-nat
as the process target, since linux-nat is officially the process
target of inferior 2. This makes inferior 1 the current inferior, as
it's the first inferior with that target.
- In handle_signal_stop, we have:
ecs->event_thread->suspend.stop_pc
= regcache_read_pc (get_thread_regcache (ecs->event_thread));
context_switch (ecs);
regcache_read_pc executes while inferior 1 is still the current one
(because it's before the `context_switch`). This is a problem,
because the regcache is for a ptid managed by the amd-dbgapi target
(e.g. (12345, 1, 1)), a ptid that does not make sense for the
linux-nat target. The fetch_registers target call goes directly
to the linux-nat target, which gets confused.
- We would then get an error like:
Couldn't get extended state status: No such process.
... since linux-nat tries to do a ptrace call on tid 1.
GDB should switch to the inferior the ptid belongs to before doing the
target call to fetch registers, to make sure the call hits the right
target stack (it should be handled by the amd-dbgapi target in this
case). In fact the following patch does this change, and it would be
enough to fix this specific problem.
However, I propose to change regcache to make it switch to the right
inferior, if needed, before doing target calls. That makes the
interface as a whole more independent of the global context.
My first attempt at doing this was to find an inferior using the process
stratum target and the ptid that regcache already knows about:
gdb::optional<scoped_restore_current_thread> restore_thread;
inferior *inf = find_inferior_ptid (this->target (), this->ptid ());
if (inf != current_inferior ())
{
restore_thread.emplace ();
switch_to_inferior_no_thread (inf);
}
However, this caused some failures in fork-related tests and gdbserver
boards. When we detach a fork child, we may create a regcache for the
child, but there is no corresponding inferior. For instance, to restore
the PC after a displaced step over the fork syscall. So
find_inferior_ptid would return nullptr, and
switch_to_inferior_no_thread would hit a failed assertion.
So, this patch adds to regcache the information "the inferior to switch
to to makes target calls". In typical cases, it will be the inferior
that matches the regcache's ptid. But in some cases, like the detached
fork child one, it will be another inferior (in this example, it will be
the fork parent inferior).
The problem that we witnessed was in regcache::raw_update specifically,
but I looked for other regcache methods doing target calls, and added
the same inferior switching code to raw_write too.
In the regcache constructor and in get_thread_arch_aspace_regcache,
"inf_for_target_calls" replaces the process_stratum_target parameter.
We suppose that the process stratum target that would be passed
otherwise is the same that is in inf_for_target_calls's target stack, so
we don't need to pass both in parallel. The process stratum target is
still used as a key in the `target_pid_ptid_regcache_map` map, but
that's it.
There is one spot that needs to be updated outside of the regcache code,
which is the path that handles the "restore PC after a displaced step in
a fork child we're about to detach" case mentioned above.
regcache_test_data needs to be changed to include full-fledged mock
contexts (because there now needs to be inferiors, not just targets).
Change-Id: Id088569ce106e1f194d9ae7240ff436f11c5e123
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
Add the maybe_switch_inferior function, which ensures that the given
inferior is the current one. Return an instantiated
scoped_restore_current_thread object only we actually needed to switch
inferior.
Returning a scoped_restore_current_thread requires it to be
move-constructible, so give it a move constructor.
Change-Id: I1231037102ed6166f2530399e8257ad937fb0569
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
The regcache class takes a process_stratum_target and then exposes it
through regcache::target. But it doesn't use it itself, suggesting it
doesn't really make sense to put it there. The only user of
regcache::target is record_btrace_target::fetch_registers, but it might
as well just get it from the current target stack. This simplifies a
little bit a patch later in this series.
Change-Id: I8878d875805681c77f469ac1a2bf3a508559a62d
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
In the upcoming patch to support fork in the amd-dbgapi target, the
amd-dbgapi target will need to be notified of fork events through an
observer, to attach itself (attach in the amd-dbgapi sense, not ptrace
sense) to the new inferior / process.
The reason that this can't be done through target_ops::follow_fork is
that the amd-dbgapi target isn't pushed on the inferior's target stack
right away. It attaches itself to the process and only pushes itself on
its target stack if and when the inferior initializes the ROCm runtime.
If an inferior that is not using the ROCm runtime forks, we want to be
notified of it, so we can attach to the child, and catch if the child
starts using the ROCm runtime.
So, add a new observable and notify it in follow_fork_inferior. It will
be used later in this series.
Change-Id: I67fced5a9cba6d5da72b9c7ea1c8397644ca1d54
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
The upcoming patch to support exec in the amd-dbgapi target needs to
detach amd-dbgapi from the inferior doing the exec and attach amd-dbgapi
to the inferior continuing the execution. They may or may not be the
same, depending on the `set follow-exec-mode` setting. But even if they
are the same, we need to do the detach / attach dance.
With the current observable signature, the observers only receive the
inferior in which execution continues (the "following" inferior).
Change the signature to pass both inferiors, and update all existing
observers.
Change-Id: I259d1ea09f70f43be739378d6023796f2fce2659
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
This adds support for 128-bit integers to the Ada parser.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30188
|
|
These helper functions in the Ada parser don't seem all that
worthwhile to me, so this patch removes them.
|
|
This adds an overload of fits_in_type that accepts a gdb_mpz. A
subsequent patch will use this.
|
|
This adds support for 128-bit integers to the Rust parser.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=21185
|
|
This changes long_const_operation to use gdb_mpz for its storage.
|
|
In preparation for adding more 128-bit support to gdb, a few additions
to gdb_mpz are needed.
First, this adds a new 'as_integer_truncate' method. This method
works like 'as_integer' but does not require the value to fit in the
target type -- it just truncates.
Second, gdb_mpz::export_bits is changed to handle the somewhat unusual
situation of zero-length types. This can happen for a Rust '()' type;
but I think other languages have zero-bit integer types as well.
Finally, this adds some operator== overloads.
|
|
With DWARF 5, it's possible to produce an empty file name in the File Name
Table of the .debug_line section:
...
The File Name Table (offset 0x112, lines 1, columns 2):
Entry Dir Name
0 1 (indirect line string, offset: 0x2d):
...
Currently, when gdb reads an exec containing such debug info, it segfaults:
...
Thread 1 "gdb" received signal SIGSEGV, Segmentation fault.
0x000000000072cd38 in dwarf2_start_subfile (cu=0x2badc50, fe=..., lh=...) at \
gdb/dwarf2/read.c:18716
18716 if (!IS_ABSOLUTE_PATH (filename) && dirname != NULL)
...
because read_direct_string transforms "" into a nullptr, and we end up
dereferencing the nullptr.
Note that the behaviour of read_direct_string has been present since repo
creation.
Fix this in read_formatted_entries, by transforming nullptr filenames in to ""
filenames.
Tested on x86_64-linux.
Reviewed-By: Tom Tromey <tom@tromey.com>
PR symtab/30357
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30357
|
|
This commit changes mi_make_breakpoint_pending to accept the 'script'
and 'times' arguments.
I've then added a new test that makes use of 'scripts' in
gdb.mi/mi-pending.exp and gdb.mi/mi-dprintf-pending.exp.
There is already a test in gdb.mi/mi-pending.exp that uses the 'times'
argument -- previously this argument was being ignored, but is now
used.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Commit:
commit c569a946f6925d3f210c3eaf74dcda56843350ef
Date: Fri Mar 24 10:45:37 2023 +0100
[gdb/testsuite] Fix unbalanced quotes in mi_expect_stop argument
Introduced the use of {"} in mi-support.exp. There is absolutely
nothing wrong with this in any way. However, this is causing my
editor to get the syntax highlighting of this file wrong after this
point.
Maybe the real answer is to use a better editor, or fix my current
editor.... but I'm hoping I can instead take the lazy approach of just
changing {"} to "\"", which is handled fine, and means exactly the
same as far as I understand it.
There should be no change in what is tested after this commit.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Older gdb's (9, 10, 11 and 12) have a bug that causes them to crash whenever
a target reports the pauth feature string in the target description and also
provide additional register outside of gdb's known and expected feature
strings.
This was fixed in gdb 13 onwards, but that means we're stuck with gdb's out
there that will crash on connection to the above targets.
QEMU has postponed inclusion of the pauth feature string in version 8, and
instead we agreed to use a new feature name to prevent crashing those older
gdb's.
Initially there was a plan to backport a trivial fix all the way to gdb 9, but
given QEMU's choice, this is no longer needed.
This new feature string is org.gnu.gdb.aarch64.pauth_v2, and should be used
by all targets going forward, except native linux gdb and gdbserver, for
backwards compatibility with older gdb's/gdbserver's.
gdb/gdbserver will still emit the old feature string for Linux since it doesn't
report additional system registers and thus doesn't cause a crash of older
gdb's. We can revisit this in the future once the problematic gdb's are likely
no longer in use.
I've added some documentation to explain the situation.
|
|
The Arm Architecture Reference Manual defines debug version 0b1010 for
FEAT_Debugv8p8. This is used to identify valid hardware debug registers.
gdb currently only knows about versions up to FEAT_Debugv8p4. This patch
teaches gdb about this new version.
No visible changes should happen as consequence of this patch, but in the
future gdb will be able to identify debug registers in newer hardware.
Regression-tested on aarch64-linux Ubuntu 20.04/22.04.
|
|
(1) Description of problem
In the current code, when execute the following test on LoongArch:
$make check-gdb TESTS="gdb.base/dump.exp"
```
FAIL: gdb.base/dump.exp: dump array as value, intel hex
FAIL: gdb.base/dump.exp: dump struct as value, intel hex
FAIL: gdb.base/dump.exp: dump array as memory, ihex
FAIL: gdb.base/dump.exp: dump struct as memory, ihex
```
These tests passed on the X86_64,
(2) Root cause
On LoongArch, variable intarray address 0x120008068 out of range for IHEX,
so dump ihex test failed.
gdb.base/dump.exp has the following code to check 64-bit address
```
# Check the address of a variable. If it is bigger than 32-bit,
# assume our target has 64-bit addresses that are not supported by SREC,
# IHEX and TEKHEX. We skip those tests then.
set max_32bit_address "0xffffffff"
set data_address [get_hexadecimal_valueof "&intarray" 0x100000000]
if {${data_address} > ${max_32bit_address}} {
set is64bitonly "yes"
}
```
We check the "&intarray" on different target as follow:
```
$gdb gdb/testsuite/outputs/gdb.base/dump/dump
...
(gdb) start
...
On X86_64:
(gdb) print /x &intarray
$1 = 0x404060
On LoongArch:
(gdb) print /x &intarray
$1 = 0x120008068
```
The variable address difference here is due to the link script
of linker.
```
On X86_64:
$ld --verbose
...
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000));
. = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
On LoongArch:
$ld --verbose
...
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x120000000));
. = SEGMENT_START("text-segment", 0x120000000) + SIZEOF_HEADERS;
```
(3) How to fix
Because 64-bit variable address out of range for IHEX, it's not an
functional problem for LoongArch. Refer to the handling of 64-bit
targets in this testsuite, use the "is64bitonly" flag to skip those
tests for the target has 64-bit addresses.
Signed-off-by: Hui Li <lihui@loongson.cn>
Approved-By: Tom Tromey <tom@tromey.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
|
|
Add regression tests for PR30325, one for the asm window and one for the
source window.
Use maint set tui-left-margin verbose to make the extend of the left margin
clear.
Tested on x86_64-linux.
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
PR gdb/29257 points out a possible double free when debuginfod is in
use. Aside from some ugly warts in the symbol code (an ongoing
issue), the underlying issue in this particular case is that elfread.c
seems to assume that symfile_bfd_open will return NULL on error,
whereas in reality it throws an exception. As this code isn't
prepared for an exception, bad things result.
This patch fixes the problem by introducing a non-throwing variant of
symfile_bfd_open and using it in the affected places.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29257
|
|
The m_digits member of tui_source_window is documented as having semantics:
...
/* How many digits to use when formatting the line number. This
includes the trailing space. */
...
The commit 1b6d4bb2232 ("Redraw both spaces between line numbers and source
code") started printing two trailing spaces instead:
...
- xsnprintf (text, sizeof (text), "%*d ", m_digits - 1, lineno);
+ xsnprintf (text, sizeof (text), "%*d ", m_digits - 1, lineno);
...
Now that PR30325 is fixed, this no longer has any effect.
Fix this by reverting to the original behaviour: print one trailing space
char.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
With a hello world a.out, and maint set tui-left-margin-verbose on, we have
this disassembly window:
...
┌───────────────────────────────────────────────────────────┐
│___ 0x555555555149 <main> endbr64 │
│___ 0x55555555514d <main+4> push %rbp │
│___ 0x55555555514e <main+5> mov %rsp,%rbp │
│B+> 0x555555555151 <main+8> lea 0xeac(%rip),%rax│
│___ 0x555555555158 <main+15> mov %rax,%rdi │
...
Note the space between "B+>" and 0x555555555151. The space shows that a bit
of the left margin is not written, which is a problem because that location is
showing a character previously written, which happens to be a space, but also
may be something else, for instance a '[' as reported in PR tui/30325.
The problem is caused by confusion about the meaning of:
...
#define TUI_EXECINFO_SIZE 4
...
There's the meaning of defining the size of this zero-terminated char array:
...
char element[TUI_EXECINFO_SIZE];
...
which is used to print the "B+>" bit, which is 3 chars wide.
And there's the meaning of defining part of the size of the left margin:
...
int left_margin () const
{ return 1 + TUI_EXECINFO_SIZE + extra_margin (); }
...
where it represents 4 chars.
The discrepancy between the two causes the space between "B+>" and
"0x555555555151".
Fix this by redefining TUI_EXECINFO_SIZE to 3, and using:
...
char element[TUI_EXECINFO_SIZE + 1];
...
such that we have:
...
|B+>0x555555555151 <main+8> lea 0xeac(%rip),%rax │
...
This changes the layout of the disassembly window back to what it was before
commit 9e820dec13e ("Use a curses pad for source and disassembly windows"),
the commit that introduced the PR30325 regression.
This also changes the source window from:
...
│___000005__{ |
...
to:
...
│___000005_{ |
...
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30325
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The TUI has two types of windows derived from tui_source_window_base:
- tui_source_window (the source window), and
- tui_disasm_window (the disassembly window).
The two windows share a common concept: the left margin.
With a hello world a.out, we can see the source window:
...
┌─/home/vries/hello.c───────────────────────────────────────┐
│ 5 { │
│B+> 6 printf ("hello\n"); │
│ 7 return 0; │
│ 8 } │
│ 9 │
│
...
where the left margin is the part holding "B+>" and the line number, and the
disassembly window:
...
┌───────────────────────────────────────────────────────────┐
│ 0x555555555149 <main> endbr64 │
│ 0x55555555514d <main+4> push %rbp │
│ 0x55555555514e <main+5> mov %rsp,%rbp │
│B+> 0x555555555151 <main+8> lea 0xeac(%rip),%rax│
│ 0x555555555158 <main+15> mov %rax,%rdi │
...
where the left margin is just the bit holding "B+>".
Because the left margin contains some spaces, it's not clear where it starts
and ends, making it harder to observe problems related to it.
Add a new maintenance command "maint set tui-left-margin-verbose", that when
set to on replaces the spaces in the left margin with either '_' or '0',
giving us this for the source window:
...
┌─/home/vries/hello.c───────────────────────────────────────┐
│___000005__{ │
│B+>000006__ printf ("hello\n"); │
│___000007__ return 0; │
│___000008__} │
...
and this for the disassembly window:
...
┌───────────────────────────────────────────────────────────┐
│___ 0x555555555149 <main> endbr64 │
│___ 0x55555555514d <main+4> push %rbp │
│___ 0x55555555514e <main+5> mov %rsp,%rbp │
│B+> 0x555555555151 <main+8> lea 0xeac(%rip),%rax│
│___ 0x555555555158 <main+15> mov %rax,%rdi │
...
Note the space between "B+>" and 0x555555555151. The space shows that a bit
of the left margin is not written, a problem reported as PR tui/30325.
Specifically, PR tui/30325 is about the fact that the '[' character from the
string "[ No Assembly Available ]" ends up in that same spot:
...
│B+>[0x555555555151 <main+8> lea 0xeac(%rip),%rax│
...
which only happens for certain window widths.
The new command allows us to spot the problem with any window width.
Likewise, when we revert the fix from commit 1b6d4bb2232 ("Redraw both spaces
between line numbers and source code"), we have:
...
┌─/home/vries/hello.c───────────────────────────────────────┐
│___000005_ { │
│B+>000006_ printf ("hello\n"); │
│___000007_ return 0; │
│___000008_ } │
...
showing a similar problem at the space between '_' and '{'.
Tested on x86_64-linux.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
I noticed a few unit tests are using gdb_assert. I think this was an
older style, before SELF_CHECK was added. This patch switches them
over.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
I found a couple of tests that check gnatmake_version_at_least using
"if" where "require" would be a little cleaner. This patch converts
these.
|
|
It said for 'info inferiors' and 'info connections' that the argument
could be 'a space separated list of inferior numbers' which is correct
but incomplete. In fact the arguments can be any space separated
combination of numbers and (ascending) ranges.
The beginning of the section now describes the ID list as a new keyword.
Co-Authored-By: Christina Schimpe <christina.schimpe@intel.com>
|
|
Spotted some code in print_one_breakpoint_location that was not
indented correctly, this commit just changes the indentation.
There should be no user visible changes after this commit.
|
|
Spotted a small typo in gdb_breakpoint proc, we use $gdb_name_name
instead of $gdb_test_name in one place. Fixed in this commit.
|
|
On amd64 (at least) if a user sets a watchpoint before the inferior
has started then GDB will assume that a hardware watchpoint can be
created.
When the inferior starts there is a chance that the watchpoint can't
actually be create as a hardware watchpoint, in which case (currently)
GDB will silently convert the watchpoint to a software watchpoint.
Here's an example session:
(gdb) p sizeof var
$1 = 4000
(gdb) watch var
Hardware watchpoint 1: var
(gdb) info watchpoints
Num Type Disp Enb Address What
1 hw watchpoint keep y var
(gdb) starti
Starting program: /home/andrew/tmp/watch
Program stopped.
0x00007ffff7fd3110 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) info watchpoints
Num Type Disp Enb Address What
1 watchpoint keep y var
(gdb)
Notice that before the `starti` command the watchpoint is showing as a
hardware watchpoint, but afterwards it is showing as a software
watchpoint. Additionally, note that we clearly told the user we
created a hardware watchpoint:
(gdb) watch var
Hardware watchpoint 1: var
I think this is bad. I used `starti`, but if the user did `start` or
even `run` then the inferior is going to be _very_ slow, which will be
unexpected -- after all, we clearly told the user that we created a
hardware watchpoint, and the manual clearly says that hardware
watchpoints are fast (at least compared to s/w watchpoints).
In this patch I propose adding a new warning which will be emitted
when GDB downgrades a h/w watchpoint to s/w. The session now looks
like this:
(gdb) p sizeof var
$1 = 4000
(gdb) watch var
Hardware watchpoint 1: var
(gdb) info watchpoints
Num Type Disp Enb Address What
1 hw watchpoint keep y var
(gdb) starti
Starting program: /home/andrew/tmp/watch
warning: watchpoint 1 downgraded to software watchpoint
Program stopped.
0x00007ffff7fd3110 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) info watchpoints
Num Type Disp Enb Address What
1 watchpoint keep y var
(gdb)
The important line is:
warning: watchpoint 1 downgraded to software watchpoint
It's not much, but hopefully it will be enough to indicate to the user
that something unexpected has occurred, and hopefully, they will not
be surprised when the inferior runs much slower than they expected.
I've added an amd64 only test in gdb.arch/, I didn't want to try
adding this as a global test as other architectures might be able to
support the watchpoint request in h/w.
Also the test is skipped for extended-remote boards as there's a
different set of options for limiting hardware watchpoints on remote
targets, and this test isn't about them.
Reviewed-By: Lancelot Six <lancelot.six@amd.com>
|
|
I was seeing some failures in gdb.threads/omp-par-scope.exp when run
on a riscv64 target. It turns out the cause of the problem is that I
didn't have debug information installed for libgomp.so, which this
test makes use of. The test requires GDB to backtrace through a
libgomp function, and the riscv prologue unwinder was failing to
unwind this particular stack frame.
The reason for the failure to unwind was that the function prologue
includes a c.li (compressed load immediate) instruction, and the riscv
prologue scanning unwinder doesn't know what to do with this
instruction, though the unwinder does understand c.lui (compressed
load unsigned immediate).
This commit adds support for c.li. After this GDB is able to unwind
through libgomp, and I no longer see any unexpected failures in
gdb.threads/omp-par-scope.exp.
I've also included a new test in gdb.arch/ which specifically checks
for our c.li support.
|
|
Unfortunately MinGW doesn't support std::future yet, so this causes the
build to fail. Use GDB's version which provides a fallback for this case.
Tested for regressions on native aarch64-linux.
Approved-By: Tom Tromey <tromey@adacore.com>
|
|
PR win32/30255 points out that a call to a NULL function pointer will
leave gdb unable to "bt" on Windows.
I tracked this down to the amd64 windows unwinder. If we treat this
scenario as if it were a leaf function, unwinding works fine.
I'm not completely sure this patch is the best way. I considered
having it check for 'pc==0' -- but then I figured this could affect
any inaccessible PC, not just the special 0 value.
No test case because I can't run dejagnu tests on Windows. I tested
this by hand using the test case in the bug.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30255
|
|
* gdb/doc/gdb.texinfo (Requirements): Fix typos.
|
|
In an experiment I'm trying, I needed Ada symbol cache entries to be
allocated with 'new'. This patch reimplements the symbol cache to use
the libiberty hash table and to use new and delete. A couple of other
minor cleanups are done.
|
|
I noticed there aren't any Ada test cases for setting a breakpoint
using a label. This patch adds one, adapted from the AdaCore test
suite.
|
|
This changes "maint info frame-unwinders" to use ui-out. This makes
the table slightly nicer. In general I think it's better to use
ui-out for tables.
|
|
Whenever we start gdb in the testsuite, we have the rather verbose:
...
$ gdb
GNU gdb (GDB) 14.0.50.20230405-git
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb)
...
This makes gdb.log longer than necessary and harder to read.
We do need to test that the output is produced, but that should be limited to
one or a few test-cases.
Fix this by adding -q to INTERNAL_GDBFLAGS, such that we simply have:
...
$ gdb -q
(gdb)
...
Tested on x86_64-linux.
|
|
gdb.arch/amd64-disp-step-self-call.exp
For test-case gdb.arch/amd64-disp-step-self-call.exp I get:
...
gdb compile failed, ld: warning: amd64-disp-step-self-call0.o: \
missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future \
version of the linker
...
Fix this by adding the missing .note.GNU-stack.
Likewise for gdb.arch/i386-disp-step-self-call.exp.
Tested on x86_64-linux.
|
|
This commit:
commit cf141dd8ccd36efe833aae3ccdb060b517cc1112
Date: Wed Feb 22 12:15:34 2023 +0000
gdb: fix reg corruption from displaced stepping on amd64
Added two test scripts gdb.arch/amd64-disp-step-self-call.exp and
gdb.arch/i386-disp-step-self-call.exp. These scripts contained a test
that included a stack address in the test name, this makes it harder
to compare results between runs.
This commit gives the tests proper names that doesn't include an
address.
Also in gdb.arch/i386-disp-step-self-call.exp I noticed that we were
writing 8-bytes rather than 4 in order to clear the return address
entry on the stack. This is also fixed in this commit.
|
|
This changes apply_ext_lang_type_printers to use unique_xmalloc_ptr,
removing some manual memory management. Regression tested on x86-64
Fedora 36.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Clang with LTO (clang -flto) garbage collects unused global variables,
Thus, gdb.base/align-c.exp and gdb.base/align-c++.exp fail with
hundreds of FAILs like so:
$ make check \
TESTS="gdb.*/align-*.exp" \
RUNTESTFLAGS="CC_FOR_TARGET='clang -flto' CXX_FOR_TARGET='clang++ -flto'"
...
FAIL: gdb.base/align-c.exp: get integer valueof "a_char"
FAIL: gdb.base/align-c.exp: print _Alignof(char)
FAIL: gdb.base/align-c.exp: get integer valueof "a_char_x_char"
FAIL: gdb.base/align-c.exp: print _Alignof(struct align_pair_char_x_char)
FAIL: gdb.base/align-c.exp: get integer valueof "a_char_x_unsigned_char"
...
AIX GCC has the same issue, and there the easier way of adding
__attribute__((used)) to globals does not help.
So add explicit uses of all globals to the generated code.
For the C++ test, that reveals that the static variable members of the
generated structs are not defined anywhere, leading to undefined
references. Fixed by emitting initialization for all static members.
Lastly, I noticed that CXX_FOR_TARGET was being ignored -- that's
because the align-c++.exp testcase is compiling with the C compiler
driver. Fixed by passing "c++" as option to prepare_for_testing.
Change-Id: I874b717afde7b6fb1e45e526912b518a20a12716
|
|
The following commit changed gdbarch_components.py but failed to
format it with black:
commit cf141dd8ccd36efe833aae3ccdb060b517cc1112
Date: Wed Feb 22 12:15:34 2023 +0000
gdb: fix reg corruption from displaced stepping on amd64
This commit just runs black on the file and commits the result.
The change is just the addition of an extra "," -- there will be no
change to the generated source files after this commit.
There will be no user visible changes after this commit.
|
|
This commit allows Frame.read_var to accept named arguments, and also
improves (I think) some of the error messages emitted when values of
the wrong type are passed to this function.
The read_var method takes two arguments, one a variable, which is
either a gdb.Symbol or a string, while the second, optional, argument
is always a gdb.Block.
I'm now using 'O!' as the format specifier for the second argument,
which allows the argument type to be checked early on. Currently, if
the second argument is of the wrong type then we get this error:
(gdb) python print(gdb.selected_frame().read_var("a1", "xxx"))
Traceback (most recent call last):
File "<string>", line 1, in <module>
RuntimeError: Second argument must be block.
Error while executing Python code.
(gdb)
After this commit, we now get an error like this:
(gdb) python print(gdb.selected_frame().read_var("a1", "xxx"))
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: argument 2 must be gdb.Block, not str
Error while executing Python code.
(gdb)
Changes are:
1. Exception type is TypeError not RuntimeError, this is unfortunate
as user code _could_ be relying on this, but I think the improvement
is worth the risk, user code relying on the exact exception type is
likely to be pretty rare,
2. New error message gives argument position and expected argument
type, as well as the type that was passed.
If the first argument, the variable, has the wrong type then the
previous exception was already a TypeError, however, I've updated the
text of the exception to more closely match the "standard" error
message we see above. If the first argument has the wrong type then
before this commit we saw this:
(gdb) python print(gdb.selected_frame().read_var(123))
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: Argument must be a symbol or string.
Error while executing Python code.
(gdb)
And after we see this:
(gdb) python print(gdb.selected_frame().read_var(123))
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: argument 1 must be gdb.Symbol or str, not int
Error while executing Python code.
(gdb)
For existing code that doesn't use named arguments and doesn't rely on
exceptions, there will be no changes after this commit.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Following on from the previous commit, this updates
Frame.read_register to accept named arguments. As with the previous
commit there's no huge benefit for the users in accepting named
arguments here -- this function only takes a single argument after
all.
But I do think it is worth keeping Frame.read_register method in sync
with the PendingFrame.read_register method, this allows for the
possibility that the user has some code that can operate on either a
Frame or a Pending frame.
Minor update to allow for named arguments, and an extra test to check
the new functionality.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Update the two gdb.PendingFrame methods gdb.PendingFrame.read_register
and gdb.PendingFrame.create_unwind_info to accept keyword arguments.
There's no huge benefit for making this change, both of these methods
only take a single argument, so it is (maybe) less likely that a user
will take advantage of the keyword arguments in these cases, but I
think it's nice to be consistent, and I don't see any particular draw
backs to making this change.
For PendingFrame.read_register I've changed the argument name from
'reg' to 'register' in the documentation and used 'register' as the
argument name in GDB. My preference for APIs is to use full words
where possible, and given we didn't support named arguments before
this change should not break any existing code.
There should be no user visible changes (for existing code) after this
commit.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
Update gdb.UnwindInfo.add_saved_register to accept named keyword
arguments.
As part of this update we now use gdb_PyArg_ParseTupleAndKeywords
instead of PyArg_UnpackTuple to parse the function arguments.
By switching to gdb_PyArg_ParseTupleAndKeywords, we can now use 'O!'
as the argument format for the function's value argument. This means
that we can check the argument type (is gdb.Value) as part of the
argument processing rather than manually performing the check later in
the function. One result of this is that we now get a better error
message (at least, I think so). Previously we would get something
like:
ValueError: Bad register value
Now we get:
TypeError: argument 2 must be gdb.Value, not XXXX
It's unfortunate that the exception type changed, but I think the new
exception type actually makes more sense.
My preference for argument names is to use full words where that's not
too excessive. As such, I've updated the name of the argument from
'reg' to 'register' in the documentation, which is the argument name
I've made GDB look for here.
For existing unwinder code that doesn't throw any exceptions nothing
should change with this commit. It is possible that a user has some
code that throws and catches the ValueError, and this code will break
after this commit, but I think this is going to be sufficiently rare
that we can take the risk here.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
This commit aims to address a problem that exists with the current
approach to displaced stepping, and was identified in PR gdb/22921.
Displaced stepping is currently supported on AArch64, ARM, amd64,
i386, rs6000 (ppc), and s390. Of these, I believe there is a problem
with the current approach which will impact amd64 and ARM, and can
lead to random register corruption when the inferior makes use of
asynchronous signals and GDB is using displaced stepping.
The problem can be found in displaced_step_buffers::finish in
displaced-stepping.c, and is this; after GDB tries to perform a
displaced step, and the inferior stops, GDB classifies the stop into
one of two states, either the displaced step succeeded, or the
displaced step failed.
If the displaced step succeeded then gdbarch_displaced_step_fixup is
called, which has the job of fixing up the state of the current
inferior as if the step had not been performed in a displaced manner.
This all seems just fine.
However, if the displaced step is considered to have not completed
then GDB doesn't call gdbarch_displaced_step_fixup, instead GDB
remains in displaced_step_buffers::finish and just performs a minimal
fixup which involves adjusting the program counter back to its
original value.
The problem here is that for amd64 and ARM setting up for a displaced
step can involve changing the values in some temporary registers. If
the displaced step succeeds then this is fine; after the step the
temporary registers are restored to their original values in the
architecture specific code.
But if the displaced step does not succeed then the temporary
registers are never restored, and they retain their modified values.
In this context a temporary register is simply any register that is
not otherwise used by the instruction being stepped that the
architecture specific code considers safe to borrow for the lifetime
of the instruction being stepped.
In the bug PR gdb/22921, the amd64 instruction being stepped is
an rip-relative instruction like this:
jmp *0x2fe2(%rip)
When we displaced step this instruction we borrow a register, and
modify the instruction to something like:
jmp *0x2fe2(%rcx)
with %rcx having its value adjusted to contain the original %rip
value.
Now if the displaced step does not succeed, then %rcx will be left
with a corrupted value. Obviously corrupting any register is bad; in
the bug report this problem was spotted because %rcx is used as a
function argument register.
And finally, why might a displaced step not succeed? Asynchronous
signals provides one reason. GDB sets up for the displaced step and,
at that precise moment, the OS delivers a signal (SIGALRM in the bug
report), the signal stops the inferior at the address of the displaced
instruction. GDB cancels the displaced instruction, handles the
signal, and then tries again with the displaced step. But it is that
first cancellation of the displaced step that causes the problem; in
that case GDB (correctly) sees the displaced step as having not
completed, and so does not perform the architecture specific fixup,
leaving the register corrupted.
The reason why I think AArch64, rs600, i386, and s390 are not effected
by this problem is that I don't believe these architectures make use
of any temporary registers, so when a displaced step is not completed
successfully, the minimal fix up is sufficient.
On amd64 we use at most one temporary register.
On ARM, looking at arm_displaced_step_copy_insn_closure, we could
modify up to 16 temporary registers, and the instruction being
displaced stepped could be expanded to multiple replacement
instructions, which increases the chances of this bug triggering.
This commit only aims to address the issue on amd64 for now, though I
believe that the approach I'm proposing here might be applicable for
ARM too.
What I propose is that we always call gdbarch_displaced_step_fixup.
We will now pass an extra argument to gdbarch_displaced_step_fixup,
this a boolean that indicates whether GDB thinks the displaced step
completed successfully or not.
When this flag is false this indicates that the displaced step halted
for some "other" reason. On ARM GDB can potentially read the
inferior's program counter in order figure out how far through the
sequence of replacement instructions we got, and from that GDB can
figure out what fixup needs to be performed.
On targets like amd64 the problem is slightly easier as displaced
stepping only uses a single replacement instruction. If the displaced
step didn't complete the GDB knows that the single instruction didn't
execute.
The point is that by always calling gdbarch_displaced_step_fixup, each
architecture can now ensure that the inferior state is fixed up
correctly in all cases, not just the success case.
On amd64 this ensures that we always restore the temporary register
value, and so bug PR gdb/22921 is resolved.
In order to move all architectures to this new API, I have moved the
minimal roll-back version of the code inside the architecture specific
fixup functions for AArch64, rs600, s390, and ARM. For all of these
except ARM I think this is good enough, as no temporaries are used all
that's needed is the program counter restore anyway.
For ARM the minimal code is no worse than what we had before, though I
do consider this architecture's displaced-stepping broken.
I've updated the gdb.arch/amd64-disp-step.exp test to cover the
'jmpq*' instruction that was causing problems in the original bug, and
also added support for testing the displaced step in the presence of
asynchronous signal delivery.
I've also added two new tests (for amd64 and i386) that check that GDB
can correctly handle displaced stepping over a single instruction that
branches to itself. I added these tests after a first version of this
patch relied too much on checking the program-counter value in order
to see if the displaced instruction had executed. This works fine in
almost all cases, but when an instruction branches to itself a pure
program counter check is not sufficient. The new tests expose this
problem.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=22921
Approved-By: Pedro Alves <pedro@palves.net>
|
|
The stabs debug format is obsolete and there's no reason to think that
toolchains still have good support for it. Therefore, if a specific debug
format wasn't set in asm-source.exp then leave it to the assembler to
decide which one to use.
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
After commit 9675da25357c ("Use unrelocated_addr in minimal symbols"),
aarch64-linux started failing gdb.asm/asm-source.exp:
Running /home/thiago.bauermann/src/binutils-gdb/gdb/testsuite/gdb.asm/asm-source.exp ...
PASS: gdb.asm/asm-source.exp: f at main
PASS: gdb.asm/asm-source.exp: n at main
PASS: gdb.asm/asm-source.exp: next over macro
FAIL: gdb.asm/asm-source.exp: step into foo2
PASS: gdb.asm/asm-source.exp: info target
PASS: gdb.asm/asm-source.exp: info symbol
PASS: gdb.asm/asm-source.exp: list
PASS: gdb.asm/asm-source.exp: search
FAIL: gdb.asm/asm-source.exp: f in foo2
FAIL: gdb.asm/asm-source.exp: n in foo2 (the program exited)
FAIL: gdb.asm/asm-source.exp: bt ALL in foo2
FAIL: gdb.asm/asm-source.exp: bt 2 in foo2
PASS: gdb.asm/asm-source.exp: s 2
PASS: gdb.asm/asm-source.exp: n 2
FAIL: gdb.asm/asm-source.exp: bt 3 in foo3
PASS: gdb.asm/asm-source.exp: info source asmsrc1.s
FAIL: gdb.asm/asm-source.exp: finish from foo3 (the program is no longer running)
FAIL: gdb.asm/asm-source.exp: info source asmsrc2.s
PASS: gdb.asm/asm-source.exp: info sources
FAIL: gdb.asm/asm-source.exp: info line
FAIL: gdb.asm/asm-source.exp: next over foo3 (the program is no longer running)
FAIL: gdb.asm/asm-source.exp: return from foo2
PASS: gdb.asm/asm-source.exp: look at global variable
PASS: gdb.asm/asm-source.exp: x/i &globalvar
PASS: gdb.asm/asm-source.exp: disassem &globalvar, (int *) &globalvar+1
PASS: gdb.asm/asm-source.exp: look at static variable
PASS: gdb.asm/asm-source.exp: x/i &staticvar
PASS: gdb.asm/asm-source.exp: disassem &staticvar, (int *) &staticvar+1
PASS: gdb.asm/asm-source.exp: look at static function
The problem is simple: a pair of parentheses was removed from the
expression calculating text_end and thus text_size was only added if
lowest_text_address wasn't equal to -1.
This patch restores the previous behaviour and fixes the testcase.
Tested on native aarch64-linux.
Reviewed-By: Tom Tromey <tom@tromey.com>
|