Age | Commit message (Collapse) | Author | Files | Lines |
|
Most files including gdbcmd.h currently rely on it to access things
actually declared in cli/cli-cmds.h (setlist, showlist, etc). To make
things easy, replace all includes of gdbcmd.h with includes of
cli/cli-cmds.h. This might lead to some unused includes of
cli/cli-cmds.h, but it's harmless, and much faster than going through
the 170 or so files by hand.
Change-Id: I11f884d4d616c12c05f395c98bbc2892950fb00f
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Move the declarations out of defs.h, and the implementations out of
findvar.c.
I opted for a new file, because this functionality of converting
integers to bytes and vice-versa seems a bit to generic to live in
findvar.c.
Change-Id: I524858fca33901ee2150c582bac16042148d2251
Approved-By: John Baldwin <jhb@FreeBSD.org>
|
|
Rename the method to `register_debug_string`.
This makes it easier to introduce `target_debug_printf` in a subsequent
patch.
Change-Id: I5bb2d49476d17940d503e66f40762e3f1e3baabc
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Now that defs.h, server.h and common-defs.h are included via the
`-include` option, it is no longer necessary for source files to include
them. Remove all the inclusions of these files I could find. Update
the generation scripts where relevant.
Change-Id: Ia026cff269c1b7ae7386dd3619bc9bb6a5332837
Approved-By: Pedro Alves <pedro@palves.net>
|
|
This commit is the result of the following actions:
- Running gdb/copyright.py to update all of the copyright headers to
include 2024,
- Manually updating a few files the copyright.py script told me to
update, these files had copyright headers embedded within the
file,
- Regenerating gdbsupport/Makefile.in to refresh it's copyright
date,
- Using grep to find other files that still mentioned 2023. If
these files were updated last year from 2022 to 2023 then I've
updated them this year to 2024.
I'm sure I've probably missed some dates. Feel free to fix them up as
you spot them.
|
|
displaced_step_finish can be called with event_status.kind ==
TARGET_WAITKIND_THREAD_EXITED, and in that case it is not possible to
get at the already-exited thread's registers.
This patch moves the get_thread_regcache calls to branches that
actually need it, where we know the thread is still alive.
It also adds an assertion to get_thread_regcache, to help catching
these broken cases sooner.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I63b5eacb3e02a538fc5087c270d8025adfda88c3
|
|
Add a new variant of gdbarch_pseudo_register_write that takes a
frame_info in order to write raw registers. Use this new method when
available:
- in put_frame_register, when trying to write a pseudo register to a given frame
- in regcache::cooked_write
No implementation is migrated to use this new method (that will come in
subsequent patches), so no behavior change is expected here.
The objective is to fix writing pseudo registers to non-current
frames. See previous commit "gdb: read pseudo register through
frame" for a more detailed explanation.
Change-Id: Ie7fe364a15a4d86c2ecb09de2b4baa08c45555ac
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
|
|
gdbarch_deprecated_pseudo_register_write
The next patch introduces a new variant of gdbarch_pseudo_register_write
that takes a frame instead of a regcache for implementations to write
raw registers. Rename to old one to make it clear it's deprecated.
Change-Id: If8872c89c6f8a1edfcab983eb064248fd5ff3115
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
|
|
Change gdbarch_pseudo_register_read_value to take a frame instead of a
regcache. The frame (and formerly the regcache) is used to read raw
registers needed to make up the pseudo register value. The problem with
using the regcache is that it always provides raw register values for
the current frame (frame 0).
Let's say the user wants to read the ebx register on amd64. ebx is a pseudo
register, obtained by reading the bottom half (bottom 4 bytes) of the
rbx register, which is a raw register. If the currently selected frame
is frame 0, it works fine:
(gdb) frame 0
#0 break_here_asm () at /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.arch/amd64-pseudo-unwind-asm.S:36
36 in /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.arch/amd64-pseudo-unwind-asm.S
(gdb) p/x $ebx
$1 = 0x24252627
(gdb) p/x $rbx
$2 = 0x2021222324252627
But if the user is looking at another frame, and the raw register behind
the pseudo register has been saved at some point in the call stack, then
we get a wrong answer:
(gdb) frame 1
#1 0x000055555555517d in caller () at /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.arch/amd64-pseudo-unwind-asm.S:56
56 in /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.arch/amd64-pseudo-unwind-asm.S
(gdb) p/x $ebx
$3 = 0x24252627
(gdb) p/x $rbx
$4 = 0x1011121314151617
Here, the value of ebx was computed using the value of rbx in frame 0
(through the regcache), it should have been computed using the value of
rbx in frame 1.
In other to make this work properly, make the following changes:
- Make dwarf2_frame_prev_register return nullptr if it doesn't know how
to unwind a register and that register is a pseudo register.
Previously, it returned `frame_unwind_got_register`, meaning, in our
example, "the value of ebx in frame 1 is the same as the value of ebx
in frame 0", which is obviously false. Return nullptr as a way to
say "I don't know".
- In frame_unwind_register_value, when prev_register (for instance
dwarf2_frame_prev_register) returns nullptr, and we are trying to
read a pseudo register, try to get the register value through
gdbarch_pseudo_register_read_value or gdbarch_pseudo_register_read.
If using gdbarch_pseudo_register_read, the behavior is known to be
broken. Implementations should be migrated to use
gdbarch_pseudo_register_read_value to fix that.
- Change gdbarch_pseudo_register_read_value to take a frame_info
instead of a regcache, update implementations (aarch64, amd64, i386).
In i386-tdep.c, I made a copy of i386_mmx_regnum_to_fp_regnum that
uses a frame instead of a regcache. The version using the regcache
is still used by i386_pseudo_register_write. It will get removed in
a subsequent patch.
- Add some helpers in value.{c,h} to implement the common cases of
pseudo registers: taking part of a raw register and concatenating
multiple raw registers.
- Update readable_regcache::{cooked_read,cooked_read_value} to pass the
current frame to gdbarch_pseudo_register_read_value. Passing the
current frame will give the same behavior as before: for frame 0, raw
registers will be read from the current thread's regcache.
Notes:
- I do not plan on changing gdbarch_pseudo_register_read to receive a
frame instead of a regcache. That method is considered deprecated.
Instead, we should be working on migrating implementations to use
gdbarch_pseudo_register_read_value instead.
- In frame_unwind_register_value, we still ask the unwinder to try to
unwind pseudo register values. It's apparently possible for the
debug info to provide information about [1] pseudo registers, so we
want to try that first, before falling back to computing them
ourselves.
[1] https://inbox.sourceware.org/gdb-patches/20180528174715.A954AD804AD@oc3748833570.ibm.com/
Change-Id: Id6ef1c64e19090a183dec050e4034d8c2394e7ca
Reviewed-by: John Baldwin <jhb@FreeBSD.org>
|
|
Add value::allocate_register, to facilitate allocating a value
representing a register in a given frame (or rather, in the given
frame's previous frame). It will be used in a subsequent patch. I
changed one relatively obvious spot that could use it, to at least
exercise the code path.
Change-Id: Icd4960f5e471a74b657bb3596c88d89679ef3772
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
|
|
Change most of regcache (and base classes) to use array_view when
possible, instead of raw pointers. By propagating the use of array_view
further, it enables having some runtime checks to make sure the what we
read from or write to regcaches has the expected length (such as the one
in the `copy(array_view, array_view)` function. It also integrates well
when connecting with other APIs already using gdb::array_view.
Add some overloads of the methods using raw pointers to avoid having to
change all call sites at once (which is both a lot of work and risky).
I tried to do this change in small increments, but since many of these
functions use each other, it ended up simpler to do it in one shot than
having a lot of intermediary / transient changes.
This change extends into gdbserver as well, because there is some part
of the regcache interface that is shared.
Changing the reg_buffer_common interface to use array_view caused some
build failures in nat/aarch64-scalable-linux-ptrace.c. That file
currently "takes advantage" of the fact that
reg_buffer_common::{raw_supply,raw_collect} operates on `void *`, which
IMO is dangerous. It uses raw_supply/raw_collect directly on
uint64_t's, which I guess is fine because it is expected that native
code will have the same endianness as the debugged process. To
accomodate that, add some overloads of raw_collect and raw_supply that
work on uint64_t.
This file also uses raw_collect and raw_supply on `char` pointers.
Change it to use `gdb_byte` pointers instead. Add overloads of
raw_collect and raw_supply that work on `gdb_byte *` and make an
array_view on the fly using the register's size. Those call sites could
be converted to use array_view with not much work, in which case these
overloads could be removed, but I didn't want to do it in this patch, to
avoid starting to dig in arch-specific code.
During development, I inadvertently changed reg_buffer::raw_compare's
behavior to not accept an offset equal to the register size. This
behavior (effectively comparing 0 bytes, returning true) change was
caught by the AArch64 SME core tests. Add a selftest to make sure that
this raw_compare behavior is preserved in the future.
Change-Id: I9005f04114543ddff738949e12d85a31855304c2
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
|
|
Make a few simplifications in these functions.
1. When checking if we need to do nothing, if the length is 0, we don't
need to do anything, regardless of the value of offset. Remove the
offset check.
2. When check if transferring the whole register, if the length is equal
to the register size, then we transfer the whole register, no need to
check the offset. Remove the offset check.
3. In the gdb_asserts, it is unnecessary to check for:
offset <= reg_size
given that right after we check for:
len >= 0 && offset + len <= reg_size
If `offset + len` is <= reg_size and len is >= 0, then necessarily
offset is <= reg_size. Remove the `offset <= reg_size` check.
Change-Id: I30a73acdc7bf432c45a07f5f177224d1cdc298e8
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
|
|
Right now, gdbsupport/common-regcache.h contains two abstractons for a
regcache. An opaque type `regcache` (gdb and gdbserver both have their
own regcache that is the concrete version of this) and an abstract base
class `reg_buffer_common`, that is the base of regcaches on both sides.
These abstractions allow code to be written for both gdb and gdbserver,
for instance in the gdb/arch sub-directory.
However, having two
different abstractions is impractical. If some common code has a regcache,
and wants to use an operation defined on reg_buffer_common, it can't.
It would be better to have just one. Change all instances of `regcache
*` in gdbsupport/common-regcache.h to be `reg_buffer_common *`, then fix
fallouts.
Implementations in gdb and gdbserver now need to down-cast (using
gdb::checked_static_cast) from reg_buffer_common to their concrete
regcache type. Some of them could be avoided by changing free functions
(like regcache_register_size) to be virtual methods on
reg_buffer_common. I tried it, it seems to work, but I did not include
it in this series to avoid adding unnecessary changes.
Change-Id: Ia5503adb6b5509a0f4604bd2a68b4642cc5283fd
Reviewed-by: John Baldwin <jhb@FreeBSD.org>
|
|
regcache::transfer_regset iterates over an array of regcache_map_entry,
transferring the registers (between regcache and buffer) described by
those entries. It stops either when it reaches the end of the
regcache_map_entry array (marked by a null entry) or (it seems like the
intent is) when it reaches the end of the buffer (in which case not all
described registers are transferred).
I said "seems like the intent is", because there appears to be a small
bug. transfer_regset is made of two loops:
foreach regcache_map_entry:
foreach register described by the regcache_map_entry:
if the register doesn't fit in the remainder of the buffer:
break
transfer register
When stopping because we have reached the end of the buffer, the break
only breaks out of the inner loop.
This problem causes some failures when I run tests such as
gdb.arch/aarch64-sme-core-3.exp (on AArch64 Linux, in qemu). This is
partly due to aarch64_linux_iterate_over_regset_sections failing to add
a null terminator in its regcache_map_entry array, but I think there is
still a problem in transfer_regset.
The sequence to the crash is:
- The `regcache_map_entry za_regmap` object built in
aarch64_linux_iterate_over_regset_sections does not have a null
terminator.
- When the target does not have a ZA register,
aarch64_linux_collect_za_regset calls `regcache->collect_regset` with
a size of 0 (it's actually pointless, but still it should work).
- transfer_regset gets called with a buffer size of 0.
- transfer_regset detects that the register to transfer wouldn't fit in
0 bytes, so it breaks out of the inner loop.
- The outer loop tries to go read the next regcache_map_entry, but
there isn't one, and we start reading garbage.
Obviously, this would get fixed by making
aarch64_linux_iterate_over_regset_sections use a null terminator (which
is what the following patch does). But I think that when detecting that
there is not enough buffer left for the current register,
transfer_regset should return, not only break out of the inner loop.
This is a kind of contrived scenario, but imagine we have these two
regcache_map_entry objects:
- 2 registers of 8 bytes
- 2 registers of 4 bytes
For some reason, the caller passes a buffer of 12 bytes.
transfer_regset will detect that the second 8 byte register does not
fit, and break out of the inner loop. However, it will then go try the
next regcache_map_entry. It will see that it can fit one 4 byte
register in the remaining buffer space, and transfer it from/to there.
This is very likely not an expected behavior, we wouldn't expect to
read/write this sequence of registers from/to the buffer.
In this example, whether passing a 12 bytes buffer makes sense or
whether it is a size computation bug in the caller, we don't know, but I
think that exiting as soon as a register doesn't fit is the sane thing
to do.
Change-Id: Ia349627d2e5d281822ade92a8e7a4dea4f839e07
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
Reviewed-By: Luis Machado <luis.machado@arm.com>
|
|
Since GDB now requires C++17, we don't need the internally maintained
gdb::optional implementation. This patch does the following replacing:
- gdb::optional -> std::optional
- gdb::in_place -> std::in_place
- #include "gdbsupport/gdb_optional.h" -> #include <optional>
This change has mostly been done automatically. One exception is
gdbsupport/thread-pool.* which did not use the gdb:: prefix as it
already lives in the gdb namespace.
Change-Id: I19a92fa03e89637bab136c72e34fd351524f65e9
Approved-By: Tom Tromey <tom@tromey.com>
Approved-By: Pedro Alves <pedro@palves.net>
|
|
Remove get_current_regcache, inlining the call to get_thread_regcache in
callers. When possible, pass the right thread_info object known from
the local context. Otherwise, fall back to passing `inferior_thread ()`.
This makes the reference to global context bubble up one level, a small
step towards the long term goal of reducing the number of references to
global context (or rather, moving those references as close as possible
to the top of the call tree).
No behavior change expected.
Change-Id: Ifa6980c88825d803ea586546b6b4c633c33be8d6
|
|
While looking at the regcache code, I noticed that the address space
(passed to regcache when constructing it, and available through
regcache::aspace) wasn't relevant for the regcache itself. Callers of
regcache::aspace use that method because it appears to be a convenient
way of getting the address space for a thread, if you already have the
regcache. But there is always another way to get the address space, as
the callers pretty much always know which thread they are dealing with.
The regcache code itself doesn't use the address space.
This patch removes anything related to address_space from the regcache
code, and updates callers to get it from the thread in context. This
removes a bit of unnecessary complexity from the regcache code.
The current get_thread_arch_regcache function gets an address_space for
the given thread using the target_thread_address_space function (which
calls the target_ops::thread_address_space method). This suggest that
there might have been the intention of supporting per-thread address
spaces. But digging through the history, I did not find any such case.
Maybe this method was just added because we needed a way to get an
address space from a ptid (because constructing a regcache required an
address space), and this seemed like the right way to do it, I don't
know.
The only implementations of thread_address_space and
process_stratum_target::thread_address_space and
linux_nat_target::thread_address_space, which essentially just return
the inferior's address space. And thread_address_space is only used in
the current get_thread_arch_regcache, which gets removed. So, I think
that the thread_address_space target method can be removed, and we can
assume that it's fine to use the inferior's address space everywhere.
Callers of regcache::aspace are updated to get the address space from
the relevant inferior, either using some context they already know
about, or in last resort using the current global context.
So, to summarize:
- remove everything in regcache related to address spaces
- in particular, remove get_thread_arch_regcache, and rename
get_thread_arch_aspace_regcache to get_thread_arch_regcache
- remove target_ops::thread_address_space, and
target_thread_address_space
- adjust all users of regcache::aspace to get the address space another
way
Change-Id: I04fd41b22c83fe486522af7851c75bcfb31c88c7
|
|
Use the gdb::byte_vector typedef when possible.
Change-Id: Ib2199201c052496992011ea02979de023d4d8a9a
|
|
Make the inferior's gdbarch field private, and add getters and setters.
This helped me by allowing putting breakpoints on set_arch to know when
the inferior's arch was set. A subsequent patch in this series also
adds more things in set_arch.
Change-Id: I0005bd1ef4cd6b612af501201cec44e457998eec
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
This changes hash_ptid to instead be a specialization of std::hash.
This makes it a little easier to use with standard containers.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Fix some more typos:
- distinquish -> distinguish
- actualy -> actually
- singe -> single
- frash -> frame
- chid -> child
- dissassembler -> disassembler
- uninitalized -> uninitialized
- precontidion -> precondition
- regsiters -> registers
- marge -> merge
- sate -> state
- garanteed -> guaranteed
- explictly -> explicitly
- prefices (nonstandard plural) -> prefixes
- bondary -> boundary
- formated -> formatted
- ithe -> the
- arrav -> array
- coresponding -> corresponding
- owend -> owned
- fials -> fails
- diasm -> disasm
- ture -> true
- tpye -> type
There's one code change, the name of macro SIG_CODE_BONDARY_FAULT changed to
SIG_CODE_BOUNDARY_FAULT.
Tested on x86_64-linux.
|
|
With the following patch, which teaches the amd-dbgapi target to handle
inferiors that fork, we end up with target stacks in the following
state, when an inferior that does not use the GPU forks an inferior that
eventually uses the GPU.
inf 1 inf 2
----- -----
amd-dbgapi
linux-nat linux-nat
exec exec
When a GPU thread from inferior 2 hits a breakpoint, the following
sequence of events would happen, if it was not for the current patch.
- we start with inferior 1 as current
- do_target_wait_1 makes inferior 2 current, does a target_wait, which
returns a stop event for an amd-dbgapi wave (thread).
- do_target_wait's scoped_restore_current_thread restores inferior 1 as
current
- fetch_inferior_event calls switch_to_target_no_thread with linux-nat
as the process target, since linux-nat is officially the process
target of inferior 2. This makes inferior 1 the current inferior, as
it's the first inferior with that target.
- In handle_signal_stop, we have:
ecs->event_thread->suspend.stop_pc
= regcache_read_pc (get_thread_regcache (ecs->event_thread));
context_switch (ecs);
regcache_read_pc executes while inferior 1 is still the current one
(because it's before the `context_switch`). This is a problem,
because the regcache is for a ptid managed by the amd-dbgapi target
(e.g. (12345, 1, 1)), a ptid that does not make sense for the
linux-nat target. The fetch_registers target call goes directly
to the linux-nat target, which gets confused.
- We would then get an error like:
Couldn't get extended state status: No such process.
... since linux-nat tries to do a ptrace call on tid 1.
GDB should switch to the inferior the ptid belongs to before doing the
target call to fetch registers, to make sure the call hits the right
target stack (it should be handled by the amd-dbgapi target in this
case). In fact the following patch does this change, and it would be
enough to fix this specific problem.
However, I propose to change regcache to make it switch to the right
inferior, if needed, before doing target calls. That makes the
interface as a whole more independent of the global context.
My first attempt at doing this was to find an inferior using the process
stratum target and the ptid that regcache already knows about:
gdb::optional<scoped_restore_current_thread> restore_thread;
inferior *inf = find_inferior_ptid (this->target (), this->ptid ());
if (inf != current_inferior ())
{
restore_thread.emplace ();
switch_to_inferior_no_thread (inf);
}
However, this caused some failures in fork-related tests and gdbserver
boards. When we detach a fork child, we may create a regcache for the
child, but there is no corresponding inferior. For instance, to restore
the PC after a displaced step over the fork syscall. So
find_inferior_ptid would return nullptr, and
switch_to_inferior_no_thread would hit a failed assertion.
So, this patch adds to regcache the information "the inferior to switch
to to makes target calls". In typical cases, it will be the inferior
that matches the regcache's ptid. But in some cases, like the detached
fork child one, it will be another inferior (in this example, it will be
the fork parent inferior).
The problem that we witnessed was in regcache::raw_update specifically,
but I looked for other regcache methods doing target calls, and added
the same inferior switching code to raw_write too.
In the regcache constructor and in get_thread_arch_aspace_regcache,
"inf_for_target_calls" replaces the process_stratum_target parameter.
We suppose that the process stratum target that would be passed
otherwise is the same that is in inf_for_target_calls's target stack, so
we don't need to pass both in parallel. The process stratum target is
still used as a key in the `target_pid_ptid_regcache_map` map, but
that's it.
There is one spot that needs to be updated outside of the regcache code,
which is the path that handles the "restore PC after a displaced step in
a fork child we're about to detach" case mentioned above.
regcache_test_data needs to be changed to include full-fledged mock
contexts (because there now needs to be inferiors, not just targets).
Change-Id: Id088569ce106e1f194d9ae7240ff436f11c5e123
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
The regcache class takes a process_stratum_target and then exposes it
through regcache::target. But it doesn't use it itself, suggesting it
doesn't really make sense to put it there. The only user of
regcache::target is record_btrace_target::fetch_registers, but it might
as well just get it from the current target stack. This simplifies a
little bit a patch later in this series.
Change-Id: I8878d875805681c77f469ac1a2bf3a508559a62d
Reviewed-By: Pedro Alves <pedro@palves.net>
|
|
This introduces the set_lval method on value, one step toward removing
deprecated_lval_hack. Ultimately I think the goal should be for some
of these set_* methods to be replaced with constructors; but I haven't
done this, as the series is already too long. Other 'deprecated'
methods can probably be handled the same way.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This turns many functions that are related to optimized-out or
availability-checking to be methods of value. The static function
value_entirely_covered_by_range_vector is also converted to be a
private method.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This turns value_contents_raw, value_contents_writeable, and
value_contents_all_raw into methods on value. The remaining functions
will be changed later in the series; they were a bit trickier and so I
didn't include them in this patch.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes allocate_value to be a static "constructor" of value.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes value_type to be a method of value. Much of this patch
was written by script.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This patch adds the foundation for GDB to be able to debug programs
offloaded to AMD GPUs using the AMD ROCm platform [1]. The latest
public release of the ROCm release at the time of writing is 5.4, so
this is what this patch targets.
The ROCm platform allows host programs to schedule bits of code for
execution on GPUs or similar accelerators. The programs running on GPUs
are typically referred to as `kernels` (not related to operating system
kernels).
Programs offloaded with the AMD ROCm platform can be written in the HIP
language [2], OpenCL and OpenMP, but we're going to focus on HIP here.
The HIP language consists of a C++ Runtime API and kernel language.
Here's an example of a very simple HIP program:
#include "hip/hip_runtime.h"
#include <cassert>
__global__ void
do_an_addition (int a, int b, int *out)
{
*out = a + b;
}
int
main ()
{
int *result_ptr, result;
/* Allocate memory for the device to write the result to. */
hipError_t error = hipMalloc (&result_ptr, sizeof (int));
assert (error == hipSuccess);
/* Run `do_an_addition` on one workgroup containing one work item. */
do_an_addition<<<dim3(1), dim3(1), 0, 0>>> (1, 2, result_ptr);
/* Copy result from device to host. Note that this acts as a synchronization
point, waiting for the kernel dispatch to complete. */
error = hipMemcpyDtoH (&result, result_ptr, sizeof (int));
assert (error == hipSuccess);
printf ("result is %d\n", result);
assert (result == 3);
return 0;
}
This program can be compiled with:
$ hipcc simple.cpp -g -O0 -o simple
... where `hipcc` is the HIP compiler, shipped with ROCm releases. This
generates an ELF binary for the host architecture, containing another
ELF binary with the device code. The ELF for the device can be
inspected with:
$ roc-obj-ls simple
1 host-x86_64-unknown-linux file://simple#offset=8192&size=0
1 hipv4-amdgcn-amd-amdhsa--gfx906 file://simple#offset=8192&size=34216
$ roc-obj-extract 'file://simple#offset=8192&size=34216'
$ file simple-offset8192-size34216.co
simple-offset8192-size34216.co: ELF 64-bit LSB shared object, *unknown arch 0xe0* version 1, dynamically linked, with debug_info, not stripped
^
amcgcn architecture that my `file` doesn't know about ----´
Running the program gives the very unimpressive result:
$ ./simple
result is 3
While running, this host program has copied the device program into the
GPU's memory and spawned an execution thread on it. The goal of this
GDB port is to let the user debug host threads and these GPU threads
simultaneously. Here's a sample session using a GDB with this patch
applied:
$ ./gdb -q -nx --data-directory=data-directory ./simple
Reading symbols from ./simple...
(gdb) break do_an_addition
Function "do_an_addition" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (do_an_addition) pending.
(gdb) r
Starting program: /home/smarchi/build/binutils-gdb-amdgpu/gdb/simple
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff5db7640 (LWP 1082911)]
[New Thread 0x7ffef53ff640 (LWP 1082913)]
[Thread 0x7ffef53ff640 (LWP 1082913) exited]
[New Thread 0x7ffdecb53640 (LWP 1083185)]
[New Thread 0x7ffff54bf640 (LWP 1083186)]
[Thread 0x7ffdecb53640 (LWP 1083185) exited]
[Switching to AMDGPU Wave 2:2:1:1 (0,0,0)/0]
Thread 6 hit Breakpoint 1, do_an_addition (a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24
24 *out = a + b;
(gdb) info inferiors
Num Description Connection Executable
* 1 process 1082907 1 (native) /home/smarchi/build/binutils-gdb-amdgpu/gdb/simple
(gdb) info threads
Id Target Id Frame
1 Thread 0x7ffff5dc9240 (LWP 1082907) "simple" 0x00007ffff5e9410b in ?? () from /opt/rocm-5.4.0/lib/libhsa-runtime64.so.1
2 Thread 0x7ffff5db7640 (LWP 1082911) "simple" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36
5 Thread 0x7ffff54bf640 (LWP 1083186) "simple" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36
* 6 AMDGPU Wave 2:2:1:1 (0,0,0)/0 do_an_addition (
a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24
(gdb) bt
Python Exception <class 'gdb.error'>: Unhandled dwarf expression opcode 0xe1
#0 do_an_addition (a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24
(gdb) continue
Continuing.
result is 3
warning: Temporarily disabling breakpoints for unloaded shared library "file:///home/smarchi/build/binutils-gdb-amdgpu/gdb/simple#offset=8192&size=67208"
[Thread 0x7ffff54bf640 (LWP 1083186) exited]
[Thread 0x7ffff5db7640 (LWP 1082911) exited]
[Inferior 1 (process 1082907) exited normally]
One thing to notice is the host and GPU threads appearing under
the same inferior. This is a design goal for us, as programmers tend to
think of the threads running on the GPU as part of the same program as
the host threads, so showing them in the same inferior in GDB seems
natural. Also, the host and GPU threads share a global memory space,
which fits the inferior model.
Another thing to notice is the error messages when trying to read
variables or printing a backtrace. This is expected for the moment,
since the AMD GPU compiler produces some DWARF that uses some
non-standard extensions:
https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html
There were already some patches posted by Zoran Zaric earlier to make
GDB support these extensions:
https://inbox.sourceware.org/gdb-patches/20211105113849.118800-1-zoran.zaric@amd.com/
We think it's better to get the basic support for AMD GPU in first,
which will then give a better justification for GDB to support these
extensions.
GPU threads are named `AMDGPU Wave`: a wave is essentially a hardware
thread using the SIMT (single-instruction, multiple-threads) [3]
execution model.
GDB uses the amd-dbgapi library [4], included in the ROCm platform, for
a few things related to AMD GPU threads debugging. Different components
talk to the library, as show on the following diagram:
+---------------------------+ +-------------+ +------------------+
| GDB | amd-dbgapi target | <-> | AMD | | Linux kernel |
| +-------------------+ | Debugger | +--------+ |
| | amdgcn gdbarch | <-> | API | <=> | AMDGPU | |
| +-------------------+ | | | driver | |
| | solib-rocm | <-> | (dbgapi.so) | +--------+---------+
+---------------------------+ +-------------+
- The amd-dbgapi target is a target_ops implementation used to control
execution of GPU threads. While the debugging of host threads works
by using the ptrace / wait Linux kernel interface (as usual), control
of GPU threads is done through a special interface (dubbed `kfd`)
exposed by the `amdgpu` Linux kernel module. GDB doesn't interact
directly with `kfd`, but instead goes through the amd-dbgapi library
(AMD Debugger API on the diagram).
Since it provides execution control, the amd-dbgapi target should
normally be a process_stratum_target, not just a target_ops. More
on that later.
- The amdgcn gdbarch (describing the hardware architecture of the GPU
execution units) offloads some requests to the amd-dbgapi library,
so that knowledge about the various architectures doesn't need to be
duplicated and baked in GDB. This is for example for things like
the list of registers.
- The solib-rocm component is an solib provider that fetches the list of
code objects loaded on the device from the amd-dbgapi library, and
makes GDB read their symbols. This is very similar to other solib
providers that handle shared libraries, except that here the shared
libraries are the pieces of code loaded on the device.
Given that Linux host threads are managed by the linux-nat target, and
the GPU threads are managed by the amd-dbgapi target, having all threads
appear in the same inferior requires the two targets to be in that
inferior's target stack. However, there can only be one
process_stratum_target in a given target stack, since there can be only
one target per slot. To achieve it, we therefore resort the hack^W
solution of placing the amd-dbgapi target in the arch_stratum slot of
the target stack, on top of the linux-nat target. Doing so allows the
amd-dbgapi target to intercept target calls and handle them if they
concern GPU threads, and offload to beneath otherwise. See
amd_dbgapi_target::fetch_registers for a simple example:
void
amd_dbgapi_target::fetch_registers (struct regcache *regcache, int regno)
{
if (!ptid_is_gpu (regcache->ptid ()))
{
beneath ()->fetch_registers (regcache, regno);
return;
}
// handle it
}
ptids of GPU threads are crafted with the following pattern:
(pid, 1, wave id)
Where pid is the inferior's pid and "wave id" is the wave handle handed
to us by the amd-dbgapi library (in practice, a monotonically
incrementing integer). The idea is that on Linux systems, the
combination (pid != 1, lwp == 1) is not possible. lwp == 1 would always
belong to the init process, which would also have pid == 1 (and it's
improbable for the init process to offload work to the GPU and much less
for the user to debug it). We can therefore differentiate GPU and
non-GPU ptids this way. See ptid_is_gpu for more details.
Note that we believe that this scheme could break down in the context of
containers, where the initial process executed in a container has pid 1
(in its own pid namespace). For instance, if you were to execute a ROCm
program in a container, then spawn a GDB in that container and attach to
the process, it will likely not work. This is a known limitation. A
workaround for this is to have a dummy process (like a shell) fork and
execute the program of interest.
The amd-dbgapi target watches native inferiors, and "attaches" to them
using amd_dbgapi_process_attach, which gives it a notifier fd that is
registered in the event loop (see enable_amd_dbgapi). Note that this
isn't the same "attach" as in PTRACE_ATTACH, but being ptrace-attached
is a precondition for amd_dbgapi_process_attach to work. When the
debugged process enables the ROCm runtime, the amd-dbgapi target gets
notified through that fd, and pushes itself on the target stack of the
inferior. The amd-dbgapi target is then able to intercept target_ops
calls. If the debugged process disables the ROCm runtime, the
amd-dbgapi target unpushes itself from the target stack.
This way, the amd-dbgapi target's footprint stays minimal when debugging
a process that doesn't use the AMD ROCm platform, it does not intercept
target calls.
The amd-dbgapi library is found using pkg-config. Since enabling
support for the amdgpu architecture (amdgpu-tdep.c) depends on the
amd-dbgapi library being present, we have the following logic for
the interaction with --target and --enable-targets:
- if the user explicitly asks for amdgcn support with
--target=amdgcn-*-* or --enable-targets=amdgcn-*-*, we probe for
the amd-dbgapi and fail if not found
- if the user uses --enable-targets=all, we probe for amd-dbgapi,
enable amdgcn support if found, disable amdgcn support if not found
- if the user uses --enable-targets=all and --with-amd-dbgapi=yes,
we probe for amd-dbgapi, enable amdgcn if found and fail if not found
- if the user uses --enable-targets=all and --with-amd-dbgapi=no,
we do not probe for amd-dbgapi, disable amdgcn support
- otherwise, amd-dbgapi is not probed for and support for amdgcn is not
enabled
Finally, a simple test is included. It only tests hitting a breakpoint
in device code and resuming execution, pretty much like the example
shown above.
[1] https://docs.amd.com/category/ROCm_v5.4
[2] https://docs.amd.com/bundle/HIP-Programming-Guide-v5.4
[3] https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads
[4] https://docs.amd.com/bundle/ROCDebugger-API-Guide-v5.4
Change-Id: I591edca98b8927b1e49e4b0abe4e304765fed9ee
Co-Authored-By: Zoran Zaric <zoran.zaric@amd.com>
Co-Authored-By: Laurent Morichetti <laurent.morichetti@amd.com>
Co-Authored-By: Tony Tye <Tony.Tye@amd.com>
Co-Authored-By: Lancelot SIX <lancelot.six@amd.com>
Co-Authored-By: Pedro Alves <pedro@palves.net>
|
|
This commit is the result of running the gdb/copyright.py script,
which automated the update of the copyright year range for all
source files managed by the GDB project to be updated to include
year 2023.
|
|
Some register sets described by an array of regcache_map_entry
structures do not have fixed register numbers in their associated
architecture but do describe a block of registers whose numbers are at
fixed offsets relative to some base register value. An example of
this are the TLS register sets for the ARM and AArch64 architectures.
Currently OS-specific architectures create register maps and register
sets dynamically using the register base number. However, this
requires duplicating the code to create the register map and register
set. To reduce duplication, add variants of the collect_regset and
supply_regset regcache methods which accept a base register number.
For valid register map entries (i.e. not REGCACHE_MAP_SKIP), add this
base register number to the value from the map entry to determine the
final register number.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Currently, every internal_error call must be passed __FILE__/__LINE__
explicitly, like:
internal_error (__FILE__, __LINE__, "foo %d", var);
The need to pass in explicit __FILE__/__LINE__ is there probably
because the function predates widespread and portable variadic macros
availability. We can use variadic macros nowadays, and in fact, we
already use them in several places, including the related
gdb_assert_not_reached.
So this patch renames the internal_error function to something else,
and then reimplements internal_error as a variadic macro that expands
__FILE__/__LINE__ itself.
The result is that we now should call internal_error like so:
internal_error ("foo %d", var);
Likewise for internal_warning.
The patch adjusts all calls sites. 99% of the adjustments were done
with a perl/sed script.
The non-mechanical changes are in gdbsupport/errors.h,
gdbsupport/gdb_assert.h, and gdb/gdbarch.py.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Change-Id: Ia6f372c11550ca876829e8fd85048f4502bdcf06
|
|
I looked at all the spots using value_mark, and converted all the
straightforward ones to use scoped_value_mark instead.
Regression tested on x86-64 Fedora 34.
|
|
After the previous few commit, gdbarch_register_name no longer returns
nullptr. This commit audits all the calls to gdbarch_register_name
and removes any code that checks the result against nullptr.
There should be no visible change after this commit.
|
|
Remove the macro, replace all uses with calls to type::length.
Change-Id: Ib9bdc954576860b21190886534c99103d6a47afb
|
|
gdbarch implements its own registry-like approach. This patch changes
it to instead use registry.h. It's a rather large patch but largely
uninteresting -- it's mostly a straightforward conversion from the old
approach to the new one.
The main benefit of this change is that it introduces type safety to
the gdbarch registry. It also removes a bunch of code.
One possible drawback is that, previously, the gdbarch registry
differentiated between pre- and post-initialization setup. This
doesn't seem very important to me, though.
|
|
With --enable-targets=all we have:
...
$ gdb -q -batch -ex "maint selftest"
...
Running selftest regcache::cooked_read_test::m68hc11.
warning: No frame soft register found in the symbol table.
Stack backtrace will not work.
Running selftest regcache::cooked_read_test::m68hc12.
warning: No frame soft register found in the symbol table.
Stack backtrace will not work.
Running selftest regcache::cooked_read_test::m68hc12:HCS12.
warning: No frame soft register found in the symbol table.
Stack backtrace will not work.
...
Likewise for regcache::cooked_write_test.
The warning has no use in the selftest context.
Fix this by skipping the specific selftests.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29224
|
|
native-extended-gdbserver board
Running
$ make check TESTS="gdb.gdb/unittest.exp" RUNTESTFLAGS="--target_board=native-extended-gdbserver"
I get some failures:
Running selftest regcache::cooked_write_test::i386.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:intel.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:x64-32.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:x64-32:intel.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:x86-64.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i386:x86-64:intel.^M
Self test failed: target already pushed^M
Running selftest regcache::cooked_write_test::i8086.^M
Self test failed: target already pushed^M
This is because the native-extended-gdbserver automatically connects GDB
to a GDBserver on startup, and therefore pushes a remote target on the
initial inferior. cooked_write_test is currently written in a way that
errors out if the current inferior has a process_stratum_target pushed.
Rewrite it to use scoped_mock_context, so it doesn't depend on the
current inferior (the current one upon entering the function).
Change-Id: I0357f989eacbdecc4bf88b043754451b476052ad
|
|
Now that filtered and unfiltered output can be treated identically, we
can unify the printf family of functions. This is done under the name
"gdb_printf". Most of this patch was written by script.
|
|
When registers are supplied via regcache_supply_register from a
register block described by a register map, registers may be stored in
slots smaller than GDB's native register size (e.g. x86 segment
registers are 16 bits, but the GDB registers for those are 32-bits).
regcache_collect_regset is careful to zero-extend slots larger than a
register size, but regcache_supply_regset just used
regcache::raw_supply_part and did not initialize the upper bytes of a
register value.
trad_frame_set_reg_regmap assumes these semantics (zero-extending
short registers). Upcoming patches also require these semantics for
handling x86 segment register values stored in 16-bit slots on
FreeBSD. Note that architecturally x86 segment registers are 16 bits,
but the x86 gdb architectures treat these registers as 32 bits.
|
|
There are several commands that may optionally send their output to a
file -- they take an optional filename argument and open a file. This
patch changes these commands to use filtered output. The rationale
here is that, when printing to gdb_stdout, filtering is appropriate --
it is, and should be, the default for all commands. And, when writing
to a file, paging will not happen anyway (it only happens when the
stream==gdb_stdout), so using the _filtered form will not change
anything.
|
|
This commit brings all the changes made by running gdb/copyright.py
as per GDB's Start of New Year Procedure.
For the avoidance of doubt, all changes in this commits were
performed by the script.
|
|
I think it would make sense for extract_integer, extract_signed_integer
and extract_unsigned_integer to take an array_view. This way, when we
extract an integer, we can validate that we don't overflow the buffer
passed by the caller (e.g. ask to extract a 4-byte integer but pass a
2-byte buffer).
- Change extract_integer to take an array_view
- Add overloads of extract_signed_integer and extract_unsigned_integer
that take array_views. Keep the existing versions so we don't
need to change all callers, but make them call the array_view
versions.
This shortens some places like:
result = extract_unsigned_integer (value_contents (result_val).data (),
TYPE_LENGTH (value_type (result_val)),
byte_order);
into
result = extract_unsigned_integer (value_contents (result_val), byte_order);
value_contents returns an array view that is of length
`TYPE_LENGTH (value_type (result_val))` already, so the length is
implicitly communicated through the array view.
Change-Id: Ic1c1f98c88d5c17a8486393af316f982604d6c95
|
|
The bug fixed by this [1] patch was caused by an out-of-bounds access to
a value's content. The code gets the value's content (just a pointer)
and then indexes it with a non-sensical index.
This made me think of changing functions that return value contents to
return array_views instead of a plain pointer. This has the advantage
that when GDB is built with _GLIBCXX_DEBUG, accesses to the array_view
are checked, making bugs more apparent / easier to find.
This patch changes the return types of these functions, and updates
callers to call .data() on the result, meaning it's not changing
anything in practice. Additional work will be needed (which can be done
little by little) to make callers propagate the use of array_view and
reap the benefits.
[1] https://sourceware.org/pipermail/gdb-patches/2021-September/182306.html
Change-Id: I5151f888f169e1c36abe2cbc57620110673816f3
|
|
This helper can be used in the fetch_registers and store_registers
target methods to determine if a register set includes a specific
register.
|
|
When debugging a large number of threads (thousands), looking up a
thread by ptid_t using the inferior::thread_list linked list can add up.
Add inferior::thread_map, an std::unordered_map indexed by ptid_t, and
change the find_thread_ptid function to look up a thread using
std::unordered_map::find, instead of iterating on all of the
inferior's threads. This should make it faster to look up a thread
from its ptid.
Change-Id: I3a8da0a839e18dee5bb98b8b7dbeb7f3dfa8ae1c
Co-Authored-By: Pedro Alves <pedro@palves.net>
|
|
Change inferior_list, the global list of inferiors, to use
intrusive_list. I think most other changes are somewhat obvious
fallouts from this change.
There is a small change in behavior in scoped_mock_context. Before this
patch, constructing a scoped_mock_context would replace the whole
inferior list with only the new mock inferior. Tests using two
scoped_mock_contexts therefore needed to manually link the two inferiors
together, as the second scoped_mock_context would bump the first mock
inferior from the thread list. With this patch, a scoped_mock_context
adds its mock inferior to the inferior list on construction, and removes
it on destruction. This means that tests run with mock inferiors in the
inferior list in addition to any pre-existing inferiors (there is always
at least one). There is no possible pid clash problem, since each
scoped mock inferior uses its own process target, and pids are per
process target.
Co-Authored-By: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I7eb6a8f867d4dcf8b8cd2dcffd118f7270756018
|
|
I spotted some indentation issues where we had some spaces followed by
tabs at beginning of line, that I wanted to fix. So while at it, I did
a quick grep to find and fix all I could find.
gdb/ChangeLog:
* Fix tab after space indentation issues throughout.
Change-Id: I1acb414dd9c593b474ae2b8667496584df4316fd
|
|
The alias creation functions currently accept a name to specify the
target command. They pass this to add_alias_cmd, which needs to lookup
the target command by name.
Given that:
- We don't support creating an alias for a command before that command
exists.
- We always use add_info_alias just after creating that target command,
and therefore have access to the target command's cmd_list_element.
... change add_com_alias to accept the target command as a
cmd_list_element (other functions are done in subsequent patches). This
ensures we don't create the alias before the target command, because you
need to get the cmd_list_element from somewhere when you call the alias
creation function. And it avoids an unecessary command lookup. So it
seems better to me in every aspect.
gdb/ChangeLog:
* command.h (add_com_alias): Accept target as
cmd_list_element. Update callers.
Change-Id: I24bed7da57221cc77606034de3023fedac015150
|
|
The reg_buffer constructor zero-initializes (value-initializes, in C++
speak) the gdb_bytes of the m_registers array. This is not necessary,
as these bytes are only meaningful if the corresponding register_status
is REG_VALID. If the corresponding register_status is REG_VALID, then
they will have been overwritten with the actual register data when
reading the registers from the system into the reg_buffer.
Fix that by removing the empty parenthesis following the new expression,
meaning that the bytes will now be default-initialized, meaning they'll
be left uninitialized. For reference, this is explained here:
https://en.cppreference.com/w/cpp/language/new#Construction
These new expressions were added in 835dcf92618e ("Use std::unique_ptr
in reg_buffer"). As mentioned in that commit message, the use of
value-initialisation was done on purpose to keep existing behavior, but
now there is some data that suggest it would be beneficial not to do it,
which is why I suggest changing it.
This doesn't make a big difference on typical architectures where the
register buffer is not that big. However, on ROCm (AMD GPU), the
register buffer is about 65000 bytes big, so the reg_buffer constructor
shows up in profiling. If you want to make some tests and profile it on
a standard system, it's always possible to change:
- m_registers.reset (new gdb_byte[m_descr->sizeof_raw_registers] ());
+ m_registers.reset (new gdb_byte[65000] ());
and run a program that constantly hits a breakpoint with a false
condition. For example, by doing this change and running the following
program:
static void break_here () {}
int main ()
{
for (int i = 0; i < 100000; i++)
break_here ();
}
with the following GDB incantation:
/usr/bin/time ./gdb -nx --data-directory=data-directory -q test -ex "b break_here if 0" -ex r -batch
I get, for value-intializing:
11.75user 7.68system 0:18.54elapsed 104%CPU (0avgtext+0avgdata 56644maxresident)k
And for default-initializing:
6.83user 8.42system 0:14.12elapsed 108%CPU (0avgtext+0avgdata 56512maxresident)k
gdb/ChangeLog:
* regcache.c (reg_buffer::reg_buffer): Default-initialize
m_registers array.
Change-Id: I5071a4444dee0530ce1bc58ebe712024ddd2b158
|