aboutsummaryrefslogtreecommitdiff
path: root/gdb/amdgpu-tdep.c
AgeCommit message (Collapse)AuthorFilesLines
2024-07-25gdb/amdgpu: remove unused includesSimon Marchi1-2/+0
Remove two includes reported as unused by clangd. Change-Id: Idfe27a6c21186de5bd5f8e8f7fdc0fd8ab4d451e
2024-03-26gdb, gdbserver, gdbsupport: remove includes of early headersSimon Marchi1-1/+0
Now that defs.h, server.h and common-defs.h are included via the `-include` option, it is no longer necessary for source files to include them. Remove all the inclusions of these files I could find. Update the generation scripts where relevant. Change-Id: Ia026cff269c1b7ae7386dd3619bc9bb6a5332837 Approved-By: Pedro Alves <pedro@palves.net>
2024-02-20gdb: pass frames as `const frame_info_ptr &`Simon Marchi1-4/+4
We currently pass frames to function by value, as `frame_info_ptr`. This is somewhat expensive: - the size of `frame_info_ptr` is 64 bytes, which is a bit big to pass by value - the constructors and destructor link/unlink the object in the global `frame_info_ptr::frame_list` list. This is an `intrusive_list`, so it's not so bad: it's just assigning a few points, there's no memory allocation as if it was `std::list`, but still it's useless to do that over and over. As suggested by Tom Tromey, change many function signatures to accept `const frame_info_ptr &` instead of `frame_info_ptr`. Some functions reassign their `frame_info_ptr` parameter, like: void the_func (frame_info_ptr frame) { for (; frame != nullptr; frame = get_prev_frame (frame)) { ... } } I wondered what to do about them, do I leave them as-is or change them (and need to introduce a separate local variable that can be re-assigned). I opted for the later for consistency. It might not be clear why some functions take `const frame_info_ptr &` while others take `frame_info_ptr`. Also, if a function took a `frame_info_ptr` because it did re-assign its parameter, I doubt that we would think to change it to `const frame_info_ptr &` should the implementation change such that it doesn't need to take `frame_info_ptr` anymore. It seems better to have a simple rule and apply it everywhere. Change-Id: I59d10addef687d157f82ccf4d54f5dde9a963fd0 Approved-By: Andrew Burgess <aburgess@redhat.com>
2024-01-12Update copyright year range in header of all files managed by GDBAndrew Burgess1-1/+1
This commit is the result of the following actions: - Running gdb/copyright.py to update all of the copyright headers to include 2024, - Manually updating a few files the copyright.py script told me to update, these files had copyright headers embedded within the file, - Regenerating gdbsupport/Makefile.in to refresh it's copyright date, - Using grep to find other files that still mentioned 2023. If these files were updated last year from 2022 to 2023 then I've updated them this year to 2024. I'm sure I've probably missed some dates. Feel free to fix them up as you spot them.
2023-11-21gdb: Remove uses of gdb::to_string (const std::string_view &)Lancelot Six1-10/+9
This patch removes all uses of to_string(const std::string_view&) and use the std::string ctor or implicit conversion from std::string_view to std::string instead. A later patch will remove this gdb::to_string while removing gdbsupport/gdb_string_view.h. Change-Id: I877cde557a0727be7b0435107e3c7a2aac165895 Approved-By: Tom Tromey <tom@tromey.com> Approved-By: Pedro Alves <pedro@palves.net>
2023-11-21gdb: Use std::string_view instead of gdb::string_viewLancelot Six1-25/+26
Given that GDB now requires a C++17, replace all uses of gdb::string_view with std::string_view. This change has mostly been done automatically: - gdb::string_view -> std::string_view - #include "gdbsupport/gdb_string_view.h" -> #include <string_view> One things which got brought up during review is that gdb::stging_view does support being built from "nullptr" while std::sting_view does not. Two places are manually adjusted to account for this difference: gdb/tui/tui-io.c:tui_getc_1 and gdbsupport/format.h:format_piece::format_piece. The above automatic change transformed "gdb::to_string (const gdb::string_view &)" into "gdb::to_string (const std::string_view &)". The various direct users of this function are now explicitly including "gdbsupport/gdb_string_view.h". A later patch will remove the users of gdb::to_string. The implementation and tests of gdb::string_view are unchanged, they will be removed in a following patch. Change-Id: Ibb806a7e9c79eb16a55c87c6e41ad396fecf0207 Approved-By: Tom Tromey <tom@tromey.com> Approved-By: Pedro Alves <pedro@palves.net>
2023-08-31[gdb/symtab] Factor out type::{alloc_fields,copy_fields}Tom de Vries1-4/+1
After finding this code in buildsym_compunit::finish_block_internal: ... ftype->set_fields ((struct field *) TYPE_ALLOC (ftype, nparams * sizeof (struct field))); ... and fixing PR30810 by using TYPE_ZALLOC, I wondered if there were more locations that needed fixing. I decided to make things easier to spot by factoring out a new function alloc_fields: ... /* Allocate the fields array of this type, with NFIELDS elements. If INIT, zero-initialize the allocated memory. */ void type::alloc_fields (unsigned int nfields, bool init = true); ... where: - a regular use would be "alloc_fields (nfields)", and - an exceptional use that needed no initialization would be "alloc_fields (nfields, false)". Pretty soon I discovered that most of the latter cases are due to initialization by memcpy, so I added two variants of copy_fields as well. After this rewrite there are 8 uses of set_fields left: ... gdb/coffread.c: type->set_fields (nullptr); gdb/coffread.c: type->set_fields (nullptr); gdb/coffread.c: type->set_fields (nullptr); gdb/eval.c: type->set_fields gdb/gdbtypes.c: type->set_fields (args); gdb/gdbtypes.c: t->set_fields (XRESIZEVEC (struct field, t->fields (), gdb/dwarf2/read.c: type->set_fields (new_fields); gdb/dwarf2/read.c: sub_type->set_fields (sub_type->fields () + 1); ... These fall into the following categories: - set to nullptr (coffread.c), - type not owned by objfile or gdbarch (eval.c), and - modifying an existing fields array, like adding an element at the end or dropping an element at the start (the rest). Tested on x86_64-linux.
2023-05-24gdbsupport: add support for references to checked_static_castSimon Marchi1-9/+19
Add a checked_static_cast overload that works with references. A bad dynamic cast with references throws std::bad_cast, it would be possible to implement the new overload based on that, but it seemed simpler to just piggy back off the existing function. I found some potential uses of this new overload in amd-dbgapi-target.c, update them to illustrate the use of the new overload. To build amd-dbgapi-target.c, on needs the amd-dbgapi library, which I don't expect many people to have. But I have it, and it builds fine here. I did test the new overload by making a purposely bad cast and it did catch it. Change-Id: Id6b6a7db09fe3b4aa43cddb60575ff5f46761e96 Reviewed-By: Lancelot SIX <lsix@lancelotsix.com> Reviewed-By: Andrew Burgess <aburgess@redhat.com>
2023-03-18Remove arch_typeTom Tromey1-2/+3
This removes arch_type, replacing all uses with the new type allocator. Reviewed-By: Simon Marchi <simon.marchi@efficios.com>
2023-03-07gdb/amdgpu: provide dummy implementation of gdbarch_return_value_as_valueSimon Marchi1-0/+12
The AMD GPU support has been merged shortly after commit 4e1d2f5814b2 ("Add new overload of gdbarch_return_value"), which made it mandatory for architectures to provide either a return_value or return_value_as_value implementation. Because of my failure to test properly after rebasing and before pushing, we get this with the current master: $ gdb ./gdb -nx --data-directory=data-directory -q -ex "set arch amdgcn:gfx1010" -batch /home/simark/src/binutils-gdb/gdb/gdbarch.c:517: internal-error: verify_gdbarch: the following are invalid ... return_value_as_value I started trying to change GDB to not force architectures to provide a return_value or return_value_as_value implementation, but Andrew pointed out that any serious port will have an implementation one day or another, and it's easy to add a dummy implementation in the mean time. So it's better to not complicate the core of GDB to know how to deal with this. There is an implementation of return_value in the downstream ROCgdb port (which we'll need to convert to the new return_value_as_value), which we'll contribute soon-ish. In the mean time, add a dummy implementation of return_value_as_value to avoid the failed assertion. Change-Id: I26edf441b511170aa64068fd248ab6201158bb63 Reviewed-By: Lancelot SIX <lancelot.six@amd.com>
2023-03-01gdb: update some copyright years (2022 -> 2023)Simon Marchi1-1/+1
The copyright years in the ROCm files (e.g. solib-rocm.c) are wrong, they end in 2022 instead of 2023. I suppose because I posted (or at least prepared) the patches in 2022 but merged them in 2023, and forgot to update the year. I found a bunch of other files that are in the same situation. Fix them all up. Change-Id: Ia55f5b563606c2ba6a89046f22bc0bf1c0ff2e10 Reviewed-By: Tom Tromey <tom@tromey.com>
2023-02-02gdb: initial support for ROCm platform (AMDGPU) debuggingSimon Marchi1-0/+1367
This patch adds the foundation for GDB to be able to debug programs offloaded to AMD GPUs using the AMD ROCm platform [1]. The latest public release of the ROCm release at the time of writing is 5.4, so this is what this patch targets. The ROCm platform allows host programs to schedule bits of code for execution on GPUs or similar accelerators. The programs running on GPUs are typically referred to as `kernels` (not related to operating system kernels). Programs offloaded with the AMD ROCm platform can be written in the HIP language [2], OpenCL and OpenMP, but we're going to focus on HIP here. The HIP language consists of a C++ Runtime API and kernel language. Here's an example of a very simple HIP program: #include "hip/hip_runtime.h" #include <cassert> __global__ void do_an_addition (int a, int b, int *out) { *out = a + b; } int main () { int *result_ptr, result; /* Allocate memory for the device to write the result to. */ hipError_t error = hipMalloc (&result_ptr, sizeof (int)); assert (error == hipSuccess); /* Run `do_an_addition` on one workgroup containing one work item. */ do_an_addition<<<dim3(1), dim3(1), 0, 0>>> (1, 2, result_ptr); /* Copy result from device to host. Note that this acts as a synchronization point, waiting for the kernel dispatch to complete. */ error = hipMemcpyDtoH (&result, result_ptr, sizeof (int)); assert (error == hipSuccess); printf ("result is %d\n", result); assert (result == 3); return 0; } This program can be compiled with: $ hipcc simple.cpp -g -O0 -o simple ... where `hipcc` is the HIP compiler, shipped with ROCm releases. This generates an ELF binary for the host architecture, containing another ELF binary with the device code. The ELF for the device can be inspected with: $ roc-obj-ls simple 1 host-x86_64-unknown-linux file://simple#offset=8192&size=0 1 hipv4-amdgcn-amd-amdhsa--gfx906 file://simple#offset=8192&size=34216 $ roc-obj-extract 'file://simple#offset=8192&size=34216' $ file simple-offset8192-size34216.co simple-offset8192-size34216.co: ELF 64-bit LSB shared object, *unknown arch 0xe0* version 1, dynamically linked, with debug_info, not stripped ^ amcgcn architecture that my `file` doesn't know about ----ยด Running the program gives the very unimpressive result: $ ./simple result is 3 While running, this host program has copied the device program into the GPU's memory and spawned an execution thread on it. The goal of this GDB port is to let the user debug host threads and these GPU threads simultaneously. Here's a sample session using a GDB with this patch applied: $ ./gdb -q -nx --data-directory=data-directory ./simple Reading symbols from ./simple... (gdb) break do_an_addition Function "do_an_addition" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (do_an_addition) pending. (gdb) r Starting program: /home/smarchi/build/binutils-gdb-amdgpu/gdb/simple [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff5db7640 (LWP 1082911)] [New Thread 0x7ffef53ff640 (LWP 1082913)] [Thread 0x7ffef53ff640 (LWP 1082913) exited] [New Thread 0x7ffdecb53640 (LWP 1083185)] [New Thread 0x7ffff54bf640 (LWP 1083186)] [Thread 0x7ffdecb53640 (LWP 1083185) exited] [Switching to AMDGPU Wave 2:2:1:1 (0,0,0)/0] Thread 6 hit Breakpoint 1, do_an_addition (a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24 24 *out = a + b; (gdb) info inferiors Num Description Connection Executable * 1 process 1082907 1 (native) /home/smarchi/build/binutils-gdb-amdgpu/gdb/simple (gdb) info threads Id Target Id Frame 1 Thread 0x7ffff5dc9240 (LWP 1082907) "simple" 0x00007ffff5e9410b in ?? () from /opt/rocm-5.4.0/lib/libhsa-runtime64.so.1 2 Thread 0x7ffff5db7640 (LWP 1082911) "simple" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 5 Thread 0x7ffff54bf640 (LWP 1083186) "simple" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 * 6 AMDGPU Wave 2:2:1:1 (0,0,0)/0 do_an_addition ( a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24 (gdb) bt Python Exception <class 'gdb.error'>: Unhandled dwarf expression opcode 0xe1 #0 do_an_addition (a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24 (gdb) continue Continuing. result is 3 warning: Temporarily disabling breakpoints for unloaded shared library "file:///home/smarchi/build/binutils-gdb-amdgpu/gdb/simple#offset=8192&size=67208" [Thread 0x7ffff54bf640 (LWP 1083186) exited] [Thread 0x7ffff5db7640 (LWP 1082911) exited] [Inferior 1 (process 1082907) exited normally] One thing to notice is the host and GPU threads appearing under the same inferior. This is a design goal for us, as programmers tend to think of the threads running on the GPU as part of the same program as the host threads, so showing them in the same inferior in GDB seems natural. Also, the host and GPU threads share a global memory space, which fits the inferior model. Another thing to notice is the error messages when trying to read variables or printing a backtrace. This is expected for the moment, since the AMD GPU compiler produces some DWARF that uses some non-standard extensions: https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html There were already some patches posted by Zoran Zaric earlier to make GDB support these extensions: https://inbox.sourceware.org/gdb-patches/20211105113849.118800-1-zoran.zaric@amd.com/ We think it's better to get the basic support for AMD GPU in first, which will then give a better justification for GDB to support these extensions. GPU threads are named `AMDGPU Wave`: a wave is essentially a hardware thread using the SIMT (single-instruction, multiple-threads) [3] execution model. GDB uses the amd-dbgapi library [4], included in the ROCm platform, for a few things related to AMD GPU threads debugging. Different components talk to the library, as show on the following diagram: +---------------------------+ +-------------+ +------------------+ | GDB | amd-dbgapi target | <-> | AMD | | Linux kernel | | +-------------------+ | Debugger | +--------+ | | | amdgcn gdbarch | <-> | API | <=> | AMDGPU | | | +-------------------+ | | | driver | | | | solib-rocm | <-> | (dbgapi.so) | +--------+---------+ +---------------------------+ +-------------+ - The amd-dbgapi target is a target_ops implementation used to control execution of GPU threads. While the debugging of host threads works by using the ptrace / wait Linux kernel interface (as usual), control of GPU threads is done through a special interface (dubbed `kfd`) exposed by the `amdgpu` Linux kernel module. GDB doesn't interact directly with `kfd`, but instead goes through the amd-dbgapi library (AMD Debugger API on the diagram). Since it provides execution control, the amd-dbgapi target should normally be a process_stratum_target, not just a target_ops. More on that later. - The amdgcn gdbarch (describing the hardware architecture of the GPU execution units) offloads some requests to the amd-dbgapi library, so that knowledge about the various architectures doesn't need to be duplicated and baked in GDB. This is for example for things like the list of registers. - The solib-rocm component is an solib provider that fetches the list of code objects loaded on the device from the amd-dbgapi library, and makes GDB read their symbols. This is very similar to other solib providers that handle shared libraries, except that here the shared libraries are the pieces of code loaded on the device. Given that Linux host threads are managed by the linux-nat target, and the GPU threads are managed by the amd-dbgapi target, having all threads appear in the same inferior requires the two targets to be in that inferior's target stack. However, there can only be one process_stratum_target in a given target stack, since there can be only one target per slot. To achieve it, we therefore resort the hack^W solution of placing the amd-dbgapi target in the arch_stratum slot of the target stack, on top of the linux-nat target. Doing so allows the amd-dbgapi target to intercept target calls and handle them if they concern GPU threads, and offload to beneath otherwise. See amd_dbgapi_target::fetch_registers for a simple example: void amd_dbgapi_target::fetch_registers (struct regcache *regcache, int regno) { if (!ptid_is_gpu (regcache->ptid ())) { beneath ()->fetch_registers (regcache, regno); return; } // handle it } ptids of GPU threads are crafted with the following pattern: (pid, 1, wave id) Where pid is the inferior's pid and "wave id" is the wave handle handed to us by the amd-dbgapi library (in practice, a monotonically incrementing integer). The idea is that on Linux systems, the combination (pid != 1, lwp == 1) is not possible. lwp == 1 would always belong to the init process, which would also have pid == 1 (and it's improbable for the init process to offload work to the GPU and much less for the user to debug it). We can therefore differentiate GPU and non-GPU ptids this way. See ptid_is_gpu for more details. Note that we believe that this scheme could break down in the context of containers, where the initial process executed in a container has pid 1 (in its own pid namespace). For instance, if you were to execute a ROCm program in a container, then spawn a GDB in that container and attach to the process, it will likely not work. This is a known limitation. A workaround for this is to have a dummy process (like a shell) fork and execute the program of interest. The amd-dbgapi target watches native inferiors, and "attaches" to them using amd_dbgapi_process_attach, which gives it a notifier fd that is registered in the event loop (see enable_amd_dbgapi). Note that this isn't the same "attach" as in PTRACE_ATTACH, but being ptrace-attached is a precondition for amd_dbgapi_process_attach to work. When the debugged process enables the ROCm runtime, the amd-dbgapi target gets notified through that fd, and pushes itself on the target stack of the inferior. The amd-dbgapi target is then able to intercept target_ops calls. If the debugged process disables the ROCm runtime, the amd-dbgapi target unpushes itself from the target stack. This way, the amd-dbgapi target's footprint stays minimal when debugging a process that doesn't use the AMD ROCm platform, it does not intercept target calls. The amd-dbgapi library is found using pkg-config. Since enabling support for the amdgpu architecture (amdgpu-tdep.c) depends on the amd-dbgapi library being present, we have the following logic for the interaction with --target and --enable-targets: - if the user explicitly asks for amdgcn support with --target=amdgcn-*-* or --enable-targets=amdgcn-*-*, we probe for the amd-dbgapi and fail if not found - if the user uses --enable-targets=all, we probe for amd-dbgapi, enable amdgcn support if found, disable amdgcn support if not found - if the user uses --enable-targets=all and --with-amd-dbgapi=yes, we probe for amd-dbgapi, enable amdgcn if found and fail if not found - if the user uses --enable-targets=all and --with-amd-dbgapi=no, we do not probe for amd-dbgapi, disable amdgcn support - otherwise, amd-dbgapi is not probed for and support for amdgcn is not enabled Finally, a simple test is included. It only tests hitting a breakpoint in device code and resuming execution, pretty much like the example shown above. [1] https://docs.amd.com/category/ROCm_v5.4 [2] https://docs.amd.com/bundle/HIP-Programming-Guide-v5.4 [3] https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads [4] https://docs.amd.com/bundle/ROCDebugger-API-Guide-v5.4 Change-Id: I591edca98b8927b1e49e4b0abe4e304765fed9ee Co-Authored-By: Zoran Zaric <zoran.zaric@amd.com> Co-Authored-By: Laurent Morichetti <laurent.morichetti@amd.com> Co-Authored-By: Tony Tye <Tony.Tye@amd.com> Co-Authored-By: Lancelot SIX <lancelot.six@amd.com> Co-Authored-By: Pedro Alves <pedro@palves.net>