aboutsummaryrefslogtreecommitdiff
path: root/gdb
AgeCommit message (Collapse)AuthorFilesLines
2025-08-27gdb/testsuite: use libtool to launch selftestsSimon Marchi1-17/+87
When building GDB on Cygwin, gdb/gdb.exe is a libtool wrapper (which happens to be a PE executable). The real executable is at gdb/.libs/gdb.exe. The "does gdb have debug info test" that _selftest_setup does is bogus, because it loads the libtool wrapper (which doesn't have debug info), doesn't see any debug info, and thus the test is skipped. The "correct" way to deal with libtool wrappers is to run the shell command you want to run under `libtool --mode=execute`. That will replace any path resembling to a libtool wrapper with the real executable path. But it will also add to the environment the library paths necessary for this executable to find the libraries it needs. Therefore, modify the `do_self_tests` proc to: - run the top-level GDB commands under `libtool --mode=execute` - pass the path to the inferior GDB on the command-line of the top-level, so that it gets replaced with the real executable's path However, the "file" command was previously used to detect the presence of debug info in the GDB executable. It's not easy to implement this check when loading the executable directly on the command line. So, add a separate proc, _selftest_check_executable_debug_info, that spawns a temporary GDB and does the debug info check through the file command. This proc uses libtool to obtain the path to the real executable. When building, we use the bundled libtool.m4 at the top of the tree. This means that the libtool system package, and therefore the libtool binary, might not be available. Check for the presence of the libtool binary first, and only do the conversion if it is found. If it is not found, the test should still work on platforms that don't require the conversion. With this commit, the test runs on Cygwin, even though there are failures later. Change-Id: Ie7b712cdc84671a5a017655a7e41687ff23f906c Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27gdb/testsuite: do not copy gdb executable in self testsSimon Marchi1-21/+2
In the ROCm-GDB testing process, we hit a problem that is a combination of these 3 factors: 1. In the downstream ROCm-GDB packages, the gdb executable is built with a relative RUNPATH: 0x000000000000001d (RUNPATH) Library runpath: [${ORIGIN}/../lib] This is done so that the installation is relocatable (the whole ROCm directory can be copied around) and things still work. For instance, the rocgdb executable needs to be able to find the libraries it needs, such as `librocm-dbgapi.so.0`. The relative runpath allows that. 2. For testing, we run the testsuite against the gdb executable installed from one of those packages. It is possible to ./configure the testsuite directory on its own, and then do: $ make check RUNTESTFLAGS="GDB=/opt/rocm/bin/rocgdb" 3. The selftests (such as gdb.gdb/selftest.exp) copy the GDB under test to the standard output directory, before trying to debug it. The problem is that the gdb executable under test that has been copied can't find the libraries it needs. With this patch, I propose that we don't copy the gdb executable, but debug it in place instead. The comment removed in this patch says "in case this OS doesn't like to edit its own text space", and has been there since forever in some form. But it's not clear if there is a host OS (where we intend to run this test) that needs this nowadays. I would bet that there isn't. If there is in fact a GDB host OS (where we intend to run this test) that needs it, we can reinstate the copying, but as an opt-in operation. Another situation where this change helps is on Windows, where gdb/gdb.exe is a libtool wrapper (the real executable is at gdb/.libs/gdb.exe). Copying gdb/gdb.exe doesn't accomplish anything useful. The next patch does further changes to account for the libtool wrapper case. I tested on Linux and Cygwin, more testing would be welcome. Change-Id: Id4148517d4fc4ecdd49f099c12003e3d16c6a93d Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27gdb/testsuite: remove function parameter from do_self_testsSimon Marchi3-8/+7
The function to stop at is always main. Remove the parameter and hard-code main in _selftest_setup. Change-Id: Ibbbf598203b1658305eb6bc631d029652c10edac Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27gdb/testsuite: namespace procs in lib/selftest-support.expSimon Marchi1-5/+5
Rename some procs in lib/selftest-support.exp that are only used internally, to make it a bit clearer that they are just internal helpers. Change-Id: Icd399ac42698209fbc8e798bf43a7d8464aa848c Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27Fix formatting of gdbarch_components.pyTom Tromey1-1/+1
pre-commit pointed out that gdbarch_components.py had a minor formatting issue, according to the official version of 'black'. This patch corrects the oversight.
2025-08-27gdb/testsuite: work around empty substring bug in expectAndrew Burgess1-11/+25
There is a bug in expect, see: https://sourceforge.net/p/expect/patches/26/ which causes empty substring matches from a regexp to instead return the complete input buffer. To reproduce this bug, try this command: expect -c 'spawn sh -c "echo -n -e \"abc\""; \ expect -re "(a?)(a)(bc)"; \ puts "\n"; \ for { set i 1 } { $i < 4 } { incr i } { \ puts -nonewline "($i): \""; \ puts -nonewline $expect_out($i,string); \ puts "\"" \ }' For a working expect the output looks like: spawn sh -c echo -n -e "abc" abc (1): "" (2): "a" (3): "bc" But for a broken expect the output looks like: spawn sh -c echo -n -e "abc" abc (1): "abc" (2): "a" (3): "bc" Notice that (1) is now returning the complete input buffer rather than the empty string, this is wrong. This is not the first time this bug has impacted GDB's testsuite, this commit seems to be working around the same problem: commit e579b537353cd91cb8fac1eaeb69901d4936766f Date: Sat Aug 16 20:32:37 2025 +0200 [gdb/testsuite] Fix TUI tests on freebsd I recently pushed this commit: commit 3825c972a636852600b47c242826313f4b9963b8 Date: Wed Jun 18 15:02:29 2025 +0100 gdb: allow gdb.Color to work correctly with pagination Which added gdb.python/py-color-pagination.exp. Bug PR gdb/33321 was then created as the test was failing on some hosts. Turns out, this is same expect bug. The fix presented here is the same as for e579b537353cd91cb8, avoid using optional regexp substrings at the start of a regexp, and instead use two separate regexp patterns. With this change in place, the test now passes on all hosts. There's no change in what is being tested after this commit. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33321 Approved-By: Tom de Vries <tdevries@suse.de>
2025-08-27[gdb/testsuite] Fix gdb.server/non-existing-program.exp on msys2-ucrt64Tom de Vries1-1/+3
On msys2-ucrt64, with test-case gdb.server/non-existing-program.exp I get: ... (gdb) quit^M gdb_caching_proc allow_xml_test caused gdb_exit to be called gdb_caching_proc allow_xml_test marked as called gdb_caching_proc get_mount_point_map marked as called builtin_spawn gdbserver stdio non-existing-program^M Error creating process "non-existing-program " (error 2): \ The system cannot find the file specified.^M^M Exiting^M^M FAIL: gdb.server/non-existing-program.exp: gdbserver exits cleanly ... This happens because this regexp fails to match: ... # This is what we get on Windows. -re "Error creating process\r\n\r\nExiting\r\n" { ... Fix this by updating the regexp. Tested on x86_64-w64-mingw32 (msys2-ucrt64).
2025-08-27[gdb/testsuite] Add have_startup_shellTom de Vries6-3/+54
Say we disable startup-with-shell, we get: ... (gdb) run `echo 8`^M Starting program: a2-run `echo 8`^M [Thread debugging using libthread_db enabled]^M Using host libthread_db library "/lib64/libthread_db.so.1".^M usage: factorial <number>^M [Inferior 1 (process 10787) exited with code 01]^M (gdb) FAIL: gdb.base/a2-run.exp: run "a2-run" with shell (timeout) ... Fix this by only doing this test if startup-with-shell is supported. This fixes the test-case on msys2-ucrt64, where startup-with-shell is not supported. Likewise in other test-cases. Tested on x86_64-linux.
2025-08-27[gdb/testsuite] Add missing require {!is_remote host}Tom de Vries6-1/+8
I ran test-case gdb.python/py-color-pagination.exp with make-check-all.sh and noticed failures when using remote host. So I grepped to find all test-cases using with_ansi_styling_terminal and ran them with host/target board local-remote-host-native. Fix the failing test-cases using require {!is_remote host}. Tested on x86_64-linux.
2025-08-26gdb/python: return gdbpy_ref<> from gdbpy_create_ptid_objectAndrew Burgess3-7/+12
Update gdbpy_create_ptid_object (python/py-infthread.c) to return a gdbpy_ref<> rather than a 'PyObject *'. This reduces the chances that a caller will leak an object, though no such memory leaks are fixed in this commit, this is just a code improvement patch. There should be no user visible changes after this commit. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26gdb: use existing argument more in rename_vmcore_idle_reg_sectionsAndrew Burgess1-5/+8
In corelow.c, in the function rename_vmcore_idle_reg_sections, the argument ABFD holds the core file bfd pointer. When this function is called current_program_space->core_bfd() is passed as the argument value. Within this function, we sometimes use the function argument, and sometimes access current_program_space->core_bfd() directly. This is confusing, and unnecessary. Lets not do that. I've renamed the argument to cbfd (for Core file BFD), and then updated the function to make use of this argument throughout. This reduces the number of accesses to global state, which is, I think, a good thing. There should be no user visible changes after this commit. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26gdb: more current_program_space->core_bfd() removalAndrew Burgess6-20/+24
This commit changes the signature of the gdbarch_core_info_proc method so that it takes a 'struct bfd *' as an extra argument. This argument is used to pass through the core file bfd pointer. Now, in corelow.c, when calling gdbarch_core_info_proc, we can pass through current_program_space->core_bfd() as the argument. Within the implementations, (Linux and FreeBSD) we can use this argument rather than having to access the core file through current_program_space. This reduces the use of global state, which I think is a good thing. There should be no user visible changes after this commit. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26gdb: use current_program_space->core_bfd() a little lessAndrew Burgess1-11/+7
The function linux_read_core_file_mappings is passed an argument CBFD, which is the BFD for the core file. In core_target::build_file_mappings, where the function is called, we pass current_program_space->core_bfd() as the argument. However, in linux_read_core_file_mappings, in some places we use the CBFD argument, and in other places we directly use current_program_space->core_bfd(). This is confusing, and unnecessary. Lets not do that. Standardise on just using CBFD. This removes some references to global state in favour of passing the global state in as an argument, I think this is a good thing. There should be no user visible changes after this commit. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26[gdb/tdep] Add XOP support in amd64_get_insn_detailsTom de Vries1-2/+30
Implement support for XOP instructions [1] in amd64_get_insn_details. The encoding scheme is documented here [2]. Essentially it's a variant of the VEX3 encoding scheme, with: - 0x8f as the first byte instead of 0xc4, and - an opcode map >= 8. The changes are roughly the same as the XOP part of an earlier submission [3], hence the tag. The only real difference is that that patch proposed to implement xop_prefix_p using: ... return pfx[0] == 0x8f && (pfx[1] & 0x38); ... which tries to resolve the conflict between the XOP prefix (starts with 0x8f) and the POP instruction (opcode 0x8f) by detecting that it's not a POP instruction. Instead, use the way AMD has resolved this conflict in the specification, by checking for opcode map >= 8: ... gdb_byte m = pfx[1] & 0x1f; return pfx[0] == 0x8f && m >= 8; ... Tested on x86_64-linux. Co-Authored-By: Jan Beulich <jbeulich@suse.com> Reviewed-By: Klaus Gerlicher<klaus.gerlicher.@intel.com> [1] https://en.wikipedia.org/wiki/XOP_instruction_set [2] https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/43479.pdf [3] https://sourceware.org/pipermail/gdb-patches/2019-February/155347.html
2025-08-26gdb/python: fix an unlikely memory leakAndrew Burgess1-16/+11
I noticed a possible memory leak in gdbpy_create_ptid_object, in py-infthread.c. We create a Tuple, and hold the reference in a 'PyObject*' local. If we then fail to create any of the tuple contents we perform an early exit, returning nullptr, this will leak the Tuple object. Currently, we create the Tuple as the first action in the function, but we don't really need the tuple until the end of the function. In this commit I have: 1. Moved creation of the Tuple until the end of the function, just before we need it. 2. Stored the Tuple reference in a gdbpy_ref<>. This is not strictly needed any more, but is (I think) good practice as future changes to the function will not need to worry about releasing the Tuple object. 3. Taken the opportunity to replace a NULL with nullptr in this function. 4. Inlined the local variable declarations to the point of first use. There should be no user visible changes after this commit. No tests as I have no idea how to make gdb_py_object_from_longest (and friends) fail, and so trigger the memory leak. I suspect we'd never actually see this leak in the real world, but it doesn't hurt to clean these things up. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26gdb: LoongArch: Improve loongarch_scan_prologue for correct backtraceHui Li1-0/+32
(1) Description of Problem: When debugging the following code, the execution result of the backtrace command is incorrect. $ cat test.S .text .globl fun1 .type fun1, @function fun1: or $r12,$r0,$r0 or $r4,$r12,$r0 jr $r1 .globl fun .type fun, @function fun: addi.d $r3,$r3,-16 st.d $r1,$r3,8 bl fun1 or $r12,$r4,$r0 or $r4,$r12,$r0 ld.d $r1,$r3,8 addi.d $r3,$r3,16 jr $r1 .globl main .type main, @function main: addi.d $r3,$r3,-16 st.d $r1,$r3,8 bl fun nop ld.d $r1,$r3,8 addi.d $r3,$r3,16 jr $r1 $ gcc test.S -o test $ gdb test ... (gdb) b fun1 Breakpoint 1 at 0x748 (gdb) r Breakpoint 1, 0x0000555555554748 in fun1 () (gdb) bt #0 0x0000555555554748 in fun1 () #1 0x0000555555554758 in fun () #2 0x0000555555554758 in fun () #3 0x0000555555554758 in fun () .... --Type <RET> for more, q to quit, c to continue without paging (2) Root Cause Analysis: The return address of fun() in r1(ra) is saved on the stack: addi.d $r3,$r3,-16 st.d $r1,$r3,8 The bl instruction in fun () will call the fun1 () and save the value of pc+4 to r1(ra). bl fun1 or $r12,$r4,$r0 Because registers such as fp and ra saved in the stack of the sub-function are not recorded in current code. When trace back fun() to main(), the pc of the previous frame to be read from ra register instead of the saved location on the stack. At this time, the value of ra register in fun() is already the address of the next instruction after the bl. So it is impossible to trace back to the main(). (3) Solution: Record the location of ra, fp, s0 to s8 on the stack to ensure the correct execution of backtrace. (4) Test: $ gdb test ... (gdb) b fun1 Breakpoint 1 at 0x748 (gdb) r Breakpoint 1, 0x0000555555554748 in fun1 () (gdb) bt #0 0x0000555555554748 in fun1 () #1 0x0000555555554758 in fun () #2 0x0000555555554778 in main () Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-26gdb: LoongArch: Improve loongarch_scan_prologue to record stack informationHui Li1-0/+42
(1) Description of Problem: When debugging the following code, the execution result of nexti command is incorrect. $ cat test.S .text .globl fun .type fun, @function fun: or $r12,$r0,$r0 or $r4,$r12,$r0 jr $r1 .globl main .type main, @function main: addi.d $r3,$r3,-16 st.d $r1,$r3,8 bl fun or $r12,$r4,$r0 or $r4,$r12,$r0 ld.d $r1,$r3,8 addi.d $r3,$r3,16 jr $r1 $ gcc test.S -o test $ gdb test ... (gdb) set disassemble-next-line on (gdb) start ... Temporary breakpoint 1, 0x0000555555554754 in main () => 0x0000555555554754 <main+8>: 57ffefff bl -20 # 0x555555554740 <fun> (gdb) ni 0x0000555555554740 in fun () => 0x0000555555554740 <fun+0>: 0015000c move $t0, $zero (2) Root Cause Analysis: In the internal execution flow of the ni command, a single-step will be executed first. After that, it will enter process_event_stop_test (), some conditions are judged in this function. if ((get_stack_frame_id (frame) != ecs->event_thread->control.step_stack_frame_id) && get_frame_type (frame) != SIGTRAMP_FRAME && ((frame_unwind_caller_id (frame) == ecs->event_thread->control.step_stack_frame_id) && ((ecs->event_thread->control.step_stack_frame_id != outer_frame_id) || (ecs->event_thread->control.step_start_function != find_pc_function (ecs->event_thread->stop_pc ()))))) { ... if (ecs->event_thread->control.step_over_calls == STEP_OVER_ALL) ... else insert_step_resume_breakpoint_at_caller (frame); } Here, it will be judged whether a sub-function has been called based on whether the frame id before the single step is not equal to the current frame id and whether there is a calling relationship. If a sub-function is called at this time and the current operation is nexti, it will not stop immediately. Instead, insert_step_resume_breakpoint_at_caller() will be called to complete the execution of the sub-function and then stop. In above debugging examples, the executable program being debugged is compiled from an asm source file that does not contain dwarf information. Therefore, the frame id of the function is calculated by loongarch_frame_unwind rather than dwarf2_frame_unwind. However, loongarch_scan_prologue() has not yet recorded stack information in loongarch_frame_cache, this will cause problems in some operations related to the frame id information. (3) Solution: Improve loongarch_scan_prologue() to record the stack information in loongarch_frame_cache. And improve the loongarch_frame_unwind_stop_reason() through the information recorded in loongarch_frame_cache. (4) Test: After this patch: $ gdb test (gdb) set disassemble-next-line on (gdb) start Temporary breakpoint 1, 0x0000555555554754 in main () => 0x0000555555554754 <main+8>: 57ffefff bl -20 # 0x555555554740 <fun> (gdb) ni 0x0000555555554758 in main () => 0x0000555555554758 <main+12>: 0015008c move $t0, $a0 (gdb) ni 0x000055555555475c in main () => 0x000055555555475c <main+16>: 00150184 move $a0, $t0 Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-26gdb: LoongArch: Refactor member functions of loongarch_frame_unwindHui Li1-17/+156
In the current code, loongarch_frame_unwind is a LoongArch prologue unwinder, it contains the required member functions, but they do not calculate a valid frame id through prologue of a function frame. Refactor these functions and use loongarch_frame_cache to record the information of the function frame. No functional change intended. Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-26gdb: LoongArch: Add the definition of loongarch_frame_cacheHui Li1-1/+28
Add the definition of loongarch_frame_cache for loongarch_frame_unwind, this is preparation for later patch on LoongArch. Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-26amd64-tdep: need_modrm = 1 for VEX/EVEX instructions, except vzeroall/vzeroupperKlaus Gerlicher1-6/+32
VEX and EVEX-encoded instructions generally require a ModR/M byte, with the notable exception of vzeroall and vzeroupper (opcode 0x77), which do not use ModR/M. This change sets need_modrm = 1 for VEX instructions, and adds an exception for instructions where *insn == 0x77, following Intel’s SDM. EVEX has no exceptions and thus always sets need_modrm to 1. Additionally, the legacy twobyte_has_modrm table cannot be used for VEX and EVEX instructions, as these encodings have different requirements and exceptions. The logic is now explicit for VEX/EVEX handling. Add vpblendw to selftest amd64_insn_decode. The Intel SDM says the following: 1. Intel® 64 and IA-32 Architectures Software Developer’s Manual Section 2.2.1.2 — Instruction Prefixes "The VEX prefix is a multi-byte prefix that replaces several legacy prefixes and opcode bytes. The VEX prefix is not an opcode; it is a prefix that modifies the instruction that follows." Section 2.2.1.3 — Opcode Bytes "The opcode byte(s) follow any instruction prefixes (including VEX). The opcode specifies the operation to be performed." Section 2.2.2 — Instruction Format "If a VEX prefix is present, it is processed as a single prefix, and the opcode bytes follow immediately after the VEX prefix." Source: Intel® SDM Vol. 2A, Section 2.2.1.2 and 2.2.2 (See Vol. 2A, PDF pages 2-4, 2-5, and 2-7) 2. ModRM Byte Requirement Intel® SDM Vol. 2A, Table 2-2 — VEX Prefix Encoding "Most VEX-encoded instructions require a ModRM byte, except for a few instructions such as VZEROALL and VZEROUPPER." Source: Intel® SDM Vol. 2A, Table 2-2 (See Vol. 2A, PDF page 2-13) Approved-By: Tom de Vries <tdevries@suse.de>
2025-08-26[gdb/testsuite] Fix require dwarf2_support check in some test-casesTom de Vries6-2/+10
On x86_64-freebsd, I ran into trouble with test-case gdb.dwarf2/macro-source-path-clang14-dw4.exp (and similar), and I managed to reproduce the problem on x86_64-linux by making dwarf2_support return 0. The failure looks like: ... UNSUPPORTED: $exp: require failed: dwarf2_support UNRESOLVED: $exp: testcase aborted due to invalid command name: do_test ERROR: tcl error sourcing $exp. ... I fixed a similar problem in commit 3e488d8ccd0 ("[gdb/testsuite] Fix gdb.dwarf2/dw-form-strx-out-of-bounds.exp with make-check-all.sh"). Fix this by moving "require dwarf2_support" from gdb.dwarf2/macro-source-path.exp.tcl to the files including it. Tested on x86_64-linux.
2025-08-25Fix tekhex format related gdb.base/dump.exp failuresKevin Buettner1-1/+1
On s390x, a big-endian machine, I'm seeing these test failures: FAIL: gdb.base/dump.exp: array as memory, tekhex; file restored ok FAIL: gdb.base/dump.exp: array as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: array as value, tekhex; file restored ok FAIL: gdb.base/dump.exp: array as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: array copy, tekhex; file restored ok FAIL: gdb.base/dump.exp: array copy, tekhex; value restored ok FAIL: gdb.base/dump.exp: array partial, tekhex; file restored ok FAIL: gdb.base/dump.exp: array partial, tekhex; value restored ok FAIL: gdb.base/dump.exp: dump array as memory, tekhex FAIL: gdb.base/dump.exp: dump array as value, tekhex FAIL: gdb.base/dump.exp: dump struct as memory, tekhex FAIL: gdb.base/dump.exp: dump struct as value, tekhex FAIL: gdb.base/dump.exp: reload array as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: reload array as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: reload struct as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: reload struct as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: struct as memory, tekhex; file restored ok FAIL: gdb.base/dump.exp: struct as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: struct as value, tekhex; file restored ok FAIL: gdb.base/dump.exp: struct as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: struct copy, tekhex; file restored ok FAIL: gdb.base/dump.exp: struct copy, tekhex; value restored ok It turns out that there's a subtle bug in move_section_contents in bfd/tekhex.c. The bug is that when attempting to write a buffer that starts with a zero byte, the function will return false, an error condition, without writing anything. But it also doesn't set bfd_error, so GDB ends up displaying whatever the last unrelated error was, e.g.: warning: writing dump file '.../intstr1.tekhex' (No such file or directory) When I investigated this, the bfd error was set during failure to open a separate debug file for the test case, which is totally unrelated to this problem. The reason this fails on big endian machines is that the test case writes out structs and arrays of int initialized to small values. On little endian machines, the small integer is the first byte, so the error doesn't occur. On big endian machines, a zero byte occurs first, triggering the error. On the GDB side of things, I've made a one line change to the test case to cause the error to also happen on little endian machines. I simply shift value of the first field in the struct left by 16 bits. That leaves at least one zero byte on both sides of the non-zero part of the int. I shifted it by 16 because, for a moment, there was a question in my mind about what would happen with a second zero byte, but it turns out that it's not a problem. On the bfd side of things, take a look at move_section_contents() and find_chunk() in tekhex.c. The scenario is this: we enter move_section_contents with locationp pointing at a character buffer whose first byte is zero. The 'get' parameter is false, i.e. we're writing, not reading. The other critical fact is that the abfd->tdata.tekhex_data->data is NULL (0). I'm going to go through the execution path pretty much line by line with commentary below the line(s) just executed. char *location = (char *) locationp; bfd_vma prev_number = 1; /* Nothing can have this as a high bit. */ I can't say that the comment provides the best explanation about what's happening, but the gist is this: later on, chunk_number will have it's low bits masked away, therefore no matter what it is, it can't possibly be equal to prev_number when it's set to 1. struct data_struct *d = NULL; BFD_ASSERT (offset == 0); for (addr = section->vma; count != 0; count--, addr++) { Set d to NULL and enter the loop. /* Get high bits of address. */ bfd_vma chunk_number = addr & ~(bfd_vma) CHUNK_MASK; bfd_vma low_bits = addr & CHUNK_MASK; Use CHUNK_MASK, which is 0x1fff, to obtain the chunk number, i.e. whatever's left after masking off the low 13 bits of addr, and low_bits, which are the low 13 bits of addr. chunk_number matters for understanding this bug, low_bits does not. Remember that no matter what addr is, once you mask off the low 13 bits, it can't be equal to 1. bool must_write = !get && *location != 0; !get is true, *location != 0 is false, therefore the conjunction is false, and furthermore must_write is false. I.e. even though we are writing, we don't transfer zero bytes to the chunk - this is why must_write is false. (The reason this works is that a chunk, once allocated, is zero'd as part of the allocation using bfd_zalloc. Therefore we can skip transferring zero bytes and, if enough of them are skipped one after another, chunk allocation simply doesn't happen. That's a good thing.) if (chunk_number != prev_number || (!d && must_write)) For the reason provided above, chunk_number != prev_number is true. The other part of the disjunction doesn't matter since the first part is true. This means that the if-block is entered. /* Different chunk, so move pointer. */ d = find_chunk (abfd, chunk_number, must_write); find_chunk is entered with must_write set to false. Now, remember where we left off here, because we're going to switch to find_chunk. static struct data_struct * find_chunk (bfd *abfd, bfd_vma vma, bool create) { (Above 3 lines indented to distinguish code from commentary.) When we enter find_chunk, create is false because must_write was false. struct data_struct *d = abfd->tdata.tekhex_data->data; d is set to NULL since abfd->tdata.texhex_data->data is NULL (one of the conditions for the scenario). vma &= ~CHUNK_MASK; while (d && (d->vma) != vma) d = d->next; d is NULL, so the while loop doesn't execute. if (!d && create) ... d is NULL so !d is true, but create is false, so the condition evaluates to false, meaning that the if-block is skipped. return d; find_chunk returns NULL, since d is NULL. Back in move_section_contents: if (!d) return false; d is NULL (because that's what find_chunk returned), so move_section_contents returns false at this point. Note that find_section_contents has allocated no memory, nor even tried to transfer any bytes beyond the first (zero) byte. This is a bug. The key to understanding this bug is to observe that find_chunk can return NULL to indicate that no chunk was found. This is especially important for the read (get=true) case. But it can also be NULL to indicate a memory allocation error. I toyed around with the idea of using a different value to distinguish these cases, i.e. something like (struct data_struct *) -1, but although bfd contains plenty of code where -1 is used to indicate various interesting conditions for scalars, there's no prior art where this is done for a pointer. Therefore the idea was discarded in favor of modifying this statement: if (!d) return false; to: if (!d && must_write) return false; This works because, in find_chunk, the only way to return a NULL memory allocation error is for must_write / create to be true. When it is true, if bfd_zalloc successfully allocates a chunk, then that (non-NULL) chunk will be returned at the end of the function. When it fails, it'll return NULL early. The point is that when bfd_zalloc() fails and returns NULL, must_write (in move_section_contents) / create (in find_chunk) HAD to be true. That provides us with an easy test back in move_section_contents to distinguish a memory-allocation-NULL from a block-not-found-NULL. The other NULL return case happens when the end of the function is reached when either searching for a chunk to read or attempting to find a chunk to write when abfd->tdata.tekhex_data->data is NULL. But for the latter case, must_write was false, which does not (now, with the above fix) trigger the early return of false. (Alan Modra approved the bfd/tekhex.c change.) Approved-By: Simon Marchi <simon.marchi@efficios.com> (GDB)
2025-08-25gdb: fix indentation in objfiles.cSimon Marchi1-1/+1
Change-Id: I3d39ee767a3b2b743b3a90386fb30a6703e9733e
2025-08-25gdb: LoongArch: Handle newly added llsc instructionsXi Ruoyao1-2/+7
We can't put a breakpoint in the middle of a ll/sc atomic sequence, handle the instructions sc.q, llacq.{w/d}, screl.{w/d} newly added in the LoongArch Reference Manual v1.10 so a ll/sc atomic sequence using them won't loop forever being debugged. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-24gdb: allow gdb.Color to work correctly with paginationAndrew Burgess4-12/+195
This commit allows gdb.Color objects to be used to style output from GDB commands written in Python, and the styled output should work correctly with pagination. There are two parts to fixing this: First, GDB needs to be able to track the currently applied style within the page_file class. This means that style changes need to be achieved with calls to pager_file::emit_style_escape. Now usually, GDB does this by calling something like fprintf_styled, which takes care to apply the style for us. However, that's not really an option here as a gdb.Color isn't a full style, and as the gdb.Color object is designed to be converted directly into escape sequences that can then be printed, we really need a solution that works with this approach. However pager_file::puts already has code in place to handle escape sequences. Right now all this code does is spot the escape sequence and append it to the m_wrap_buffer. But in this commit I propose that we go one step further, parse the escape sequence back into a ui_file_style object in pager_file::puts, and then we can call pager_file::emit_style_escape. If the parsing doesn't work then we can just add the escape sequence to m_wrap_buffer as we did before. But wait, how can this work if a gdb.Color isn't a full style? Turns out that's not a problem. We only ever emit the escape sequence for those parts of a style that need changing, so a full style that sets the foreground color will emit the same escape sequence as a gdb.Color for the foreground. When we convert the escape sequence back into a ui_file_style, then we get a style with everything set to default, except the foreground color. I had hoped that this would be all that was needed. But unfortunately this doesn't work because of the second problem... ... the implementation of the Python function gdb.write() calls gdb_printf(), which calls gdb_vprintf(), which calls ui_file::vprintf, which calls ui_out::vmessage, which calls ui_out::call_do_message, and finally we reach cli_ui_out::do_message. This final do_message function does this: ui_file *stream = m_streams.back (); stream->emit_style_escape (style); stream->puts (str.c_str ()); stream->emit_style_escape (ui_file_style ()); If we imagine the case where we are emitting a style, triggered from Python like this: gdb.write(gdb.Color('red').escape_sequence(True)) the STYLE in this case will be the default ui_file_style(), and STR will hold the escape sequence we are writing. After the first change, where pager_file::puts now calls pager_file::emit_style_escape, the current style of STREAM will have been updated. But this means that the final emit_style_escape will now restore the default style. The fix for this is to avoid using the high level gdb_printf from gdb.write(), and instead use gdb_puts instead. The gdb_puts function doesn't restore the default style, which means our style modification survives. There's a new test included. This test includes what appears like a pointless extra loop (looping over a single value), but this makes sense given the origin of this patch. I've pulled this commit from a longer series: https://inbox.sourceware.org/gdb-patches/cover.1755080429.git.aburgess@redhat.com I want to get this bug fix merged before GDB 17 branches, but the longer series is not getting reviews, so for now I'm just merging this one fix. Once the rest of the series gets merged, I'll be extending the test, and the loop (mentioned above) will now loop over more values.
2025-08-23Update comment in rust-parse.cTom Tromey1-1/+1
I noticed an out-of-date comment in rust-parse.c.
2025-08-23[gdb/testsuite] Require cooked index in two test-casesTom de Vries2-0/+12
After running the testsuite with target board cc-with-gdb-index I ran found failures in test-cases: - gdb.dwarf2/backward-spec-inter-cu.exp - gdb.dwarf2/forward-spec-inter-cu.exp Fix this by requiring a cooked index. Tested on x86_64-linux.
2025-08-23[gdb/symtab] Turn complaints in create_addrmap_from_gdb_index into warningsTom de Vries1-6/+8
Rather than issuing a complaint, which is off by default, warn when returning false in create_addrmap_from_gdb_index, informing the user that the .gdb_index was ignored, and why. Tested on aarch64-linux.
2025-08-23[gdb/symtab] Detect overlapping ranges in create_addrmap_from_gdb_indexTom de Vries1-1/+9
In create_addrmap_from_gdb_index, use the return value of addrmap_mutable::insert_empty to detect overlapping ranges. Tested on x86_64-linux. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-23[gdb] Make addrmap_mutable::set_empty return boolTom de Vries2-6/+18
Function addrmap_mutable::set_empty has the follow behavior (shortened comment): ... /* In the mutable address map MAP, associate the addresses from START to END_INCLUSIVE that are currently associated with NULL with OBJ instead. Addresses mapped to an object other than NULL are left unchanged. */ void set_empty (CORE_ADDR start, CORE_ADDR end_inclusive, void *obj); ... Change the return type to bool, and return true if the full range [START, END_INCLUSIVE] is mapped to OBJ. Tested on x86_64-linux. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-23[gdb/symtab] Bail out of create_addrmap_from_gdb_index on errorTom de Vries1-5/+9
Currently, in create_addrmap_from_gdb_index, when finding an incorrect entry in the address table of a .gdb_index section: - a (by default silent) complaint is made, - the entry is skipped, and - the rest of the entries is processed. This is the use-what-you-can approach, which make sense in general. But in the case that the .gdb_index section is incorrect while the other debug info is correct, this approach prevents gdb from building a correct cooked index (assuming there's no bug in gdb that would cause an incorrect index to be generated). Instead, bail out of create_addrmap_from_gdb_index on finding errors in the address table. I wonder about the following potential drawback of this approach: in the case that the .gdb_index section is incorrect because the debug info is incorrect, this approach rejects the .gdb_index section and spents time rebuilding a likewise incorrect index. But I'm not sure if this is a real problem. Perhaps gdb will refuse to generate such an index, in which case this is a non-issue. Tested on aarch64-linux. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-22gdb/doc: qSearch:memory packets use escaped binary patternsAaron Griffith1-1/+2
The `qSearch:memory` packet uses hex encoding for the address and length arguments, but the search-pattern argument uses escaped binary. Approved-By: Eli Zaretskii <eliz@gnu.org>
2025-08-22gdb/testsuite: fix pattern in gdb.base/dlmopen-ns-ids.expSimon Marchi1-10/+10
I forgot one spot when updating the "info shared" header from NS to "Linker NS", fix that. This fixes the following failure: FAIL: gdb.base/dlmopen-ns-ids.exp: check no duplicates: info sharedlibrary At the same time, fix a couple of things I found when looking at this code again. One is bad indentation, the other is an unnecessary parameter. Change-Id: Ibbc2062699264dde08fd3ff7c503524265c73b0c
2025-08-22MSYS2+MinGW testing: Unix <-> Windows path conversionPedro Alves8-49/+263
On an MSYS2 system, I have: # which tclsh /mingw64/bin/tclsh # which tclsh86 /mingw64/bin/tclsh86 # which tclsh8.6 /usr/bin/tclsh8.6 # which expect /usr/bin/expect The ones under /usr/bin are MSYS2 programs (linked with msys-2.0.dll). I.e., they are really Cygwin (unix) ports of the programs. The ones under /mingw64 are native Windows programs (NOT linked with msys-2.0.dll). You can check that with CYGWIN/MSYS2 ldd. The MSYS2/Cygwin port of TCL (and thus expect) does not treat a file name that starts with a drive letter as an absolute file name, while the native/MinGW port does. Vis: # cat file-join.exp puts [file join c:/ d:/] # /mingw64/bin/tclsh.exe file-join.exp d:/ # /mingw64/bin/tclsh86.exe file-join.exp d:/ # /usr/bin/expect.exe file-join.exp c:/d: # /usr/bin/tclsh8.6.exe file-join.exp c:/d: When running the testsuite under MSYS2 to test mingw32 (Windows native) GDB, we use MSYS2 expect (there is no MinGW port of expect AFAIK). Any TCL file manipulation routine will thus not consider drive letters special, and just treats them as relative file names. This results in several cases of the testsuite passing to GDB broken file names, like: "C:/foo/C:/foo/bar" or: "/c/foo/C:/foo/bar" E.g., there is a "file join" in standard_output_file that results in this: (gdb) file C:/gdb/build/outputs/gdb.base/info_sources_2/C:/gdb/build/outputs/gdb.base/info_sources_2/info_sources_2 C:/gdb/build/outputs/gdb.base/info_sources_2/C:/gdb/build/outputs/gdb.base/info_sources_2/info_sources_2: No such file or directory. (gdb) ERROR: (info_sources_2) No such file or directory delete breakpoints The bad "file join" comes from clean_restart $binfile, where $binfile is an absolute host file name (thus has a drive letter), clean_restart doing: set binfile [standard_output_file ${executable}] return [gdb_load ${binfile}] and standard_output_file doing: # If running on MinGW, replace /c/foo with c:/foo if { [ishost *-*-mingw*] } { set dir [exec sh -c "cd ${dir} && pwd -W"] } return [file join $dir $basename] Here, BASENAME was already an absolute file name that starts with a drive letter, but "file join" treated it as a relative file name. Another spot where we mishandle Unix vs drive letter file names, is in the "dir" command that we issue when starting every testcase under GDB. We currently always pass the file name as seen from the build machine (i.e., from MSYS2), which is a Unix file name that native Windows GDB does not understand, resulting in: (gdb) dir /c/gdb/src/gdb/testsuite/gdb.rocm warning: /c/gdb/src/gdb/testsuite/gdb.rocm: No such file or directory Source directories searched: /c/gdb/src/gdb/testsuite/gdb.rocm;$cdir;$cwd This patch introduces a systematic approach to handle all this, by introducing the concepts of build file names (what DejaGnu sees) vs host file names (what GDB sees). This patches implements that in the following way: 1) - Keep standard_output_file's host-side semantics standard_output_file currently converts the file name to a Windows file name, using the "cd $dir; pwd -W" trick. standard_output_file is used pervasively, so I think it should keep the semantics that it returns a host file name. Note there is already a preexisting host_standard_output_file procedure. The difference to standard_output_file is that host_standard_output_file handles remote hosts, while standard_output_file assumes the build and host machines share a filesystem. The MSYS2 Unix path vs MinGW GDB drive letter case fall in the "shared filesystem" bucket. An NFS mount on the host at the same mount point as on the build machine falls in that bucket too. 2) - Introduce build_standard_output_file In some places, we are calling standard_output_file to find the build-side file name, most often just to find the standard output directory file name, and then immediately use that file name with TCL file manipulation procedures, to do some file manipulation on the build machine. clean_standard_output_dir is an example of such a case. That code path is responsible for this bogus 'rm -rf' in current MSYS2 testing: Running /c/gdb/src/gdb/testsuite/gdb.base/break.exp ... Executing on build: rm -rf /c/msys2/home/alves/gdb/build-testsuite/C:/msys2/home/alves/gdb/build-tests... For these cases, add a variant of standard_output_file called build_standard_output_file. The main difference to standard_output_file is that it doesn't do the "cd $dir; pwd -W" trick. I.e., it returns a path on the build machine. 3) Introduce host_file_sanitize In some cases, we read an absolute file name out of GDB's output, and then want to compare it against some other file name. The file name may originally come from the DWARF, and sometimes may have forward slashes, and other times, it may have backward slashes. Or the drive letter may be uppercase, or it may be lowercase. To make comparisons easier, add a new host_file_sanitize procedure, that normalizes slashes, and uppercases the drive letter. It does no other normalization. Particularly, it does not turn a relative file name into an absolute file name. It's arguable whether GDB itself should do this sanitization. I suspect it should. I personally dislike seeing backward slashes in e.g., "info shared" output, or worse, mixed backward and forward slashes. Still, I propose starting with a testsuite adjustment that moves us forward, and handle that separately. I won't be surprised if we need the new routine for some cases even if we adjust GDB. 4) build_file_normalize / host_file_normalize In several places in the testsuite, we call "file normalize" on some file name. If we pass it a drive-letter file name, that TCL procedure treats the passed in file name as a relative file name, so produces something like /c/foo/C:/foo/bar.txt. If the context calls for a build file name, then the "file normalize" call should produce /c/foo/bar.txt. If OTOH we need a host file name, then it should produce "C:/foo/bar.txt". Handle this by adding two procedures that wrap "file normalize": - build_file_normalize - host_file_normalize Initialy I implemented them in a very simple way, calling into cygpath: proc build_file_normalize {filename} { if { [ishost *-*-mingw*] } { return [exec cygpath -ua $filename] } else { return [file normalize $filename] } } proc host_file_normalize {filename} { if { [ishost *-*-mingw*] } { return [exec cygpath -ma $filename] } else { return [file normalize $filename] } } "cygpath" is a utility that comes OOTB with both Cygwin and MSYS2, that does Windows <-> Cygwin file name conversion. This works well, but because running the testsuite on Windows is so slow, I thought of trying to avoid or minimize the cost of calling an external utility ("cygpath"). On my system, calling into cygpath takes between 200ms to 350ms, and these smallish costs (OK, not so small!) can creep up and compound an already bad situation. Note that the current call to "cd $dir; pwd -W" has about the same cost as a "cygpath" call (though a little bit cheaper). So with this patch, we actually don't call cygpath at all, and no longer use the "cd $dir; pwd -W" trick. Instead we run the "mount" command once, and cache the mapping (via gdb_caching_proc) between Windows file names and Unix mount points, and then use that mapping in host_file_normalize and build_file_normalize, to do the Windows <=> Unix file name conversions ourselves. One other small advantage here is that this approach works the same for 'cygwin x mingw' testing [1], and 'msys x mingw' testing, while "pwd -W" only works on MSYS2. So I think the end result is that we should end up faster (or less slow) than the current state. (No, I don't have actual timings for the effect over a whole testsuite run.) 5) Introduce host_file_join For the "file join" call done from within standard_output_file (and probably in future other places), since that procedure works with host file names, add a new host_file_join procedure that is a wrapper around "file join" that is aware of Windows drive letters. ====== With the infrastructure described above in place, the "dir" case is fixed by simply calling host_file_normalize on the directory name, before passing it to GDB. That turns: (gdb) dir /c/gdb/src/gdb/testsuite/gdb.base warning: /c/gdb/src/gdb/testsuite/gdb.base: No such file or directory Source directories searched: /c/gdb/src/gdb/testsuite/gdb.base;$cdir;$cwd Into: (gdb) dir C:/gdb/src/gdb/testsuite/gdb.base Source directories searched: C:/gdb/src/gdb/testsuite/gdb.base;$cdir;$cwd Running the testsuite on GNU/Linux reveals that that change requires tweaks to gdb.guile/scm-parameter.exp and gdb.python/py-parameter.exp, to run the expected directory by host_file_normalize too, so that it matches the directory we initially pass GDB at startup time. Without that fix, there could be a mismatch if the GDB sources path has a symlink component, which now gets resolved by the host_file_normalize call. The theory is that most standard_output_file uses will not need to be adjusted. I grepped for "file normalize" and "file join", to find cases that might need adjustment, and fixed those that required fixing. The fixes are included in this patch, to make it easier to reason about the overall change. E.g., in gdb.base/fullname.exp, without the fix, we get: Running /c/gdb/src/gdb/testsuite/gdb.base/fullname.exp ... ERROR: tcl error sourcing /c/gdb/src/gdb/testsuite/gdb.base/fullname.exp. ERROR: tcl error code NONE ERROR: C:/msys2/home/alves/gdb/build-testsuite/outputs/gdb.base/fullname/tmp-fullname.c not a subdir of /c/msys2/home/alves/gdb/build-testsuite In gdb.base/source-dir.exp, we have several issues. E.g., we see the "/c/foo/c:/foo" problem there too: dir /c/msys2/home/alves/gdb/build-testsuite/C:/msys2/home/alves/gdb/build-testsuite/outputs/gdb.base/source-dir/C:/msys2/home/alves/gdb/build-testsuite/outputs warning: /c/msys2/home/alves/gdb/build-testsuite/C:/msys2/home/alves/gdb/build-testsuite/outputs/gdb.base/source-dir/C:/msys2/home/alves/gdb/build-testsuite/outputs: No such file or directory Source directories searched: /c/msys2/home/alves/gdb/build-testsuite/C:/msys2/home/alves/gdb/build-testsuite/outputs/gdb.base/source-dir/C:/msys2/home/alves/gdb/build-testsuite/outputs;$cdir;$cwd (gdb) PASS: gdb.base/source-dir.exp: setup source path search directory ... Executing on host: x86_64-w64-mingw32-gcc \ -fno-stack-protector \ /c/msys2/home/alves/gdb/build-testsuite/C:/msys2/home/alves/gdb/build-testsuite/outputs/gdb.base/macro-source-path/cwd/macro-source-path.c ... ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ... and we need to handle Unix file names that we pass to the compiler (on the build side), vs file names that GDB prints out (the host side). Similarly in the other testcases. I haven't yet tried to do a full testsuite run on MSYS2, and I'm quite confident there will be more places that will need similar adjustment, but I'd like to land the infrastructure early, so that the rest of the testsuite can be adjusted incrementally, and others can help. Change-Id: I664dbb86d0efa4fa8db405577bea2b4b4a96a613
2025-08-22gdb/copyright.py: print notice about files that print copyright at runtimeSimon Marchi1-1/+10
During the last new year process, it seems that we forgot to update the copyright notices printed by the various programs (see 713b99a9398 "gdb, gdbserver: update copyright years in copyright notices"). Change gdb/copyright.py to print a message about this. For a procedure that happens once a year, this seems sufficient to me, but if someone wants to automate it I won't object. While at it, change the formatting of the previous message, to match the formatting of the first message (making it end with a colon). Change-Id: I330f566221d102bab0a953bc324127f2466dd5cf Approved-By: Tom Tromey <tom@tromey.com>
2025-08-22testsuite: Introduce gdb_watchdog (avoid unistd.h/alarm)Pedro Alves4-5/+93
There are a good number of testcases in the testsuite that use alarm() as a watchdog that aborts the test if something goes wrong. alarm()/SIG_ALRM do not exist on (native) Windows, so those tests fail to compile there. For example, testing with x86_64-w64-mingw32-gcc, we see: Running /c/rocgdb/src/gdb/testsuite/gdb.base/attach.exp ... gdb compile failed, C:/rocgdb/src/gdb/testsuite/gdb.base/attach.c: In function 'main': C:/rocgdb/src/gdb/testsuite/gdb.base/attach.c:17:3: error: implicit declaration of function 'alarm' [-Wimplicit-function-declaration] 17 | alarm (60); | ^~~~~ While testing with a clang configured to default to x86_64-pc-windows-msvc, which uses the C/C++ runtime headers from Visual Studio and has no unistd.h, we get: Running /c/rocgdb/src/gdb/testsuite/gdb.base/attach.exp ... gdb compile failed, C:/rocgdb/src/gdb/testsuite/gdb.base/attach.c:8:10: fatal error: 'unistd.h' file not found 8 | #include <unistd.h> | ^~~~~~~~~~ Handle this by adding a new testsuite/lib/gdb_watchdog.h header that defines a new gdb_watchdog function, which wraps alarm on Unix-like systems, and uses a timer on Windows. This patch adjusts gdb.base/attach.c as example of usage. Testing gdb.base/attach.exp with clang/x86_64-pc-windows-msvc required a related portability tweak to can_spawn_for_attach, to not rely on unistd.h on Windows. gdb.rocm/mi-attach.cpp is another example adjusted, one which always runs with clang configured as x86_64-pc-windows-msvc on Windows (via hipcc). Approved-by: Kevin Buettner <kevinb@redhat.com> Change-Id: I3b07bcb60de039d34888ef3494a5000de4471951
2025-08-22Automatically handle includes in testsuite/lib/Pedro Alves2-5/+64
Instead of manually calling lappend_include_file in every testcase that needs to include a file in testsuite/lib/, handle testsuite/lib/ includes automatically in gdb_compile. As an example, gdb.base/backtrace.exp is adjusted to no longer explicitly call lappend_include_file for testsuite/lib/attributes.h. Tested on x86-64 GNU/Linux with both: $ make check RUNTESTFLAGS=" \ --host_board=local-remote-host-native \ --target_board=local-remote-host-native \ HOST_DIR=/tmp/foo/" \ TESTS="gdb.base/backtrace.exp" and: $ make check TESTS="gdb.base/backtrace.exp" and confirming that the testcase still compiles and passes cleanly. Also ran full testsuite on x86-64 GNU/Linux in normal mode. Approved-by: Kevin Buettner <kevinb@redhat.com> Change-Id: I5ca77426ea4a753a995c3ad125618c02cd952576
2025-08-22gdb/solib-svr4: fix wrong namespace id for dynamic linkerSimon Marchi4-34/+143
When running a program that uses multiple linker namespaces, I get something like: $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.base/dlmopen-ns-ids/dlmopen-ns-ids -ex "tb 50" -ex r -ex "info shared" -batch ... From To NS Syms Read Shared Object Library 0x00007ffff7fc6000 0x00007ffff7fff000 0 Yes /lib64/ld-linux-x86-64.so.2 0x00007ffff7e93000 0x00007ffff7f8b000 0 Yes /usr/lib/libm.so.6 0x00007ffff7ca3000 0x00007ffff7e93000 0 Yes /usr/lib/libc.so.6 0x00007ffff7fb7000 0x00007ffff7fbc000 1 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/dlmopen-ns-ids/dlmopen-lib.so 0x00007ffff7b77000 0x00007ffff7c6f000 1 Yes /usr/lib/libm.so.6 0x00007ffff7987000 0x00007ffff7b77000 1 Yes /usr/lib/libc.so.6 0x00007ffff7fc6000 0x00007ffff7fff000 1 Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7fb2000 0x00007ffff7fb7000 2 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/dlmopen-ns-ids/dlmopen-lib.so 0x00007ffff788f000 0x00007ffff7987000 2 Yes /usr/lib/libm.so.6 0x00007ffff769f000 0x00007ffff788f000 2 Yes /usr/lib/libc.so.6 0x00007ffff7fc6000 0x00007ffff7fff000 1! Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7fad000 0x00007ffff7fb2000 3 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/dlmopen-ns-ids/dlmopen-lib.so 0x00007ffff75a7000 0x00007ffff769f000 3 Yes /usr/lib/libm.so.6 0x00007ffff73b7000 0x00007ffff75a7000 3 Yes /usr/lib/libc.so.6 0x00007ffff7fc6000 0x00007ffff7fff000 1! Yes /usr/lib/ld-linux-x86-64.so.2 Some namespace IDs for the dynamic linker entries (ld-linux) are wrong (I placed a ! next to those that are wrong). The dynamic linker is special: it is loaded only once (notice how all ld-linux entries have the same addresses), but it is visible in all namespaces. It is therefore listed separately in all namespaces. The problem happens like this: - for each solib, print_solib_list_table calls solib_ops::find_solib_ns to get the namespace ID to print - svr4_solib_ops::find_solib_ns calls find_debug_base_for_solib - find_debug_base_for_solib iterates on the list of solibs in all namespaces, looking for a match for the given solib. For this, it uses svr4_same, which compares two SOs by name and low address. Because there are entries for the dynamic linker in all namespaces, with the same low address, find_debug_base_for_solib is unable to distinguish them, and sometimes returns the wrong namespace. To fix this, save in lm_info_svr4 the debug base address that this lm/solib comes from, as a way to distinguish two solibs that would be otherwise identical. The code changes are: - Add a constructor to lm_info_svr4 accepting the debug base. Update all callers, which sometimes requires passing down the debug base. - Modify find_debug_base_for_solib to return the debug base directly from lm_info_svr4. - Modify svr4_same to consider the debug base value of the two libraries before saying they are the same. While at it, move the address checks before the name check, since they are likely less expensive to do. - Modify svr4_solib_ops::default_debug_base to update the debug base of existing solibs when the default debug base becomes known. I found the last point to be necessary, because when running an inferior, we list the shared libraries very early (before the first instruction): #0 svr4_solib_ops::current_sos (this=0x7c1ff1e09710) #1 0x00005555643c774e in update_solib_list (from_tty=0) #2 0x00005555643ca377 in solib_add (pattern=0x0, from_tty=0, readsyms=1) #3 0x0000555564335585 in svr4_solib_ops::enable_break (this=0x7c1ff1e09710, info=0x7d2ff1de8c40, from_tty=0) #4 0x000055556433c85c in svr4_solib_ops::create_inferior_hook (this=0x7c1ff1e09710, from_tty=0) #5 0x00005555643d22cb in solib_create_inferior_hook (from_tty=0) #6 0x000055556337071b in post_create_inferior (from_tty=0, set_pspace_solib_ops=true) #7 0x00005555633726a2 in run_command_1 (args=0x0, from_tty=0, run_how=RUN_NORMAL) #8 0x0000555563372b35 in run_command (args=0x0, from_tty=0) At this point, the dynamic linker hasn't yet filled the DT_DEBUG slot, which normally points at the base of r_debug. Since we're unable to list shared libraries at this point, we go through svr4_solib_ops::default_sos, which creates an solib entry for the dynamic linker. At this point, we have no choice but to create it with a debug base of 0 (or some other value that indicates "unknown"). If we left it as-is, then it would later not be recognized to be part of any existing namespace and that would cause problems down the line. With this change, the namespaces of the dynamic linker become correct. I was not sure if the code in library_list_start_library was conflating debug base and lmid. The documentation says this about the "lmid" field in the response of a qxfer:libraries-svr4:read packet: lmid, which is an identifier for a linker namespace, such as the memory address of the r_debug object that contains this namespace’s load map or the namespace identifier returned by dlinfo (3). When I read "lmid", I typically think about "the namespace identifier returned by dlinfo (3)". In library_list_start_library, we use the value of the "lmid" attribute as the debug base address. This is the case even before this patch, since we do: solist = &list->solib_lists[lmid]; The key for the solib_lists map is documented as being the debug base address. In practice, GDBserver uses the debug base address for the "lmid" field, so we're good for now. If the remote side instead used "the namespace identifier returned by dlinfo (3)" (which in practice with glibc are sequential integers starting at 0), I think we would be mostly fine. If we use the qxfer packet to read the libraries, we normally won't use the namespace base address to do any memory reads, as all the information comes from the XML. There might be some problems however because we treat the namespace 0 specially, for instance in svr4_solib_ops::update_incremental. In that case, we might need a different way of indicating that the remote side does not give namespace information than using namespace 0. This is just a thought for the future. I improved the existing test gdb.base/dlmopen-ns-ids.exp to verify that "info sharedlibrary" does not show duplicate libraries, duplicate meaning same address range, namespace and name. Change-Id: I84467c6abf4e0109b1c53a86ef688b934e8eff99 Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-22gdb/solib-svr4: centralize fetching of default debug baseSimon Marchi2-37/+69
When running an inferior, solib_ops_svr4::current_sos is called very early, at a point where the default debug base is not yet accessible. The inferior is stopped at its entry point, before the dynamic linker had the time to fill the DT_DEBUG slot. It only becomes available a little bit later. In a following patch, I will want to do some action when the debug base becomes known (namely, update the debug base in the previously created lm_info_svr4 instances). For this reason, add the svr4_solib_ops::default_debug_base method to centralize where we fetch the default debug base. I will then be able to add my code there, when detecting the debug base changes. This patch brings the following behavior change: since all svr4_solib_ops entry points now use svr4_solib_ops::default_debug_base to get the debug base value, they will now all re-fetch the value from the inferior. Previously, this was not done consistently, only in two spots. It seems to me like it would be good to be consistent about that, because we can't really predict which methods will get called in which order in all scenarios. Some internal methods still access svr4_info::default_debug_base directly, because it is assumed that their caller would have used svr4_solib_ops::default_debug_base, updating the value in svr4_info::default_debug_base if necessary. Change-Id: Ie08da34bbb3ad6fd317c0e5802c5c94d8c7d1ce5 Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-22gdb: make iterate_over_objfiles_in_search_order methods of program_space and ↵Simon Marchi20-164/+139
solib_ops Change the "iterate over objfiles in search order" operation from a gdbarch method to methods on both program_space and solib_ops. The first motivation for this is that I want to encapsulate solib-svr4's data into svr4_solib_ops (in a subsequent series), instead of it being in a separate structure (svr4_info). It is awkward to do so as long as there are entry points that aren't the public solib_ops interface. The second motivation is my project of making it able to have multiple solib_ops per program space (which should be the subject of said subsequent series), to better support heterogenousa systems (like ROCm, with CPU and GPU in the same inferior). When we have this, when stopped in GPU code, it won't make sense to ask the host's architecture to do the iteration, as the logic could be different for the GPU architecture. Instead, program_space::iterate_over_objfiles_in_search_order will be responsible to delegate to the various solib_ops using a logic that is yet to be determined. I included this patch in this series (rather than the following one) so that svr4_solib_ops::iterate_over_objfiles_in_search_order can access svr4_solib_ops::default_debug_base, introduced in a later patch in this series. default_iterate_over_objfiles_in_search_order becomes the default implementation of solib_ops::iterate_over_objfiles_in_search_order. As far as I know, all architectures using svr4_iterate_over_objfiles_in_search_order also use solib_ops_svr4, so I don't expect this patch to cause behavior changes. Change-Id: I71f8a800b8ce782ab973af2f2eb5fcfe4e06ec76 Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-22gdb: rename svr4_same_1 -> svr4_same_nameSimon Marchi1-6/+6
This makes it a bit clearer that it compares shared libraries by name. While at it, change the return type to bool. Change-Id: Ib11a931a0cd2e00bf6ae35c5b6e0d620298d46cb Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-22gdb/solib-svr4: add get_lm_info_svr4Simon Marchi1-35/+29
Add this function, as a shortcut of doing the more verbose: auto *li = gdb::checked_static_cast<lm_info_svr4 &> (*solib.lm_info); Change-Id: I0206b3a8b457bdb276f26b354115e8f44416dfcf Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-22gdb/solib: save program space in solib_opsSimon Marchi26-57/+104
In some subsequent patches, solib_ops methods will need to access the program space they were created for. We currently access the program space using "current_program_space", but it would better to remember the program space at construction time instead. Change-Id: Icf2809435a23c47ddeeb75e603863b201eff2e58 Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-22gdb/solib: adjust info linker-namespaces/sharedlibrary formatSimon Marchi4-61/+56
I would like to propose some minor changes to the format of "info linker namespaces" and "info sharedlibrary", to make it a bit tidier and less chatty. Here are the current formats (I replaced empty lines with dots, so that git doesn't collapse them): (gdb) info linker-namespaces There are 3 linker namespaces loaded There are 5 libraries loaded in linker namespace [[0]] Displaying libraries for linker namespace [[0]]: From To Syms Read Shared Object Library 0x00007ffff7fc6000 0x00007ffff7fff000 Yes /lib64/ld-linux-x86-64.so.2 0x00007ffff7e94000 0x00007ffff7f8c000 Yes /usr/lib/libm.so.6 0x00007ffff7ca4000 0x00007ffff7e94000 Yes /usr/lib/libc.so.6 0x00007ffff7fad000 0x00007ffff7fb2000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7fa8000 0x00007ffff7fad000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so . . There are 6 libraries loaded in linker namespace [[1]] Displaying libraries for linker namespace [[1]]: From To Syms Read Shared Object Library 0x00007ffff7fb7000 0x00007ffff7fbc000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7fb2000 0x00007ffff7fb7000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7b79000 0x00007ffff7c71000 Yes /usr/lib/libm.so.6 0x00007ffff7989000 0x00007ffff7b79000 Yes /usr/lib/libc.so.6 0x00007ffff7fc6000 0x00007ffff7fff000 Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7f99000 0x00007ffff7f9e000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.2.so . . There are 5 libraries loaded in linker namespace [[2]] Displaying libraries for linker namespace [[2]]: From To Syms Read Shared Object Library 0x00007ffff7fc6000 0x00007ffff7fff000 Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7fa3000 0x00007ffff7fa8000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7f9e000 0x00007ffff7fa3000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7891000 0x00007ffff7989000 Yes /usr/lib/libm.so.6 0x00007ffff76a1000 0x00007ffff7891000 Yes /usr/lib/libc.so.6 (gdb) info sharedlibrary From To NS Syms Read Shared Object Library 0x00007ffff7fc6000 0x00007ffff7fff000 [[0]] Yes /lib64/ld-linux-x86-64.so.2 0x00007ffff7e94000 0x00007ffff7f8c000 [[0]] Yes /usr/lib/libm.so.6 0x00007ffff7ca4000 0x00007ffff7e94000 [[0]] Yes /usr/lib/libc.so.6 0x00007ffff7fb7000 0x00007ffff7fbc000 [[1]] Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7fb2000 0x00007ffff7fb7000 [[1]] Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7b79000 0x00007ffff7c71000 [[1]] Yes /usr/lib/libm.so.6 0x00007ffff7989000 0x00007ffff7b79000 [[1]] Yes /usr/lib/libc.so.6 0x00007ffff7fc6000 0x00007ffff7fff000 [[1]] Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7fad000 0x00007ffff7fb2000 [[0]] Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7fa8000 0x00007ffff7fad000 [[0]] Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7fa3000 0x00007ffff7fa8000 [[2]] Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7f9e000 0x00007ffff7fa3000 [[2]] Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7891000 0x00007ffff7989000 [[2]] Yes /usr/lib/libm.so.6 0x00007ffff76a1000 0x00007ffff7891000 [[2]] Yes /usr/lib/libc.so.6 0x00007ffff7fc6000 0x00007ffff7fff000 [[1]] Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7f99000 0x00007ffff7f9e000 [[1]] Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.2.so Here is what I would change: - I find that the [[...]] notation used everywhere is heavy and noisy. I understand that this is the (proposed) notation for specifying a namespace id in an expression. But I don't think it's useful to print those brackets everywhere (when it's obvious from the context that the number is a namespace id). I would remove them from the messages and from the tables. - I find these lines a bit too verbose: There are X libraries loaded in linker namespace [[Y]] Displaying libraries for linker namespace [[Y]]: I think they can be condensed to a single line, without loss of information (I think that printing the number of libs in each namespace is not essential, but I don't really mind, so I left it there). - I would add an empty line after the "There are N linker namespaces loaded" message, to visually separate it from the first group. I would also finish that line with a period. - There are two empty lines between each group I think that one empty line is sufficient to do a visual separation. Here's how it looks with this patch: (gdb) info linker-namespaces There are 3 linker namespaces loaded. 5 libraries loaded in linker namespace 0: From To Syms Read Shared Object Library 0x00007ffff7fc6000 0x00007ffff7fff000 Yes /lib64/ld-linux-x86-64.so.2 0x00007ffff7e94000 0x00007ffff7f8c000 Yes /usr/lib/libm.so.6 0x00007ffff7ca4000 0x00007ffff7e94000 Yes /usr/lib/libc.so.6 0x00007ffff7fad000 0x00007ffff7fb2000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7fa8000 0x00007ffff7fad000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 6 libraries loaded in linker namespace 1: From To Syms Read Shared Object Library 0x00007ffff7fb7000 0x00007ffff7fbc000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7fb2000 0x00007ffff7fb7000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7b79000 0x00007ffff7c71000 Yes /usr/lib/libm.so.6 0x00007ffff7989000 0x00007ffff7b79000 Yes /usr/lib/libc.so.6 0x00007ffff7fc6000 0x00007ffff7fff000 Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7f99000 0x00007ffff7f9e000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.2.so 5 libraries loaded in linker namespace 2: From To Syms Read Shared Object Library 0x00007ffff7fc6000 0x00007ffff7fff000 Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7fa3000 0x00007ffff7fa8000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7f9e000 0x00007ffff7fa3000 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7891000 0x00007ffff7989000 Yes /usr/lib/libm.so.6 0x00007ffff76a1000 0x00007ffff7891000 Yes /usr/lib/libc.so.6 (gdb) info shared From To Linker NS Syms Read Shared Object Library 0x00007ffff7fc6000 0x00007ffff7fff000 0 Yes /lib64/ld-linux-x86-64.so.2 0x00007ffff7e94000 0x00007ffff7f8c000 0 Yes /usr/lib/libm.so.6 0x00007ffff7ca4000 0x00007ffff7e94000 0 Yes /usr/lib/libc.so.6 0x00007ffff7fb7000 0x00007ffff7fbc000 1 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7fb2000 0x00007ffff7fb7000 1 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7b79000 0x00007ffff7c71000 1 Yes /usr/lib/libm.so.6 0x00007ffff7989000 0x00007ffff7b79000 1 Yes /usr/lib/libc.so.6 0x00007ffff7fc6000 0x00007ffff7fff000 1 Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7fad000 0x00007ffff7fb2000 0 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7fa8000 0x00007ffff7fad000 0 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7fa3000 0x00007ffff7fa8000 2 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.1.so 0x00007ffff7f9e000 0x00007ffff7fa3000 2 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib-dep.so 0x00007ffff7891000 0x00007ffff7989000 2 Yes /usr/lib/libm.so.6 0x00007ffff76a1000 0x00007ffff7891000 2 Yes /usr/lib/libc.so.6 0x00007ffff7fc6000 0x00007ffff7fff000 1 Yes /usr/lib/ld-linux-x86-64.so.2 0x00007ffff7f99000 0x00007ffff7f9e000 1 Yes /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.mi/mi-dlmopen/dlmopen-lib.2.so Change-Id: Iefad340f7f43a15cff24fc8e1301f91d3d7f0278 Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-22gdb/solib: don't check filename when checking for duplicate solibSimon Marchi1-9/+6
On Arch Linux, I get: FAIL: gdb.base/dlmopen-ns-ids.exp: reopen a namespace The symptom observed is that after stepping over the last dlmopen of the test, "info sharedlibrary" does not show the library just opened. After digging, I found that when stepping over that dlmopen call, the shlib event breakpoint (that GDB inserts in glibc to get notified of dynamic linker activity) does not get hit. I then saw that after the previous dlclose, the shlib event breakpoints were suddenly all marked as pending: (gdb) maintenance info breakpoints Num Type Disp Enb Address What -1 shlib events keep n <PENDING> -1.1 y- <PENDING> The root cause of this problem is the fact that the dynamic linker path specified in binaries contains a symlink: $ readelf --program-headers /bin/ls | grep "Requesting program interpreter" [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] $ ls -l /lib64 lrwxrwxrwx 1 root root 7 May 3 15:26 /lib64 -> usr/lib $ realpath /lib64/ld-linux-x86-64.so.2 /usr/lib/ld-linux-x86-64.so.2 As a result, the instances of the dynamic linker in the non-base namespace have the real path instead of the original path: (gdb) info sharedlibrary From To NS Syms Read Shared Object Library 0x00007ffff7fc6000 0x00007ffff7fff000 [[0]] Yes /lib64/ld-linux-x86-64.so.2 ... 0x00007ffff7fc6000 0x00007ffff7fff000 [[1]] Yes /usr/lib/ld-linux-x86-64.so.2 ... 0x00007ffff7fc6000 0x00007ffff7fff000 [[1]] Yes /usr/lib/ld-linux-x86-64.so.2 ... 0x00007ffff7fc6000 0x00007ffff7fff000 [[1]] Yes /usr/lib/ld-linux-x86-64.so.2 Notice that all instances of the dynamic loader have the same address range. This is expected: the dynamic loader is really loaded just once in memory, it's just that it's visible in the various namespaces, so listed multiple times. Also, notice that the last three specify namespace 1... seems like a separate bug to me (ignore it for now). The fact that the paths differ between the first one and the subsequent ones is not something we control: we receive those paths as-is from the glibc link map. Since these multiple solib entries are really the same mapping, we would expect this code in solib_read_symbols to associate them to the same objfile: /* Have we already loaded this shared object? */ so.objfile = nullptr; for (objfile *objfile : current_program_space->objfiles ()) { if (filename_cmp (objfile_name (objfile), so.name.c_str ()) == 0 && objfile->addr_low == so.addr_low) { so.objfile = objfile; break; } } But because the filenames differ, we end up creating two different objfiles with the same symbols, same address ranges, etc. I would guess that this is not a state we want. When the dlclose call closes the last library from the non-base namespace, the dynamic linker entry for that namespace is also removed. From GDB's point of view, it just looks like an solib getting unloaded. In update_solib_list, we have this code to check if the objfile behind the solib is used by other solibs, and avoid deleting the objfile if so: bool still_in_use = (gdb_iter->objfile != nullptr && solib_used (current_program_space, *gdb_iter)); /* Notify any observer that the shared object has been unloaded before we remove it from GDB's tables. */ notify_solib_unloaded (current_program_space, *gdb_iter, still_in_use, false); /* Unless the user loaded it explicitly, free SO's objfile. */ if (gdb_iter->objfile != nullptr && !(gdb_iter->objfile->flags & OBJF_USERLOADED) && !still_in_use) gdb_iter->objfile->unlink (); Because this is the last solib to use that objfile instance, the objfile is deleted. In the process, disable_breakpoints_in_unloaded_shlib (in breakpoint.c) is called. The breakpoint locations for the shlib event breakpoints get marked as "shlib_disabled", which then causes them (I suppose) to not get inserted and be marked as pending. And then, when stepping on the subsequent dlmopen call, GDB misses the load of the new library. It seems clear to me that, at least, the duplicate objfile detection in solib_read_symbols needs to be fixed. Right now, to conclude that an solib matches an existing objfile, it checks that: - the two have equivalent paths (filename_cmp) - the two have the same "low" address In this patch, I remove the filename check. This makes it such that all the solibs for dynamic linker entries will share the same objfile. This assumes that no two different solibs / objfiles will have the same low address. At first glance, it seems like a reasonable assumption to make, but I don't know if there are some corner cases where this is not true. To fix my specific case, I could change the code to resolve the symlinks and realize that these are all the same file. But I don't think it would work in a general way. For example, if debugging remotely and using the target: filesystem, we would need to resolve the symlink on the target, and I don't think we can do that today (there is no readlink/realpath operation in the target file I/O). With this patch, gdb.base/dlmopen-ns-ids.exp passes cleanly: # of expected passes 44 Change-Id: I3b60051085fb9597b7a72f50122c1104c969908e Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-22gdb/solib-svr4: make "lmid" XML attribute optionalSimon Marchi1-1/+1
When connecting to a GDBserver 12, which doesn't have support for non-default linker namespaces and the "lmid" attribute in the qxfer:libraries-svr4:read response, I get: (gdb) c Continuing. ⚠️ warning: while parsing target library list (at line 1): Required attribute "lmid" of <library> not specified Given the code in library_list_start_library, I understand that the "lmid" attribute is meant to be optional. Mark it as optional in the attribute descriptions, to avoid this warning. Change-Id: Ieb10ee16e36bf8a771f944006e7ada1c10f6fbdc Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-22gdb/testsuite: handle dynamic linker path with symlink in dlmopen testsSimon Marchi2-2/+22
On my Arch Linux system*, the dynamic linker path specified in ELF binaries contains a symlink: $ readelf --program-headers /bin/ls | grep "Requesting program interpreter" [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] $ ls -l /lib64 lrwxrwxrwx 1 root root 7 May 3 15:26 /lib64 -> usr/lib $ realpath /lib64/ld-linux-x86-64.so.2 /usr/lib/ld-linux-x86-64.so.2 Because of this, some dlmopen tests think that the dynamic linker doesn't appear multiple times, when it in fact does (under two different names), and some parts of the test are disabled: UNSUPPORTED: gdb.base/dlmopen.exp: test_solib_unmap_events: multiple copies of the dynamic linker not found Make the tests compute the real path of the dynamic linker and accept that as valid path for the dynamic linker. With this patch, I go from # of expected passes 92 to # of expected passes 98 * On my Ubuntu 24.04 system, the dynamic linker appears to be a symlink too, but the glibc is too old to show the dynamic linker in the non-default namespace. Change-Id: I03867f40e5313816bd8a8401b65713ddef5d620e Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-08-21gdb/python: check return value of PyObject_New in all casesAndrew Burgess3-0/+8
I spotted a few cases where the return value of PyObject_New was not being checked against nullptr, but we were dereferencing the result. All fixed here. The fixed functions can now return NULL, so I checked all the callers, and I believe there will handle a return of NULL correctly. Assuming calls to PyObject_New never fail, there should be no user visible changes after this commit. No tests here as I don't know how we'd go about causing a Python object allocation to fail. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-20gdb: rework _active_linker_namespaces variableGuinevere Larsen7-21/+51
This commit reworks the _active_linker_namespaces convenience variable following Simon's feedback here: https://sourceware.org/pipermail/gdb-patches/2025-August/219938.html This patch implements the renaming to _linker_namespace_count (following the standard set by _inferior_thread_count) and makes the convenience variable more resilient in the multi-inferior case by providing a new function, solib_linker_namespace_count, which counts gets the count of namespaces using the solib_ops of the provided program_space Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-20gdb/amd-dbgapi: make get_amd_dbgapi_inferior_info return a referenceSimon Marchi1-107/+112
This function can't return a NULL pointer, so make it return a reference instead. Change-Id: I0970d6d0757181291b300bd840037a48330a7fbb