aboutsummaryrefslogtreecommitdiff
path: root/gdb
AgeCommit message (Collapse)AuthorFilesLines
2025-08-29gdb, gdbarch: Introduce gdbarch method to get the shadow stack pointer.Christina Schimpe5-2/+72
This patch is required by the following commit "gdb: Enable displaced stepping with shadow stack on amd64 linux." Reviewed-By: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Approved-By: Andrew Burgess <aburgess@redhat.com>
2025-08-29gdb: Implement amd64 linux shadow stack support for inferior calls.Christina Schimpe3-1/+152
This patch enables inferior calls to support Intel's Control-Flow Enforcement Technology (CET), which provides the shadow stack feature for the x86 architecture. Following the restriction of the linux kernel, enable inferior calls for amd64 only. Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Reviewed-By: Eli Zaretskii <eliz@gnu.org> Approved-By: Luis Machado <luis.machado@arm.com> Approved-By: Andrew Burgess <aburgess@redhat.com>
2025-08-29gdb, gdbarch: Enable inferior calls for shadow stack support.Christina Schimpe4-4/+72
Inferior calls in GDB reset the current PC to the beginning of the function that is called. As no call instruction is executed the new return address needs to be pushed to the shadow stack and the shadow stack pointer needs to be updated. This commit adds a new gdbarch method to push an address on the shadow stack. The method is used to adapt the function 'call_function_by_hand_dummy' for inferior call shadow stack support. Reviewed-By: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Approved-By: Andrew Burgess <aburgess@redhat.com>
2025-08-29gdb: Handle shadow stack pointer register unwinding for amd64 linux.Christina Schimpe5-0/+240
Unwind the $pl3_ssp register. We now have an updated value for the shadow stack pointer when moving up or down the frame level. Note that $pl3_ssp can become unavailable when moving to a frame before the shadow stack enablement. In the example below, shadow stack is enabled in the function 'call1'. Thus, when moving to a frame level above the function, $pl3_ssp will become unavaiable. Following the restriction of the linux kernel, implement the unwinding for amd64 linux only. Before this patch: ~~~ Breakpoint 1, call2 (j=3) at sample.c:44 44 return 42; (gdb) p $pl3_ssp $1 = (void *) 0x7ffff79ffff8 (gdb) up 55 call2 (3); (gdb) p $pl3_ssp $2 = (void *) 0x7ffff79ffff8 (gdb) up 68 call1 (43); (gdb) p $pl3_ssp $3 = (void *) 0x7ffff79ffff8 ~~~ After this patch: ~~~ Breakpoint 1, call2 (j=3) at sample.c:44 44 return 42; (gdb) p $pl3_ssp $1 = (void *) 0x7ffff79ffff8 (gdb) up 55 call2 (3); (gdb) p $pl3_ssp $2 = (void *) 0x7ffff7a00000 (gdb) up 68 call1 (43i); (gdb) p $pl3_ssp $3 = <unavailable> ~~~ As we now have an updated value for each selected frame, the return command is now enabled for shadow stack enabled programs, too. We therefore add a test for the return command and shadow stack support, and for an updated shadow stack pointer after a frame level change. Reviewed-By: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Approved-By: Andrew Burgess <aburgess@redhat.com>
2025-08-29gdb: amd64 linux coredump support with shadow stack.Christina Schimpe3-4/+221
Intel's Control-Flow Enforcement Technology (CET) provides the shadow stack feature for the x86 architecture. This commit adds support to write and read the shadow-stack node in corefiles. This helps debugging return address violations post-mortem. The format is synced with the linux kernel commit "x86: Add PTRACE interface for shadow stack". As the linux kernel restricts shadow stack support to 64-bit, apply the fix for amd64 only. Co-Authored-By: Christina Schimpe <christina.schimpe@intel.com> Reviewed-By: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Approved-By: Andrew Burgess <aburgess@redhat.com> --- The code and testcase are lightly adapted from: [PATCH v3 5/9] GDB, gdbserver: aarch64-linux: Initial Guarded Control Stack support https://sourceware.org/pipermail/gdb-patches/2025-June/218892.html
2025-08-29gdb, gdbserver: Add support of Intel shadow stack pointer register.Christina Schimpe28-140/+582
This patch adds the user mode register PL3_SSP which is part of the Intel(R) Control-Flow Enforcement Technology (CET) feature for support of shadow stack. For now, only native and remote debugging support for shadow stack userspace on amd64 linux are covered by this patch including 64 bit and x32 support. 32 bit support is not covered due to missing Linux kernel support. This patch requires fixing the test gdb.base/inline-frame-cycle-unwind which is failing in case the shadow stack pointer is unavailable. Such a state is possible if shadow stack is disabled for the current thread but supported by HW. This test uses the Python unwinder inline-frame-cycle-unwind.py which fakes the cyclic stack cycle by reading the pending frame's registers and adding them to the unwinder: ~~~ for reg in pending_frame.architecture().registers("general"): val = pending_frame.read_register(reg) unwinder.add_saved_register(reg, val) return unwinder ~~~ However, in case the python unwinder is used we add a register (pl3_ssp) that is unavailable. This leads to a NOT_AVAILABLE_ERROR caught in gdb/frame-unwind.c:frame_unwind_try_unwinder and it is continued with standard unwinders. This destroys the faked cyclic behavior and the stack is further unwinded after frame 5. In the working scenario an error should be triggered: ~~~ bt 0 inline_func () at /tmp/gdb.base/inline-frame-cycle-unwind.c:49^M 1 normal_func () at /tmp/gdb.base/inline-frame-cycle-unwind.c:32^M 2 0x000055555555516e in inline_func () at /tmp/gdb.base/inline-frame-cycle-unwind.c:45^M 3 normal_func () at /tmp/gdb.base/inline-frame-cycle-unwind.c:32^M 4 0x000055555555516e in inline_func () at /tmp/gdb.base/inline-frame-cycle-unwind.c:45^M 5 normal_func () at /tmp/gdb.base/inline-frame-cycle-unwind.c:32^M Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb) PASS: gdb.base/inline-frame-cycle-unwind.exp: cycle at level 5: backtrace when the unwind is broken at frame 5 ~~~ To fix the Python unwinder, we simply skip the unavailable registers. Also it makes the test gdb.dap/scopes.exp fail. The shadow stack feature is disabled by default, so the pl3_ssp register which is added with my CET shadow stack series will be shown as unavailable and we see a TCL error: ~~ >>> {"seq": 12, "type": "request", "command": "variables", "arguments": {"variablesReference": 2, "count": 85}} Content-Length: 129^M ^M {"request_seq": 12, "type": "response", "command": "variables", "success": false, "message": "value is not available", "seq": 25}FAIL: gdb.dap/scopes.exp: fetch all registers success ERROR: tcl error sourcing /tmp/gdb/testsuite/gdb.dap/scopes.exp. ERROR: tcl error code TCL LOOKUP DICT body ERROR: key "body" not known in dictionary while executing "dict get $val body variables" (file "/tmp/gdb/testsuite/gdb.dap/scopes.exp" line 152) invoked from within "source /tmp/gdb/testsuite/gdb.dap/scopes.exp" ("uplevel" body line 1) invoked from within "uplevel #0 source /tmp/gdb/testsuite/gdb.dap/scopes.exp" invoked from within "catch "uplevel #0 source $test_file_name" msg" UNRESOLVED: gdb.dap/scopes.exp: testcase '/tmp/gdb/testsuite/gdb.dap/scopes.exp' aborted due to Tcl error ~~ I am fixing this by enabling the test for CET shadow stack, in case we detect that the HW supports it: ~~~ # If x86 shadow stack is supported we need to configure GLIBC_TUNABLES # such that the feature is enabled and the register pl3_ssp is # available. Otherwise the reqeust to fetch all registers will fail # with "message": "value is not available". if { [allow_ssp_tests] } { append_environment GLIBC_TUNABLES "glibc.cpu.hwcaps" "SHSTK" } ~~~ Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Reviewed-By: Eli Zaretskii <eliz@gnu.org> Approved-By: Luis Machado <luis.machado@arm.com> Approved-By: Andrew Burgess <aburgess@redhat.com>
2025-08-29gdb, gdbserver: Use xstate_bv for target description creation on x86.Christina Schimpe17-123/+161
The XSAVE function set is organized in state components, which are a set of registers or parts of registers. So-called XSAVE-supported features are organized using state-component bitmaps, each bit corresponding to a single state component. The Intel Software Developer's Manual uses the term xstate_bv for a state-component bitmap, which is defined as XCR0 | IA32_XSS. The control register XCR0 only contains a state-component bitmap that specifies user state components, while IA32_XSS contains a state-component bitmap that specifies supervisor state components. Until now, XCR0 is used as input for target description creation in GDB. However, a following patch will add userspace support for the CET shadow stack feature by Intel. The CET state is configured in IA32_XSS and consists of 2 state components: - State component 11 used for the 2 MSRs controlling user-mode functionality for CET (CET_U state) - State component 12 used for the 3 MSRs containing shadow-stack pointers for privilege levels 0-2 (CET_S state). Reading the CET shadow stack pointer register on linux requires a separate ptrace call using NT_X86_SHSTK. To pass the CET shadow stack enablement state we would like to pass the xstate_bv value instead of xcr0 for target description creation. To prepare for that, we rename the xcr0 mask values for target description creation to xstate_bv. However, this patch doesn't add any functional changes in GDB. Future states specified in IA32_XSS such as CET will create a combined xstate_bv_mask including xcr0 register value and its corresponding bit in the state component bitmap. This combined mask will then be used to create the target descriptions. Reviewed-By: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com>
2025-08-29gdb: Sync up x86-gcc-cpuid.h with cpuid.h from gcc 14 branch.Christina Schimpe1-31/+122
This is required for a later commit which requires "bit_SHSTK". Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Approved-By: Tom Tromey <tom@tromey.com> Approved-By: Luis Machado <luis.machado@arm.com>
2025-08-29gdb, testsuite: Extend core_find procedure to save program output.Christina Schimpe1-2/+8
From: Thiago Jung Bauermann <thiago.bauermann@linaro.org> The change comes from ARM's GCS series: [PATCH v3 5/9] GDB, gdbserver: aarch64-linux: Initial Guarded Control Stack support. We need it for testing coredump files, too. So include it in this patch series. Abridged-by: Christina Schimpe <christina.schimpe@intel.com> Approved-By: Luis Machado <luis.machado@arm.com> Approved-By: Andrew Burgess <aburgess@redhat.com> --- This is the patch mentioned above: https://sourceware.org/pipermail/gdb-patches/2025-June/218892.html Minus everything except for the change in gdb.exp's corefind procedure.
2025-08-29gdb/objfiles: make objfile::sections yield referencesSimon Marchi21-149/+149
I wrote this as a preparatory patch while attempting to make objfile::section_iterator use filtered_iterator. It turned out not so easy, so I have put it aside for now. But now I have this patch, so I thought I'd send it by itself. Since the `obj_section *` yielded by the iterator can't be nullptr, I think it makes sense for the iterator to yield references instead. Just like you would get if you iterated on an std::vector<obj_section>. Change-Id: I7bbee50ed52599e64c4f3b06bdbbde597feba9aa
2025-08-29[gdb/testsuite] Fix overlapping CUs in gdb.dwarf2/dw2-linkage-name-trust.expTom de Vries1-1/+1
When running test-case gdb.dwarf2/dw2-linkage-name-trust.exp with target board cc-with-gdb-index, I get: ... (gdb) file dw2-linkage-name-trust^M Reading symbols from dw2-linkage-name-trust...^M warning: .gdb_index address table has a range (0x4006ac - 0x4006cc) that \ overlaps with an earlier range, ignoring .gdb_index^M (gdb) delete breakpoints^M ... Fix this by compiling with nodebug. Tested on aarch64-linux. Approved-By: Tom Tromey <tom@tromey.com> PR testsuite/33315 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33315
2025-08-29[gdb/testsuite] Fix overlapping CUs in gdb.dwarf2/dw2-entry-points.expTom de Vries2-3/+21
When running test-case gdb.dwarf2/dw2-entry-points.exp with target board cc-with-gdb-index, I get: ... (gdb) file dw2-entry-points^M Reading symbols from dw2-entry-points...^M warning: .gdb_index address table has a range (0x40066c - 0x4006e4) that \ overlaps with an earlier range, ignoring .gdb_index^M (gdb) delete breakpoints^M ... Fix this by copying function bar_helper to barso_helper, and using it where appropriate. Tested on aarch64-linux. Approved-By: Tom Tromey <tom@tromey.com> PR testsuite/33315 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33315
2025-08-29gdb: use kill() in gdbpy_interrupt for hosts with signal supportAndrew Burgess2-24/+68
For background, see this thread: https://inbox.sourceware.org/gdb-patches/20250612144607.27507-1-tdevries@suse.de Tom describes the issue clearly in the above thread, here's what he said: Once in a while, when running test-case gdb.base/bp-cmds-continue-ctrl-c.exp, I run into: ... Breakpoint 2, foo () at bp-cmds-continue-ctrl-c.c:23^M 23 usleep (100);^M ^CFAIL: $exp: run: stop with control-c (unexpected) (timeout) FAIL: $exp: run: stop with control-c ... This is PR python/32167, observed both on x86_64-linux and powerpc64le-linux. This is not a timeout due to accidental slowness, gdb actually hangs. The backtrace at the hang is (on cfarm120 running AlmaLinux 9.6): ... (gdb) bt #0 0x00007fffbca9dd94 in __lll_lock_wait () from /lib64/glibc-hwcaps/power10/libc.so.6 #1 0x00007fffbcaa6ddc in pthread_mutex_lock@@GLIBC_2.17 () from /lib64/glibc-hwcaps/power10/libc.so.6 #2 0x000000001067aee8 in __gthread_mutex_lock () at /usr/include/c++/11/ppc64le-redhat-linux/bits/gthr-default.h:749 #3 0x000000001067afc8 in __gthread_recursive_mutex_lock () at /usr/include/c++/11/ppc64le-redhat-linux/bits/gthr-default.h:811 #4 0x000000001067b0d4 in std::recursive_mutex::lock () at /usr/include/c++/11/mutex:108 #5 0x000000001067b380 in std::lock_guard<std::recursive_mutex>::lock_guard () at /usr/include/c++/11/bits/std_mutex.h:229 #6 0x0000000010679d3c in set_quit_flag () at gdb/extension.c:865 #7 0x000000001066b6dc in handle_sigint () at gdb/event-top.c:1264 #8 0x00000000109e3b3c in handler_wrapper () at gdb/posix-hdep.c:70 #9 <signal handler called> #10 0x00007fffbcaa6d14 in pthread_mutex_lock@@GLIBC_2.17 () from /lib64/glibc-hwcaps/power10/libc.so.6 #11 0x000000001067aee8 in __gthread_mutex_lock () at /usr/include/c++/11/ppc64le-redhat-linux/bits/gthr-default.h:749 #12 0x000000001067afc8 in __gthread_recursive_mutex_lock () at /usr/include/c++/11/ppc64le-redhat-linux/bits/gthr-default.h:811 #13 0x000000001067b0d4 in std::recursive_mutex::lock () at /usr/include/c++/11/mutex:108 #14 0x000000001067b380 in std::lock_guard<std::recursive_mutex>::lock_guard () at /usr/include/c++/11/bits/std_mutex.h:229 #15 0x00000000106799cc in set_active_ext_lang () at gdb/extension.c:775 #16 0x0000000010b287ac in gdbpy_enter::gdbpy_enter () at gdb/python/python.c:232 #17 0x0000000010a8e3f8 in bpfinishpy_handle_stop () at gdb/python/py-finishbreakpoint.c:414 ... What happens here is the following: - the gdbpy_enter constructor attempts to set the current extension language to python using set_active_ext_lang - set_active_ext_lang attempts to lock ext_lang_mutex - while doing so, it is interrupted by sigint_wrapper (the SIGINT handler), handling a SIGINT - sigint_wrapper calls handle_sigint, which calls set_quit_flag, which also tries to lock ext_lang_mutex - since std::recursive_mutex::lock is not async-signal-safe, things go wrong, resulting in a hang. The hang bisects to commit 8bb8f834672 ("Fix gdb.interrupt race"), which introduced the lock, making PR python/32167 a regression since gdb 15.1. Commit 8bb8f834672 fixes PR dap/31263, a race reported by ThreadSanitizer: ... WARNING: ThreadSanitizer: data race (pid=615372) Read of size 1 at 0x00000328064c by thread T19: #0 set_active_ext_lang(extension_language_defn const*) gdb/extension.c:755 #1 scoped_disable_cooperative_sigint_handling::scoped_disable_cooperative_sigint_handling() gdb/extension.c:697 #2 gdbpy_interrupt gdb/python/python.c:1106 #3 cfunction_vectorcall_NOARGS <null> Previous write of size 1 at 0x00000328064c by main thread: #0 scoped_disable_cooperative_sigint_handling::scoped_disable_cooperative_sigint_handling() gdb/extension.c:704 #1 fetch_inferior_event() gdb/infrun.c:4591 ... Location is global 'cooperative_sigint_handling_disabled' of size 1 at 0x00000328064c ... SUMMARY: ThreadSanitizer: data race gdb/extension.c:755 in \ set_active_ext_lang(extension_language_defn const*) ... The problem here is that gdb.interrupt is called from a worker thread, and its implementation, gdbpy_interrupt races with the main thread on some variable. The fix presented here is based on the fix that Tom proposed, but fills in the missing Mingw support. The problem is basically split into two: hosts that support unix like signals, and Mingw, which doesn't support signals. For signal supporting hosts, I've adopted the approach that Tom suggests, gdbpy_interrupt uses kill() to send SIGINT to the GDB process. This is then handled in the main thread as if the user had pressed Ctrl+C. For these hosts no locking is required, so the existing lock is removed. However, everywhere the lock currently exists I've added an assert: gdb_assert (is_main_thread ()); If this assert ever triggers then we're setting or reading the quit flag on a worker thread, this will be a problem without the mutex. For Mingw, the current mutex is retained. This is fine as there are no signals, so no chance of the mutex acquisition being interrupted by a signal, and so, deadlock shouldn't be an issue. To manage the complexity of when we need an assert, and when we need the mutex, I've created 'struct ext_lang_guard', which can be used as a RAII object. This object either performs the assertion check, or acquires the mutex, depending on the host. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32167 Co-Authored-By: Tom de Vries <tdevries@suse.de> Approved-By: Tom Tromey <tom@tromey.com>
2025-08-28gdb/gdb-gdb.gdb.in: skip gdb::ref_ptr<.*>::getSimon Marchi1-0/+3
I think it's uninteresting to step into gdb::ref_ptr::get, so add a skip entry for it. I am adding just one to get the party started, but there are certainly more like this that we could skip. Change-Id: Ib074535c96a62137de63bbe58ff168a1e913688f Approved-By: Tom Tromey <tom@tromey.com>
2025-08-28gdb/testsuite: use gdb_test_no_output when dumping in gdb.base/dump.expSimon Marchi1-8/+1
I don't know if this is true on all platforms, but from what I can see on Linux, the dump commands don't output anything. Use gdb_test_no_output, which should be a bit more robust than checking for some specific error patterns. Change-Id: Idc82298c4752ba7808659dfea2f8324c8a97052d Approved-By: Tom Tromey <tom@tromey.com>
2025-08-28Fix documentation of -list-[target-]features resultsChristian Walther1-2/+2
The manual claims that the -list-features and -list-target-features MI commands return their result in a field named "result". The field is actually named "features", and always has been since the introduction of these commands in 084344d and c6ebd6c. See mi_cmd_list_features and mi_cmd_list_target_features in gdb/mi/mi-main.c. Approved-By: Tom Tromey <tom@tromey.com>
2025-08-28testsuite: add untested in case OS corefile is not foundChristina Schimpe7-0/+7
Even though the core_find proc will log a warning, it's better to log "untested" and then terminate the test. This will help to avoid silently skipped tests, when running the testsuite. Most of the tests already do that. This patch adds the missing ones. Approved-By: Luis Machado <luis.machado.foss@gmail.com>
2025-08-28gdb/python: check return from final PyObject_New in py-disasm.cAndrew Burgess1-44/+41
In this commit: commit dbd05b9edcf760a7001985f89bc760358a3c19d7 Date: Wed Aug 20 10:45:09 2025 +0100 gdb/python: check return value of PyObject_New in all cases I missed a call to PyObject_New in python/py-disasm.c, which this commit addresses. Unlike the previous commit, the call to PyObject_New in py-disasm.c is contained within the scoped_disasm_info_object class, which makes it harder to check for NULL and return. So in this commit I've rewritten the scoped_disasm_info_object class, moving the call to PyObject_New out into gdbpy_print_insn, which is the only place that scoped_disasm_info_object was being used. As scoped_disasm_info_object is no longer responsible for creating the underlying Python object, I figured that I might as well move the initialisation of that object out of scoped_disasm_info_object too. With that done, the scoped_disasm_info_object now has just one task, invalidating the existing disasm_info_object at the end of the scope. So I renamed scoped_disasm_info_object to scoped_invalidate_disasm_info, which reflects its only task. I made a couple of other small adjustments that were requested during review, these are both in the same code area: updating disasm_info_fill to take an object reference rather than a pointer, and removing the local variable insn_disas_obj from gdbpy_print_insn, and inline its value at the one place it was used. There should be no user visible changes after this commit. Except for the PyObject_New call, which now has proper error checking. But in the working case, nothing should have changed. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-28gdb/objfiles: add comment explaining when obj_section::the_bfd_section is ↵Simon Marchi1-1/+3
nullptr Change-Id: Iae17492f468efba7b76463a6ff8526171e412040 Reviewed-By: Tom de Vries <tdevries@suse.de>
2025-08-28[gdb/testsuite] Use build_executable in gdb.tui/tui-missing-src.expTom de Vries1-4/+2
While looking at test-case gdb.tui/tui-missing-src.exp I noticed that gdb_compile is used to compile multiple sources: ... if { [gdb_compile "${srcfiles}" "${binfile}" \ executable {debug additional_flags=-O0}] != "" } { ... meaning there are no separate compile and link steps, as is required for fission [1]. Fix this by using build_executable instead. Tested on aarch64-linux. [1] https://gcc.gnu.org/wiki/DebugFission
2025-08-28gdb/record: Support wfi, sfence.vma, sret and mret instructions in risc-vtimurgol0071-11/+45
During testing of bare-metal applications on QEMU for RISC-V, it was discovered that the instructions wfi, sfence.vma, sret, and mret were not supported. This patch introduces support for these instructions. Additionally, it wraps fetch_instruction function in a try-catch block to gracefully handle errors that may occur when attempting to read invalid address. Reviewed-By: Guinevere Larsen <guinevere@redhat.com> Approved-By: Andrew Burgess <aburgess@redhat.com>
2025-08-28[gdb/testsuite] Fix require dwarf2_support check in some test-cases, some moreTom de Vries6-2/+10
The Linaro CI reported a regression in test-case gdb.dwarf2/macro-source-path-clang14-dw4.exp due to recent commit 81e5a23c7b8 ("[gdb/testsuite] Fix require dwarf2_support check in some test-cases"). The problem is that the "require dwarf2_support" in its new location doesn't work because proc dwarf2_support is not defined. I didn't notice this because I tested all gdb.dwarf2 test-cases together, and a different test-case had already imported the proc. Fix this by moving load_lib dwarf.exp earlier. Tested on x86_64-linux.
2025-08-27gdb/testsuite: get real executable in gdb.gdb/index-file.expSimon Marchi2-9/+44
Similar to a previous patch, if the gdb executable is in fact a libtool wrapper, we need to get the path to the real executable to load it in the top-level gdb. With this change, the test runs on Cygwin, although I do see two failures: FAIL: gdb.gdb/index-file.exp: debug_names files are identical FAIL: gdb.gdb/index-file.exp: debug_str files are identical Change-Id: Ie06d1ece67e61530e5b664e65b5ef0edccaf6afa Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27gdb/testsuite: turn thread events off in selftestsSimon Marchi1-0/+4
When running gdb.gdb/selftest.exp on Cygwin, the test eventually times out on this command: (gdb) PASS: gdb.gdb/selftest.exp: printed version as pointer continue Continuing. [New Thread 4804.0x1728] [New Thread 4804.0x2f24] [New Thread 4804.0x934] [New Thread 4804.0x23a8] [New Thread 4804.0x2cf4] [New Thread 4804.0x1408] [New Thread 4804.0x2c90] [New Thread 4804.0xc58] [New Thread 4804.0x1d40] [New Thread 4804.0x1824] GNU gdb (GDB) 17.0.50.20250530-git Copyright (C) 2024 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-cygwin". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". (gdb) [New Thread 4804.0x2c64] [New Thread 4804.0x23c4] [New Thread 4804.0x2814] [Thread 4804.0x1200 exited with code 0] [Thread 4804.0x293c exited with code 0] [Thread 4804.0x2c9c exited with code 0] FAIL: gdb.gdb/selftest.exp: xgdb is at prompt (timeout) The problem is the new thread notification, and the fact that the test expects the prompt to be the last thing in the buffer. To avoid the thread events interfering with the test, disable them, they are not useful here. With this patch, gdb.gdb/selftest.exp mostly runs fine on Cygwin, the only remaining problem appears to be: (gdb) PASS: gdb.gdb/selftest.exp: send ^C to child process signal SIGINT Continuing with signal SIGINT. PASS: gdb.gdb/selftest.exp: send SIGINT signal to child process, top GDB message FAIL: gdb.gdb/selftest.exp: send SIGINT signal to child process, bottom GDB message (timeout) Change-Id: I0b1df0503c1961c042c8de559b4d223c5d3cb95c Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27gdb/testsuite: use libtool to launch selftestsSimon Marchi1-17/+87
When building GDB on Cygwin, gdb/gdb.exe is a libtool wrapper (which happens to be a PE executable). The real executable is at gdb/.libs/gdb.exe. The "does gdb have debug info test" that _selftest_setup does is bogus, because it loads the libtool wrapper (which doesn't have debug info), doesn't see any debug info, and thus the test is skipped. The "correct" way to deal with libtool wrappers is to run the shell command you want to run under `libtool --mode=execute`. That will replace any path resembling to a libtool wrapper with the real executable path. But it will also add to the environment the library paths necessary for this executable to find the libraries it needs. Therefore, modify the `do_self_tests` proc to: - run the top-level GDB commands under `libtool --mode=execute` - pass the path to the inferior GDB on the command-line of the top-level, so that it gets replaced with the real executable's path However, the "file" command was previously used to detect the presence of debug info in the GDB executable. It's not easy to implement this check when loading the executable directly on the command line. So, add a separate proc, _selftest_check_executable_debug_info, that spawns a temporary GDB and does the debug info check through the file command. This proc uses libtool to obtain the path to the real executable. When building, we use the bundled libtool.m4 at the top of the tree. This means that the libtool system package, and therefore the libtool binary, might not be available. Check for the presence of the libtool binary first, and only do the conversion if it is found. If it is not found, the test should still work on platforms that don't require the conversion. With this commit, the test runs on Cygwin, even though there are failures later. Change-Id: Ie7b712cdc84671a5a017655a7e41687ff23f906c Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27gdb/testsuite: do not copy gdb executable in self testsSimon Marchi1-21/+2
In the ROCm-GDB testing process, we hit a problem that is a combination of these 3 factors: 1. In the downstream ROCm-GDB packages, the gdb executable is built with a relative RUNPATH: 0x000000000000001d (RUNPATH) Library runpath: [${ORIGIN}/../lib] This is done so that the installation is relocatable (the whole ROCm directory can be copied around) and things still work. For instance, the rocgdb executable needs to be able to find the libraries it needs, such as `librocm-dbgapi.so.0`. The relative runpath allows that. 2. For testing, we run the testsuite against the gdb executable installed from one of those packages. It is possible to ./configure the testsuite directory on its own, and then do: $ make check RUNTESTFLAGS="GDB=/opt/rocm/bin/rocgdb" 3. The selftests (such as gdb.gdb/selftest.exp) copy the GDB under test to the standard output directory, before trying to debug it. The problem is that the gdb executable under test that has been copied can't find the libraries it needs. With this patch, I propose that we don't copy the gdb executable, but debug it in place instead. The comment removed in this patch says "in case this OS doesn't like to edit its own text space", and has been there since forever in some form. But it's not clear if there is a host OS (where we intend to run this test) that needs this nowadays. I would bet that there isn't. If there is in fact a GDB host OS (where we intend to run this test) that needs it, we can reinstate the copying, but as an opt-in operation. Another situation where this change helps is on Windows, where gdb/gdb.exe is a libtool wrapper (the real executable is at gdb/.libs/gdb.exe). Copying gdb/gdb.exe doesn't accomplish anything useful. The next patch does further changes to account for the libtool wrapper case. I tested on Linux and Cygwin, more testing would be welcome. Change-Id: Id4148517d4fc4ecdd49f099c12003e3d16c6a93d Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27gdb/testsuite: remove function parameter from do_self_testsSimon Marchi3-8/+7
The function to stop at is always main. Remove the parameter and hard-code main in _selftest_setup. Change-Id: Ibbbf598203b1658305eb6bc631d029652c10edac Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27gdb/testsuite: namespace procs in lib/selftest-support.expSimon Marchi1-5/+5
Rename some procs in lib/selftest-support.exp that are only used internally, to make it a bit clearer that they are just internal helpers. Change-Id: Icd399ac42698209fbc8e798bf43a7d8464aa848c Reviewed-By: Keith Seitz <keiths@redhat.com>
2025-08-27Fix formatting of gdbarch_components.pyTom Tromey1-1/+1
pre-commit pointed out that gdbarch_components.py had a minor formatting issue, according to the official version of 'black'. This patch corrects the oversight.
2025-08-27gdb/testsuite: work around empty substring bug in expectAndrew Burgess1-11/+25
There is a bug in expect, see: https://sourceforge.net/p/expect/patches/26/ which causes empty substring matches from a regexp to instead return the complete input buffer. To reproduce this bug, try this command: expect -c 'spawn sh -c "echo -n -e \"abc\""; \ expect -re "(a?)(a)(bc)"; \ puts "\n"; \ for { set i 1 } { $i < 4 } { incr i } { \ puts -nonewline "($i): \""; \ puts -nonewline $expect_out($i,string); \ puts "\"" \ }' For a working expect the output looks like: spawn sh -c echo -n -e "abc" abc (1): "" (2): "a" (3): "bc" But for a broken expect the output looks like: spawn sh -c echo -n -e "abc" abc (1): "abc" (2): "a" (3): "bc" Notice that (1) is now returning the complete input buffer rather than the empty string, this is wrong. This is not the first time this bug has impacted GDB's testsuite, this commit seems to be working around the same problem: commit e579b537353cd91cb8fac1eaeb69901d4936766f Date: Sat Aug 16 20:32:37 2025 +0200 [gdb/testsuite] Fix TUI tests on freebsd I recently pushed this commit: commit 3825c972a636852600b47c242826313f4b9963b8 Date: Wed Jun 18 15:02:29 2025 +0100 gdb: allow gdb.Color to work correctly with pagination Which added gdb.python/py-color-pagination.exp. Bug PR gdb/33321 was then created as the test was failing on some hosts. Turns out, this is same expect bug. The fix presented here is the same as for e579b537353cd91cb8, avoid using optional regexp substrings at the start of a regexp, and instead use two separate regexp patterns. With this change in place, the test now passes on all hosts. There's no change in what is being tested after this commit. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33321 Approved-By: Tom de Vries <tdevries@suse.de>
2025-08-27[gdb/testsuite] Fix gdb.server/non-existing-program.exp on msys2-ucrt64Tom de Vries1-1/+3
On msys2-ucrt64, with test-case gdb.server/non-existing-program.exp I get: ... (gdb) quit^M gdb_caching_proc allow_xml_test caused gdb_exit to be called gdb_caching_proc allow_xml_test marked as called gdb_caching_proc get_mount_point_map marked as called builtin_spawn gdbserver stdio non-existing-program^M Error creating process "non-existing-program " (error 2): \ The system cannot find the file specified.^M^M Exiting^M^M FAIL: gdb.server/non-existing-program.exp: gdbserver exits cleanly ... This happens because this regexp fails to match: ... # This is what we get on Windows. -re "Error creating process\r\n\r\nExiting\r\n" { ... Fix this by updating the regexp. Tested on x86_64-w64-mingw32 (msys2-ucrt64).
2025-08-27[gdb/testsuite] Add have_startup_shellTom de Vries6-3/+54
Say we disable startup-with-shell, we get: ... (gdb) run `echo 8`^M Starting program: a2-run `echo 8`^M [Thread debugging using libthread_db enabled]^M Using host libthread_db library "/lib64/libthread_db.so.1".^M usage: factorial <number>^M [Inferior 1 (process 10787) exited with code 01]^M (gdb) FAIL: gdb.base/a2-run.exp: run "a2-run" with shell (timeout) ... Fix this by only doing this test if startup-with-shell is supported. This fixes the test-case on msys2-ucrt64, where startup-with-shell is not supported. Likewise in other test-cases. Tested on x86_64-linux.
2025-08-27[gdb/testsuite] Add missing require {!is_remote host}Tom de Vries6-1/+8
I ran test-case gdb.python/py-color-pagination.exp with make-check-all.sh and noticed failures when using remote host. So I grepped to find all test-cases using with_ansi_styling_terminal and ran them with host/target board local-remote-host-native. Fix the failing test-cases using require {!is_remote host}. Tested on x86_64-linux.
2025-08-26gdb/python: return gdbpy_ref<> from gdbpy_create_ptid_objectAndrew Burgess3-7/+12
Update gdbpy_create_ptid_object (python/py-infthread.c) to return a gdbpy_ref<> rather than a 'PyObject *'. This reduces the chances that a caller will leak an object, though no such memory leaks are fixed in this commit, this is just a code improvement patch. There should be no user visible changes after this commit. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26gdb: use existing argument more in rename_vmcore_idle_reg_sectionsAndrew Burgess1-5/+8
In corelow.c, in the function rename_vmcore_idle_reg_sections, the argument ABFD holds the core file bfd pointer. When this function is called current_program_space->core_bfd() is passed as the argument value. Within this function, we sometimes use the function argument, and sometimes access current_program_space->core_bfd() directly. This is confusing, and unnecessary. Lets not do that. I've renamed the argument to cbfd (for Core file BFD), and then updated the function to make use of this argument throughout. This reduces the number of accesses to global state, which is, I think, a good thing. There should be no user visible changes after this commit. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26gdb: more current_program_space->core_bfd() removalAndrew Burgess6-20/+24
This commit changes the signature of the gdbarch_core_info_proc method so that it takes a 'struct bfd *' as an extra argument. This argument is used to pass through the core file bfd pointer. Now, in corelow.c, when calling gdbarch_core_info_proc, we can pass through current_program_space->core_bfd() as the argument. Within the implementations, (Linux and FreeBSD) we can use this argument rather than having to access the core file through current_program_space. This reduces the use of global state, which I think is a good thing. There should be no user visible changes after this commit. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26gdb: use current_program_space->core_bfd() a little lessAndrew Burgess1-11/+7
The function linux_read_core_file_mappings is passed an argument CBFD, which is the BFD for the core file. In core_target::build_file_mappings, where the function is called, we pass current_program_space->core_bfd() as the argument. However, in linux_read_core_file_mappings, in some places we use the CBFD argument, and in other places we directly use current_program_space->core_bfd(). This is confusing, and unnecessary. Lets not do that. Standardise on just using CBFD. This removes some references to global state in favour of passing the global state in as an argument, I think this is a good thing. There should be no user visible changes after this commit. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26[gdb/tdep] Add XOP support in amd64_get_insn_detailsTom de Vries1-2/+30
Implement support for XOP instructions [1] in amd64_get_insn_details. The encoding scheme is documented here [2]. Essentially it's a variant of the VEX3 encoding scheme, with: - 0x8f as the first byte instead of 0xc4, and - an opcode map >= 8. The changes are roughly the same as the XOP part of an earlier submission [3], hence the tag. The only real difference is that that patch proposed to implement xop_prefix_p using: ... return pfx[0] == 0x8f && (pfx[1] & 0x38); ... which tries to resolve the conflict between the XOP prefix (starts with 0x8f) and the POP instruction (opcode 0x8f) by detecting that it's not a POP instruction. Instead, use the way AMD has resolved this conflict in the specification, by checking for opcode map >= 8: ... gdb_byte m = pfx[1] & 0x1f; return pfx[0] == 0x8f && m >= 8; ... Tested on x86_64-linux. Co-Authored-By: Jan Beulich <jbeulich@suse.com> Reviewed-By: Klaus Gerlicher<klaus.gerlicher.@intel.com> [1] https://en.wikipedia.org/wiki/XOP_instruction_set [2] https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/43479.pdf [3] https://sourceware.org/pipermail/gdb-patches/2019-February/155347.html
2025-08-26gdb/python: fix an unlikely memory leakAndrew Burgess1-16/+11
I noticed a possible memory leak in gdbpy_create_ptid_object, in py-infthread.c. We create a Tuple, and hold the reference in a 'PyObject*' local. If we then fail to create any of the tuple contents we perform an early exit, returning nullptr, this will leak the Tuple object. Currently, we create the Tuple as the first action in the function, but we don't really need the tuple until the end of the function. In this commit I have: 1. Moved creation of the Tuple until the end of the function, just before we need it. 2. Stored the Tuple reference in a gdbpy_ref<>. This is not strictly needed any more, but is (I think) good practice as future changes to the function will not need to worry about releasing the Tuple object. 3. Taken the opportunity to replace a NULL with nullptr in this function. 4. Inlined the local variable declarations to the point of first use. There should be no user visible changes after this commit. No tests as I have no idea how to make gdb_py_object_from_longest (and friends) fail, and so trigger the memory leak. I suspect we'd never actually see this leak in the real world, but it doesn't hurt to clean these things up. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-26gdb: LoongArch: Improve loongarch_scan_prologue for correct backtraceHui Li1-0/+32
(1) Description of Problem: When debugging the following code, the execution result of the backtrace command is incorrect. $ cat test.S .text .globl fun1 .type fun1, @function fun1: or $r12,$r0,$r0 or $r4,$r12,$r0 jr $r1 .globl fun .type fun, @function fun: addi.d $r3,$r3,-16 st.d $r1,$r3,8 bl fun1 or $r12,$r4,$r0 or $r4,$r12,$r0 ld.d $r1,$r3,8 addi.d $r3,$r3,16 jr $r1 .globl main .type main, @function main: addi.d $r3,$r3,-16 st.d $r1,$r3,8 bl fun nop ld.d $r1,$r3,8 addi.d $r3,$r3,16 jr $r1 $ gcc test.S -o test $ gdb test ... (gdb) b fun1 Breakpoint 1 at 0x748 (gdb) r Breakpoint 1, 0x0000555555554748 in fun1 () (gdb) bt #0 0x0000555555554748 in fun1 () #1 0x0000555555554758 in fun () #2 0x0000555555554758 in fun () #3 0x0000555555554758 in fun () .... --Type <RET> for more, q to quit, c to continue without paging (2) Root Cause Analysis: The return address of fun() in r1(ra) is saved on the stack: addi.d $r3,$r3,-16 st.d $r1,$r3,8 The bl instruction in fun () will call the fun1 () and save the value of pc+4 to r1(ra). bl fun1 or $r12,$r4,$r0 Because registers such as fp and ra saved in the stack of the sub-function are not recorded in current code. When trace back fun() to main(), the pc of the previous frame to be read from ra register instead of the saved location on the stack. At this time, the value of ra register in fun() is already the address of the next instruction after the bl. So it is impossible to trace back to the main(). (3) Solution: Record the location of ra, fp, s0 to s8 on the stack to ensure the correct execution of backtrace. (4) Test: $ gdb test ... (gdb) b fun1 Breakpoint 1 at 0x748 (gdb) r Breakpoint 1, 0x0000555555554748 in fun1 () (gdb) bt #0 0x0000555555554748 in fun1 () #1 0x0000555555554758 in fun () #2 0x0000555555554778 in main () Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-26gdb: LoongArch: Improve loongarch_scan_prologue to record stack informationHui Li1-0/+42
(1) Description of Problem: When debugging the following code, the execution result of nexti command is incorrect. $ cat test.S .text .globl fun .type fun, @function fun: or $r12,$r0,$r0 or $r4,$r12,$r0 jr $r1 .globl main .type main, @function main: addi.d $r3,$r3,-16 st.d $r1,$r3,8 bl fun or $r12,$r4,$r0 or $r4,$r12,$r0 ld.d $r1,$r3,8 addi.d $r3,$r3,16 jr $r1 $ gcc test.S -o test $ gdb test ... (gdb) set disassemble-next-line on (gdb) start ... Temporary breakpoint 1, 0x0000555555554754 in main () => 0x0000555555554754 <main+8>: 57ffefff bl -20 # 0x555555554740 <fun> (gdb) ni 0x0000555555554740 in fun () => 0x0000555555554740 <fun+0>: 0015000c move $t0, $zero (2) Root Cause Analysis: In the internal execution flow of the ni command, a single-step will be executed first. After that, it will enter process_event_stop_test (), some conditions are judged in this function. if ((get_stack_frame_id (frame) != ecs->event_thread->control.step_stack_frame_id) && get_frame_type (frame) != SIGTRAMP_FRAME && ((frame_unwind_caller_id (frame) == ecs->event_thread->control.step_stack_frame_id) && ((ecs->event_thread->control.step_stack_frame_id != outer_frame_id) || (ecs->event_thread->control.step_start_function != find_pc_function (ecs->event_thread->stop_pc ()))))) { ... if (ecs->event_thread->control.step_over_calls == STEP_OVER_ALL) ... else insert_step_resume_breakpoint_at_caller (frame); } Here, it will be judged whether a sub-function has been called based on whether the frame id before the single step is not equal to the current frame id and whether there is a calling relationship. If a sub-function is called at this time and the current operation is nexti, it will not stop immediately. Instead, insert_step_resume_breakpoint_at_caller() will be called to complete the execution of the sub-function and then stop. In above debugging examples, the executable program being debugged is compiled from an asm source file that does not contain dwarf information. Therefore, the frame id of the function is calculated by loongarch_frame_unwind rather than dwarf2_frame_unwind. However, loongarch_scan_prologue() has not yet recorded stack information in loongarch_frame_cache, this will cause problems in some operations related to the frame id information. (3) Solution: Improve loongarch_scan_prologue() to record the stack information in loongarch_frame_cache. And improve the loongarch_frame_unwind_stop_reason() through the information recorded in loongarch_frame_cache. (4) Test: After this patch: $ gdb test (gdb) set disassemble-next-line on (gdb) start Temporary breakpoint 1, 0x0000555555554754 in main () => 0x0000555555554754 <main+8>: 57ffefff bl -20 # 0x555555554740 <fun> (gdb) ni 0x0000555555554758 in main () => 0x0000555555554758 <main+12>: 0015008c move $t0, $a0 (gdb) ni 0x000055555555475c in main () => 0x000055555555475c <main+16>: 00150184 move $a0, $t0 Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-26gdb: LoongArch: Refactor member functions of loongarch_frame_unwindHui Li1-17/+156
In the current code, loongarch_frame_unwind is a LoongArch prologue unwinder, it contains the required member functions, but they do not calculate a valid frame id through prologue of a function frame. Refactor these functions and use loongarch_frame_cache to record the information of the function frame. No functional change intended. Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-26gdb: LoongArch: Add the definition of loongarch_frame_cacheHui Li1-1/+28
Add the definition of loongarch_frame_cache for loongarch_frame_unwind, this is preparation for later patch on LoongArch. Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-26amd64-tdep: need_modrm = 1 for VEX/EVEX instructions, except vzeroall/vzeroupperKlaus Gerlicher1-6/+32
VEX and EVEX-encoded instructions generally require a ModR/M byte, with the notable exception of vzeroall and vzeroupper (opcode 0x77), which do not use ModR/M. This change sets need_modrm = 1 for VEX instructions, and adds an exception for instructions where *insn == 0x77, following Intel’s SDM. EVEX has no exceptions and thus always sets need_modrm to 1. Additionally, the legacy twobyte_has_modrm table cannot be used for VEX and EVEX instructions, as these encodings have different requirements and exceptions. The logic is now explicit for VEX/EVEX handling. Add vpblendw to selftest amd64_insn_decode. The Intel SDM says the following: 1. Intel® 64 and IA-32 Architectures Software Developer’s Manual Section 2.2.1.2 — Instruction Prefixes "The VEX prefix is a multi-byte prefix that replaces several legacy prefixes and opcode bytes. The VEX prefix is not an opcode; it is a prefix that modifies the instruction that follows." Section 2.2.1.3 — Opcode Bytes "The opcode byte(s) follow any instruction prefixes (including VEX). The opcode specifies the operation to be performed." Section 2.2.2 — Instruction Format "If a VEX prefix is present, it is processed as a single prefix, and the opcode bytes follow immediately after the VEX prefix." Source: Intel® SDM Vol. 2A, Section 2.2.1.2 and 2.2.2 (See Vol. 2A, PDF pages 2-4, 2-5, and 2-7) 2. ModRM Byte Requirement Intel® SDM Vol. 2A, Table 2-2 — VEX Prefix Encoding "Most VEX-encoded instructions require a ModRM byte, except for a few instructions such as VZEROALL and VZEROUPPER." Source: Intel® SDM Vol. 2A, Table 2-2 (See Vol. 2A, PDF page 2-13) Approved-By: Tom de Vries <tdevries@suse.de>
2025-08-26[gdb/testsuite] Fix require dwarf2_support check in some test-casesTom de Vries6-2/+10
On x86_64-freebsd, I ran into trouble with test-case gdb.dwarf2/macro-source-path-clang14-dw4.exp (and similar), and I managed to reproduce the problem on x86_64-linux by making dwarf2_support return 0. The failure looks like: ... UNSUPPORTED: $exp: require failed: dwarf2_support UNRESOLVED: $exp: testcase aborted due to invalid command name: do_test ERROR: tcl error sourcing $exp. ... I fixed a similar problem in commit 3e488d8ccd0 ("[gdb/testsuite] Fix gdb.dwarf2/dw-form-strx-out-of-bounds.exp with make-check-all.sh"). Fix this by moving "require dwarf2_support" from gdb.dwarf2/macro-source-path.exp.tcl to the files including it. Tested on x86_64-linux.
2025-08-25Fix tekhex format related gdb.base/dump.exp failuresKevin Buettner1-1/+1
On s390x, a big-endian machine, I'm seeing these test failures: FAIL: gdb.base/dump.exp: array as memory, tekhex; file restored ok FAIL: gdb.base/dump.exp: array as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: array as value, tekhex; file restored ok FAIL: gdb.base/dump.exp: array as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: array copy, tekhex; file restored ok FAIL: gdb.base/dump.exp: array copy, tekhex; value restored ok FAIL: gdb.base/dump.exp: array partial, tekhex; file restored ok FAIL: gdb.base/dump.exp: array partial, tekhex; value restored ok FAIL: gdb.base/dump.exp: dump array as memory, tekhex FAIL: gdb.base/dump.exp: dump array as value, tekhex FAIL: gdb.base/dump.exp: dump struct as memory, tekhex FAIL: gdb.base/dump.exp: dump struct as value, tekhex FAIL: gdb.base/dump.exp: reload array as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: reload array as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: reload struct as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: reload struct as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: struct as memory, tekhex; file restored ok FAIL: gdb.base/dump.exp: struct as memory, tekhex; value restored ok FAIL: gdb.base/dump.exp: struct as value, tekhex; file restored ok FAIL: gdb.base/dump.exp: struct as value, tekhex; value restored ok FAIL: gdb.base/dump.exp: struct copy, tekhex; file restored ok FAIL: gdb.base/dump.exp: struct copy, tekhex; value restored ok It turns out that there's a subtle bug in move_section_contents in bfd/tekhex.c. The bug is that when attempting to write a buffer that starts with a zero byte, the function will return false, an error condition, without writing anything. But it also doesn't set bfd_error, so GDB ends up displaying whatever the last unrelated error was, e.g.: warning: writing dump file '.../intstr1.tekhex' (No such file or directory) When I investigated this, the bfd error was set during failure to open a separate debug file for the test case, which is totally unrelated to this problem. The reason this fails on big endian machines is that the test case writes out structs and arrays of int initialized to small values. On little endian machines, the small integer is the first byte, so the error doesn't occur. On big endian machines, a zero byte occurs first, triggering the error. On the GDB side of things, I've made a one line change to the test case to cause the error to also happen on little endian machines. I simply shift value of the first field in the struct left by 16 bits. That leaves at least one zero byte on both sides of the non-zero part of the int. I shifted it by 16 because, for a moment, there was a question in my mind about what would happen with a second zero byte, but it turns out that it's not a problem. On the bfd side of things, take a look at move_section_contents() and find_chunk() in tekhex.c. The scenario is this: we enter move_section_contents with locationp pointing at a character buffer whose first byte is zero. The 'get' parameter is false, i.e. we're writing, not reading. The other critical fact is that the abfd->tdata.tekhex_data->data is NULL (0). I'm going to go through the execution path pretty much line by line with commentary below the line(s) just executed. char *location = (char *) locationp; bfd_vma prev_number = 1; /* Nothing can have this as a high bit. */ I can't say that the comment provides the best explanation about what's happening, but the gist is this: later on, chunk_number will have it's low bits masked away, therefore no matter what it is, it can't possibly be equal to prev_number when it's set to 1. struct data_struct *d = NULL; BFD_ASSERT (offset == 0); for (addr = section->vma; count != 0; count--, addr++) { Set d to NULL and enter the loop. /* Get high bits of address. */ bfd_vma chunk_number = addr & ~(bfd_vma) CHUNK_MASK; bfd_vma low_bits = addr & CHUNK_MASK; Use CHUNK_MASK, which is 0x1fff, to obtain the chunk number, i.e. whatever's left after masking off the low 13 bits of addr, and low_bits, which are the low 13 bits of addr. chunk_number matters for understanding this bug, low_bits does not. Remember that no matter what addr is, once you mask off the low 13 bits, it can't be equal to 1. bool must_write = !get && *location != 0; !get is true, *location != 0 is false, therefore the conjunction is false, and furthermore must_write is false. I.e. even though we are writing, we don't transfer zero bytes to the chunk - this is why must_write is false. (The reason this works is that a chunk, once allocated, is zero'd as part of the allocation using bfd_zalloc. Therefore we can skip transferring zero bytes and, if enough of them are skipped one after another, chunk allocation simply doesn't happen. That's a good thing.) if (chunk_number != prev_number || (!d && must_write)) For the reason provided above, chunk_number != prev_number is true. The other part of the disjunction doesn't matter since the first part is true. This means that the if-block is entered. /* Different chunk, so move pointer. */ d = find_chunk (abfd, chunk_number, must_write); find_chunk is entered with must_write set to false. Now, remember where we left off here, because we're going to switch to find_chunk. static struct data_struct * find_chunk (bfd *abfd, bfd_vma vma, bool create) { (Above 3 lines indented to distinguish code from commentary.) When we enter find_chunk, create is false because must_write was false. struct data_struct *d = abfd->tdata.tekhex_data->data; d is set to NULL since abfd->tdata.texhex_data->data is NULL (one of the conditions for the scenario). vma &= ~CHUNK_MASK; while (d && (d->vma) != vma) d = d->next; d is NULL, so the while loop doesn't execute. if (!d && create) ... d is NULL so !d is true, but create is false, so the condition evaluates to false, meaning that the if-block is skipped. return d; find_chunk returns NULL, since d is NULL. Back in move_section_contents: if (!d) return false; d is NULL (because that's what find_chunk returned), so move_section_contents returns false at this point. Note that find_section_contents has allocated no memory, nor even tried to transfer any bytes beyond the first (zero) byte. This is a bug. The key to understanding this bug is to observe that find_chunk can return NULL to indicate that no chunk was found. This is especially important for the read (get=true) case. But it can also be NULL to indicate a memory allocation error. I toyed around with the idea of using a different value to distinguish these cases, i.e. something like (struct data_struct *) -1, but although bfd contains plenty of code where -1 is used to indicate various interesting conditions for scalars, there's no prior art where this is done for a pointer. Therefore the idea was discarded in favor of modifying this statement: if (!d) return false; to: if (!d && must_write) return false; This works because, in find_chunk, the only way to return a NULL memory allocation error is for must_write / create to be true. When it is true, if bfd_zalloc successfully allocates a chunk, then that (non-NULL) chunk will be returned at the end of the function. When it fails, it'll return NULL early. The point is that when bfd_zalloc() fails and returns NULL, must_write (in move_section_contents) / create (in find_chunk) HAD to be true. That provides us with an easy test back in move_section_contents to distinguish a memory-allocation-NULL from a block-not-found-NULL. The other NULL return case happens when the end of the function is reached when either searching for a chunk to read or attempting to find a chunk to write when abfd->tdata.tekhex_data->data is NULL. But for the latter case, must_write was false, which does not (now, with the above fix) trigger the early return of false. (Alan Modra approved the bfd/tekhex.c change.) Approved-By: Simon Marchi <simon.marchi@efficios.com> (GDB)
2025-08-25gdb: fix indentation in objfiles.cSimon Marchi1-1/+1
Change-Id: I3d39ee767a3b2b743b3a90386fb30a6703e9733e
2025-08-25gdb: LoongArch: Handle newly added llsc instructionsXi Ruoyao1-2/+7
We can't put a breakpoint in the middle of a ll/sc atomic sequence, handle the instructions sc.q, llacq.{w/d}, screl.{w/d} newly added in the LoongArch Reference Manual v1.10 so a ll/sc atomic sequence using them won't loop forever being debugged. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
2025-08-24gdb: allow gdb.Color to work correctly with paginationAndrew Burgess4-12/+195
This commit allows gdb.Color objects to be used to style output from GDB commands written in Python, and the styled output should work correctly with pagination. There are two parts to fixing this: First, GDB needs to be able to track the currently applied style within the page_file class. This means that style changes need to be achieved with calls to pager_file::emit_style_escape. Now usually, GDB does this by calling something like fprintf_styled, which takes care to apply the style for us. However, that's not really an option here as a gdb.Color isn't a full style, and as the gdb.Color object is designed to be converted directly into escape sequences that can then be printed, we really need a solution that works with this approach. However pager_file::puts already has code in place to handle escape sequences. Right now all this code does is spot the escape sequence and append it to the m_wrap_buffer. But in this commit I propose that we go one step further, parse the escape sequence back into a ui_file_style object in pager_file::puts, and then we can call pager_file::emit_style_escape. If the parsing doesn't work then we can just add the escape sequence to m_wrap_buffer as we did before. But wait, how can this work if a gdb.Color isn't a full style? Turns out that's not a problem. We only ever emit the escape sequence for those parts of a style that need changing, so a full style that sets the foreground color will emit the same escape sequence as a gdb.Color for the foreground. When we convert the escape sequence back into a ui_file_style, then we get a style with everything set to default, except the foreground color. I had hoped that this would be all that was needed. But unfortunately this doesn't work because of the second problem... ... the implementation of the Python function gdb.write() calls gdb_printf(), which calls gdb_vprintf(), which calls ui_file::vprintf, which calls ui_out::vmessage, which calls ui_out::call_do_message, and finally we reach cli_ui_out::do_message. This final do_message function does this: ui_file *stream = m_streams.back (); stream->emit_style_escape (style); stream->puts (str.c_str ()); stream->emit_style_escape (ui_file_style ()); If we imagine the case where we are emitting a style, triggered from Python like this: gdb.write(gdb.Color('red').escape_sequence(True)) the STYLE in this case will be the default ui_file_style(), and STR will hold the escape sequence we are writing. After the first change, where pager_file::puts now calls pager_file::emit_style_escape, the current style of STREAM will have been updated. But this means that the final emit_style_escape will now restore the default style. The fix for this is to avoid using the high level gdb_printf from gdb.write(), and instead use gdb_puts instead. The gdb_puts function doesn't restore the default style, which means our style modification survives. There's a new test included. This test includes what appears like a pointless extra loop (looping over a single value), but this makes sense given the origin of this patch. I've pulled this commit from a longer series: https://inbox.sourceware.org/gdb-patches/cover.1755080429.git.aburgess@redhat.com I want to get this bug fix merged before GDB 17 branches, but the longer series is not getting reviews, so for now I'm just merging this one fix. Once the rest of the series gets merged, I'll be extending the test, and the loop (mentioned above) will now loop over more values.
2025-08-23Update comment in rust-parse.cTom Tromey1-1/+1
I noticed an out-of-date comment in rust-parse.c.