| Age | Commit message (Collapse) | Author | Files | Lines |
|
A user noted that "set environment" does not affect things like
"shell" or "pipe". This makes some sense, because the environment is
per-inferior, and also in the general case it may be affecting a
remote machine -- i.e., the settings may not be locally appropriate.
This patch adds a new set of "local-environment" commands (set, show,
and unset) that affect gdb's own environment.
Approved-By: Andrew Burgess <aburgess@redhat.com>
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-by: Kévin Le Gouguec <legouguec@adacore.com>
|
|
In a review early in the year:
https://inbox.sourceware.org/gdb-patches/874iz3f4ek.fsf@redhat.com/
Andrew pointed out that a new test I proposed failed with read1.
This test was essentially a copy of gdb.base/environ.exp.
I couldn't reproduce the read1 problem, but this patch rewrites the
one test there that seems like it could possibly fail in this mode.
Now it uses line-by-line mode and checks for a certain environment
variable appearing in the output.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Simplify regexps in test-case gdb.multi/remote-with-running-inferior.exp using
string_to_regexp and {}.
While we're at it, also break an overly long line.
Tested on x86_64-linux.
|
|
Fix two typos in test-case gdb.multi/remote-with-running-inferior.exp.
|
|
Occasionally, with test-case gdb.multi/remote-with-running-inferior.exp I run
into:
...
(gdb) continue^M
Continuing.^M
^M
Thread 1.1 "remote-with-run" hit Breakpoint 1.1, main () at \
remote-with-running-inferior.c:31^M
31 for (int i = 0; i < 30; ++i)^M
(gdb) PASS: $exp: target_non_stop=auto: non_stop=off: \
continue to breakpoint: continue to breakpoint in breakpt
bt 1^M
(gdb) FAIL: $exp: target_non_stop=auto: non_stop=off: \
check inferior 1 is in breakpt
...
The problem is a race between:
- inferior 1 reaching main, and
- gdb setting a breakpoint on main.
If the breakpoint on main is set before inferior 1 reaches main, the
breakpoint triggers for inferior 1, which the test-case doesn't expect.
Fix this by setting the breakpoint on main only for inferior 2.
Tested on x86_64-linux.
PR testsuite/33621
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33621
|
|
I noticed that when a command line 'list foo.c:10' displays multiple
files, the symbol would always be shown as "???", e.g.:
file: "/tmp/foo.c", line number: 10, symbol: "???"
this is because, when the symtab_and_line is created for the
'foo.c:10', the pc and symbol are never filled in.
In this commit, I propose that, when we decide that the above header
line needs to be printed, we should attempt to lookup a symbol for the
relevant line, and if one is found, we can use that.
The symbol lookup is done by first calling find_pc_for_line, and then
using find_symbol_for_pc to find a suitable symbol.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The gdb.rocm/break-kernel-no-debug-info.exp builds a testcase without
debug info, and loads the binary in GDB with:
gdb_test "file $::binfile" ".*No debugging symbols.*" "load file"
On some configurations (SLES with glibc-devel installed), the compiler
links /usr/lib64/crit1.o (which contains some debug information) into
the final binary. As a result, the binary contains "some" debug
information, which makes the above pattern fail. The testcase is still
valid as we only really need the GPU code object (embedded in the main
binary) to not contain debug information, so this patch proposes to
relax this first step by using "gdb_load" instead.
Tested on x86_64-linux + AMDGPU gfx1031.
Change-Id: Id903f62e170af69c001b39f4681602a54e2fdaf1
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This commit:
commit c7a45b98a61451f05ff654c4fb72a9c9cb2fba36
Date: Thu Jun 12 15:37:50 2025 +0000
gdb, linespec: avoid multiple locations with same PC
broke GDB's ability to list multiple source files using a 'list'
command. In GDB 16 and earlier something like 'list foo.c:10' could
print multiple results if there were multiple 'foo.c' files compiled
into the executable.
The above commit added a filter to add_sal_to_sals (linespec.c) such
that multiple sals in the same program space, but with the same pc
value, could not be added, only the first sal would actually be
recorded. The problem with this is that add_sal_to_sals is used from
decode_digits_list_mode (also linespec.c) where the pc value is forced
to zero. This force to zero makes sense I think as there might not be
any compiled code for the requested line (this is for 'list' after
all), so there might not be a valid pc to use.
I'm not a fan of using '0' as a special pc value, there are embedded
targets where 0 is a valid pc value, but given we're already using 0
here, I propose to just roll with it.
So, my proposal is that, if the pc is 0, add_sal_to_sals should always
add the sal. This fixes the decode_digits_list_mode, but should keep
the fix that c7a45b98a614 introduced.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33647
Approved-By: Tom Tromey <tom@tromey.com>
Reviewed-By: Kevin Buettner <kevinb@redhat.com>
|
|
On armv7hl-suse-linux-gnu, I run into:
...
(gdb) p/f val.oct
$26 = -5.9822653797615723e-120
(gdb) FAIL: gdb.base/long_long.exp: p/f val.oct
...
There's some complicated logic here in the test-case:
...
if { $sizeof_double == 8 || $sizeof_long_double == 8 } {
# ARM FPA floating point numbers are not strictly little endian or big
# endian, but a hybrid. They are in little endian format with the two
# words swapped in big endian format.
# EABI targets default to natural-endian VFP format.
if { ([istarget "arm*-*-*"]) \
&& !([istarget "*-*-*eabi*"] || \
[istarget "*-*-mingw32ce*"] || \
[istarget "*-*-openbsd*"]) } then {
# assume the long long represents a floating point double in ARM format
gdb_test "p/f val.oct" "2.1386676354387559e\\+265"
} else {
# assume the long long represents a floating point double in little
# endian format
gdb_test "p/f val.oct" "-5.9822653797615723e-120"
}
} else {
gdb_test "p/f val.oct" "-2.42716126e-15"
}
...
which makes the test-case expect the ARM FPA variant.
I think trying to pin down the exact circumstances under which we have ARM FPA
is unnecessary, so I could live with:
...
set re_val_oct_f [string_to_regexp "-5.9822653797615723e-120"]
if { [istarget "arm*-*-*"] } {
set re_val_oct_f_arm_fpa [string_to_regexp "2.1386676354387559e+265"]
re_val_oct_f "$re_val_oct_f|$re_val_oct_f_arm_fpa"
}
....
But instead, I propose to just drop the arm-specific part.
I doubt whether an ARM VPA setup is still part of any test matrix. And if it
is, then I expect other tests to fail as well for the same reason. The ARM
FPA logic is used only once in the test-case, but -5.9822653797615723e-120 is
used more frequently.
Indeed, I wonder the same about setups where -2.42716126e-15 is expected.
On such a setup, won't this fail as well:
...
gdb_test "x/2gf g" "3.5127005640885037e-303.*-5.9822653797615723e-120"
...
?
Unfortunately I'm not sure what setup to use to trigger the -2.42716126e-15
case.
Fix this as good as possible by choosing once between -5.9822653797615723e-120
and -2.42716126e-15, assigning it to re_val_oct_f, and using it everywhere
where those constants are used.
Tested on x86_64-linux.
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
PR testsuite/33669
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33669
|
|
If an Ada modular type is the same size as gdb's own ULONGEST, ptype
will show "mod 0". This happens because ada_modulus does:
return (ULONGEST) high.const_val () + 1;
This patch cleans this up, replacing ada_modulus with a function to
return the upper bound (if available), and then fixing the various
callers. The type-printing caller still does the "+1", but now this
is done with a gdb_mpz.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33690
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
I recently added two requires in test-case gdb.python/py-missing-objfile.exp:
...
# The following tests assume that the build-ids of binfile and libfile can be
# found in the core file.
require {expect_build_id_in_core_file $binfile}
require {expect_build_id_in_core_file $libfile}
...
However, at the point where the check is done, the files are no longer
available at that location, which makes the require fail.
First, make the problem visible, by making proc expect_build_id_in_core_file
throw an error if the filename argument specifies a non-existing file.
Then, fix test-case gdb.python/py-missing-objfile.exp by moving the calls to
expect_build_id_in_core_file to a point where the files exist.
Tested on x86_64-linux.
|
|
The comment says that it returns 0 if the target supports SME and 1 if
it doesn't, but it actually does the opposite. The code does the right
thing, so fix the comment.
|
|
This bug points out that a template parameter that is a constant is
not printed by 'ptype' -- in fact, only type parameters are displayed.
However, there's no real reason for this, and any template parameter
that the DWARF reader can turn into a symbol should be printable.
Note that some parameters are still missing from the DWARF; see the
kfails in the test for details.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33670
Reviewed-By: Christina Schimpe <christina.schimpe@intel.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
variable
With test-case gdb.base/until-trailing-insns.exp I run into:
...
ERROR: tcl error code TCL READ VARNAME
ERROR: can't read "_line_unit_advertised_version": no such variable
while executing
"if {$_line_unit_advertised_version == "default"} {
set _line_unit_advertised_version $_line_unit_version
}"
...
Fix this by using the correct variable name in two places in Dwarf::lines:
use _line_unit_advertised_version instead of _line_advertised_unit_version.
Tested on x86_64-linux.
|
|
On openSUSE Leap 15.6 x86_64, with gcc 7 and test-case
gdb.base/condbreak-multi-context.exp I run into:
...
(gdb) print aaa^M
$3 = <optimized out>^M
(gdb) FAIL: $exp: start_before=true: scenario_1: print aaa
...
This is a regression since commit 86ac8c54623 ("Convert
lookup_symbol_in_objfile").
Likewise in test-cases gdb.cp/m-static.exp and gdb.cp/namespace.exp.
The failure is specific to using Dwarf v4:
- using target board unix/gdb:debug_flags=-gdwarf-5 fixes it
- using target board unix/gdb:debug_flags=-gdwarf-4 on Tumbleweed (with gcc 15
and Dwarf v5 default) triggers it
The variable we're trying to print, A::aaa is a static const int member:
...
class A : public Base
{
public:
static const int aaa = 10;
...
};
...
With Dwarf v5, we have this DIE:
...
<2><356>: Abbrev Number: 2 (DW_TAG_variable)
<357> DW_AT_name : aaa
<35c> DW_AT_linkage_name: _ZN1A3aaaE
<364> DW_AT_external : 1
<364> DW_AT_accessibility: 1 (public)
<364> DW_AT_declaration : 1
<364> DW_AT_const_value : 10
...
and the cooked index contains these corresponding entries:
...
[45] ((cooked_index_entry *) 0x7facf0004730)
name: _ZN1A3aaaE
canonical: _ZN1A3aaaE
qualified: _ZN1A3aaaE
DWARF tag: DW_TAG_variable
flags: 0x4 [IS_LINKAGE]
DIE offset: 0x356
parent: ((cooked_index_entry *) 0)
[52] ((cooked_index_entry *) 0x7facf0004700)
name: aaa
canonical: aaa
qualified: A::aaa
DWARF tag: DW_TAG_variable
flags: 0x0 []
DIE offset: 0x356
parent: ((cooked_index_entry *) 0x7facf00046d0) [A]
...
With Dwarf v4, we have instead the following DIE:
...
<2><350>: Abbrev Number: 3 (DW_TAG_member)
<351> DW_AT_name : aaa
<35b> DW_AT_external : 1
<35b> DW_AT_accessibility: 1 (public)
<35c> DW_AT_declaration : 1
<35c> DW_AT_const_value : 4 byte block: a 0 0 0
...
and there are no corresponding entries.
Fix this by adding an entry:
...
[47] ((cooked_index_entry *) 0x7f5a24004660)
name: aaa
canonical: aaa
qualified: A::aaa
DWARF tag: DW_TAG_member
flags: 0x0 []
DIE offset: 0x350
parent: ((cooked_index_entry *) 0x7f5a24004630) [A]
...
Add a regression test in the form of a dwarf assembly test-case printing the
value of variable A::aaa.
In the test-case, for A::aaa, DW_FORM_flag is used to encode
DW_AT_declaration. In v1 of this patch that mattered (because of using
has_hardcoded_declaration in abbrev_table::read), but that's no longer the
case. Nevertheless, also add an A::bbb using FORM_flag_present for
DW_AT_declaration (and while we're at it, DW_AT_external as well).
Also add two variants, for which <optimized out> is printed:
- A::ccc: a variant with DW_AT_location instead of DW_AT_const_value, and
- A::ddd: a variant without external.
This behavior is not part of the regression. I've reproduced it using a
system gdb based on 14.2. It's also not specific to using the cooked index,
it reproduces with -readnow as well.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
PR symtab/33415
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33415
|
|
The Free Pascal Compiler fpc supports generating different versions of DWARF:
...
$ fpc -h
Free Pascal Compiler version 3.2.2 [2025/09/10] for x86_64
...
-gw Generate DWARFv2 debug information (same as -gw2)
-gw2 Generate DWARFv2 debug information
-gw3 Generate DWARFv3 debug information
-gw4 Generate DWARFv4 debug information (experimental)
...
The v4 support is experimental, and indeed the line number information is
broken (missing maximum_operations_per_instruction field in the .debug_line
header), so setting a breakpoint on a line number is not possible:
...
$ fpc -gw4 hello.pas
...
$ gdb -q hello
Reading symbols from hello...
(gdb) b hello.pas:8
No compiled code for line 8 in file "hello.pas".
Make breakpoint pending on future shared library load? (y or [n])
...
The brokenness is detected by llvm-dwarfdump (second warning):
...
$ llvm-dwarfdump -debug-line hello
hello: file format elf64-x86-64
.debug_line contents:
debug_line[0x00000000]
warning: parsing line table prologue at offset 0x0 found opcode base of 0. \
Assuming no standard opcodes
warning: unknown data in line table prologue at offset 0x0: parsing ended (at \
offset 0x00000017) before reaching the prologue end at offset 0x2a
...
Likewise, detect the situation the second warning describes in
dwarf_decode_line_header, getting us instead:
...
(gdb) b hello.pas:8
❌malformed line number program header, advertised length does not match \
actual length
(gdb)
...
Tested on x86_64-linux.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This is the result of:
...
$ git rm -f $(find gdb* -name "*.ac")
$ git rm -f $(find gdb* -name "*.m4")
$ git commit -a -m tmp
$ git revert HEAD
$ git rebase --whitespace=fix HEAD^
$ git reset --soft HEAD^
$ git commit --amend
...
and running autoreconf -f in directories gdb, gdb/testsuite, gdbsupport and
gdbserver.
Tested on x86_64-linux.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
The .size calculation for bar's function incorrectly uses the label func
instead of bar. Fix it.
Signed-off-by: Yu-Cheng Liang <yclwlcy@gmail.com>
|
|
Evaluating an Objective-C "NSString" literal will cause gdb to crash.
This patch fixes the crash.
I think the result here still isn't correct -- I see a warning from
the runtime ("autorelease called without pool for object") with the
new code.
However, not crashing is an improvement on its own.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=20501
|
|
Once the gdb.objc tests compile, you can see that they do not pass.
The issue is primarily that they are over-strict about line numbers.
This fixes the problem by replacing these line numbers with $decimal.
|
|
The gdb.objc tests haven't compiled in years. This patch fixes this,
based on a comment in bug 31671.
I don't know whether this approach works with the clang implementation
of Objective-C. However, it does work with GCC, provided that
gnustep-base is installed.
|
|
This patch applies some minor formatting changes and cleanups to the
gdb.objc tests.
|
|
Tom pointed out that gdb.ada/recursive-access.exp causes a compiler
crash with gcc 7.5. This patch adjusts the test case to skip testing
in this case.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33664
Approved-By: Tom de Vries <tdevries@suse.de>
|
|
New in v2:
- make the test try with indexes by default
- using uint8_t instead of unsigned char
In some specific circumstances, it is possible for GDB to read a type
unit from a .dwo file without ever reading in the section of the stub in
the main file. In that case, calling any of these methods:
- dwarf2_per_cu::addr_size()
- dwarf2_per_cu::offset_size()
- dwarf2_per_cu::ref_addr_size()
will cause a crash, because they will try to read the unit header from
the not-read-in section buffer. See the previous patch for more
details.
The remaining calls to these methods are in the loc.c and expr.c
files. That is, in the location and expression machinery. It is
possible to set things up to cause them to trigger a crash, as shown by
the new test, when running it with the cc-with-gdb-index board:
$ make check TESTS="gdb.dwarf2/fission-type-unit-locexpr.exp" RUNTESTFLAGS="--target_board=cc-with-gdb-index"
Running /home/simark/src/binutils-gdb/gdb/testsuite/gdb.dwarf2/fission-type-unit-locexpr.exp ...
ERROR: GDB process no longer exists
The backtrace at the moment of the crash is:
#0 0x0000555566968b1f in bfd_getl32 (p=0x78) at /home/simark/src/binutils-gdb/bfd/libbfd.c:846
#1 0x00005555642e51b7 in read_initial_length (abfd=0x7d1ff1eb0e40, buf=0x78 <error: Cannot access memory at address 0x78>, bytes_read=0x7bfff09daca0, handle_nonstd=true)
at /home/simark/src/binutils-gdb/gdb/dwarf2/leb.c:92
#2 0x00005555647ca584 in read_unit_head (header=0x7d0ff1e06c70, info_ptr=0x78 <error: Cannot access memory at address 0x78>, section=0x7c3ff1dea7d0, section_kind=ruh_kind::COMPILE)
at /home/simark/src/binutils-gdb/gdb/dwarf2/unit-head.c:44
#3 0x000055556452df18 in dwarf2_per_cu::get_header (this=0x7d0ff1e06c40) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:18531
#4 0x000055556452e10e in dwarf2_per_cu::addr_size (this=0x7d0ff1e06c40) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:18544
#5 0x0000555564314ac3 in dwarf2_locexpr_baton_eval (dlbaton=0x7bfff0c9a508, frame=..., addr_stack=0x7bfff0b59150, valp=0x7bfff0c9a430, push_values=..., is_reference=0x7bfff0d33030)
at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1593
#6 0x0000555564315bd2 in dwarf2_evaluate_property (prop=0x7bfff0c9a450, initial_frame=..., addr_stack=0x7bfff0b59150, value=0x7bfff0c9a430, push_values=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1668
#7 0x0000555564a14ee1 in resolve_dynamic_field (field=..., addr_stack=0x7bfff0b59150, frame=...) at /home/simark/src/binutils-gdb/gdb/gdbtypes.c:2758
#8 0x0000555564a15e24 in resolve_dynamic_struct (type=0x7e0ff1f02550, addr_stack=0x7bfff0b59150, frame=...) at /home/simark/src/binutils-gdb/gdb/gdbtypes.c:2839
#9 0x0000555564a17061 in resolve_dynamic_type_internal (type=0x7e0ff1f02550, addr_stack=0x7bfff0b59150, frame=..., top_level=true) at /home/simark/src/binutils-gdb/gdb/gdbtypes.c:2972
#10 0x0000555564a17899 in resolve_dynamic_type (type=0x7e0ff1f02550, valaddr=..., addr=0x4010, in_frame=0x7bfff0d32e60) at /home/simark/src/binutils-gdb/gdb/gdbtypes.c:3019
#11 0x000055556675fb34 in value_from_contents_and_address (type=0x7e0ff1f02550, valaddr=0x0, address=0x4010, frame=...) at /home/simark/src/binutils-gdb/gdb/value.c:3674
#12 0x00005555666ce911 in get_value_at (type=0x7e0ff1f02550, addr=0x4010, frame=..., lazy=1) at /home/simark/src/binutils-gdb/gdb/valops.c:992
#13 0x00005555666ceb89 in value_at_lazy (type=0x7e0ff1f02550, addr=0x4010, frame=...) at /home/simark/src/binutils-gdb/gdb/valops.c:1039
#14 0x000055556491909f in language_defn::read_var_value (this=0x5555725fce40 <minimal_language_defn>, var=0x7e0ff1f02500, var_block=0x7e0ff1f025d0, frame_param=...)
at /home/simark/src/binutils-gdb/gdb/findvar.c:504
#15 0x000055556491961b in read_var_value (var=0x7e0ff1f02500, var_block=0x7e0ff1f025d0, frame=...) at /home/simark/src/binutils-gdb/gdb/findvar.c:518
#16 0x00005555666d1861 in value_of_variable (var=0x7e0ff1f02500, b=0x7e0ff1f025d0) at /home/simark/src/binutils-gdb/gdb/valops.c:1384
#17 0x00005555647f7099 in evaluate_var_value (noside=EVAL_NORMAL, blk=0x7e0ff1f025d0, var=0x7e0ff1f02500) at /home/simark/src/binutils-gdb/gdb/eval.c:533
#18 0x00005555647f740c in expr::var_value_operation::evaluate (this=0x7c2ff1e3b690, expect_type=0x0, exp=0x7c2ff1e3aa00, noside=EVAL_NORMAL) at /home/simark/src/binutils-gdb/gdb/eval.c:559
#19 0x00005555647f3347 in expression::evaluate (this=0x7c2ff1e3aa00, expect_type=0x0, noside=EVAL_NORMAL) at /home/simark/src/binutils-gdb/gdb/eval.c:109
#20 0x000055556543ac2f in process_print_command_args (args=0x7fffffffe728 "global_var", print_opts=0x7bfff0be4a30, voidprint=true) at /home/simark/src/binutils-gdb/gdb/printcmd.c:1328
#21 0x000055556543ae65 in print_command_1 (args=0x7fffffffe728 "global_var", voidprint=1) at /home/simark/src/binutils-gdb/gdb/printcmd.c:1341
#22 0x000055556543b707 in print_command (exp=0x7fffffffe728 "global_var", from_tty=1) at /home/simark/src/binutils-gdb/gdb/printcmd.c:1408
The problem to solve is: in order to evaluate a location expression, we
need to know some information (the various sizes) found in the unit
header. In that context, it's not possible to get it from
dwarf2_cu::header, like the previous patch did: at the time the
expression is evaluated, the corresponding dwarf2_cu might have been
freed. We don't want to re-build a dwarf2_cu just for that, it would be
very inefficient. We could force reading in the dwarf2_per_cu::section
section (in the main file), but we never needed to read that section
before, so it would be better to avoid reading it unnecessarily.
My initial attempt was to store this information in baton objects
(dwarf2_locexpr_baton & co), so that it can be retrieved when the time
comes to evaluate the expressions. However, it quickly became obvious
that storing it there would be redundant and wasteful.
I instead opted to store this information directly inside dwarf2_per_cu,
making it easily available when evaluating expressions. These fields
initially have the value 0, and are set by cutu_reader whenever the
unit is parsed. The various getters (dwarf2_per_cu::addr_size & al) now
just return these fields.
Doing so allows removing anything related to reading the header from
dwarf2_per_cu, which I think is a nice simplification. This means that
nothing ever needs to read the header from just a dwarf2_per_cu.
It also happens to shrink the dwarf2_per_cu object size a bit, going
from:
(top-gdb) p sizeof(dwarf2_per_cu)
$1 = 176
to
(top-gdb) p sizeof(dwarf2_per_cu)
$1 = 120
I placed the new fields at this strange location in dwarf2_per_cu
because there happened to be a nice 3 bytes hole there (on Linux amd64
at least).
The new test set things up as described previously. Note that the crash
only occurs if using the cc-with-gdb-index board.
Change-Id: I50807a1bbb605f0f92606a9e61c026e3376a4fcf
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
New in v2: make the test try with indexes by default
This patch fixes a crash caused by GDB trying to read from a section not
read in. The bug happens in those specific circumstances:
- reading a type unit from .dwo
- that type unit has a stub in the main file
- there is a GDB index (.gdb_index) present
This crash is the cause of the following test failure, with the
cc-with-gdb-index target board:
$ make check TESTS="gdb.dwarf2/fission-reread.exp" RUNTESTFLAGS="--target_board=cc-with-gdb-index"
Running /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.dwarf2/fission-reread.exp ...
ERROR: GDB process no longer exists
Or, manually:
$ ./gdb -nx -q --data-directory=data-directory /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/fission-reread/fission-reread -ex "p 1"
Reading symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/fission-reread/fission-reread...
Fatal signal: Segmentation fault
For this last one, you need to interrupt the test (e.g. add a return)
before the test deletes the .dwo file.
The backtrace at the moment of the crash is:
#0 0x0000555566968f7f in bfd_getl32 (p=0x0) at /home/simark/src/binutils-gdb/bfd/libbfd.c:846
#1 0x00005555642e561d in read_initial_length (abfd=0x7d1ff1eb0e40, buf=0x0, bytes_read=0x7bfff0962fa0, handle_nonstd=true) at /home/simark/src/binutils-gdb/gdb/dwarf2/leb.c:92
#2 0x00005555647ca9ea in read_unit_head (header=0x7d0ff1e068b0, info_ptr=0x0, section=0x7c3ff1dea7d0, section_kind=ruh_kind::COMPILE) at /home/simark/src/binutils-gdb/gdb/dwarf2/unit-head.c:44
#3 0x000055556452e37e in dwarf2_per_cu::get_header (this=0x7d0ff1e06880) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:18531
#4 0x000055556452e574 in dwarf2_per_cu::addr_size (this=0x7d0ff1e06880) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:18544
#5 0x000055556406af91 in dwarf2_cu::addr_type (this=0x7d7ff1e20880) at /home/simark/src/binutils-gdb/gdb/dwarf2/cu.c:124
#6 0x0000555564534e48 in set_die_type (die=0x7e0ff1f23dd0, type=0x7e0ff1f027f0, cu=0x7d7ff1e20880, skip_data_location=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:19020
#7 0x00005555644dcc7b in read_structure_type (die=0x7e0ff1f23dd0, cu=0x7d7ff1e20880) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:11239
#8 0x000055556451c834 in read_type_die_1 (die=0x7e0ff1f23dd0, cu=0x7d7ff1e20880) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:16878
#9 0x000055556451c5e0 in read_type_die (die=0x7e0ff1f23dd0, cu=0x7d7ff1e20880) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:16861
#10 0x0000555564526f3a in get_signatured_type (die=0x7e0ff1f0ffb0, signature=10386129560629316377, cu=0x7d7ff1e0f480) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:17998
#11 0x000055556451c23b in lookup_die_type (die=0x7e0ff1f0ffb0, attr=0x7e0ff1f10008, cu=0x7d7ff1e0f480) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:16808
#12 0x000055556451b2e9 in die_type (die=0x7e0ff1f0ffb0, cu=0x7d7ff1e0f480) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:16684
#13 0x000055556451457f in new_symbol (die=0x7e0ff1f0ffb0, type=0x0, cu=0x7d7ff1e0f480, space=0x0) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:16089
#14 0x00005555644c52a4 in read_variable (die=0x7e0ff1f0ffb0, cu=0x7d7ff1e0f480) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:9119
#15 0x0000555564494072 in process_die (die=0x7e0ff1f0ffb0, cu=0x7d7ff1e0f480) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:5197
#16 0x000055556449c88e in read_file_scope (die=0x7e0ff1f0fdd0, cu=0x7d7ff1e0f480) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:6125
#17 0x0000555564493671 in process_die (die=0x7e0ff1f0fdd0, cu=0x7d7ff1e0f480) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:5098
#18 0x00005555644912f5 in process_full_comp_unit (cu=0x7d7ff1e0f480) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:4851
#19 0x0000555564485e18 in process_queue (per_objfile=0x7d6ff1e71100) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:4161
#20 0x000055556446391d in dw2_do_instantiate_symtab (per_cu=0x7ceff1de42d0, per_objfile=0x7d6ff1e71100, skip_partial=true) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:1650
#21 0x0000555564463b3c in dw2_instantiate_symtab (per_cu=0x7ceff1de42d0, per_objfile=0x7d6ff1e71100, skip_partial=true) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:1671
#22 0x00005555644687fd in dwarf2_base_index_functions::expand_all_symtabs (this=0x7c1ff1e04990, objfile=0x7d5ff1e46080) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:1990
#23 0x0000555564381050 in cooked_index_functions::expand_all_symtabs (this=0x7c1ff1e04990, objfile=0x7d5ff1e46080) at /home/simark/src/binutils-gdb/gdb/dwarf2/cooked-index.h:237
#24 0x0000555565df5b0d in objfile::expand_all_symtabs (this=0x7d5ff1e46080) at /home/simark/src/binutils-gdb/gdb/symfile-debug.c:372
#25 0x0000555565eafc4a in maintenance_expand_symtabs (args=0x0, from_tty=1) at /home/simark/src/binutils-gdb/gdb/symmisc.c:914
The main file contains a stub (skeleton) for a compilation unit and a
stub for a type unit. The .dwo file contains a compilation unit and a
type unit matching those stubs. When doing the initial scan of the main
file, the DWARF reader parses the CU/TU list from the GDB index
(.gdb_index), and thus creates a signatured_type object based on that.
The section field of this signatured_type points to the .debug_types
section in the main file, the one containing the stub. And because GDB
trusts the GDB index, it never needs to look at that .debug_types
section in the main file. That section remains not read in.
When expanding the compilation unit, GDB encounters a type unit
reference (by signature) corresponding to the type in the type unit. We
get in lookup_dwo_signatured_type, trying to see if there is a type unit
matching that signature in the current .dwo file. We proceed to read
and expand that type unit, until we eventually get to a
dwarf2_cu::addr_type() call, which does:
int addr_size = this->per_cu->addr_size ();
dwarf2_per_cu::addr_size() tries to read the header from the section
pointed to by dwarf2_per_cu::section which, if you recall, is the
.debug_types section in the main file that was never read in. That
causes the segfault.
All this was working fine before these patches of mine, that tried to do
some cleanups:
a47e2297fc28 ("gdb/dwarf: pass section offset to dwarf2_per_cu_data constructor")
c44ab627b021 ("gdb/dwarf: pass section to dwarf2_per_cu_data constructor")
39ee8c928551 ("gdb/dwarf: pass unit length to dwarf2_per_cu_data constructor")
Before these patches, the fill_in_sig_entry_from_dwo_entry function
(called from lookup_dwo_signatured_type, among others) would overwrite
some dwarf2_per_cu fields (including the section) to point to the .dwo,
rather than represent what's in the main file. Therefore, the header
would have been read from the unit in the .dwo file, and things would
have been fine.
When doing these changes, I mistakenly assumed that the section written
by fill_in_sig_entry_from_dwo_entry was the same as the section already
there, which is why I removed the statements overwriting the section
field (and the two others). To my defense, here's the comment on
dwarf2_per_cu::section:
/* The section this CU/TU lives in.
If the DIE refers to a DWO file, this is always the original die,
not the DWO file. */
struct dwarf2_section_info *section = nullptr;
I would prefer to not reintroduce the behavior of overwriting the
section info in dwarf2_per_cu, because:
1. I find it confusing, I like the invariant of dwarf2_per_cu::section
points to the stub, and dwarf2_cu::section points to where we
actually read the debug info from.
2. The dwarf2_per_bfd::all_units vector is nowadays sorted by (section,
section offset). If we change the section and section offset of a
dwarf2_per_cu, then we can no longer do binary searches in it, we
would have to re-sort the vector (not a big deal, but still adds to
the confusion).
One possible fix would be to make sure that the section is read in when
reading the header, probably in dwarf2_per_cu::get_header. An approach
like that was proposed by Andrew initially, here:
https://inbox.sourceware.org/gdb-patches/60ba2b019930fd6164f8e6ab6cb2e396c32c6ac2.1759486109.git.aburgess@redhat.com/
It would work, but there is a more straightforward fix for this
particular problem, I believe. In dwarf2_cu, we have access to the
header read from the unit we're actually reading the DWARF from. In the
DWO case, that is the header read from the .dwo file. We can get the
address size from there instead of going through the dwarf2_per_cu
object. This is what this patch does.
However, there are other case where we get the address (or offset) size
from the dwarf2_per_cu in the DWARF expression evaluator (expr.c,
loc.c), that could cause a similar crash. The next patch handles these
cases.
Modify the gdb.dwarf2/fission-reread.exp test so that it tries running
with an index even with the standard board (that part was originally
written by Andrew).
Finally, just to put things in context, having a stub in the main file
for a type unit is obsolete. It happened in the gcc 4.x days, until
this commit:
commit 4dd7c3b285daf030da0ff9c0d5e2f79b24943d1e
Author: Cary Coutant <ccoutant@google.com>
Date: Fri Aug 8 20:33:26 2014 +0000
Remove skeleton type units that were being produced with -gsplit-dwarf.
In DWARF 5, split type units don't have stubs, only split compilations
units do.
Change-Id: Icc5014276c75bf3126ccb43a4424e96ca1a51f06
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33307
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
In an effort to generate ELF files and DWARF info as close as possible
as an actual compiler would generate, change how we emit type unit
sections to emit each type unit in its own section, in COMDAT section
groups.
We currently emit all type units in a single, standard section (either
.debug_info or .debug_types, depending on the DWARF version). Compilers
emit each type unit in its own .debug_types section. Each section is
placed in a COMDAT section group with a signature based on the type
unit's signature. This lets the linker deduplicate identical type units
by discarding section groups with identical signatures (keeping only one
group with a given signature).
Looking at a .s file generated by gcc, we can see that this is how it
specifies the section for a type unit:
.section .debug_info,"G",@progbits,wi.006fd2152a3054a6,comdat
The "G" flag tells the assembler to place the section in a section
group with signature "wi.006fd2152a3054a6". That string was generated
from the type unit, signature. Finally, the comdat linkage field
indicates that the section group should have the COMDAT flag.
Update the tu proc to emit a section with these properties. In this
case, it's mandatory to specify the type of the section (progbits).
Update the _section proc to accept the new options "group_signature" and
"linkage".
As a result, if you look at the .debug_types section in a .o file from
gcc:
$ readelf -g main.o
COMDAT group section [ 1] `.group' [wt.006fd2152a3054a6] contains 2 sections:
[Index] Name
[ 11] .debug_types
[ 12] .rela.debug_types
COMDAT group section [ 2] `.group' [wt.c621aa8e3c23e450] contains 2 sections:
[Index] Name
[ 13] .debug_types
[ 14] .rela.debug_types
And a .o file created by our DWARF assembler:
$ readelf -g testsuite/outputs/gdb.dwarf2/struct-with-sig/struct-with-sig1.o
COMDAT group section [ 1] `.group' [sig.0x0000000000000001] contains 2 sections:
[Index] Name
[ 10] .debug_types
[ 11] .rela.debug_types
COMDAT group section [ 2] `.group' [sig.0x0000000000000002] contains 2 sections:
[Index] Name
[ 12] .debug_types
[ 13] .rela.debug_types
In both cases, in the fully linked files, there is a single .debug_types section
containing the two type units, as expected.
Before this patch, when I run gdb.dwarf2/fission-with-type-unit.exp, the
resulting .dwo file has a single .debug_info.dwo. After this patch it
has two: one with the type unit and one with the compile unit (the test
uses DWARF 5). Based on what I see gcc generate when using "-gdwarf-5
-gsplit-dwarf -fdebug-types-section", this is what we expect.
Change-Id: Ie1954dc697fe100b5dbe664d148c71fa02888d96
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
New in v2: return early if the section is .debug_str
The following patch will add more options to the _section proc, so
convert it to use options, to be more user-friendly.
Change-Id: I971e4e10e55d63af2a5e29dc5e1d00f368b3ecaa
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
When I wrote test gdb.dwarf2/fission-with-type-unit.exp, I did not use
build_executable_and_dwo_files, because it wouldn't work to have
multiple units in the .dwo file, each referring to their own abbrev
table using labels. build_executable_and_dwo_files extracts the .dwo
file content from the .o using objcopy (just like gcc does, I learned),
meaning that the .dwo file never runs through a linker. Anything
needing relocation (like labels pointing to abbrev tables) doesn't work.
I instead opted to use gdb_compile_shlib to build the .dwo file on its
own, so that those labels would get resolved. That causes problems now
that I'm trying to write a test with multiple type units in a .dwo file,
where each type unit should be in its own .debug_types section. Running
the .dwo file through the linker causes all the .debug_types section to
be collapsed into one. And generally, I think it was a bad idea to
generate a .dwo file using the linker, since the idea behind .dwo files
is that they do not need to be linked (therefore improving link
times). We want to produce files as close to what an actual compiler
would produce.
This patch fixes this by doing what compilers do in the same situation:
use a single abbrev table shared by all units in the .dwo file. This
requires the following changes in lib/dwarf.exp:
- Declare a new variable _dwo_abbrev_num, which holds the next
abbrev number to use in the .dwo file's abbrev section
(.debug_abbrev.dwo). Initialize this variable to 1.
- When producing a CU or TU in a .dwo file, use 0 as the abbrev table
offset.
- When generating a DIE, return $_dwo_abbrev_num or $_abbrev_num,
depending on whether the current CU is in a .dwo file.
- After producing a CU or TU in a .dwo file, don't append the
terminator byte.
- After finishing producing the CUs and TUs, append the terminator byte
in .debug_abbrev.dwo if we did output anything there.
Update gdb.dwarf2/fission-with-type-unit.exp to use
build_executable_and_dwo_files, as it should. Remove the
gdb_remote_download call from gdb.dwarf/fission-with-type-unit.exp,
because build_executable_and_dwo_files does not support remote hosts
anyway.
With this change, running with the cc-with-gdb-index board, I see:
(gdb) maint expand-symtabs
/home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3056: internal-error: cutu_reader: Assertion `sig_type->signature == cu->header.signature' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
FAIL: gdb.dwarf2/fission-with-type-unit.exp: maint expand-symtabs (GDB internal error)
This is actually an improvement, as the test case didn't run properly
before. The compilation failed with:
gdb compile failed, During symbol reading: Could not find DWO CU fission-with-type-unit.dwo(0xf00d) referenced by CU at offset 0xc [in module /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/fission-with-type-unit/.tmp/fission-with-type-unit]
The reason was that the old code would try to generate the GDB index
during this step:
# Build main file.
if { [build_executable "${testfile}.exp" $binfile \
[list ${srcfile} ${main_asm_file}] {nodebug}] } {
return
}
... which is before the DWO file is even generated. With this patch
things are done in the correct order:
- The -dw.S file is generated
- The -dw.o file is compiled from the -dw.S
- The .dwo sections are extracted to the .dwo file, and stripped from
the -dw.o file
- The executable is linked from the .o and -dw.o
- gdb-add-index is ran on the executable
When gdb-add-index runs, the .dwo file exists, so GDB is able to produce
an index. That index is bogus though, because the .gdb_index format is
unable to describe skeletonless type units. And then GDB gets confused
trying to use that index, leading to the internal error.
Change-Id: Iabbcf00db97faf2a4fa5fc71652ad273081189f9
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
all failure paths
There are some paths of build_executable_and_dwo_files that return -1
without calling "untested". As a result, tests such as
gdb.dwarf2/fission-absolute-dwo.exp would exit without leaving a trace.
Add some untested calls to fix that.
Change-Id: I2e632b5b44b11b4beb39791316f1203f9a12bf4f
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Commits:
commit aaad5a3254db53434eaf1cf70384e7ee0dfb886a
Author: Tom de Vries <tdevries@suse.de>
Date: Fri Sep 5 15:36:23 2025 +0200
[gdb/testsuite] Fix clean_restart <absolute filename> in gdb.base, part 3
commit 2e61486fcefe8812714dcb0fb787581592675502
Author: Tom de Vries <tdevries@suse.de>
Date: Fri Sep 5 15:36:23 2025 +0200
[gdb/testsuite] Fix clean_restart <absolute filename> in gdb.base, part 2
commit 202beb3feebd44fbc1979d9fdb4d74c44e16a417
Author: Tom de Vries <tdevries@suse.de>
Date: Fri Sep 5 15:36:23 2025 +0200
[gdb/testsuite] Fix clean_restart <absolute filename> in gdb.base, part 1
were made to work around the changes to clean_restart in commit:
commit cba778b944af90c362a618af0630877736a54baa
Date: Sun Sep 7 11:53:30 2025 +0200
[gdb/testsuite] Error out on clean_restart <absolute filename>
These commits added a lot of calls to gdb_load which can be removed in
many cases by passing $testfile to clean_restart, or by switching to
use prepare_for_testing to compile the test executable.
In this commit I've gone through the gdb.base/ directory and removed
as many of the gdb_load calls as possible. I was only looking for
places where the gdb_load call immediately follows the call to
clean_restart. And I did skip a few where it was not as simple as
just passing $testfile.
Where possible I've updated tests to use calls to prepare_for_testing,
and simply removed the clean_restart call altogether (this is done as
part of prepare_for_testing). This is, I think, the best solution.
In other cases I've removed the gdb_load call, and passed $testfile to
clean_restart. I've preferred $::testfile to adding a 'global'
declaration, and in some cases switching to testfile has allowed me to
remove the 'global binfile' as an additional cleanup.
I ran the complete set of tests that I touched and I didn't see any
regressions, so I don't believe I broke anything.
I know that there are probably gdb_load calls that can be cleaned up
in other testsuite sub-directories, if/when this patch is merged I'll
take a look at those too.
Reviewed-By: Tom de Vries <tdevries@suse.de>
|
|
At Red Hat we have an out of tree AArch64 watchpoint test which broke
after this commit:
commit cf16ab724a41e4cbaf723b5633d4e7b29f61372b
Date: Tue Mar 12 17:08:18 2024 +0100
[gdb/tdep] Fix gdb.base/watch-bitfields.exp on aarch64
The problem with AArch64 hardware watchpoints is that they (as I
understand it) are restricted to a minimum of 8 bytes.
The problem is that current AArch64 hardware has imprecise hardware
watchpoint events due to unaligned accesses. The address reported for
the watchpoint event will depend on the access size. As a result, it
is possible that multiple watchpoints could potentially account for a
single watchpoint event, which is the case in the RH test. GDB can
then miss-identify which watchpoint actually triggered.
Prior to the above commit the RH test was passing. However, the test
was relying on, in the case of ambiguity, GDB selecting the first
created watchpoint. That behaviour changed with the above commit.
Now GDB favours reporting non write breakpoints, and will only report
a write breakpoint if no non-write breakpoint exists in the same
region.
I originally posted a patch to try and tweak the existing logic to
restore enough of the original behaviour that the RH test would pass,
this can be found here (2 iterations):
https://inbox.sourceware.org/gdb-patches/65e746b6394f04faa027e778f733eda95d20f368.1753115072.git.aburgess@redhat.com
https://inbox.sourceware.org/gdb-patches/638cbe9b738c0c529f6370f90ba4a395711f63ae.1753971315.git.aburgess@redhat.com
Neither of these really resolved the problem, they fixed some cases,
but broke others.
Ultimately, the problem on AArch64 is that for a single watchpoint
trap, there could be multiple watchpoints that are potentially
responsible. The existing API defined by the target_ops methods
stopped_by_watchpoint() and stopped_data_address() only allow for two
possible options:
1. If stopped_by_watchpoint() is true then stopped_data_address()
can return true and a single address which identifies all
watchpoints at that single address, or
2. If stopped_by_watchpoint() is true then stopped_data_address()
can return false, in which case GDB will check all write
watchpoints to see if any have changed, if they have, then GDB
tells the user that that was the triggering watchpoint.
If we are in a situation where we have to choose between multiple
write and read watchpoints then the current API doesn't allow the
architecture specific code to tell GDB core about this case.
In this commit I propose that we change the target_ops API,
specifically, the method:
bool target_ops::stopped_data_address (CORE_ADDR *);
will change to:
std::vector<CORE_ADDR> target_ops::stopped_data_addresses ();
The architecture specific code can now return a set of watchpoint
addresses, allowing GDB to identify a set of watchpoints that might
have triggered. GDB core can then select the most likely watchpoint,
and present that to the user.
As with the old API, target_ops::stopped_data_addresses should only be
called when target_ops::stopped_by_watchpoint is true, in which case
it's return values can be interpreted like this:
a. An empty vector; this replaces the old case where false was
returned. GDB should check all the write watchpoints and select
the one that changed as the responsible watchpoint.
b. A single entry vector; all targets except AArch64 currently
return at most a single entry vector. The single address
indicates the watchpoint(s) that triggered.
c. A multi-entry vector; currently AArch64 only. These addresses
indicate the set of watchpoints that might have triggered. GDB
will check the write watchpoints to see which (if any) changed,
and if no write watchpoints changed, GDB will present the first
access watchpoint.
In the future, we might want to improve the handling of (c) so that
GDB tells the user that multiple access watchpoints might have
triggered, and then list all of them. This might clear up some
confusion. But I think that can be done in the future (I don't have
an immediate plan to work on this). I think this change is already a
good improvement.
The changes for this are pretty extensive, but here's a basic summary:
* Within gdb/ changing the API name from stopped_data_address to
stopped_data_addresses throughout. Comments are updated too where
needed.
* For targets other than AArch64, the existing code is retained with
as few changes as possible, we only allow for a single address to
be returned, the address is now wrapped in a vector. Where we
used to return false, we now return the empty vector.
* For AArch64, the return a vector logic is pushed through to
gdb/nat/aarch64-hw-point.{c,h}, and aarch64_stopped_data_address
changes to aarch64_stopped_data_addresses, and is updated to
return a vector of addresses.
* In infrun.c there's some updates to some debug output.
* In breakpoint.c the interesting changes are in
watchpoints_triggered. The existing code has three cases to
handle:
(i) target_stopped_by_watchpoint returns false. This case is
unchanged.
(ii) target_stopped_data_address returns false. This case is now
calling target_stopped_data_addresses, and checks for the
empty vector, but otherwise is unchanged.
(iii) target_stopped_data_address returns true, and a single
address. This code calls target_stopped_data_addresses, and
now handles the possibility of a vector containing multiple
entries. We need to first loop over every watchpoint
setting its triggered status to 'no', then we check every
address in the vector setting matching watchpoint's
triggered status to 'yes'. But the actual logic for if a
watchpoint matches an address or not is unchanged.
The important thing to notice here is that in case (iii), before
this patch, GDB could already set _multiple_ watchpoints to
triggered. For example, setting a read and write watchpoint on
the same address would result in multiple watchpoints being marked
as triggered. This patch just extends this so that multiple
watchpoints, at multiple addresses, can now be marked as
triggered.
* In remote.c there is an interesting change. We need to allow
gdbserver to pass the multiple addresses back to GDB. To achieve
this, I now allow multiple 'watch', 'rwatch', and 'awatch' tokens
in a 'T' stop reply packet. There's a new feature multi-wp-addr
which is passed in the qSupported packet to determine if the
remote is allowed to pass back multiple watchpoint stop reasons.
If the remote passed multiple watchpoint addresses then these are
collected and returned from the target_ops::stopped_data_addresses
call.
If a new GDB connects to an old gdbserver that doesn't understand
the multi-wp-addr feature, then gdbserver will continue to return
a single watchpoint address in the 'T' packet, which is what
happens before this patch.
* In gdbserver/ the changes are pretty similar. The API is renamed
from ::stopped_data_address to ::stopped_data_addresses, and
::low_stopped_data_address to ::low_stopped_data_addresses.
There's also code added to detect the new multi-wp-addr feature.
If this feature is not advertised from GDB then only a single
watchpoint address will be returned in the 'T' stop reply packet.
* In GDB and gdbserver, for all targets except AArch64, the existing
code to figure out a watchpoint address is retained, we just wrap
the single address into a vector.
* For AArch64, we call aarch64_stopped_data_addresses, which returns
the required vector.
For testing, I've built GDB on GNU/Linux for i386, x86-64, PPC64le,
ARM, and AArch64. That still leaves a lot of targets possibly
impacted by this change as untested. Which is a risk. I certainly
wouldn't want to push this patch until after GDB 17 branches so we
have time to find and fix any regressions that are introduced.
I've run a full regression test on AArch64 and x86-64 (both GNU/Linux)
with no regressions. As I said above, for other targets nothing
should really have changed, all non-AArch64 targets just return a
single watchpoint address from target_ops::stopped_data_addresses(),
so, as long as the target builds, it should run unchanged.
I also sent the branch through the sourceware CI, and everything
passed.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33240
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33252
Acked-By: Tom de Vries <tdevries@suse.de>
|
|
When merging the big "whitespace fix" commit in a downstream repo,
pre-commit/codespell identified a lot of typos, fix a few of them.
Change-Id: Ie898e9903daa4e6e0e49a623891a739071e91392
Approved-By: Tom de Vries <tdevries@suse.de>
|
|
While working with objfiles in Python I noticed that
gdb.Progspace.objfile_for_address () does not return "dynamic" objfiles
created by (for example) GDB's JIT reader API.
This is because is_addr_in_objfile() checks if a given address falls into
any (mappped) section of that objfile. However objfiles created by JIT
reader API do not have sections.
To solve this issue, this commit updates is_addr_in_objfile() to also
check if the address fall into any compunit in that objfile. It does so
only if the objfile has no sections.
|
|
Fix three types in test-case gdb.base/long_long.exp:
...
$ codespell gdb.base/long_long.exp
gdb.base/long_long.exp:19: differnet ==> different
gdb.base/long_long.exp:208: Implict ==> Implicit
gdb.base/long_long.exp:224: Implict ==> Implicit
...
|
|
According to GDB coding style, test names should start with lowercase:
https://sourceware.org/gdb/wiki/GDBTestcaseCookbook#Follow_the_test_name_convention
Modify the test names in gdb.base/gdbvars.exp accordingly.
|
|
On PowerPC targets, RTTI typeinfo objects for simple base types such as
`int`, `char*`, and `const char*` may not be emitted until the inferior
has been started. As a result, the `gdb.cp/typeid.exp` test fails when
checking typeid results before program execution begins.
This patch extends the existing Clang-specific logic that skips base type
RTTI checks before the inferior starts, to also apply on PowerPC. This
ensures consistent test behavior across compilers and targets.
gdb/testsuite/
* gdb.cp/typeid.exp (do_typeid_tests): Skip base type RTTI tests
before inferior start on PowerPC.
Approved-By: Tom de Vries <tdevries@suse.de>
|
|
This particular path for value_cast does not attempt to resolve a
dynamic target type before assigning it to the new value. Having a
value with a dynamic type that hasn't been resolved causes an assert
later, when printing the value. For instance, running the added test
without the fix yields:
(gdb) p (outer_type) g_outer_as_char_array
$7 = ( /home/smarchi/src/binutils-gdb/gdb/value.c:3111: internal-error: primitive_field: Assertion `type->dyn_prop (DYN_PROP_DATA_LOCATION)->is_constant ()' failed.
This code path is taken when the value being cast has the same size as
the target type, and no earlier more specific rule matched. Fix it by
adding a call to resolve_dynamic_type before assigning the target type
to the value.
The test exercises this by defining a char array
(`g_outer_as_char_array`) with the same size as `outer_type` in the
DWARF info, then casting it to `outer_type`. Without the fix, this
triggers the assertion when printing the result.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33575
Change-Id: I547da32bbd24462b779e76bceb6e0a87859188d1
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Using the test program from the added test, I get this internal error:
$ ./gdb -q -nx --data-directory=data-directory testsuite/outputs/gdb.dwarf2/data-loc-cast/data-loc-cast -ex "with language fortran -- p (outer_type) g_outer_var" -batch
$1 = ( /home/smarchi/src/binutils-gdb/gdb/value.c:1697: internal-error: set_component_location: Assertion `this->lval () == lval_memory' failed.
The test added by this patch mimics a problem that was reported when
debugging a Fortran program. The situation is:
- The DWARF defines a Fortran-style array type, with a
DW_AT_data_location attribute.
- It then defines a structure type (outer_type) with a field
(outer_type::assoc) of that array type.
- Trying to cast a minimal symbol (g_outer_var) to that structure type
leads to the internal error shown above.
The use case for this is: for some reason, a variable of type S isn't
described in the debug info, but GDB still knows about type S. The user
is trying to say: I know that the variable at this address is an S, show
it to me.
A Fortran-style array doesn't hold the data directly. It's a structure
(a descriptor) that contains a pointer to the data and possible other
properties, like the length of the array. The DW_AT_data_location
attribute contains a DWARF expression that yields the location the
actual data, given the location of the descriptor. In GDB's type
system, this translates to a dynamic property of kind
DYN_PROP_DATA_LOCATION pointing to the DWARF expression. Before
instantiating a value with such a dynamic type, the dynamic type must
first be resolved, which is done by function resolve_dynamic_type. That
is, for each dynamic property, compute a concrete value for the specific
instance we're dealing with here. The result of resolve_dynamic_type is
a type that is just like the input type, but where all dynamic
properties now have concrete, constant values.
Here's the timeline of what happens when doing "p (outer_type)
g_outer_var":
1. We start in var_msym_value_operation::evaluate_for_cast.
2. We go in function value_cast, in this branch:
else if (arg2->lval () == lval_memory)
return value_at_lazy (to_type, arg2->address ());
... where to_type is the "outer_type" structure and arg2 is the
minsym. This effectively builds a value pretending that there
exists an instance of outer_type at the address of the minsym, which
is what we want. The resulting value is "lval_memory".
3. Somewhere inside value_at_lazy, there is a call to
resolve_dynamic_type:
#0 resolve_dynamic_type (type=0x7e0ff1cfd560, valaddr=..., addr=0x4340, in_frame=0x7bfff0b3f100) at /home/smarchi/src/binutils-gdb/gdb/gdbtypes.c:3011
#1 0x000055556687dcba in value_from_contents_and_address (type=0x7e0ff1cfd560, valaddr=0x0, address=0x4340, frame=...) at /home/smarchi/src/binutils-gdb/gdb/value.c:3669
#2 0x00005555667ec527 in get_value_at (type=0x7e0ff1cfd560, addr=0x4340, frame=..., lazy=1) at /home/smarchi/src/binutils-gdb/gdb/valops.c:992
#3 0x00005555667ec79f in value_at_lazy (type=0x7e0ff1cfd560, addr=0x4340, frame=...) at /home/smarchi/src/binutils-gdb/gdb/valops.c:1039
#4 0x00005555667e902b in value_cast (type=0x7e0ff1cfd560, arg2=0x7d0ff1c35540) at /home/smarchi/src/binutils-gdb/gdb/valops.c:645
This is good, it returns a structure type where the type of field
"assoc" has a constant DYN_PROP_DATA_LOCATION property that holds
the memory address where the data for this array resides.
4. Back in var_msym_value_operation::evaluate_for_cast, we do:
val = value_cast (to_type, val);
/* Don't allow e.g. '&(int)var_with_no_debug_info'. */
if (val->lval () == lval_memory)
{
if (val->lazy ())
val->fetch_lazy ();
val->set_lval (not_lval);
}
This is meant to make GDB behave more or less like C, where the
result of a cast is not an lvalue, of which you can't take the
address, for instance. I am not an expert in this area, but Pedro
explained that this lval thing in GDB actually conflates two things:
- where is the value (memory, register, only in GDB's mind, etc)
- is this an lvalue (is it assignable, can you take its address,
etc)
Here, we would ideally want to say that the value is not an lvalue,
but still say that it lives at a given address in memory. But since
the two concepts are conflated, we set it to "not_lval", which means
"not an lval and does not exist on target".
If there was a way to say "non-lvalue" and "in memory", I think that
the bug that follows would be hidden.
5. When printing the value, the value-printing code attempts to fetch
the "assoc" field of the struct using value::primitive_field, which
then goes into value::set_component_location. In
set_component_location there is this code, which is where we find
the assert that fails:
/* If the COMPONENT has a dynamic location, and is an
lval_internalvar_component, then we change it to a lval_memory.
Usually a component of an internalvar is created non-lazy, and has
its content immediately copied from the parent internalvar.
However, for components with a dynamic location, the content of
the component is not contained within the parent, but is instead
accessed indirectly. Further, the component will be created as a
lazy value.
By changing the type of the component to lval_memory we ensure
that value_fetch_lazy can successfully load the component.
This solution isn't ideal, but a real fix would require values to
carry around both the parent value contents, and the contents of
any dynamic fields within the parent. This is a substantial
change to how values work in GDB. */
if (this->lval () == lval_internalvar_component)
{
gdb_assert (lazy ());
m_lval = lval_memory;
}
else
gdb_assert (this->lval () == lval_memory);
I think that what this comment is really trying to say is: if a
structure is an internalvar, and a field of that structure has a
dynamic data location, then the actual data is not contained in the
internalvar, it is in memory.
The message for the commit that introduced that code (3c8c6de21da
"gdb: user variables with components of dynamic type")) confirms it,
I think. The message also goes on to explain that we could imagine
a world where the internalvar outer struct value would also capture
the (indirect) contents of the array field. The internalvar value
would then be completely self-contained. I imagine this could be
useful in some cases, but we don't have that today.
The comment makes it sound like it's a hack, but I actually think it
makes sense. This is what is really happening.
This all assumes that the result of the DW_AT_data_location is a
location in memory. I guess this is true for all practical purposes
today, but it would be possible for DW_AT_data_location to yield a
register, a composite, or even "undefined" (meaning optimized out)
as a location (which would be even easier to implement with the
upcoming DWARF 6 "Location descriptions on the DWARF stack" feature
[1]). In that case, the location that we set for the array
component should reflect whatever the DWARF expression returned.
But that is future work, for now, we assume that the data location
can only be in memory.
So, the fix is basically to always set the location of a value
sub-component to "memory", because we assume (for now) that the result
of all DW_AT_data_location expressions will be memory locations.
Now, referring back to point 4 above: if the code in
var_msym_value_operation::evaluate_for_cast could in fact set the value
to "non-lvalue in memory", then we wouldn't hit the assert in
value::set_component_location, because the subcomponent would have
inherited lval_memory from the parent structure. However, I still think
that the fix in this patch is valid on its own. Imagine that the DWARF
says that the outer struct (and thus the array descriptor) is in a
register. Once we evaluate the DW_AT_data_location of the array field,
the value describing the actual data is still in memory.
The takeaway is that regardless of the location of the descriptor, the
DW_AT_data_location expression always returns a memory location (for
now), so we should always set the location of that value to memory. In
other words, we apply the same logic as commit 3c8c6de21da regardless of
the location of the outer value".
Finally, a note about the test: I made the array content very long (100
elements), because I did spot some issues where GDB would conflate the
size of the array data with the size of the array field, within the
outer structure. The test does not expose these issues and I didn't try
to fix them here (one thing at a time), but making the array size very
large might help spot these issues in the future.
[1] https://dwarfstd.org/issues/230524.1.html
Change-Id: Ib10a77b62cd168fc7c08702e0f6dd47b5ac0f097
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33575
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This is the result of:
...
$ git rm -rf gdb/testsuite
$ git commit -a -m tmp
$ git revert HEAD
$ git rebase --whitespace=fix HEAD^
...
Tested on x86_64-linux.
PR build/33616
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33616
|
|
In test-case gdb.ada/unchecked_union.exp, git --check reports some whitespace
issues:
...
$ git diff-tree --check $(git hash-object -t tree /dev/null) HEAD \
-- gdb/testsuite/gdb.ada/unchecked_union.exp \
| grep -c "indent with spaces"
12
...
The problem is that this style of string is used containing space-indented
text:
...
set inner_string { case ? is
when 0 =>
small: range 0 .. 255;
second: range 0 .. 255;
when ? =>
bval: range 0 .. 255;
when others =>
large: range 255 .. 510;
more: range 255 .. 510;
end case;
}
...
Fix this by changing the string into a list of strings:
...
set inner_string \
[list \
" case ? is" \
" when 0 =>" \
" small: range 0 .. 255;" \
" second: range 0 .. 255;" \
" when ? =>" \
" bval: range 0 .. 255;" \
" when others =>" \
" large: range 255 .. 510;" \
" more: range 255 .. 510;" \
" end case;"]
...
which also fixes the odd position of the first line in the original version.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
PR build/33616
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33616
|
|
A test added in:
commit 91eee81d23537366bb6072c66f662ab46d88380a
Date: Tue Jan 21 17:22:04 2025 +0000
gdb: include NT_I386_TLS note in generated core files
includes a pointer value in the test name. The value changes from run
to run making it harder to compare test results. Fix this by giving
the test an actual name.
There's no change to what is being tested with the commit.
|
|
This commit extends GDB for x86/Linux to include the NT_I386_TLS note
in generated core files (i.e. created with `generate-core-file` or
`gcore` command). This note contains the 3 per-thread TLS related
GDT (global descriptor table) entries, and is present for i386
binaries, or those compiled on x86-64 with -m32.
The approach I have taken to achieve this, is to make the 3 GDT
entries available within 3 new registers. I added these registers to
the org.gnu.gdb.i386.linux target description feature, as this feature
seemed perfectly named. As the new registers are optional I don't see
any harm in extending this existing feature. I did consider adding a
new feature with `tls` in the name, but this seemed excessive given
the existing feature.
Which GDT entries are used for TLS varies between i386 and x86-64
running in 32-bit mode. As such the registers are named with suffixes
0, 1, and 2, and it is left to GDB or gdbserver, to find the correct
GDT entries (based on the precise target) and place the contents into
these registers.
With this done, adding the relevant regset is sufficient to get the
tls contents emitted as a core file note. Support for emitting the
note into the generated core file relies on some BFD changes which
were made in an earlier commit:
commit ea6ec00ff4520895735e4913cb90c933c7296f04
Date: Fri Jul 25 19:51:58 2025 +0100
bfd: support for NT_386_TLS notes
The three new registers are readable and writable. Writing to one of
the new registers will update the relevant kernel GDT entry.
Each TLS GDT is represented by a 'struct user_desc' (see 'man 2
get_thread_area' for details), the first 4 bytes of each 'user_desc'
is the 'entry_number' field, this is the index of the GDT within the
kernel, and cannot be modified. Attempts to write to this region of
the register will be ignored, but will not give an error.
I did consider not including this part of the user_desc within the
register value, but this becomes difficult when we consider remote
targets, GDB would then need to figure out what these indexes were so
that the core file note could be generated. Sure, we probably could
figure the correct index values out, but I figure, why bother, we can
just pass them through in the register and know for certain that we
have the correct values.
For testing, there's a new test that covers the basic functionality,
including read/write access to the new registers, and checking that
the NT_386_TLS note is added to the core file, and that the note
contents can be read by GDB.
I also manually tested opening a core file generated from an old
GDB (so no NT_386_TLS notes) using a GDB with this patch. This works
fine, the new tls registers are not created as the NT_GDB_TDESC
note (the target description) doesn't include the new registers.
Out of interest I also patched an old version of GDB to avoid creating
the NT_GDB_TDESC, and created a core file. This core file contained
neither the NT_386_TLS nor NT_GDB_TDESC. When opening this core file
with a patched GDB, the new registers do show up, but their contents
are given as <unavailable>, which is exactly what we'd expect, GDB
builds a target description based on the architecture, the
architecture says these registers should exist, but they are missing
from the core file, hence, <unavailable>.
I also tested using a patched GDB with an old version of gdbserver,
the new registers don't show up as the old gdbserver doesn't send them
in its target description. And a core file created using the gcore
command in such a setup leaves no NT_386_TLS notes added, which is
what we'd expect.
And I also tested a new gdbserver running with an old version of GDB.
As the new tls registers are now mentioned in the target description,
then obviously, the old GDB does see the registers, and present them
to the user, however GDB doesn't know how to use these registers to
create a NT_386_TLS, so that note isn't added to any core files.
Also, while a new GDB places the tls registers into the 'system'
group, an old GDB doesn't do this, so the registers end up in the
'general' group by default. This means they show up within 'info
registers' output. This isn't ideal, but there's not much that can be
done about this.
Overall, I feel the combinations of old and new tools has been tested,
and the behaviours are what we'd want or expect.
I'm tagging this commit with PR gdb/15591, even though this patch
isn't directly related. That bug is for improving GDB's testing of
TLS support in core files. The test in this commit does do some very
simple reading of a TLS variable, but there's only two threads, and
one TLS variable, so it's not extensive. Additionally, the test in
this commit is x86 only, so this should not be considered a full
resolution to that bug. But still, it's something.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=15591
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Christina Schimpe <christina.schimpe@intel.com>
Reviewed-By: Keith Seitz <keiths@redhat.com>
|
|
A user reported that gdb would crash when debugging a certain Fortran
executable.
The bug is that the DWARF reader may try to apply dynamic properties
to an arch-allocated type. This came as a bit of a surprise, but the
issue is that the function-type-allocation code could end up creating
an arch-owned type, when the return type is arch-owned.
This patch fixes the problem, and any other potential future problems,
by arranging for all types created by the DWARF reader to be
objfile-owned.
A better long-term solution might be the fabled "type GC", where the
arch/objfile distinction is finally removed. However, this is more
difficult to implement.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32793
|
|
In Ada, it's possible to create a recursive pointer type, or two
mutually recursive pointer types. gdb will crash when reading the
DWARF for these.
These types aren't actually useful, and so are unlikely to appear in
real code. Still, gdb shouldn't crash.
This patch handles this case by noticing recursive pointer types and
representing them as pointer-to-void.
Approved-by: Kevin Buettner <kevinb@redhat.com>
|
|
Add tests for FPMR support in gdb/gdbserver. These tests check
availability of FPMR, reading/writing to FPMR, core file generation and
preservation under sighandler frame unwinding.
A run of the full gdb testsuite has been done on aarch64-none-linux-gnu
without FPMR support. The gdb.arch tests were run on Shrinkwrap with
FPMR support.
Approved-By: Luis Machado <luis.machado.foss@gmail.com>
|
|
A user pointed out that if multiple breakpoints are set at the same
spot, in DAP mode, then changing the breakpoints won't reset all of
them.
The problem here is that the breakpoint map only stores a single
breakpoint, so if two breakpoints have the same key, only one will be
stored. Then, when breakpoints are changed, the "missing" breakpoint
will not be deleted.
The fix is to change the map to store a list of breakpoints.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33467
Reviewed-By: Ciaran Woodward <ciaranwoodward@xmos.com>
|
|
On i686-linux, with test-case gdb.rust/methods.exp I get:
...
(gdb) print x.take()
$5 = methods::HasMethods {value: 4}
(gdb) FAIL: $exp: print x.take()
...
The instructions for the function methods::HasMethods::take look like this:
...
00007b90 <_ZN7methods10HasMethods4take17hf373500ea3bd6e27E>:
7b90: 8b 44 24 04 mov 0x4(%esp),%eax
7b94: c3 ret
...
which is equivalent to what you get for:
...
$ cat test.c
int foo (int val) { return val; }
$ gcc test.c -O2 -S -o-
...
movl 4(%esp), %eax
ret
...
$
...
The inferior call mechanism however decides that this is a return_method_struct
function, and adds an implicit first parameter pointing to the return value
location. Then two things go wrong:
- the argument is written to a place where the code doesn't read from, and
- the return value is read from a place where the code doesn't write to.
AFAIU, both gdb and rustc are behaving correctly:
- there's no stable ABI and consequently rustc is at liberty to optimize this
function how it wants, and
- gdb cannot be expected to target an unstable ABI.
The solution is to mark the function for interoperability using 'extern "C"'.
Doing so causes a compilation warning:
...
warning: `extern` fn uses type `HasMethods`, which is not FFI-safe
--> gdb.rust/methods.rs:50:28
|
50 | pub extern "C" fn take(self) -> HasMethods {
| ^^^^ not FFI-safe
|
= help: consider adding a `#[repr(C)]` or `#[repr(transparent)]` attribute
to this struct
= note: this struct has unspecified layout
...
which we fix by using '#[repr(C)]'.
Likewise in gdb.rust/generics.exp.
Tested on i686-linux and x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Fix those:
gdb/testsuite/gdb.base/gdb-index-many-types.py:17:18: F821 undefined name 'gdb'
gdb/testsuite/gdb.base/gdb-index-many-types.py:26:42: F821 undefined name 'gdb'
gdb/testsuite/gdb.base/gdb-index-many-types.py:29:16: F821 undefined name 'gdb'
gdb/testsuite/gdb.base/gdb-index-many-types.py:31:19: F821 undefined name 'gdb'
gdb/testsuite/gdb.base/gdb-index-many-types.py:33:16: F821 undefined name 'gdb'
gdb/testsuite/gdb.base/gdb-index-many-types.py:33:51: F821 undefined name 'gdb'
gdb/testsuite/gdb.base/gdb-index-many-types.py:47:17: E722 do not use bare 'except'
Change-Id: Iba1949a211af66e8dd1e6cb77a44845e5fa60c2e
|
|
This commit removes the attempted de-duplication of types when
building the gdb-index. This commit is the natural extension of this
earlier commit:
commit aef36dee93bf194cb0b976a4ae49a79a736a188d
Date: Sun Aug 13 14:08:06 2023 +0200
[gdb/symtab] Don't deduplicate variables in gdb-index
Which removed the de-duplication of variables. It is worth reading
the earlier commit as all the justifications for that patch also
apply to this one.
Currently, when building the gdb-index we sort the type entries,
moving declarations to the end of the entry list, and non-declarations
to the front. Then within each group, declarations, and
non-declarations, the index entries are sorted by CU offset.
We then emit the first entry for any given type name.
There are two problems with this.
First, a non-declaration entry could be a definition, but it could
also be a typedef. Now sure, a typedef is a type definition, but not
necessarily a useful one.
If we have a header file that contains:
typedef struct foo_t foo_t;
And a CU which makes use of 'foo_t', then the CU will include both a
typedef and a type declaration. The target of the typedef will be the
declaration. But notice, the CU will not include a type definition.
If we have two CUs, one which only sees the above typedef and
declaration, and another which sees the typedef and an actual type
definition, then the final list of entries for this type's name will
be:
1. A typedef entry that points at the declaration.
2. A typedef entry that points at the definition.
3. A definition.
4. A declaration.
Now (4) will get sorted to the end of the entry list. But the order
of (1), (2), and (3) will depend on the CU offset. If the CU which
containing the typedef and declaration has the smallest offset,
then (1) will be sorted to the front of the list of entries for this
type name. Due to the de-duplication code this means that only (1)
will be added to the gdb-index.
After GDB starts and parses the index, if a user references 'foo_t'
GDB will look in the index and find just (1). GDB loads the CU
containing (1) and finds both the typedef and the declaration. But
GDB does not find the full type definition. As a result GDB will
display 'foo_t' as an incomplete type.
This differs from the behaviour when no index is used. With no index
GDB expands the first CU containing 'foo_t', finds the typedef and
type declaration, decides that this is not good enough and carries on.
GDB will then expand the second CU and find the type's definition, GDB
now has a full understanding of the type, and can print the type
correctly.
We could solve this problem by marking typedefs as a distinct
sub-category of types, just as we do with declarations. Then we could
sort definitions to the front of the list, then typedefs, and finally,
declarations after that. This would, I think, mean that we always
prefer emitting a definition for a type, which would resolve this
first problem, or at least, it would resolve it well enough, but it
wouldn't fix the second problem.
The second problem is that the Python API and the 'info types' command
can be used to query all type symbols. As such, GDB needs to be able
to find all the CUs which contain a given type. Especially as it is
possible that a type might be defined differently within different
CUs.
NOTE: Obviously a program doing this (defining a type differently in
different CUs) would need to be mindful of the One Definition Rule,
but so long as the type doesn't escape outside of a single CU then
reusing a type name isn't, as I understand it, wrong. And even if
it is, the fact that it compiles, and could be a source of bugs,
means (in my opinion) that GDB should handle this case to enable
debugging of it.
Even something as simple as 'info types ....' relies on GDB being able
to find multiple entries for a given type in different CUs. If the
index only contains a single type entry, then this means GDB will see
different things depending on which CUs happen to have been expanded.
Given all of the above, I think that any attempt to remove type
entries from the gdb-index is unsafe and can result in GDB behaving
differently when using the gdb-index compared to using no index.
The solution is to remove the de-duplication code, which is what this
patch does.
Now that we no longer need to sort declarations to the end of the
entry list, I've removed all the code related to the special use of
GDB_INDEX_SYMBOL_KIND_UNUSED5 (which is how we marked declarations),
this cleans things up a little bit.
I've also renamed some of the functions away from minimize, now that
there's no minimization being done.
A problem was revealed by this change. When running the test
gdb.cp/stub-array-size.exp with the --target_board=cc-with-gdb-index,
I was seeing a failure using gcc 15.1.0.
This test has two CUs, and a type 'A'. The test description says:
Test size of arrays of stubbed types (structures where the full
definition is not immediately available).
Which I don't really understand given the test's source code. The
type 'A' is defined in a header, which is included in both CUs.
However, the test description does seem to be accurate; in one CU the
type looks like this:
<1><4a>: Abbrev Number: 8 (DW_TAG_structure_type)
<4b> DW_AT_name : A
<4d> DW_AT_declaration : 1
<4d> DW_AT_sibling : <0x6d>
<2><51>: Abbrev Number: 9 (DW_TAG_subprogram)
<52> DW_AT_external : 1
<52> DW_AT_name : ~A
<55> DW_AT_decl_file : 2
<56> DW_AT_decl_line : 20
<57> DW_AT_decl_column : 11
<58> DW_AT_linkage_name: (indirect string, offset: 0x103): _ZN1AD4Ev
<5c> DW_AT_virtuality : 1 (virtual)
<5d> DW_AT_containing_type: <0x4a>
<61> DW_AT_declaration : 1
<61> DW_AT_object_pointer: <0x66>
<65> DW_AT_inline : 0 (not inlined)
<3><66>: Abbrev Number: 10 (DW_TAG_formal_parameter)
<67> DW_AT_type : <0x8c>
<6b> DW_AT_artificial : 1
<3><6b>: Abbrev Number: 0
<2><6c>: Abbrev Number: 0
while in the second CU, the type looks like this:
<1><178>: Abbrev Number: 4 (DW_TAG_structure_type)
<179> DW_AT_name : A
<17b> DW_AT_byte_size : 8
<17c> DW_AT_decl_file : 2
<17d> DW_AT_decl_line : 18
<17e> DW_AT_decl_column : 8
<17f> DW_AT_containing_type: <0x178>
<183> DW_AT_sibling : <0x1ac>
<2><187>: Abbrev Number: 5 (DW_TAG_member)
<188> DW_AT_name : (indirect string, offset: 0x19e): _vptr.A
<18c> DW_AT_type : <0x1be>
<190> DW_AT_data_member_location: 0
<191> DW_AT_artificial : 1
<2><191>: Abbrev Number: 6 (DW_TAG_subprogram)
<192> DW_AT_external : 1
<192> DW_AT_name : ~A
<195> DW_AT_decl_file : 1
<196> DW_AT_decl_line : 20
<197> DW_AT_decl_column : 1
<198> DW_AT_linkage_name: (indirect string, offset: 0x103): _ZN1AD4Ev
<19c> DW_AT_virtuality : 1 (virtual)
<19d> DW_AT_containing_type: <0x178>
<1a1> DW_AT_declaration : 1
<1a1> DW_AT_object_pointer: <0x1a5>
<3><1a5>: Abbrev Number: 7 (DW_TAG_formal_parameter)
<1a6> DW_AT_type : <0x1cd>
<1aa> DW_AT_artificial : 1
<3><1aa>: Abbrev Number: 0
<2><1ab>: Abbrev Number: 0
So, for reasons that I don't understand, the type, despite (as far as
I can see) having its full definition available, is recorded only as
declared in one CU.
The test then performs some actions that rely on 'sizeof(A)' and
expects GDB to correctly figure out the size. This requires GDB to
find, and expand the CU containing the real definition of 'A'.
Prior to this patch GDB would sort the two type entries for 'A',
placing the declaration second, and then record only one entry, the
definition. When it came to expansion there was only one thing to
expand, and this is the declaration we needed. It happens that in
this test the definition is in the second CU, that is, the CU with the
biggest offset. This means that, if all index entries were considered
equal, the definition entry would be second. However, currently, due
to the way GDB forces definitions to the front, the entry for the
second CU, the definition, is placed first in the index, and with
de-duplication, this is the only entry added to the index.
After this patch, both the declaration and the definition are placed
in the index, and as the declaration is in the CU at offset 0, the
declaration is added first to the index.
This should be fine. When looking for 'A' GDB should expand the CU
containing the declaration, see that all we have is a declaration, and
so continue, next expanding the definition, at which point we're done.
However, in read-gdb-index.c, in the function
mapped_gdb_index::build_name_components, there is a work around for
gold bug PR gold/15646. Ironically, the bug here is that gold was not
removing duplicate index entries, and it is noted that this has a
performance impact on GDB. A work around for this was added to GDB in
commit:
commit 8943b874760d9cf35b71890a70af9866e4fab2a6
Date: Tue Nov 12 09:43:17 2013 -0800
Work around gold/15646.
A test for this was added in:
commit 40d22035a7fc239ac1e944b75a2e3ee9029d1b76
Date: Tue May 26 11:35:32 2020 +0200
[gdb/testsuite] Add test-case gold-gdb-index.exp
And the fix was tweaked in commit:
commit f030440daa989ae3dadc1fa4342cfa16d690db3c
Date: Thu May 28 17:26:22 2020 +0200
[gdb/symtab] Make gold index workaround more precise
The problem specifically called out in the bug report is that
namespaces can appear in multiple CUs, and that trying to complete
'ns::misspelled' would expand every CU containing namespace 'ns' due
to the duplicate 'ns' type symbols.
The work around that was added in 8943b874760d9cf3 was to ignore
duplicate global symbols when expanding entries from the index. In
commit f030440daa989ae3 this work around was restricted to only ignore
duplicate type entries. This restriction was required to allow the
earlier de-duplication patch aef36dee93bf194c to function correctly.
Now that I'm taking the work started in aef36dee93bf194c to its
logical conclusion, and allowing duplicate type entries, the work
around of ignoring duplicate global type symbols is no longer needed,
and can be removed.
The associated test for this, added in 40d22035a7fc239a, is also
removed in this commit.
To be clear; the performance issue mentioned in PR gold/15646 is now
back again. But my claim is that gold was right all along to include
the duplicate index entries, and any performance hit we see as a
result, though unfortunate, is just a consequence of doing it right.
That doesn't mean there's not room for optimisation and improvement in
the future, though I don't have any immediate ideas, or plans in this
area. It's just we can't throw out a bunch of index entries that are
critical, and claim this as a performance optimisation.
I am seeing some failure with this patch when using the board file
dwarf5-fission-debug-types. These failures all have the error:
DWARF Error: wrong unit_type in unit header (is DW_UT_skeleton, should be DW_UT_type) [in module ....]
However, I ran the whole testsuite with this board, and this error
crops up often, so I don't think this is something specific to my
patch, so I'm choosing to ignore this.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=15646
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=15035
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Consider the C construct:
typedef struct foo
{
int a;
int b;
} foo;
GDB will see two types here, 'struct foo' and the typedef 'foo'.
However, if we use 'info types foo' we will see this:
File test.c:
18: struct foo;
At least that's what I see with current HEAD of master. However, it
is really just luck that we see the 'struct' here. See more below.
When searching for symbols matching 'foo' GDB ends up in the function
global_symbol_searcher::add_matching_symbols, where we consider all
possible matching symbols. This will include the 'struct foo' and the
typedef 'foo'. However, before a new symbols is added to the results,
we attempt to remove duplicates with this code:
/* Match, insert if not already in the results. */
symbol_search ss (block, sym);
if (result_set->find (ss) == result_set->end ())
result_set->insert (ss);
If a symbol is already present in result_set then it will not be added
a second time.
The symbol_search equality check is done using the function
symbol_search::compare_search_syms, this function does a number of
checks, but at the end, any two symbols that are in the same block
within the same file, with the same name, are considered the same,
even if the types of those symbols are different.
This makes sense in most cases, it usually wouldn't make sense to have
two symbols within a single block with different types. But the
'struct foo' and typedef 'foo' case is a bit of a strange one. Within
DWARF and GDB we consider both of these as just types. But in C
types and structure names live in different namespaces, and so we can
have both in the same block. I don't think that GDB should consider
these two as the same, especially if we consider something really
ill-advised like this:
struct foo
{
int a;
int b;
};
typedef int foo;
This is perfectly valid C code, 'struct foo' and the typedef 'foo' are
in different namespaces, and can be used within the same block. But
please, never write C code like this.
Given the above, I think, when asked about 'foo', GDB should, report
both 'struct foo' and the typedef 'foo'.
To do this I propose extending symbol_search::compare_search_syms such
that if two symbol_search objects are in the same block, within the
same file, and they have the same name, then if just one of them is a
typedef, the two objects will not be considered equal. The results
will be sorted by line number if the line numbers are different, or,
if the line numbers are the same, the non-typedef will be sorted
first. This means that for something like this:
typedef struct foo { int a; } foo;
We'll get an 'info types foo' result like:
File test.c:
18: struct foo;
18: typedef struct foo foo;
I mentioned earlier that it is really just luck that we see 'struct
foo'. I ran into this problem while working on another patch. When
testing with the 'debug-types' board file I was seeing the typedef
being reported rather than the struct. In "normal" DWARF given the
'typedef struct foo { ...} foo;' construct, the compiler will usually
emit the struct definition first, and then the typedef definition. So
when GDB parses the DWARF it sees the struct first. It is the typedef
that becomes the duplicate which is not added to the results list.
But with the 'debug-types' board the compiler moves the struct
definition out to the .debug_types section. And GDB now parses the CU
containing the typedef first, and then expands the structure
definition from the separate section afterwards. As a result, it is
the structure that is now considered the duplicate, and the typedef is
the result that gets reported.
I think this is yet another motivation for this patch. Changes like
this (the use of .debug_types section) shouldn't impact what results
GDB shows to the user.
There is an interesting update to the gdb.base/info-types.exp.tcl test
script. In this case the C results only needed to change to include
the typedef. The C++ results already included both the struct and the
typedef in the expected results. The reason for this is that C places
both the struct baz_t and the typedef for baz_t into the global block,
while C++ places the struct in the global block, and the typedef into
the static block. I have no idea why there's a difference in the
placement, but I'm choosing to believe the difference is correct. But
this explains why only the C results needed to change. If anything
this (I think) is yet another justification for this change; having C
not show the typedef in this case seems weird when the same source
code compiled as C++ does show the typedef.
Approved-By: Tom Tromey <tom@tromey.com>
|