Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch changes a few more uses of static_cast to use
checked_static_cast. In this patch, cast-to-references are converted
by moving the dereference outside of the cast, as checked_static_cast
only handles pointers.
|
|
I looked through all the uses of static_cast<... *> in gdb and
converted many of them to checked_static_cast.
I couldn't test a few of these changes.
|
|
This changes GDB to use frame_info_ptr instead of frame_info *
The substitution was done with multiple sequential `sed` commands:
sed 's/^struct frame_info;/class frame_info_ptr;/'
sed 's/struct frame_info \*/frame_info_ptr /g' - which left some
issues in a few files, that were manually fixed.
sed 's/\<frame_info \*/frame_info_ptr /g'
sed 's/frame_info_ptr $/frame_info_ptr/g' - used to remove whitespace
problems.
The changed files were then manually checked and some 'sed' changes
undone, some constructors and some gets were added, according to what
made sense, and what Tromey originally did
Co-Authored-By: Bruno Larsen <blarsen@redhat.com>
Approved-by: Tom Tomey <tom@tromey.com>
|
|
The class debug_names has two 'insert' overloads, but only one of them
is ever called externally, and it simply forwards to the other
implementation. It seems cleaner to me to have a single method, so
this patch merges the two.
|
|
After committing 8ba677d3560 ("[gdb/symtab] Don't complain about function
decls") I noticed that quite a bit of code in read_func_scope is used to decide
whether to issue the "cannot get low and high bounds for subprogram DIE at
$hex" complaint, which executes unnecessarily if we have the default
"set complaints 0".
Fix this by (NFC):
- factoring out new static function have_complaint from macro complaint, and
- using it to wrap the relevant code in read_func_scope.
Tested on x86_64-linux.
|
|
[ Requires "[gdb/symtab] Don't complain about inlined functions" as
submitted here (
https://sourceware.org/pipermail/gdb-patches/2022-September/191762.html ). ]
With the test-case included in this patch, we get:
...
(gdb) ptype main^M
During symbol reading: cannot get low and high bounds for subprogram DIE \
at 0xc1^M
type = int (void)^M
(gdb) FAIL: gdb.dwarf2/anon-ns-fn.exp: ptype main without complaints
...
The DIE causing the complaint is a function declaration:
...
<2><c1>: Abbrev Number: 3 (DW_TAG_subprogram)
<c2> DW_AT_name : foo
<c8> DW_AT_declaration : 1
...
which is referred to from the DIE representing the function definition:
...
<1><f4>: Abbrev Number: 7 (DW_TAG_subprogram)
<f5> DW_AT_specification: <0xc1>
<f9> DW_AT_low_pc : 0x4004c7
<101> DW_AT_high_pc : 0x7
...
which does contain the low and high bounds.
Fix this by not complaining about function declarations.
Tested on x86_64-linux.
|
|
With the test-case included in this patch, we get:
...
(gdb) ptype main^M
During symbol reading: cannot get low and high bounds for subprogram DIE \
at 0x113^M
During symbol reading: cannot get low and high bounds for subprogram DIE \
at 0x11f^M
type = int (void)^M
(gdb) FAIL: gdb.dwarf2/inline.exp: ptype main
...
The complaints are about foo, with DW_AT_inline == DW_INL_inlined:
...
<1><11f>: Abbrev Number: 6 (DW_TAG_subprogram)
<120> DW_AT_name : foo
<126> DW_AT_prototyped : 1
<126> DW_AT_type : <0x10c>
<12a> DW_AT_inline : 1 (inlined)
...
and foo2, with DW_AT_inline == DW_INL_declared_inlined:
...
<1><113>: Abbrev Number: 5 (DW_TAG_subprogram)
<114> DW_AT_name : foo2
<11a> DW_AT_prototyped : 1
<11a> DW_AT_type : <0x10c>
<11e> DW_AT_inline : 3 (declared as inline and inlined)
...
Fix this by not complaining about inlined functions.
Tested on x86_64-linux.
|
|
Add all_comp_units/all_type_units views on all_units.
Having the views allows us to:
- easily get the number of CUs or TUs in all_units, and
- easily access the nth CU or TU.
This minimizes the use of tu_stats.nr_tus.
Tested on x86_64-linux.
|
|
Mechanically rename all_comp_units to all_units:
...
$ sed -i 's/all_comp_units/all_units/' gdb/dwarf2/*
...
Tested on x86_64-linux.
|
|
When a GDB built with -D_GLIBCXX_DEBUG=1 reads a binary with a single
character name, we hit this assertion failure:
$ ./gdb -q --data-directory=data-directory -nx ./x
/usr/include/c++/12.1.0/string_view:239: constexpr const std::basic_string_view<_CharT, _Traits>::value_type& std::basic_string_view<_CharT, _Traits>::operator[](size_type) const [with _CharT = char; _Traits = std::char_traits<char>; const_reference = const char&; size_type = long unsigned int]: Assertion '__pos < this->_M_len' failed.
The backtrace:
#3 0x00007ffff6c0f002 in std::__glibcxx_assert_fail (file=<optimized out>, line=<optimized out>, function=<optimized out>, condition=<optimized out>) at /usr/src/debug/gcc/libstdc++-v3/src/c++11/debug.cc:60
#4 0x000055555da8a864 in std::basic_string_view<char, std::char_traits<char> >::operator[] (this=0x7fffffffcc30, __pos=1) at /usr/include/c++/12.1.0/string_view:239
#5 0x00005555609dcb88 in path_join[abi:cxx11](gdb::array_view<std::basic_string_view<char, std::char_traits<char> > const>) (paths=...) at /home/simark/src/binutils-gdb/gdbsupport/pathstuff.cc:203
#6 0x000055555e0443f4 in path_join<char const*, char const*> () at /home/simark/src/binutils-gdb/gdb/../gdbsupport/pathstuff.h:84
#7 0x00005555609dc336 in gdb_realpath_keepfile[abi:cxx11](char const*) (filename=0x6060000a8d40 "/home/simark/build/binutils-gdb-one-target/gdb/./x") at /home/simark/src/binutils-gdb/gdbsupport/pathstuff.cc:122
#8 0x000055555ebd2794 in exec_file_attach (filename=0x7fffffffe0f9 "./x", from_tty=1) at /home/simark/src/binutils-gdb/gdb/exec.c:471
#9 0x000055555f2b3fb0 in catch_command_errors (command=0x55555ebd1ab6 <exec_file_attach(char const*, int)>, arg=0x7fffffffe0f9 "./x", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:513
#10 0x000055555f2b7e11 in captured_main_1 (context=0x7fffffffdb60) at /home/simark/src/binutils-gdb/gdb/main.c:1209
#11 0x000055555f2b9144 in captured_main (data=0x7fffffffdb60) at /home/simark/src/binutils-gdb/gdb/main.c:1319
#12 0x000055555f2b9226 in gdb_main (args=0x7fffffffdb60) at /home/simark/src/binutils-gdb/gdb/main.c:1344
#13 0x000055555d938c5e in main (argc=5, argv=0x7fffffffdcf8) at /home/simark/src/binutils-gdb/gdb/gdb.c:32
The problem is this line in path_join:
gdb_assert (strlen (path) == 0 || !IS_ABSOLUTE_PATH (path));
... where `path` is "x". IS_ABSOLUTE_PATH eventually calls
HAS_DRIVE_SPEC_1:
#define HAS_DRIVE_SPEC_1(dos_based, f) \
((f)[0] && ((f)[1] == ':') && (dos_based))
This macro accesses indices 0 and 1 of the input string. However, `f`
is a string_view of length 1, so it's incorrect to try to access index
1. We know that the string_view's underlying object is a null-terminated
string, so in practice there's no harm. But as far as the string_view
is concerned, index 1 is considered out of bounds.
This patch makes the easy fix, that is to change the path_join parameter
from a vector of to a vector of `const char *`. Another solution would
be to introduce a non-standard gdb::cstring_view class, which would be a
view over a null-terminated string. With that class, it would be
correct to access index 1, it would yield the NUL character. If there
is interest in having this class (it has been mentioned a few times in
the past) I can do it and use it here.
This was found by running tests such as gdb.ada/arrayidx.exp, which
produce 1-char long filenames, so adding a new test is not necessary.
Change-Id: Ia41a16c7243614636b18754fd98a41860756f7af
|
|
Remove the macro, replace all uses with calls to type::length.
Change-Id: Ib9bdc954576860b21190886534c99103d6a47afb
|
|
Add the `length` and `set_length` methods on `struct type`, in order to remove
the `TYPE_LENGTH` macro. In this patch, the macro is changed to use the
getter, so all the call sites of the macro that are used as a setter are
changed to use the setter method directly. The next patch will remove the
macro completely.
Change-Id: Id1090244f15c9856969b9be5006aefe8d8897ca4
|
|
Remove the macro, replace all uses by calls to type::target_type.
Change-Id: Ie51d3e1e22f94130176d6abd723255282bb6d1ed
|
|
Add the `target_type` and `set_target_type` methods on `struct type`, in order
to remove the `TYPE_TARGET_TYPE` macro. In this patch, the macro is changed to
use the getter, so all the call sites of the macro that are used as a setter
are changed to use the setter method directly. The next patch will remove the
macro completely.
Change-Id: I85ce24d847763badd34fdee3e14b8c8c14cb3161
|
|
With the test-case included in this commit, we run into this FAIL:
...
(gdb) p var^M
During symbol reading: file index out of range^M
$1 = 0^M
(gdb) FAIL: gdb.dwarf2/dw2-no-code-cu.exp: p var with no complaints
...
This is a regression since commit 6d263fe46e0 ("Avoid bad breakpoints with
--gc-sections"), which contains this change in read_file_scope:
...
- handle_DW_AT_stmt_list (die, cu, fnd, lowpc);
+ if (lowpc != highpc)
+ handle_DW_AT_stmt_list (die, cu, fnd, lowpc);
...
The change intends to avoid a problem with a check in
lnp_state_machine::check_line_address, but also prevents the file and dir
tables from being read, which causes the complaint.
Fix the FAIL by reducing the scope of the "lowpc != highpc" condition to the
call to dwarf_decode_lines in handle_DW_AT_stmt_list.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29561
|
|
With the test-case included in the patch, we run into:
...
(gdb) info types -q std::nullptr_t^M
During symbol reading: unsupported tag: 'DW_TAG_unspecified_type'^M
^M
File /usr/include/c++/7/x86_64-suse-linux/bits/c++config.h:^M
2198: typedef decltype(nullptr) std::nullptr_t;^M
(gdb) FAIL: gdb.dwarf2/nullptr_t.exp: info types -q std::nullptr_t \
without complaint
...
Fix the complaint by handling DW_TAG_unspecified_type in new_symbol, and verify
in the test-case using "maint print symbols" that the symbol exists.
Tested on x86_64-linux, with gcc 7.5.0 and clang 13.0.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17271
|
|
The .gdb_index variant of commit d878bb39e41 ("[gdb/symtab] Support
.debug_names section with TUs in .debug_info").
Tested on x86_64-linux.
|
|
Currently, the test-case contained in this patch fails:
...
(gdb) p (int) foo ()^M
Invalid cast.^M
(gdb) FAIL: gdb.dwarf2/dw2-unspecified-type.exp: p (int) foo ()
...
because DW_TAG_unspecified_type is translated as void.
There's some code in read_unspecified_type that marks the type as stub, but
that's only active for ada:
...
if (cu->lang () == language_ada)
type->set_is_stub (true);
...
Fix this by:
- marking the type as a stub for all languages, and
- handling the stub return type case in call_function_by_hand_dummy.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29558
|
|
When running test-case gdb.cp/cpexprs-debug-types.exp on target board
cc-with-debug-names/gdb:debug_flags=-gdwarf-5, we get an executable with
a .debug_names section, but no .debug_types section. For dwarf-5, the TUs
are no longer put in a separate unit, but instead they're put in the
.debug_info section.
When loading the executable, the .debug_names section is silently ignored
because of this check in dwarf2_read_debug_names:
...
if (map->tu_count != 0)
{
/* We can only handle a single .debug_types when we have an
index. */
if (per_bfd->types.size () != 1)
return false;
...
which triggers because per_bfd->types.size () == 0.
The intention of the check is to make sure we don't have more that one
.debug_types section, as can happen in a object file (see PR12984):
...
$ grep "\.debug_types" 11.s
.section .debug_types,"G",@progbits,wt.75c042c23a9a07ee,comdat
.section .debug_types,"G",@progbits,wt.c59c413bf50a4607,comdat
...
Fix this by:
- changing the check condition to "per_bfd->types.size () > 1", and
- handling per_bfd->types.size () == 0.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29385
|
|
The following GDB behavior was also reported as a GDB bug in
https://sourceware.org/bugzilla/show_bug.cgi?id=28396
I will reiterate the problem a bit and give some more information here.
This patch closes the above mentioned bug.
The DWARF 5 standard 2.23 'Template Parameters' reads:
A template type parameter is represented by a debugging information
entry with the tag DW_TAG_template_type_parameter. A template value
parameter is represented by a debugging information entry with the tag
DW_TAG_template_value_parameter. The actual template parameter entries
appear in the same order as the corresponding template formal
parameter declarations in the source progam.
A type or value parameter entry may have a DW_AT_name attribute, whose
value is a null-terminated string containing the name of the
corresponding formal parameter.
So the DW_AT_name attribute for DW_TAG_template_type_parameter and
DW_TAG_template_value_parameter is optional.
Within GDB, creating a new symbol from some read DIE usually requires the
presence of a DW_AT_name for the DIE (an exception here is the case of
unnamed namespaces or the existence of a linkage name).
This patch makes the presence of the DW_AT_name for template value/type
tags optional, similar to the unnamed namespaces.
For unnamed namespaces dwarf2_name simply returns the constant string
CP_ANONYMOUS_NAMESPACE_STR '(anonymous namespace)'. For template tags a
case was added to the switch statement calling the
unnamed_template_tag_name helper. Within the scope of parent which
the template parameter is a child of, the helper counts the position
of the template tag within the unnamed template tags and returns
'<unnamedNUMBER>' where NUMBER is its position. This way we end up with
unique names within the respective scope of the function/class/struct
(these are the only currenltly supported template kinds within GDB and
usually the compilers) where we discovered the template tags in.
While I do not know of a way to bring GCC to emit template tags without
names there is one for clang/icpx. Consider the following example
template<typename A, typename B, typename C>
class Foo {};
template<typename, typename B, typename>
class Foo;
int main () {
Foo<double, int, float> f;
return 0;
}
The forward declaration for 'Foo' with the missing template type names
'A' and 'C' makes clang emit a bunch of template tags without names:
...
<2><43>: Abbrev Number: 3 (DW_TAG_variable)
<44> DW_AT_location : 2 byte block: 91 78 (DW_OP_fbreg: -8)
<47> DW_AT_name : (indirect string, offset: 0x63): f
<4b> DW_AT_decl_file : 1
<4c> DW_AT_decl_line : 8
<4d> DW_AT_type : <0x59>
...
<1><59>: Abbrev Number: 5 (DW_TAG_class_type)
<5a> DW_AT_calling_convention: 5 (pass by value)
<5b> DW_AT_name : (indirect string, offset: 0x74): Foo<double, int, float>
<5f> DW_AT_byte_size : 1
<60> DW_AT_decl_file : 1
<61> DW_AT_decl_line : 2
<2><62>: Abbrev Number: 6 (DW_TAG_template_type_param)
<63> DW_AT_type : <0x76>
<2><67>: Abbrev Number: 7 (DW_TAG_template_type_param)
<68> DW_AT_type : <0x52>
<6c> DW_AT_name : (indirect string, offset: 0x6c): B
<2><70>: Abbrev Number: 6 (DW_TAG_template_type_param)
<71> DW_AT_type : <0x7d>
...
Befor this patch, GDB would not create any symbols for the read template
tag DIEs and thus lose knowledge about them. Breaking at the return
statement and printing f's type would read
(gdb) ptype f
type = class Foo<double, int, float> [with B = int] {
<no data fields>
}
After this patch GDB does generate symbols from the DWARF (with their
artificial names:
(gdb) ptype f
type = class Foo<double, int, float> [with <unnamed0> = double, B = int,
<unnamed1> = float] {
<no data fields>
}
The same principle theoretically applies to template functions. Also
here, GDB would not record unnamed template TAGs but I know of no visual
way to trigger and test this changed behavior. Template functions do
not emit a '[with...]' list and their name generation also does not
suffer from template tags without names. GDB does not check whether or
not a template tag has a name in 'dwarf2_compute_name' and thus, the
names of the template functions are created independently of whether or
not the template TAGs have a DW_TAT_name attribute. A testcase has
been added in the gdb.dwarf2 for template classes and structs.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28396
|
|
When running the included test-case, we run into:
...
(gdb) break _start^M
read.h:309: internal-error: set_length: \
Assertion `m_length == length' failed.^M
...
The problem is that while there are two CUs:
...
$ readelf -wi debug-names-missing-cu | grep @
Compilation Unit @ offset 0x0:
Compilation Unit @ offset 0x2d:
...
the CU table in the .debug_names section only contains the first one:
...
CU table:
[ 0] 0x0
...
The incomplete CU table makes create_cus_from_debug_names_list set the size of
the CU at 0x0 to the actual size of both CUs combined.
This eventually leads to the assert, when we read the actual size from the CU
header.
While having an incomplete CU table in a .debug_names section is incorrect,
we need a better failure mode than asserting.
The easiest way to fix this is to set the length to 0 (meaning: unkown) in
create_cus_from_debug_names_list.
This makes the failure mode to accept the incomplete CU table, but to ignore
the missing CU.
It would be nice to instead reject the .debug_names index, and build a
complete CU list, but the point where we find this out is well after
dwarf2_initialize_objfile, so it looks rather intrusive to restart at that
point.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29453
|
|
Bug 29374 shows this crash:
$ ./gdb -nx --data-directory=data-directory -q -batch -ex "catch throw" -ex r -ex bt a.out
...
/home/simark/src/binutils-gdb/gdb/../gdbsupport/array-view.h:217: internal-error: copy: Assertion `dest.size () == src.size ()' failed.
The backtrace is:
#0 internal_error (file=0x5555606504c0 "/home/simark/src/binutils-gdb/gdb/../gdbsupport/array-view.h", line=217, fmt=0x55556064b700 "%s: Assertion `%s' failed.") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:51
#1 0x000055555d41c0bb in gdb::copy<unsigned char const, unsigned char> (src=..., dest=...) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/array-view.h:217
#2 0x000055555deef28c in dwarf_expr_context::fetch_result (this=0x7fffffffb830, type=0x621007a86830, subobj_type=0x621007a86830, subobj_offset=0, as_lval=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1040
#3 0x000055555def0015 in dwarf_expr_context::evaluate (this=0x7fffffffb830, addr=0x62f00004313e "0", len=1, as_lval=false, per_cu=0x60b000069550, frame=0x621007c9e910, addr_info=0x0, type=0x621007a86830, subobj_type=0x621007a86830, subobj_offset=0) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1091
#4 0x000055555e084327 in dwarf2_evaluate_loc_desc_full (type=0x621007a86830, frame=0x621007c9e910, data=0x62f00004313e "0", size=1, per_cu=0x60b000069550, per_objfile=0x613000006080, subobj_type=0x621007a86830, subobj_byte_offset=0, as_lval=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1485
#5 0x000055555e0849e2 in dwarf2_evaluate_loc_desc (type=0x621007a86830, frame=0x621007c9e910, data=0x62f00004313e "0", size=1, per_cu=0x60b000069550, per_objfile=0x613000006080, as_lval=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1529
#6 0x000055555e0828c6 in dwarf_entry_parameter_to_value (parameter=0x621007a96e58, deref_size=0x0, type=0x621007a86830, caller_frame=0x621007c9e910, per_cu=0x60b000069550, per_objfile=0x613000006080) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1235
#7 0x000055555e082f55 in value_of_dwarf_reg_entry (type=0x621007a86890, frame=0x621007acc510, kind=CALL_SITE_PARAMETER_DWARF_REG, kind_u=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1332
#8 0x000055555e083449 in value_of_dwarf_block_entry (type=0x621007a86890, frame=0x621007acc510, block=0x61e000033568 "T\004\205\001\240\004\004\243\001T\237\004\240\004\261\004\001T\004\261\004\304\005\004\243\001T\237\004\304\005\310\005\001T\004\310\005\311\005\004\243\001T\237", block_len=1) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1365
#9 0x000055555e094d40 in loclist_read_variable_at_entry (symbol=0x621007a99bd0, frame=0x621007acc510) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:3889
#10 0x000055555f5192e0 in read_frame_arg (fp_opts=..., sym=0x621007a99bd0, frame=0x621007acc510, argp=0x7fffffffbf20, entryargp=0x7fffffffbf60) at /home/simark/src/binutils-gdb/gdb/stack.c:559
#11 0x000055555f51c352 in print_frame_args (fp_opts=..., func=0x621007a99ad0, frame=0x621007acc510, num=-1, stream=0x6030000bad90) at /home/simark/src/binutils-gdb/gdb/stack.c:887
#12 0x000055555f521919 in print_frame (fp_opts=..., frame=0x621007acc510, print_level=1, print_what=LOCATION, print_args=1, sal=...) at /home/simark/src/binutils-gdb/gdb/stack.c:1390
#13 0x000055555f51f22e in print_frame_info (fp_opts=..., frame=0x621007acc510, print_level=1, print_what=LOCATION, print_args=1, set_current_sal=0) at /home/simark/src/binutils-gdb/gdb/stack.c:1116
#14 0x000055555f526c6d in backtrace_command_1 (fp_opts=..., bt_opts=..., count_exp=0x0, from_tty=0) at /home/simark/src/binutils-gdb/gdb/stack.c:2079
#15 0x000055555f527ae5 in backtrace_command (arg=0x0, from_tty=0) at /home/simark/src/binutils-gdb/gdb/stack.c:2198
The problem is that the type that gets passed down to
dwarf_expr_context::fetch_result (the type of a variable of which we're
trying to read the entry value) is a typedef whose size has never been
computed yet (check_typedef has never been called on it). As we get in
the DWARF_VALUE_STACK case (line 1028 of dwarf2/expr.c), the `len`
variable is therefore set to 0, instead of the actual type length. We
then call allocate_value on subobj_type, which does call check_typedef,
so the length of the typedef gets filled in at that point. We end up
passing to the copy function a source array view of length 0 and a
target array view of length 4, and the assertion fails.
Fix this by calling check_typedef on both type and subobj_type at the
beginning of fetch_result.
I tried writing a test for this using the DWARF assembler, but I haven't
succeeded. It's possible that we need to get into this specific code
path (value_of_dwarf_reg_entry and all) to manage to get to
dwarf_expr_context::fetch_result with a typedef type that has never been
resolved. In all my attempts, the typedef would always be resolved
already, so the bug wouldn't show up.
As a fallback, I made a gdb.dwarf2 test with compiler-generated .S
files. I don't particularly like those, but I think it's better than no
test. The .cpp source code is the smallest reproducer I am able to make
from the reproducer given in the bug (thanks to Pedro for suggestions on
how to minimize it further than I had). Since I tested on both amd64
and aarch64, I added versions of the test for these two architectures.
Change-Id: I182733ad08e34df40d8bcc47af72c482fabf4900
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29374
|
|
When loading the debug-names-duplicate-cu executable included in this
test-case, we run into:
...
(gdb) file debug-names-duplicate-cu^M
Reading symbols from debug-names-duplicate-cu...^M
src/gdb/dwarf2/read.c:2353: internal-error: read_addrmap_from_aranges: \
Assertion `insertpair.second' failed.^M
...
This assert was added in recent commit 75337cbc147 ("[gdb/symtab] Fix
.debug_aranges duplicate offset warning").
The assert triggers because the CU table in the .debug_names section contains
a duplicate:
...
Version 5
Augmentation string: 47 44 42 00 ("GDB")
CU table:
[ 0] 0x0
[ 1] 0x0
...
Fix this by rejecting the .debug_names index:
...
(gdb) file debug-names-duplicate-cu^M
Reading symbols from debug-names-duplicate-cu...^M
warning: Section .debug_names has duplicate entry in CU table, \
ignoring .debug_names.^M
...
Likewise for the case where the CU table is not sorted by increasing offset.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29436
|
|
In dwarf2_build_psymtabs_hard, we use a parallel_for_each to distribute CUs
over threads.
Ensuring a fair distribution over the worker threads and main thread in terms
of number of CUs might not be the most efficient way, given that CUs can vary
in size.
Fix this by using per_cu->get_length () as the task size.
I've used this experiment to verify the performance impact:
...
$ for n in $(seq 1 10); do \
time gdb -q -batch ~/firefox/libxul.so-93.0-1.1.x86_64.debug \
2>&1 \
| grep "real:"; \
done
...
and without the patch got:
...
real: 4.71
real: 4.88
real: 4.29
real: 4.30
real: 4.65
real: 4.27
real: 4.27
real: 4.27
real: 4.75
real: 4.41
...
and with the patch:
...
real: 3.68
real: 3.81
real: 3.80
real: 3.68
real: 3.75
real: 3.69
real: 3.69
real: 3.74
real: 3.67
real: 3.74
...
so that seems a reasonable improvement.
With parallel_for_each_debug set to true, we get some more detail about
the difference in behaviour. Without the patch we have:
...
Parallel for: n_elements: 2818
Parallel for: minimum elements per thread: 1
Parallel for: elts_per_thread: 704
Parallel for: elements on worker thread 0 : 705
Parallel for: elements on worker thread 1 : 705
Parallel for: elements on worker thread 2 : 704
Parallel for: elements on worker thread 3 : 0
Parallel for: elements on main thread : 704
...
and with the patch:
...
Parallel for: n_elements: 2818
Parallel for: total_size: 1483674865
Parallel for: size_per_thread: 370918716
Parallel for: elements on worker thread 0 : 752 (size: 371811790)
Parallel for: elements on worker thread 1 : 360 (size: 371509370)
Parallel for: elements on worker thread 2 : 1130 (size: 372681710)
Parallel for: elements on worker thread 3 : 0 (size: 0)
Parallel for: elements on main thread : 576 (size: 367671995)
...
Tested on x86_64-linux.
|
|
gdbarch implements its own registry-like approach. This patch changes
it to instead use registry.h. It's a rather large patch but largely
uninteresting -- it's mostly a straightforward conversion from the old
approach to the new one.
The main benefit of this change is that it introduces type safety to
the gdbarch registry. It also removes a bunch of code.
One possible drawback is that, previously, the gdbarch registry
differentiated between pre- and post-initialization setup. This
doesn't seem very important to me, though.
|
|
This changes struct objfile to use a gdb_bfd_ref_ptr. In addition to
removing some manual memory management, this fixes a use-after-free
that was introduced by the registry rewrite series. The issue there
was that, in some cases, registry shutdown could refer to memory that
had already been freed. This help fix the bug by delaying the
destruction of the BFD reference (and thus the per-bfd object) until
after the registry has been shut down.
|
|
The function read_addrmap_from_aranges contains code to issue a warning:
...
if (!insertpair.second)
{
warning (_("Section .debug_aranges in %s has duplicate "
"debug_info_offset %s, ignoring .debug_aranges."),
objfile_name (objfile), sect_offset_str (per_cu->sect_off));
return false;
}
...
but the warning is in fact activated when all_comp_units has duplicate
entries, which is very misleading.
Fix this by:
- adding a test-case that should trigger the warning,
- replacing the current implementation of the warning with an
assert that all_comp_units should not contain duplicates, and
- properly re-implementing the warning, such that it is triggered
by the test-case.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29381
|
|
Printing macros defined in the main source file doesn't work reliably
using various toolchains, especially when DWARF 5 is used. For example,
using the binaries produced by either of these commands:
$ gcc --version
gcc (GCC) 11.2.0
$ ld --version
GNU ld (GNU Binutils) 2.38
$ gcc test.c -g3 -gdwarf-5
$ clang --version
clang version 13.0.1
$ clang test.c -gdwarf-5 -fdebug-macro
I get:
$ ./gdb -nx -q --data-directory=data-directory a.out
(gdb) start
Temporary breakpoint 1 at 0x111d: file test.c, line 6.
Starting program: /home/simark/build/binutils-gdb-one-target/gdb/a.out
Temporary breakpoint 1, main () at test.c:6
6 return ZERO;
(gdb) p ZERO
No symbol "ZERO" in current context.
When starting to investigate this (taking the gcc-compiled binary as an
example), we see that GDB fails to look up the appropriate macro scope
when evaluating the expression. While stopped in
macro_lookup_inclusion:
(top-gdb) p name
$1 = 0x62100011a980 "test.c"
(top-gdb) p source.filename
$2 = 0x62100011a9a0 "/home/simark/build/binutils-gdb-one-target/gdb/test.c"
`source` is the macro_source_file that we would expect GDB to find.
`name` comes from the symtab::filename field of the symtab we are
stopped in. GDB doesn't find the appropriate macro_source_file because
the name of the macro_source_file doesn't match exactly the name of the
symtab.
The name of the main symtab comes from the compilation unit's
DW_AT_name, passed to the buildsym_compunit's constructor:
https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/read.c#L10627-10630
The contents of DW_AT_name, in this case, is "test.c". It is typically
(what I witnessed all compilers do) the same string that was passed to
the compiler on the command-line.
The name of the macro_source_file comes from the line number program
header's file table, from the call to the line_header::file_file_name
method:
https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/macro.c#L54-65
line_header::file_file_name prepends the directory path that the file
entry refers to, in the file table (if the file name is not already
absolute). In this case, the file name is "test.c", appended to the
directory "/home/simark/build/binutils-gdb-one-target/gdb".
Because the symtab's name is not created the same way as the
macro_source_file's name is created, we get this mismatch. GDB fails to
find the appropriate macro scope for the symtab, and we can't print
macros when stopped in that symtab.
To make this work, we must ensure that paths produced in these two ways
end up identical. This can be tricky because of the different ways a
path can be passed to the compiler by the user.
Another thing to consider is that while the main symtab's name (or
subfile, before it becomes a symtab) is created using DW_AT_name, the
main symtab is also referred to using its entry in the line table
header's file table, when processing the line table. We must therefore
ensure that the same name is produced in both cases, so that a call to
"start_subfile" for the main subfile will correctly find the
already-created subfile, created by buildsym_compunit's constructor. If
we fail to do that, things still often work, because of a fallback: the
watch_main_source_file_lossage method. This method determines that if
the main subfile has no symbols but there exists another subfile with
the same basename (e.g. "test.c") that does have symbols, it's probably
because there was some filename mismatch. So it replaces the main
subfile with that other subfile. I think that heuristic is useful as a
last effort to work around any bug or bad debug info, but I don't think
we should design things such as to rely on it. It's a heuristic, it can
get things wrong. So in my search for a fix, it is important that given
some good debug info, we don't end up relying on that for things to
work.
A first attempt at fixing this was to try to prepend the compilation
directory here or not prepend it there. In practice, because of all the
possible combinations of debug info the compilers produce, it was not
possible to get something that would produce reliable, consistent paths.
Another attempt at fixing this was to make both macro_source_file
objects and symtab objects use the most complete form of path possible.
That means to prepend directories at least until we get an absolute
path. In theory, we should end up with the same path in all cases.
This generally worked, but because it changed the symtab names, it
resulted in user-visible changes (for example, paths to source files in
Breakpoint hit messages becoming always absolute). I didn't find this
very good, first because there is a "set filename-display" setting that
lets the user control how they want the paths to be displayed, and that
would suddenly make this setting completely ineffective (although even
today, it is a bit dependent on the debug info). Second, it would
require a good amount of testsuite tweaks to make tests accept these
suddenly absolute paths.
This new patch is a slight variation of that: it adds a new field called
"filename_for_id" in struct symtab and struct subfile, next to the
existing filename field. The goal is to separate the internal ids used
for finding objects from the names used for presentation. This field is
used for identifying subfiles, symtabs and macro_source_files
internally. For DWARF symtabs, this new field is meant to contain the
"most complete possible" path, as discussed above. So for a given file,
it must always be in the same form, everywhere. The existing
symtab::filename field remains the one used for printing to the user, so
there shouldn't be any change in how paths are printed.
Changes in the core symtab files are:
- Add "name_for_id" and "filename_for_id" fields to "struct subfile"
and "struct symtab", next to existing "name" and "filename" fields.
- Make buildsym_compunit::buildsym_compunit and
buildsym_compunit::start_subfile accept a "name_for_id" parameter
next to the existing "name" ones.
- Make buildsym_compunit::start_subfile use "name_for_id" for looking
up existing subfiles. This is the key thing for making calls
to start_subfile for the main source file look up the existing
subfile successfully, and avoid relying on
watch_main_source_file_lossage.
- Make sal_macro_scope pass "filename_for_id", rather than "filename",
to macro_lookup_inclusion. This is the key thing to making the
lookup work and macro printing work.
Changes in the DWARF files are:
- Make line_header::file_file_name return the "most complete possible"
name. The only pre-existing user of this method is the macro code,
to give the macro_source_file objects their name. And we now want
them to have this "most complete possible" name, which will match the
corresponding symtab's "filename_for_id".
- Make dwarf2_cu::start_compunit_symtab pass the "most complete
possible" name for the main symtab's "filename_for_id". In this
context, where the info comes from the compilation unit's DW_AT_name
/ DW_AT_comp_dir, it means prepending DW_AT_comp_dir to DW_AT_name if
DW_AT_name is not already absolute.
- Change dwarf2_start_subfile to build a name_for_id for the subfile
being started. The simplest way is to re-use
line_header::file_file_name, since the callers always have a
file_entry handy. This ensures that it will get the exact same path
representation as the macro code does, for the same file (since it
also uses line_header::file_file_name).
- Update calls to allocate_symtab to pass the "name_for_id" from the
subfile.
Tests exercising all this are added by the following patch.
Of all the cases I tried, the only one I found that ends up relying on
watch_main_source_file_lossage is the following one:
$ clang --version
clang version 13.0.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang ./test.c -g3 -O0 -gdwarf-4
$ ./gdb -nx --data-directory=data-directory -q -readnow -iex "set debug symtab-create 1" a.out
...
[symtab-create] start_subfile: name = test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/test.c
[symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
[symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
[symtab-create] start_subfile: found existing symtab with name_for_id /home/simark/build/binutils-gdb-one-target/gdb/./test.c (/home/simark/build/binutils-gdb-one-target/gdb/./test.c)
[symtab-create] watch_main_source_file_lossage: using subfile ./test.c as the main subfile
As we can see, there are two forms used for "test.c", one with a "." and
one without. This comes from the fact that the compilation unit DIE
contains:
DW_AT_name ("test.c")
DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb")
without a ".", and the line table for that file contains:
include_directories[ 1] = "."
file_names[ 1]:
name: "test.c"
dir_index: 1
When assembling the filename from that entry, we get a ".".
It is a bit unexpected that the main filename resulting from the line
table header does not match exactly the name in the compilation unit.
For instance, gcc uses "./test.c" for the DW_AT_name, which gives
identical paths in the compilation unit and in the line table header.
Similarly, with DWARF 5:
$ clang ./test.c -g3 -O0 -gdwarf-5
clang create two entries that refer to the same file but are of in a different
form.
include_directories[ 0] = "/home/simark/build/binutils-gdb-one-target/gdb"
include_directories[ 1] = "."
file_names[ 0]:
name: "test.c"
dir_index: 0
file_names[ 1]:
name: "test.c"
dir_index: 1
The first file name produces a path without a "." while the second does.
This is not caught by watch_main_source_file_lossage, because of
dwarf_decode_lines that creates a symtab for each file entry in the line
table. It therefore appears as "non-empty" to
watch_main_source_file_lossage. This results in two symtabs:
(gdb) maintenance info symtabs
{ objfile /home/simark/build/binutils-gdb-one-target/gdb/a.out ((struct objfile *) 0x613000005d00)
{ ((struct compunit_symtab *) 0x62100011aca0)
debugformat DWARF 5
producer clang version 13.0.1
name test.c
dirname /home/simark/build/binutils-gdb-one-target/gdb
blockvector ((struct blockvector *) 0x621000129ec0)
user ((struct compunit_symtab *) (null))
{ symtab test.c ((struct symtab *) 0x62100011ad20)
fullname (null)
linetable ((struct linetable *) 0x0)
}
{ symtab ./test.c ((struct symtab *) 0x62100011ad60)
fullname (null)
linetable ((struct linetable *) 0x621000129ef0)
}
}
}
I am not sure what is the consequence of this, but this is also what
happens before my patch, so I think its acceptable to leave it as-is.
To handle these two cases nicely, I think we will need a function that
removes the unnecessary "." from path names, something that can be done
later.
Finally, I made a change in find_file_and_directory is necessary to
avoid breaking test
gdb.dwarf2/dw2-compdir-oldgcc.exp: info source gcc42
Without that change, we would get:
(gdb) info source
Current source file is /dir/d/dw2-compdir-oldgcc42.S
Compilation directory is /dir/d
whereas the expected result is:
(gdb) info source
Current source file is dw2-compdir-oldgcc42.S
Compilation directory is /dir/d
This test was added here:
https://sourceware.org/pipermail/gdb-patches/2012-November/098144.html
Long story short, GCC <= 4.2 apparently had a bug where it would
generate a DW_AT_name with a full path ("/dir/d/dw2-compdir-oldgcc42.S")
and no DW_AT_comp_dir. The line table has one entry with filename
"dw2-compdir-oldgcc42.S", which refers to directory 0. Directory 0
normally refers to the compilation unit's comp dir, but it is
non-existent in this case.
This caused some symtab lookup problems, and to work around them, some
workaround was added, which today reads as:
if (res.get_comp_dir () == nullptr
&& producer_is_gcc_lt_4_3 (cu)
&& res.get_name () != nullptr
&& IS_ABSOLUTE_PATH (res.get_name ()))
res.set_comp_dir (ldirname (res.get_name ()));
Source: https://gitlab.com/gnutools/binutils-gdb/-/blob/6577f365ebdee7dda71cb996efa29d3714cbccd0/gdb/dwarf2/read.c#L9428-9432
It extracts an artificial DW_AT_comp_dir from DW_AT_name, if there is no
DW_AT_comp_dir and DW_AT_name is absolute.
Prior to my patch, a subfile would get created with filename
"/dir/d/dw2-compdir-oldgcc42.S", from DW_AT_name, and another would get
created with filename "dw2-compdir-oldgcc42.S" from the line table's
file table. Then watch_main_source_file_lossage would kick in and merge
them, keeping only the "dw2-compdir-oldgcc42.S" one:
[symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name dw2-compdir-oldgcc42.S (dw2-compdir-oldgcc42.S)
[symtab-create] watch_main_source_file_lossage: using subfile dw2-compdir-oldgcc42.S as the main subfile
And so "info source" would show "dw2-compdir-oldgcc42.S" as the
filename.
With my patch applied, but without the change in
find_file_and_directory, both DW_AT_name and the line table would try to
start a subfile with the same filename_for_id, and there was no need for
watch_main_source_file_lossage - which is what we want:
[symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
But since the one with name == "/dir/d/dw2-compdir-oldgcc42.S", coming
from DW_AT_name, gets created first, it wins, and the symtab ends up
with "/dir/d/dw2-compdir-oldgcc42.S" as the name, "info source" shows
"/dir/d/dw2-compdir-oldgcc42.S" and the test breaks.
This is not wrong per-se, after all DW_AT_name is
"/dir/d/dw2-compdir-oldgcc42.S", so it wouldn't be wrong to report the
current source file as "/dir/d/dw2-compdir-oldgcc42.S". If you compile
a file passing "/an/absolute/path.c", DW_AT_name typically contains (at
least with GCC) "/an/absolute/path.c" and GDB tells you that the source
file is "/an/absolute/path.c". But we can also keep the existing
behavior fairly easily with a little change in find_file_and_directory.
When extracting an artificial DW_AT_comp_dir from DW_AT_name, we now
modify the name to just keep the file part. The result is coherent with
what compilers do when you compile a file by just passing its filename
("gcc path.c -g"):
DW_AT_name ("path.c")
DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb")
With this change, filename_for_id is still the full name,
"/dir/d/dw2-compdir-oldgcc42.S", but the filename of the subfile /
symtab (what ends up shown by "info source") is just
"dw2-compdir-oldgcc42.S", and that makes the test happy.
Change-Id: I8b5cc4bb3052afdb172ee815c051187290566307
|
|
In the following patch, there will be some callers of file_file_name
that will already have access to the file_entry object for which they
want the file name. It would be inefficient to have them pass an index,
only for line_header::file_file_name to re-lookup the same file_entry
object. Change line_header::file_file_name to accept a file_entry
object reference, instead of an index to look up.
I think this change makes sense in any case. Callers that have an index
can first obtain a file_entry using line_header::file_name_at or
line_header::file_names.
When passing a file_entry object, we can assume that the file_entry's
index is valid, unlike when passing an index. So, push the special case
about an invalid index to the sole current caller of file_file_name,
macro_start_file. I think that error belongs there anyway, since it
specifically talks about "bad file number in macro information".
This requires recording the file index in the file_entry structure, so
add that.
Change-Id: Ic6e44c407539d92b7863d7ba82405ade17f384ad
|
|
The following patch changes line_header::file_file_name to prepend the
compilation directory to the file name, if needed. For that, the line
header needs to know about the compilation directory. Prepare for that
by adding a constructor that takes it as a parameter, and passing the
value down everywhere needed. Add a second constructor for the special
case of building a line_header for doing a hash table lookup, since that
case doesn't require a compilation directory value.
Change-Id: Iba3ba0293e4e2d13a64b257cf9a3094684d54330
|
|
This rewrites registry.h, removing all the macros and replacing it
with relatively ordinary template classes. The result is less code
than the previous setup. It replaces large macros with a relatively
straightforward C++ class, and now manages its own cleanup.
The existing type-safe "key" class is replaced with the equivalent
template class. This approach ended up requiring relatively few
changes to the users of the registry code in gdb -- code using the key
system just required a small change to the key's declaration.
All existing users of the old C-like API are now converted to use the
type-safe API. This mostly involved changing explicit deletion
functions to be an operator() in a deleter class.
The old "save/free" two-phase process is removed, and replaced with a
single "free" phase. No existing code used both phases.
The old "free" callbacks took a parameter for the enclosing container
object. However, this wasn't truly needed and is removed here as
well.
|
|
When running test-case gdb.cp/cpexprs-debug-types.exp with target board
cc-with-debug-names on a system with gcc 12.1.1 (defaulting to dwarf 5), I
run into:
...
(gdb) file cpexprs-debug-types^M
Reading symbols from cpexprs-debug-types...^M
warning: Section .debug_aranges in cpexprs-debug-types has duplicate \
debug_info_offset 0x0, ignoring .debug_aranges.^M
gdb/dwarf2/read.h:309: internal-error: set_length: \
Assertion `m_length == length' failed.^M
...
The exec contains a .debug_names section, which gdb rejects due to
.debug_names containing a list of TUs, while the exec doesn't contain a
.debug_types section (which is what you'd expect for dwarf 4).
Gdb then falls back onto the cooked index, which calls create_all_comp_units
to create all_comp_units. However, the failed index reading left some
elements in all_comp_units, so we end up with duplicates in all_comp_units,
which causes the misleading complaint and the assert.
Fix this by:
- asserting at the start of create_all_comp_units that all_comp_units is empty,
as we do in create_cus_from_index and create_cus_from_debug_names, and
- cleaning up all_comp_units when failing in dwarf2_read_debug_names.
Add a similar cleanup in dwarf2_read_gdb_index.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29381
|
|
I noticed this code in dw2_debug_names_iterator::next:
...
case DW_IDX_compile_unit:
/* Don't crash on bad data. */
if (ull >= per_bfd->all_comp_units.size ())
{
complaint (_(".debug_names entry has bad CU index %s"
" [in module %s]"),
pulongest (ull),
objfile_name (objfile));
continue;
}
per_cu = per_bfd->get_cu (ull);
break;
...
This code used to DTRT, before we started keeping both CUs and TUs in
all_comp_units.
Fix by using "per_bfd->all_comp_units.size () - per_bfd->tu_stats.nr_tus"
instead.
It's hard to produce a test-case for this, but let's try at least to trigger
the complaint somehow. We start out by creating an exec with .debug_types and
.debug_names:
...
$ gcc -g ~/hello.c -fdebug-types-section
$ gdb-add-index -dwarf-5 a.out
...
and verify that we don't see any complaints:
...
$ gdb -q -batch -iex "set complaints 100" ./a.out
...
We look at the CU and TU table using readelf -w and conclude that we have
nr_cus == 6 and nr_tus == 1.
Now override ull in dw2_debug_names_iterator::next for the DW_IDX_compile_unit
case to 6, and we have:
...
$ gdb -q -batch -iex "set complaints 100" ./a.out
During symbol reading: .debug_names entry has bad CU index 6 [in module a.out]
...
After this, it still crashes because this code in
dw2_debug_names_iterator::next:
...
/* Skip if already read in. */
if (m_per_objfile->symtab_set_p (per_cu))
goto again;
...
is called with per_cu == nullptr.
Fix this by skipping the entry if per_cu == nullptr.
Now revert the fix and observe that the complaint disappears, so we've
confirmed that the fix is required.
A somewhat similar issue for .gdb_index in dw2_symtab_iter_next has been filed
as PR29367.
Tested on x86_64-linux, with native and target board cc-with-debug-names.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29336
|
|
When building gdb with -fsanitize-threads and running test-case
gdb.ada/char_enum_unicode.exp, I run into:
...
WARNING: ThreadSanitizer: data race (pid=21301)^M
Write of size 8 at 0x7b2000008080 by main thread:^M
#0 free <null> (libtsan.so.2+0x4c5e2)^M
#1 _dl_close_worker <null> (ld-linux-x86-64.so.2+0x4b7b)^M
#2 convert_between_encodings() charset.c:584^M
...
#21 cooked_index_functions::expand_symtabs_matching() read.c:18606
...
This is fixed by making cooked_index_functions::expand_symtabs_matching wait
for the cooked index finalization to be done.
Tested on x86_64-linux.
https://sourceware.org/bugzilla/show_bug.cgi?id=29311
https://sourceware.org/bugzilla/show_bug.cgi?id=29286
|
|
When building gdb with -fsanitize=thread and running test-case
gdb.dwarf2/inlined_subroutine-inheritance.exp, we run into a data race
between:
...
Read of size 1 at 0x7b2000003010 by thread T4:
#0 packed<language, 1ul>::operator language() const packed.h:54
#1 dwarf2_per_cu_data::set_lang(language) read.h:363
...
and:
...
Previous write of size 1 at 0x7b2000003010 by main thread:
#0 dwarf2_per_cu_data::set_lang(language) read.h:365
...
Fix this by making per_cu->m_lang atomic.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29286
|
|
With gdb with -fsanitize=thread and test-case gdb.ada/array_bounds.exp, I run
into a data race between:
...
Read of size 1 at 0x7b2000025f0f by main thread:
#0 packed<dwarf_unit_type, 1ul>::operator dwarf_unit_type() const packed.h:54
#1 dwarf2_per_cu_data::set_unit_type(dwarf_unit_type) read.h:339
...
and:
...
Previous write of size 1 at 0x7b2000025f0f by thread T3:
#0 dwarf2_per_cu_data::set_unit_type(dwarf_unit_type) read.h:341
...
Fix this by making per_cu->unit_type atomic.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29286
|
|
We have in per_cu->set_lang this comment:
...
void set_lang (enum language lang)
{
/* We'd like to be more strict here, similar to what is done in
set_unit_type, but currently a partial unit can go from unknown to
minimal to ada to c. */
...
Fix this by not setting m_lang for partial units.
This requires us to move the m_unit_type initialization to ensure that
m_unit_type is initialized before per_cu->m_lang.
Tested on x86_64-linux, with native and target board cc-with-dwz-m.
|
|
The cu->per_cu->lang field was added to carry information from the initial
partial symtabs phase to the symtab expansion phase, for the benefit of a
particular optimization in process_imported_unit_die.
Other uses have been added, but since the first phase now has been
parallelized, those have become problematic and sources of race conditions.
Fix this by adding dwarf2_cu::lang () and using it where we can to replace
cu->per_cu->lang () with cu->lang ().
Also assert in dwarf2_cu::lang () that we're not returning language_unknown.
Tested on x86_64-linux.
|
|
When building gdb with -fsanitize=thread and gcc 12, and running test-case
gdb.dwarf2/dwz.exp, we run into a data race between:
...
Read of size 1 at 0x7b200000300d by thread T2:^M
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \
(gdb+0x82ec95)^M
...
and:
...
Previous write of size 1 at 0x7b200000300d by main thread:^M
#0 prepare_one_comp_unit gdb/dwarf2/read.c:23588 (gdb+0x86f973)^M
...
In other words, between:
...
if (this_cu->reading_dwo_directly)
...
and:
...
cu->per_cu->lang = pretend_language;
...
Likewise, we run into a data race between:
...
Write of size 1 at 0x7b200000300e by thread T4:
#0 process_psymtab_comp_unit gdb/dwarf2/read.c:6789 (gdb+0x830720)
...
and:
...
Previous read of size 1 at 0x7b200000300e by main thread:
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \
(gdb+0x82edab)
...
In other words, between:
...
this_cu->unit_type = DW_UT_partial;
...
and:
...
if (this_cu->reading_dwo_directly)
...
Likewise for the write to addresses_seen in cooked_indexer::check_bounds and a
read from is_dwz in dwarf2_find_containing_comp_unit for test-case
gdb.dwarf2/dw2-dir-file-name.exp and target board cc-with-dwz-m.
The problem is that the written fields are part of the same memory location as
the read fields, so executing a read and write in different threads is
undefined behavour.
Making the written fields separate memory locations, using the new
struct packed template fixes this.
The set of fields has been established experimentally to be the
minimal set to get rid of this type of -fsanitize=thread errors, but
more fields might require the same treatment.
Looking at the properties of the lang field, unlike dwarf_version it's
not available in the unit header, so it will be set the first time
during the parallel cooked index reading. The same holds for
unit_type, and likewise for addresses_seen.
dwarf2_per_cu_data::addresses_seen is moved so that the bitfields that
currently follow it can be merged in the same memory location as the
bitfields that currently precede it, for better packing.
Tested on x86_64-linux.
Co-Authored-By: Pedro Alves <pedro@palves.net>
Change-Id: Ifa94f0a2cebfae5e8f6ddc73265f05e7fd9e1532
|
|
With gdb build with -fsanitize=thread and test-case
gdb.dwarf2/dw4-sig-types.exp and target board cc-with-dwz-m we run into a data
race between:
...
Write of size 4 at 0x7b2800002268 by thread T4:^M
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6236 \
(gdb+0x82f525)^M
...
and this read:
...
Previous read of size 4 at 0x7b2800002268 by thread T1:^M
#0 dwarf2_find_containing_comp_unit gdb/dwarf2/read.c:23444 \
(gdb+0x86e22e)^M
...
In other words, between this write:
...
this_cu->length = cu->header.get_length ();
...
and this read:
...
&& mid_cu->sect_off + mid_cu->length > sect_off))
...
of per_cu->length.
Fix this similar to the per_cu->dwarf_version case, by only setting it if
needed, and otherwise verifying that the same value is used. [ Note that the
same code is already present in the other cutu_reader constructor. ]
Move this logic into into a member function set_length to make sure it's used
consistenly, and make the field private in order to enforce access through the
member functions, and rename it to m_length.
This exposes (running test-case gdb.dwarf2/fission-reread.exp) that in
fill_in_sig_entry_from_dwo_entry, the m_length field is overwritten.
For now, allow for that exception.
While we're at it, make sure that the length is set before read.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29344
|
|
There's a spot in read_comp_units_from_section where we explictly use
initial_length_size to get the total length:
...
this_cu->length = cu_header.length + cu_header.initial_length_size;
...
Instead, just use cu_header.get_length ().
Tested on x86_64-linux.
|
|
When running test-case gdb.dwarf2/dw2-symtab-includes.exp with target board
cc-with-debug-names, I run into:
...
(gdb) maint expand-symtab dw2-symtab-includes.h^M
src/gdb/dwarf2/read.h:311: internal-error: unit_type: \
Assertion `m_unit_type != 0' failed.^M
A problem internal to GDB has been detected,^M
further debugging may prove unreliable.^M
----- Backtrace -----^M
FAIL: gdb.dwarf2/dw2-symtab-includes.exp: maint expand-symtab \
dw2-symtab-includes.h (GDB internal error)
...
The assert was recently added in commit 2c474c46943 ("[gdb/symtab] Add get/set
functions for per_cu->lang/unit_type").
The assert is triggered here:
...
/* We're importing a C++ compilation unit with tag DW_TAG_compile_unit
into another compilation unit, at root level. Regard this as a hint,
and ignore it. */
if (die->parent && die->parent->parent == NULL
&& per_cu->unit_type () == DW_UT_compile
&& per_cu->lang () == language_cplus)
return;
...
We're trying to access unit_type / lang which hasn't been set yet.
Normally, these are set during cooked index creation, but that's not the case
when using an index (or using -readnow).
In other words, this lto binary reading speed optimization only works in the
normal use case.
IWBN to have this working in all use cases, but for now, allow lang () and
unit_type () to return language_unknown and 0 here.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29321
|
|
When running test-case gdb.cp/cpexprs-debug-types.exp with target board
cc-with-debug-names, I run into:
...
(gdb) print base::operator new^M
^M
^M
Fatal signal: Segmentation fault^M
----- Backtrace -----^M
0x57ea46 gdb_internal_backtrace_1^M
/home/vries/gdb_versions/devel/src/gdb/bt-utils.c:122^M
0x57eae9 _Z22gdb_internal_backtracev^M
/home/vries/gdb_versions/devel/src/gdb/bt-utils.c:168^M
0x75b8ad handle_fatal_signal^M
/home/vries/gdb_versions/devel/src/gdb/event-top.c:946^M
0x75ba19 handle_sigsegv^M
/home/vries/gdb_versions/devel/src/gdb/event-top.c:1019^M
0x7f795f46a8bf ???^M
0x6d3cb1 _ZNK18dwarf2_per_objfile12symtab_set_pEPK18dwarf2_per_cu_data^M
/home/vries/gdb_versions/devel/src/gdb/dwarf2/read.c:1515^M
...
The problem is in this piece of code in dw2_debug_names_iterator::next:
...
case DW_IDX_type_unit:
/* Don't crash on bad data. */
if (ull >= per_bfd->tu_stats.nr_tus)
{
complaint (_(".debug_names entry has bad TU index %s"
" [in module %s]"),
pulongest (ull),
objfile_name (objfile));
continue;
}
per_cu = per_bfd->get_cu (ull + per_bfd->tu_stats.nr_tus);
break;
...
The all_comp_units vector (which get_cu accesses) contains both CUs and TUs,
with CUs first.
So to get the nth TU we need the element at "nr_cus + n", but
the code uses "nr_tus + n" instead.
Fix this by using "nr_cus + n".
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29334
|
|
The dwarf2_per_cu_data fields lang and unit_type both have a dont-know
initial value (respectively language_unknown and (dwarf_unit_type)0), which
allows us to add certain checks, f.i. checking that that a field is not read
before written.
Add get/set member functions for the two fields as a convenient location to
add such checks, make the fields private to enforce using the member
functions, and add the m_ prefix.
Tested on x86_64-linux.
|
|
When building gdb with -fsanitize=thread and gcc 12, and running test-case
gdb.dwarf2/dwz.exp, we run into a data race between thread T2 and the main
thread in the same write:
...
Write of size 1 at 0x7b200000300c:^M
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6252 \
(gdb+0x82f3b3)^M
...
which is here:
...
this_cu->dwarf_version = cu->header.version;
...
Both writes are called from the parallel for in dwarf2_build_psymtabs_hard,
this one directly:
...
#1 process_psymtab_comp_unit gdb/dwarf2/read.c:6774 (gdb+0x8304d7)^M
#2 operator() gdb/dwarf2/read.c:7098 (gdb+0x8317be)^M
#3 operator() gdbsupport/parallel-for.h:163 (gdb+0x872380)^M
...
and this via the PU import:
...
#1 cooked_indexer::ensure_cu_exists(cutu_reader*, dwarf2_per_objfile*, \
sect_offset, bool, bool) gdb/dwarf2/read.c:17964 (gdb+0x85c43b)^M
#2 cooked_indexer::index_imported_unit(cutu_reader*, unsigned char const*, \
abbrev_info const*) gdb/dwarf2/read.c:18248 (gdb+0x85d8ff)^M
#3 cooked_indexer::index_dies(cutu_reader*, unsigned char const*, \
cooked_index_entry const*, bool) gdb/dwarf2/read.c:18302 (gdb+0x85dcdb)^M
#4 cooked_indexer::make_index(cutu_reader*) gdb/dwarf2/read.c:18443 \
(gdb+0x85e68a)^M
#5 process_psymtab_comp_unit gdb/dwarf2/read.c:6812 (gdb+0x830879)^M
#6 operator() gdb/dwarf2/read.c:7098 (gdb+0x8317be)^M
#7 operator() gdbsupport/parallel-for.h:171 (gdb+0x8723e2)^M
...
Fix this by setting the field earlier, in read_comp_units_from_section.
The write in cutu_reader::cutu_reader() is still needed, in case
read_comp_units_from_section is not used (run the test-case with say, target
board cc-with-gdb-index).
Make the write conditional, such that it doesn't trigger if the field is
already set by read_comp_units_from_section. Instead, verify that the
field already has the value that we're trying to set it to.
Move this logic into into a member function set_version (in analogy to the
already present member function version) to make sure it's used consistenly,
and make the field private in order to enforce access through the member
functions, and rename it to m_dwarf_version.
While we're at it, make sure that the version is set before read, to avoid
say returning true for "per_cu.version () < 5" if "per_cu.version () == 0".
Tested on x86_64-linux.
|
|
When running test-case gdb.dwarf2/fission-mix.exp with target board dwarf64
and gcc-12 (defaulting to DWARF5), I run into:
...
(gdb) break func2^M
Offset from DW_FORM_GNU_str_index or DW_FORM_strx pointing outside of \
.debug_str.dwo section in CU at offset 0x0 [in module fission-mix]^M
(gdb) FAIL: gdb.dwarf2/fission-mix.exp: break func2
...
The .debug_str_offsets section has version 5, so as per the standard it has
it's own header, with initial length and version:
...
Contents of the .debug_str_offsets.dwo section (loaded from fission-mix2.dwo):
Length: 0x1c
Version: 0x5
Index Offset [String]
0 0 build/gdb/testsuite
1 33 GNU C17
2 8f src/gdb/testsuite/gdb.dwarf2/fission-mix-2.c
...
But when trying to read the string offset at index 0 in the table (which
is 0), we start reading at offset 8, which points in the header, at the last
4 bytes of the initial length (it's 12 bytes because of 64-bit dwarf), as well
at the 2-byte version field and 2 bytes of padding, so we get:
...
(gdb) p /x str_offset
$1 = 0x500000000
...
which indeed is an offset that doesn't fit in the .debug_str section.
The offset 8 is based on reader->cu->header.addr_size:
...
static const char *
read_dwo_str_index (const struct die_reader_specs *reader, ULONGEST str_index)
{
ULONGEST str_offsets_base = reader->cu->header.version >= 5
? reader->cu->header.addr_size : 0;
...
which doesn't in look in agreement with the standard.
Note that this happens to give the right answer for 32-bit dwarf and
addr_size == 8, because then we have header size ==
(initial length (4) + version (2) + padding (2)) == 8.
Conversely, for 32-bit dwarf and addr_size == 4 (target board unix/-m32)
we run into a similar problem. It just happens to not trigger the warning,
instead we get the wrong strings, like "func2" for DW_AT_producer and
"build/gdb/testsuite" for DW_AT_name of the DW_TAG_compile_unit DIE.
Fix this by parsing the .debug_str_offsets header in read_dwo_str_index.
Add a FIXME that we should not parse this for every call.
Tested on x86_64-linux.
|
|
cooked_indexer::index_dies incorrect computes the end of the current
CU in the .debug_info. This isn't readily testable without writing
intentionally corrupt DWARF, but it's apparent through observation: it
is currently based on 'info_ptr', which does not always point to the
start of the CU. This patch fixes the expression. Tested on x86-64
Fedora 34.
|
|
Starting with (future) Clang 15 (since
https://reviews.llvm.org/D120989), Clang emits the DWARF information
of global alias variables as DW_TAG_imported_declaration. However,
GDB does not handle it. It incorrectly always reads this tag as
C++/Fortran imported declaration (type alias, namespace alias and
Fortran module). This commit adds support to handle this tag as an
alias variable.
This change fixes the failures in the gdb.base/symbol-alias.exp
testcase with current git Clang. This testcase is also updated to
test nested (recursive) aliases.
|
|
While working on addrmaps, I noticed that psymtab_addrmap is no longer
needed now. It was introduced in ancient times as an optimization for
DWARF, but no other symbol reader was ever updated to use it. Now
that DWARF does not use psymtabs, it can be deleted.
|
|
Mutable addrmaps currently require an obstack. This was probably done
to avoid having to call splay_tree_delete, but examination of the code
shows that all mutable obstacks have a limited lifetime -- now it's
simple to treat them as ordinary C++ objects, in some cases
stack-allocating them, and have a destructor to make the needed call.
This patch implements this change.
|