Age | Commit message (Collapse) | Author | Files | Lines |
|
When compiling with -gsplit-dwarf -fdebug-types-section, DWARF 5
.debug_info.dwo sections may contain some type units:
$ llvm-dwarfdump -F -color a-test.dwo | head -n 5
a-test.dwo: file format elf64-x86-64
.debug_info.dwo contents:
0x00000000: Type Unit: length = 0x000008a0, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_type, abbr_offset = 0x0000, addr_size = 0x08, name = 'vector<int, std::allocator<int> >', type_signature = 0xb499dcf29e2928c4, type_offset = 0x0023 (next unit at 0x000008a4)
In this case, create_dwo_cus_hash_table wrongly creates a dwo_unit for
it and adds it to dwo_file::cus. create_dwo_debug_type_hash_table later
correctly creates a dwo_unit that it puts in dwo_file::tus.
This can be observed with:
$ ./gdb -nx -q --data-directory=data-directory -ex 'maint set dwarf sync on' -ex "maint set worker-threads 0" -ex "set debug dwarf-read 2" -ex "file a.out" -batch
...
[dwarf-read] create_dwo_cus_hash_table: Reading .debug_info.dwo for /home/smarchi/build/binutils-gdb/gdb/a-test.dwo:
[dwarf-read] create_dwo_cus_hash_table: offset 0x0, dwo_id 0xb499dcf29e2928c4
[dwarf-read] create_dwo_cus_hash_table: offset 0x8a4, dwo_id 0x496a8791a842701b
[dwarf-read] create_dwo_cus_hash_table: offset 0x941, dwo_id 0xefd13b3f62ea9fea
...
[dwarf-read] create_dwo_debug_type_hash_table: Reading .debug_info.dwo for /home/smarchi/build/binutils-gdb/gdb/a-test.dwo
[dwarf-read] create_dwo_debug_type_hash_table: offset 0x0, signature 0xb499dcf29e2928c4
[dwarf-read] create_dwo_debug_type_hash_table: offset 0x8a4, signature 0x496a8791a842701b
[dwarf-read] create_dwo_debug_type_hash_table: offset 0x941, signature 0xefd13b3f62ea9fea
...
Fix it by skipping anything that isn't a compile unit in
create_dwo_cus_hash_table. After this patch, the debug output of
create_dwo_cus_hash_table only shows one created dwo_unit, as we expect.
I couldn't find any user-visible problem related to this, I just noticed
it while debugging.
Change-Id: I7dddf766fe1164123b6702027b1beb56114f25b1
Reviewed-By: Tom de Vries <tdevries@suse.de>
|
|
Rename some functions to make it clearer that they are only relevant
when dealing with DWO files.
Change-Id: Ia0cd3320bf16ebdbdc3c09d7963f372e6679ef7c
Reviewed-By: Tom de Vries <tdevries@suse.de>
|
|
On riscv64-linux, with test-case gdb.base/vla-optimized-out.exp I ran into:
...
(gdb) p sizeof (a)^M
$2 = <optimized out>^M
(gdb) FAIL: $exp: o1: printed size of optimized out vla
...
The variable a has type 0xbf:
...
<1><bf>: Abbrev Number: 12 (DW_TAG_array_type)
<c0> DW_AT_type : <0xe3>
<c4> DW_AT_sibling : <0xdc>
<2><c8>: Abbrev Number: 13 (DW_TAG_subrange_type)
<c9> DW_AT_type : <0xdc>
<cd> DW_AT_upper_bound : 13 byte block:
a3 1 5a 23 1 8 20 24 8 20 26 31 1c
(DW_OP_entry_value: (DW_OP_reg10 (a0));
DW_OP_plus_uconst: 1; DW_OP_const1u: 32;
DW_OP_shl; DW_OP_const1u: 32; DW_OP_shra;
DW_OP_lit1; DW_OP_minus)
...
which has an upper bound using a DW_OP_entry_value, and since the
corresponding call site contains no information to resolve the value of a0 at
function entry:
...
<2><6b>: Abbrev Number: 6 (DW_TAG_call_site)
<6c> DW_AT_call_return_pc: 0x638
<74> DW_AT_call_origin : <0x85>
...
evaluting the dwarf expression fails, and we get <optimized out>.
My first thought was to try breaking at *f1 instead of f1 to see if that would
help, but actually the breakpoint resolved to the same address.
In other words, the inferior is stopped at function entry.
Fix this by resolving DW_OP_entry_value when stopped at function entry by
simply evaluating the expression.
This handles these two cases (x86_64, using reg rdi):
- DW_OP_entry_value: (DW_OP_regx: 5 (rdi))
- DW_OP_entry_value: (DW_OP_bregx: 5 (rdi) 0; DW_OP_deref_size: 4)
Tested on x86_64-linux.
Tested gdb.base/vla-optimized-out.exp on riscv64-linux.
Tested an earlier version of gdb.dwarf2/dw2-entry-value-2.exp on
riscv64-linux, but atm I'm running into trouble on that machine (cfarm92) so
I haven't tested the current version there.
|
|
Commit 71a48752660b ("gdb/dwarf: remove create_dwo_cu_reader")
introduced a regression when handling files compiled with "-gsplit-dwarf
-fdebug-types-section" (at least with clang):
$ cat test.cpp
#include <vector>
int main()
{
std::vector<int> v;
return v.size ();
}
$ clang++ -O0 test.cpp -g -gdwarf-5 -gsplit-dwarf -fdebug-types-section -o test
$ ./gdb -nx -q --data-directory=data-directory ./test -ex "maint expand-symtabs"
Reading symbols from ./test...
/home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:6159: internal-error: setup_type_unit_groups: Assertion `per_cu->is_debug_types' failed.
In the main file, we have a skeleton CU with a certain DWO ID:
0x00000000: Compile Unit: ..., unit_type = DW_UT_skeleton, ..., DWO_id = 0x146eaa4daf5deef2, ...
In the .dwo file, the first unit is a type unit with a certain type
signature:
0x00000000: Type Unit: ..., unit_type = DW_UT_split_type, ..., type_signature = 0xb499dcf29e2928c4, ...
and the split compile unit matching the DWO ID from the skeleton from
the main file comes later:
0x0000117f: Compile Unit: ..., unit_type = DW_UT_split_compile, ..., DWO_id = 0x146eaa4daf5deef2, ...
The problem introduced by the aforementioned commit is that when
creating a dwo_unit structure representing the type unit, we use the
signature (DWO id) from the skeleton, instead of the signature from the
type unit's header. As a result, all dwo_units get created with the
same signature (the DWO id) and only the first unit gets inserted in the
hash table. When looking up the comp unit by DWO ID later on, we
wrongly find the type unit, and try to expand a type unit as a comp
unit, hitting the assert.
Before that commit, we passed `reader.cu ()` to lookup_dwo_id, which
yields a dwarf2_cu built from parsing the type unit's header. This
dwarf2_cu contains the comp_unit_header with the correct signature. Fix
the code to use `reader.cu ()` again.
Another thing that enables this bug is the fact that since DWARF 5, type
and compile units are all in .debug_info, and therefore read by
create_cus_hash_table, so they both end up in dwo_file::cus. Type units
should end up in dwo_file::tus, otherwise they won't be found by
lookup_dwo_cutu. This bug hasn't given me trouble so far, so I'm not
fixing it right now, but it's on my todo list.
The problem can be seen with some tests, when using the
dwarf5-fission-debug-types board:
$ make check TESTS="gdb.cp/expand-sals.exp" RUNTESTFLAGS="--target_board=dwarf5-fission-debug-types CC_FOR_TARGET=clang CXX_FOR_TARGET=clang++"
Running /home/simark/src/binutils-gdb/gdb/testsuite/gdb.cp/expand-sals.exp ...
FAIL: gdb.cp/expand-sals.exp: gdb_breakpoint: set breakpoint at main (GDB internal error)
But this patch also adds a DWARF assembler-based test that triggers the
internal error.
Note that the new test does not use the build_executable_and_dwo_files
proc, because I found that it is subtly broken and doesn't work to put
multiple units in a single .dwo file. The debug abbrev offset field in
the second unit's header would be 0, when it should have been something
else. The problem is that no linking is ever done to generate the .dwo
file, so the relocation that would apply for this field is never
applied. Instead, I generate two DWARF debug infos separately and link
the .dwo file using gdb_compile, it seems to work fine.
Change-Id: I96f809c56f703e25f72b8622c32e6bb91de20d6a
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This updates the copyright headers to include 2025. I did this by
running gdb/copyright.py and then manually modifying a few files as
noted by the script.
Approved-By: Eli Zaretskii <eliz@gnu.org>
|
|
"cache" is just a bit too generic to be clear.
Change-Id: I8bf01c5fe84e076af1afd2453b1a115777630271
|
|
Some architectures, such as MIPS, have signed addresses and this changes
read_addrmap_from_aranges to record them as signed when required.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32658
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The cooked index worker maintains the state for the various state
transition in the scanner. It is held by the cooked_index while
scanning is in progress, then deleted once this has completed.
I noticed that none of the arguments to cooked_index::done_reading
were really needed -- the cooked_index already has access to the
worker should it need it. Removing these parameters makes the code a
bit simpler and also cleans up some confusing code around the use of
the deferred warnings object.
Regression tested on x86-64 Fedora 40.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Remove some includes reported as unused by clangd.
Change-Id: I841938c3c6254e4f0d154a1e172c4968ff326333
|
|
This updates the cooked_index comment with some notes about object
lifetimes, in an attempt to make navigating this code a bit simpler.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
The two readers currently using cooked_index_worker shared some code.
This patch factors this out into a new "done_reading" method.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
cooked_index_worker::result_type is an ad hoc tuple type used for
transferring data between phases of the indexer. It's a bit unwieldy
and another patch I'm working on would be somewhat nicer without it.
This patch removes the type. Now cooked_index_ephemeral objects are
transferred instead, which is handy because they already hold the
needed state.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This updates the "See xyz.h" comments for all the methods that were
moved earlier in this series. Perhaps I should have removed them
instead.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This moves the cooked_index_worker class to cooked-index-worker.[ch].
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes cooked-index-worker.h to include the new header files.
This breaks the circular dependency that would otherwise be introduced
in the next patch.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This moves cooked_index_shard to a couple of new files,
dwarf2/cooked-index-shard.[ch]. The rationale is the same as the
previous patch: cooked-index.h had to be split to enable other
cleanups.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This moves cooked_index_entry and some related helper code to a couple
of new files, dwarf2/cooked-index-entry.[ch].
The main rationale for this is that in order to finish this series and
remove "cooked_index_worker::result_type", I had to split
cooked-index.h into multiple parts to avoid circular includes.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
language_requires_canonicalization is only called from cooked-index.c,
so mark it as static.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This renames cooked_index_storage to cooked_index_worker_result,
making its function more clear. It also updates the class comment to
as well.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
A discussion with Simon made me realize that cooked_index_storage
isn't a very clear name, especially now that it's escaped from read.c.
While it does provide some storage (I guess any object does in a
sense), it is really a helper for cooked_index_worker -- a temporary
object that is destroyed after reading has completed.
This patch renames this file. Later patches will rename the class and
move cooked_index_worker here, something I think is reasonable given
that cooked_index_storage is really something of a helper class for
cooked_index_worker.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
GDB's compile subsystem is deeply tied to GDB's ability to understand
DWARF. A future patch will add the option to disable DWARF at configure
time, but for that to work, the compile subsystem will need to be
entirely disabled as well, so this patch adds that possibility.
I also think there is motive for a security conscious user to disable
compile for it's own sake. Considering that the code is quite
unmaintained, and depends on an equally unmaintained gcc plugin, there
is a case to be made that this is an unnecessary increase in the attack
surface if a user knows they won't use the subsystem. Additionally, this
can make compilation slightly faster and the final binary is around 3Mb
smaller. But these are all secondary to the main goal of being able to
disable dwarf at configure time.
To be able to achieve optional compilation, some of the code that
interfaces with compile had to be changed. All parts that directly
called compile things have been wrapped by ifdefs checking for compile
support. The file compile/compile.c has been setup in a similar way to
how python's and guile's main file has been setup, still being compiled
but only for with placeholder command.
Finally, to avoid several new errors, a new TCL proc was introduced to
gdb.exp, allow_compile_tests, which checks if the "compile" command is
recognized before the inferior is started and otherwise skips the compile
tests. All tests in the gdb.compile subfolder have been updated to use
that, and the test gdb.base/filename-completion also uses this. The proc
skip_compile_feature_tests to recognize when the subsystem has been
disabled at compile time.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Change some parameters to be references instead of pointers, when the
value must not be nullptr. I'd like to do this more of this kind of
change, but I have to limit the scope of the change, otherwise there's
just no end (and some local variables could also be turned into
references). So for now, just do it the cutu_reader constructors.
Change-Id: I9442c6043726981d58f9b141f516c590c0a71bcc
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The comment on this constructor is really outdated. Update it to better
reflect the reality today.
I'd eventually like to change this cutu_reader constructor not to use
dwarf2_per_cu, because it seems like an abuse of dwarf2_per_cu just to
pass 3 values. But for now, just document the existing behavior.
Change-Id: Id96db020c361e64d9b0d2f25d51950b206658aa2
Approved-By: Tom Tromey <tom@tromey.com>
|
|
lookup_dwo_unit receives the name of the DWO unit to look up, as read
from the DW_AT_dwo_name attribute of the skeleton DIE. But then, it
doesn't use it:
/* Yeah, we look dwo_name up again, but it simplifies the code. */
dwo_name = dwarf2_dwo_name (comp_unit_die, cu);
Perhaps this comment made sense at some point, but with the code we have
today, I don't understand it. It should be fine to use the name passed
as a parameter, which the caller also obtained by calling
dwarf2_dwo_name.
Change-Id: I84723e12726f77e4202d042428ee0eed9962ceb8
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Looking at `cooked_index_shard::find`, I thought that we could make a
small optimization: when finding the upper bound, we already know the
lower bound. And we know that the upper bound is >= the lower bound.
So we could pass `lower` as the first argument of the `std::upper_bound`
call to cut the part of the search space that is below `lower`.
It then occured to me that what we do is basically what
`std::equal_range` is for, so why not use it. Implementations of
`std::equal_range` are likely do to things as efficiently as possible.
Unfortunately, because `cooked_index_entry::compare` is sensitive to the
order of its parameters, we need to provide two different comparison
functions (just like we do know, to the lower_bound and upper_bound
calls). But I think that the use of equal_range makes it clear what the
intent of the code is.
Regression tested using the various DWARF target boards on Debian 12.
Change-Id: Idfad812fb9abae1b942d81ad9976aeed7c2cf762
Approved-By: Tom Tromey <tom@tromey.com>
|
|
I believe that the `(mode == MATCH && a == munge ('<'))` part of the
condition is unnecesary. Or perhaps I don't understand the algorithm.
The use of "munge" above effectively makes it so that the template
portion of names is completely ignored for the sake of the comparison.
Then, in the condition, this:
a == munge ('<')
is functionally equivalent to
a == '\0'
If `a` is indeed '\0', and `b` is also '\0', then we would have taken
the earlier branch:
if (a == b)
return 0;
If `b` is not '\0', then we won't take this branch and we'll go into the
final comparison:
return a < b ? -1 : 1;
So, as far as I can see, there is no case where `mode == MATCH`, where
we're going to use this special `return 0`.
Regression tested using the various DWARF target boards on Debian 12.
Change-Id: I5ea0463c1fdbbc1b003de2f0a423fd0073cc9dec
Approved-By: Tom Tromey <tom@tromey.com>
|
|
We have this pattern of check in multiple places:
/* Skip dummy compilation units. */
if (m_info_ptr >= begin_info_ptr + this_cu->length ()
|| peek_abbrev_code (abfd, m_info_ptr) == 0)
m_dummy_p = true;
In all places except one (read_cutu_die_from_dwo), this is done after
reading the unit header but before potentially reading the first DIE.
The effect is that we consider dummy units that have no DIE at all.
Either the "data" portion of the unit (the portion after the header) has
a size of zero, or the first abbrev code is 0, i.e. "end of list".
According to this old commit I found [1], dummy CUs were used as filler
for incremental LTO linking. A comment reads:
WARNING: If THIS_CU is a "dummy CU" (used as filler by the incremental
linker) then DIE_READER_FUNC will not get called.
In read_cutu_die_from_dwo, however, this check is done after having read
the first DIE. So at the time of the check, m_info_ptr has already been
advanced just past the first DIE. As a result, compilations units with
a single DIE are considered (erroneously, IMO) as dummy.
In commit aab6de1613df ("gdb/dwarf: fix spurious error when encountering
dummy CU") [2], I mentioned a real world case where compilation units
with a single top-level DIE were being considered dummy. I believe that
those units should not actually have been treated as dummy. A CU with
just one DIE may not be very interesting, but I don't see any reason to
consider it dummy.
Move the dummy check above the read_toplevel_die call, and return early
if the CU is dummy.
I am 99% convinced that it's not even possible to encounter an empty
unit here, and considered turning it into an assert (it did pass the
testsuite). This function is passed a dwo_unit, and functions that
create a dwo_unit are:
- create_debug_type_hash_table (creates a dwo_unit for each type unit
found in a dwo file)
- create_cus_hash_table (creates a dwo_unit for each comp unit found in
a dwo file)
- create_dwo_unit_in_dwp_v1
- create_dwo_unit_in_dwp_v2
- create_dwo_unit_in_dwp_v5
In the first two, there are already dummy checks, so we wouldn't even
get to read_cutu_die_from_dwo for such an empty CU. However, in the
last three, there is no such checks, we just trust the dwp file's index
and create dwo_units out of that. So I guess it would be possible to
craft a broken dwp file with a CU that has no DIE. Out of caution, I
didn't switch that to an assert, but I also don't really know what would
be the mode of failure if that were to happen.
Regtested using the various DWARF target boards on Debian 12.
[1] https://gitlab.com/gnutools/binutils-gdb/-/commit/dee91e82ae87f379c90fddff8db7c4b54a116609#dd409f60ba6f9c066432dafbda7093ac5eec76d1_3434_3419
[2] https://gitlab.com/gnutools/binutils-gdb/-/commit/aab6de1613df693059a6a2b505cc8f20d479d109
Change-Id: I90e6fa205cb2d23ebebeae6ae7806461596f9ace
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This parameter is always used to set cutu_reader::m_dwo_abbrev_table.
Remove the parameter, and have read_cutu_die_from_dwo set the field
directly.
Change-Id: I6c0c7d23591fb2c3d28cdea1befa4e6b379fd0d3
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This adds a new die_info::children method. This returns a range that
can be used to iterate over a DIE's children.
Then this goes through and updates all the relevant loops to use
foreach instead. This is a net code reduction.
You'll note that in some places the code was checking the tag as well,
like:
while (child_die && child_die->tag)
I believe this can't happen and is just a copy-paste oddity from the
old days.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
I want to add support for C++ foreach iteration over DIE siblings.
I considered writing a custom iterator for this, but it would be
largely identical to the already-existing next_iterator. I didn't
want to duplicate the code...
Then I tried parameterizing next_iterator by having it take an
optional pointer-to-member template argument. However, this would
involve changes in many places, because currently a next_iterator can
be instantiated before the underlying type is complete.
So in the end I decided to rename die_info::sibling to die_info::next.
This name is slightly worse but (1) IMO it isn't really all that bad,
nobody would have blinked if it was called 'next' in the initial
patch, and (2) with the change to iteration it is barely used.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
A recent patch of mine had a comment with bad grammar; apparently I
didn't finish editing it. This patch cleans it up.
|
|
Replace an htab with gdb::unordered_set. I think we could also use the
dwarf2_per_cu pointer itself as the identity, basically have the
functional equivalent of:
gdb::unordered_map<dwarf2_per_cu *, cutu_reader_up>
But I kept the existing behavior of using dwarf2_per_cu::index as the
identity.
Change-Id: Ief3df9a71ac26ca7c07a7b79ca0c26c9d031c11d
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The type_unit_group is an indirection between a stmt_list_hash (possible
dwo_unit + line table section offset) and a type_unit_group_unshareable
that provides no real value. In dwarf2_per_objfile, we maintain a
stmt_list_hash -> type_unit_group mapping, and in dwarf2_per_objfile, we
maintain a type_unit_group_unshareable mapping. The type_unit_group
type is empty and only exists to have an identity and to be a link
between the two mappings.
This patch changes it so that we have a single stmt_list_hash ->
type_unit_group_unshareable mapping.
Regression tested on Debian 12 amd64 with a bunch of DWARF target
boards.
Change-Id: I9c5778ecb18963f353e9dd058e0f8152f7d8930c
Approved-By: Tom Tromey <tom@tromey.com>
|
|
dwarf2_per_bfd::{quick_file_names_table,type_unit_groups}
Change these two hash tables to use gdb::unordered_map. I changed these
two at the same time because they both use the same key, a
stmt_list_hash. Unlike other previous patches that used a
gdb::unordered_set, use an unordered_map here because the key isn't
found in the element itself (well, it was before, because of how htab
works, but it didn't need to be).
You'll notice that the type_unit_group structure is empty. That
structure isn't really needed. It is removed in the following patch.
Regression tested on Debian 12 amd64 with a bunch of DWARF target
boards.
Change-Id: Iec2289958d0f755cab8198f5b72ecab48358ba11
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This removes attribute::is_nonnegative and attribute::as_nonnegative
in favor of a call to unsigned_constant.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
I noticed that gdb doesn't handle DW_END_default. This patch adds
support for this.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes get_alignment to assume that DW_AT_alignment refers to an
unsigned value.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes read_decl_line and new_symbol to assume that
DW_AT_decl_line should refer to an unsigned value.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This changes dwarf2_record_block_entry_pc to issue a complaint using
the form name rather than a value. This seems more correct to me.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This introduces a new 'unsigned_constant' method on attribute. This
method can be used to get the value as an unsigned number. Unsigned
scalar forms are handled, and signed scalar forms are handled as well
provided that the value is non-negative.
Several spots in the reader that expect small DWARF-defined constants
are updated to use this new method.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
This renames attribute::form_is_signed to form_is_strictly_signed. I
think this more accurately captures what it does: it says whether a
form will always use signed data -- not whether a form might use
signed data, which DW_FORM_data* do depending on context.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
read_cutu_die_from_dwo currently returns the dwo's top-level DIE through
a parameter. Following the previous patch, all code paths end up
setting m_top_level_die. Simplify this by having read_cutu_die_from_dwo
set m_top_level_die directly. I think it's easier to understand,
because there's one less indirection to follow.
Change-Id: Ib659f1d2e38501a8fe2b5dd0ca2add3ef55e8d60
Approved-By: Tom Tromey <tom@tromey.com>
|
|
I built an application with -gsplit-dwarf (i.e. dwo), and some CUs are
considered "dummy" by the DWARF reader. That is, the top-level DIE
(DW_TAG_compile_unit) does not have any children. Here's the skeleton:
0x0000c0cb: Compile Unit: length = 0x0000001d, format = DWARF32, version = 0x0005, unit_type = DW_UT_skeleton, abbr_offset = 0x529b, addr_size = 0x08, DWO_id = 0x0ed2693dd2a756dc (next unit at 0x0000c0ec)
0x0000c0df: DW_TAG_skeleton_unit
DW_AT_stmt_list [DW_FORM_sec_offset] (0x09dee00f)
DW_AT_dwo_name [DW_FORM_strp] ("CMakeFiles/lib_crl.dir/crl/dispatch/crl_dispatch_queue.cpp.dwo")
DW_AT_comp_dir [DW_FORM_strp] ("/home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/Telegram/lib_crl")
DW_AT_GNU_pubnames [DW_FORM_flag_present] (true)
And here's the entire debug info in the .dwo file:
.debug_info.dwo contents:
0x00000000: Compile Unit: length = 0x0000001a, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size = 0x08, DWO_id = 0x0ed2693dd2a756dc (next unit at 0x0000001e)
0x00000014: DW_TAG_compile_unit
DW_AT_producer [DW_FORM_strx] ("GNU C++20 14.2.1 20250207 -mno-direct-extern-access -mtune=generic -march=x86-64 -gsplit-dwarf -g3 -gz=none -O2 -std=gnu++20 -fPIC -fno-strict-aliasing")
DW_AT_language [DW_FORM_data1] (DW_LANG_C_plus_plus_14)
DW_AT_name [DW_FORM_strx] ("/home/simark/src/tdesktop/Telegram/lib_crl/crl/dispatch/crl_dispatch_queue.cpp")
DW_AT_comp_dir [DW_FORM_strx] ("/home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/Telegram/lib_crl")
When loading the binary in GDB, I see some warnings:
$ ./gdb -q -nx --data-directory=data-directory -ex 'maint set dwarf sync on' -ex "file /home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/telegram-desktop"
Reading symbols from /home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/telegram-desktop...
DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc0cb
DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc152
DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc194
DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc1b5
(gdb)
It turns out that these errors are not really justified. What happens
is:
- cutu_reader::read_cutu_die_from_dwo return 0, indicating that the CU
is "dummy"
- back in cutu_reader::cutu_reader, we omit setting m_top_level_die to
the DIE from the dwo file, meaning that m_top_level_die keeps
pointing to the DIE from the main file (DW_TAG_skeleton_unit)
- later, in cutu_reader::prepare_one_comp_unit, there is a check that
m_top_level_die->tag is one of DW_TAG_{compile,partial,type}_unit,
which triggers
My proposal to fix this is to set m_top_level_die even if the CU is
dummy. Even if the top-level DIE does not have any children, I don't
see any reason to leave cutu_reader::m_top_level_die in a different
state than when the CU is not dummy.
While at it, set m_dummy_p directly in read_cutu_die_from_dwo, instead
of returning a value and having the caller do it. This is all inside
cutu_reader anyway.
Change-Id: I483a68a369bb461a8dfa5bf2106ab1d6a0067198
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This function, as can be seen by its comment, is a remnant of past
design. Inline its content into create_cus_hash_table.
Change-Id: Id900bae2cdce8f33bf01199fb1d366646effc76e
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Following the previous patch, this parameter is now unused. Remove it.
Change-Id: I7e96a3ba61ad9a0d6b64f9129aeeb9a8f3da22a7
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Add a few -Wunused-* diagnostic flags that look useful. Some are known
to gcc, some to clang, some to both. Fix the fallouts.
-Wunused-const-variable=1 is understood by gcc, but not clang.
-Wunused-const-variable would be undertsood by both, but for gcc at
least it would flag the unused const variables in headers. This doesn't
make sense to me, because as soon as one source file includes a header
but doesn't use a const variable defined in that header, it's an error.
With `=1`, gcc only warns about unused const variable in the main source
file. It's not a big deal that clang doesn't understand it though: any
instance of that problem will be flagged by any gcc build.
Change-Id: Ie20d99524b3054693f1ac5b53115bb46c89a5156
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Direct replacement of an htab with a gdb::unordered_set.
Using a large test program, I see a small but consistent performance
improvement. The "file" command time goes on average from 7.88 to 7.73
seconds (~2%). To give a rough estimate of the scale of the test
program, the 8 seen_names hash tables (one for each worker thread) had
between 173846 and 866961 entries.
Change-Id: I0157cbd04bb55338bb1fcefd2690aeef52fe3afe
Approved-By: Tom Tromey <tom@tromey.com>
|
|
After staring at the code, I got convinced that it was not possible for
load_full_comp_unit to be called while a dwarf2_cu object exists in
per_objfile for this_cu. If you follow all callers of
load_full_comp_unit, you can see that all calls to load_full_comp_unit
(except one, see below) are gated one way or another by the fact that:
per_objfile->get_cu (per_cu) == nullptr
Some calls are gated by maybe_queue_comp_unit returning true. If it
returns true, then necessarily the dwarf2_cu is unset for that per_cu.
The spot that didn't seem to check for whether the dwarf2_cu is already
set before calling load_full_comp_unit is dw2_do_instantiate_symtab. It
didn't trigger when running the testsuite, but I could imagine a made up
case where the dwarf2_cu would already be set because we looked up a DIE
reference to it (follow_die_ref) for whatever reason. Then, something
would cause the symtab for that CU to be expanded and
dw2_do_instantiate_symtab to be called.
I added a check in that function, because it seemed prudent to do so.
All other load_cu calls are gated by this check, so it makes this call
look just like the others.
Finally, because all call sites that use cutu_reader::release_cu pass
nullptr for `existing_cu` (and therefore cutu_reader creates a new
dwarf2_cu), we know that cutu_reader::release_cu will always return a
non-nullptr value. Add an assert in it and remove checks in
load_full_comp_unit and read_signatured_type.
Change-Id: I496be34bd4bf7edfa38d5135cf4bc4ccd960abe2
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Following the previous patch, all callers now pass the same thing:
per_objfile->get_cu (this_cu)
Remove that parameter and to the call in the function itself.
Change-Id: Iafd36b058d7b95efae518bb65035c6a03728b018
Approved-By: Tom Tromey <tom@tromey.com>
|
|
After staring at the code for a while, I got convinced that it's not
possible for cu->dies to be nullptr in follow_die_offset. It might be a
leftover from the psymtab days.
In most cases, we see that the dwarf2_cu passedas `*ref_cu` has been
obtained by doing:
per_objfile->get_cu (per_cu);
The only way for a dwarf2_cu to end up in the per_objfile like this is
through load_full_comp_unit or read_signatured_type. Both of these
functions call `reader.read_all_dies ()` (which loads the DIEs in memory
and assigns dwarf2_cu::dies) before transferring the newly created
dwarf2_cu to the per_objfile. So any dwarf2_cu obtained through
per_objfile->get_cu (per_cu)
... will have its DIEs set.
The only case today I'm aware of of a dwarf2_cu without DIEs is in the
cooked indexer. It creates a cutu_reader, but does not call
read_all_dies. Instead, it gets the info_ptr from the cutu_reader and
reads the DIEs from the section buffer directly, on its own. But this
is an entirely different code path that doesn't assign dwarf2_cu
objects to per_objfile.
So, remove the code path in follow_die_offset that tests for
`source_cu->dies == NULL`. I added an assert at the top of the function
to verify that `source_cu->dies` is always non-nullptr, as a way to
test my hypothesis. We could probably get rid of it, but I left it
there because it doesn't cost much to have it.
Change-Id: I97f269f092128800850aa5e64eda7032c2edec60
Approved-By: Tom Tromey <tom@tromey.com>
|