Age | Commit message (Collapse) | Author | Files | Lines |
|
Internally we noticed that some tests would fail like so on Windows:
warning: Section .debug_aranges in [...] has duplicate debug_info_offset 0x0, ignoring .debug_aranges.
Debugging showed that, in fact, a second CU was being created at this
offset. We tracked this down to the fact that, while the ELF reader
is careful to re-use the per-BFD data, other readers are not, and
could re-read the DWARF data multiple times.
However, since the change to allow an objfile to have multiple "quick
symbol" implementations, there's no reason for this approach -- it's
safe and easy for all symbol readers to reuse the per-BFD data when
reading DWARF.
This patch implements this idea, simplifying dwarf2_build_psymtabs and
making it private, and then switching to dwarf2_initialize_objfile as
the sole way to start the DWARF reader.
Note that, while I think the call to dwarf2_build_frame_info in
machoread.c is also obsolete, I haven't attempted to remove it here.
|
|
In this review [1], Eli pointed out that we should be careful when
concatenating file names to avoid duplicated slashes. On Windows, a
double slash at the beginning of a file path has a special meaning. So
naively concatenating "/" and "foo/bar" would give "//foo/bar", which
would not give the desired results. We already have a few spots doing:
if (first_path ends with a slash)
path = first_path + second_path
else
path = first_path + slash + second_path
In general, I think it's nice to avoid superfluous slashes in file
paths, since they might end up visible to the user and look a bit
unprofessional.
Introduce the path_join function that can be used to join multiple path
components together (along with unit tests).
I initially wanted to make it possible to join two absolute paths, to
support the use case of prepending a sysroot path to a target file path,
or the prepending the debug-file-directory to a target file path. But
the code in solib_find_1 shows that it is more complex than this anyway
(for example, when the right hand side is a Windows path with a drive
letter). So I don't think we need to support that case in path_join.
That also keeps the implementation simpler.
Change a few spots to use path_join to show how it can be used. I
believe that all the spots I changed are guarded by some checks that
ensure the right hand side operand is not an absolute path.
Regression-tested on Ubuntu 18.04. Built-tested on Windows, and I also
ran the new unit-test there.
[1] https://sourceware.org/pipermail/gdb-patches/2022-April/187559.html
Change-Id: I0df889f7e3f644e045f42ff429277b732eb6c752
|
|
This turns symbol_symtab into a method on symbol. It also replaces
symbol_set_symtab with a method.
|
|
For a series I'm experimenting with, it was handy to hide a symbol's
"artificial" field behind accessors. This patch is the result.
|
|
The dwarf2_per_bfd object has a separate field for each possible kind
of index. Until an earlier patch in this series, two of these were
even derived from a common base class, but still had separate slots.
This patch unifies all the index fields using the common base class
that was introduced earlier in this series. This makes it more
obvious that only a single index can be active at a time, and also
removes some code from dwarf2_initialize_objfile.
|
|
Some generic code in the DWARF reader has a special case for older
versions of .gdb_index. This patch adds an ad hoc version check
method so that these spots can work without specific knowledge of
which index is in use.
|
|
This simplifies the index versio check in dw2_symtab_iter_next, by
passing a reference to the index object to this function. This avoids
an indirection via the per_bfd object.
|
|
This introduces dwarf_scanner_base, a base class for all the index
readers in the DWARF code. Then, it changes both mapped_index_base
and cooked_index_vector to derive from this new base class.
|
|
This introduces readnow_functions, a new subclass of
dwarf2_base_index_functions, and changes the DWARF reader to use it.
This lets us drop the "index is NULL" hack from the gdb index code.
|
|
The dwarf2_debug_names_index code treats a NULL debug_names_table as
if it were from OBJF_READNOW. However, this trick is only done for
gdb_index, never for debug_names -- see dwarf2_initialize_objfile.
|
|
This changes the mapped index classes to create the
quick_symbol_functions objects. This is a step toward having a more
abstract interface to mapped indices.
|
|
This changes mapped_index_base to have a virtual destructor, so it can
be destroyed via its base class.
|
|
This moves mapped_index_base and the helper struct name_component to a
new header file in gdb/dwarf2/.
|
|
I noticed that cooked_index_entry::full_name can return the canonical
string when there is no parent entry.
Regression tested on x86-64 Fedora 34.
|
|
I'm trying to switch these functions to use std::string instead of char
arrays, as much as possible. Some callers benefit from it (can avoid
doing a copy of the result), while others suffer (have to make one more
copy).
Change-Id: Iced49b8ee2f189744c5072a3b217aab5af17a993
|
|
This patch started when I noticed that the unordered_set include
wasn't needed in abbrev-cache.h. (That was probably leftover from
some earlier implementation of the class.) Then, I noticed that the
class itself was under-commented. This patch fixes both issues.
|
|
When running the internal AdaCore test suite against the new DWARF
indexer, I found one regression on RISC-V. The test in question uses
--gc-sections, and winds up with an entry in the middle of a
.debug_aranges that has both address and length of 0. In this
scenario, gdb assumes the entries are terminated and then proceeds to
reject the section because it reads a subsequent entry as if it were a
header.
It seems to me that, because each header describes the size of each
.debug_aranges CU, it's better to simply ignore 0,0 entries and simply
read to the end. That is what this patch does.
I've patched an existing test to provide a regression test for this.
|
|
Building with clang++-14, I see:
CXX dwarf2/cooked-index.o
In file included from /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.c:21:
/home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.h:172:12: error: explicitly defaulted move constructor is implicitly deleted [-Werror,-Wdefaulted-function-deleted]
explicit cooked_index (cooked_index &&other) = default;
^
/home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.h:225:16: note: move constructor of 'cooked_index' is implicitly deleted because field 'm_storage' has a deleted move constructor
auto_obstack m_storage;
^
/home/smarchi/src/binutils-gdb/gdb/../gdbsupport/gdb_obstack.h:128:28: note: 'auto_obstack' has been explicitly marked deleted here
DISABLE_COPY_AND_ASSIGN (auto_obstack);
^
In file included from /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.c:21:
/home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.h:174:17: error: explicitly defaulted move assignment operator is implicitly deleted [-Werror,-Wdefaulted-function-deleted]
cooked_index &operator= (cooked_index &&other) = default;
^
/home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.h:225:16: note: move assignment operator of 'cooked_index' is implicitly deleted because field 'm_storage' has a deleted move assignment operator
auto_obstack m_storage;
^
/home/smarchi/src/binutils-gdb/gdb/../gdbsupport/gdb_obstack.h:128:3: note: 'operator=' has been explicitly marked deleted here
DISABLE_COPY_AND_ASSIGN (auto_obstack);
^
/home/smarchi/src/binutils-gdb/gdb/../include/ansidecl.h:425:8: note: expanded from macro 'DISABLE_COPY_AND_ASSIGN'
void operator= (const TYPE &) = delete
^
We explicitly make cooked_index have a default move constructor and
move assignment operator. But it doesn't actually happen because
cooked_index has a field of type auto_obstack, which isn't movable.
We don't actually need cooked_index to be movable at the moment, so
remove those lines.
Change-Id: Ifc1fe3d7d67e3ae1a14363d6c1869936fe80b0a2
|
|
Reading a simple file compiled with :
$ gcc -DONE=1 -gdwarf-4 -g3 test.c
$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
I get:
Reading symbols from /tmp/cwd/a.out...
/home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.c:332:11: runtime error: null pointer passed as argument 2, which is declared to never be null
It looks like even if the size is 0 (the size of the `entries` vector is
0), we shouldn't be passing a NULL pointer to memcpy. And
`entries.data ()` returns NULL.
Fix that by using std::vector::insert to insert the items of entries
into m_entries. I haven't checked, but it should essentially compile
down to a memcpy, since the vector elements are trivially copyiable.
Change-Id: I75f1c901e9b522e42e89eb5936e2c70d68eb21e5
|
|
Change subfile::name to be a string, for easier memory management.
Change buildsym_compunit::m_comp_dir as well, since we move one in to
the other at some point in patch_subfile_names, so it's easier to do
both at the same time. There are various NULL checks for both fields
currently, replace them with empty checks, I think it ends up
equivalent.
I can't test the change in xcoffread.c, it's best-effort.
Change-Id: I62b5fb08b2089e096768a090627ac7617e90a016
|
|
When building with -std=c++11, I get:
CXX dwarf2/read.o
/home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c: In function ‘void dwarf2_build_psymtabs_hard(dwarf2_per_objfile*)’:
/home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:7130:23: error: expected type-specifier before ‘typeof’
7130 | using iter_type = typeof (per_bfd->all_comp_units.begin ());
| ^~~~~~
This is because typeof is a GNU extension. Use C++'s decltype keyword
instead.
Change-Id: Ieca2e8d25e50f71dc6c615a405a972a54de3ef14
|
|
Now that the psymtab reader has been removed, the
dwarf2_per_cu_data::v union is no longer needed. Instead, we can
simply move the members from dwarf2_per_cu_quick_data into
dwarf2_per_cu_data and remove the "quick" object entirely.
|
|
This removes the DWARF psymtab reader.
|
|
This patch finally enables the new indexer. It is left until this
point in the series to avoid any regressions; in particular, it has to
come after the changes to the DWARF index writer to avoid this
problem.
However, if you experiment with the series, this patch can be moved
anywhere from the patch to wire in the new reader to this point.
Moving this patch around is how I got separate numbers for the
parallelization and background finalization patches.
In the ongoing performance example, this reduces the time from the
baseline of 1.598869 to 0.903534.
|
|
This updates the .debug_names writer to work with the new DWARF
scanner.
|
|
This updates the .gdb_index writer to work with the new DWARF scanner.
The .debug_names writer is deferred to another patch, to make review
simpler.
This introduces a small hack to psyms_seen_size, but is
inconsequential because this function will be deleted in a subsequent
patch.
|
|
This updates the DWARF index writing code to make the addrmap-writing
a bit more generic. Now, it can handle multiple maps, and it can work
using the maps generated by the new indexer.
Note that the new addrmap_index_data::using_index field will be
deleted in a future patch, when the rest of the DWARF psymtab code is
removed.
|
|
To support the removal of partial symtabs from the DWARF index writer,
this makes a small change to have write_address_map accept the address
map as a parameter, rather than assuming it always comes from the
per-BFD object.
|
|
In order to change the DWARF index writer to avoid partial symtabs,
this patch changes the key type in psym_index_map (and renames that
type as well). Using the dwarf2_per_cu_data as the key makes it
simpler to reuse this code with the new indexer.
|
|
We'll be removing all the psymtab code from the DWARF reader. As a
preparatory step, this renames write_psymtabs_to_index to avoid the
"psymtab" name.
|
|
After scanning the CUs, the DWARF indexer merges all the data into a
single vector, canonicalizing C++ names as it proceeds. While not
necessarily single-threaded, this process is currently done in just
one thread, to keep memory costs lower.
However, this work is all done without reference to any data outside
of the indexes. This patch improves the apparent performance of GDB
by moving it to the background. All uses of the index are then made
to wait for this process to complete.
In our ongoing example, this reduces the scanning time on gdb itself
to 0.173937 (wall). Recall that before this patch, the time was
0.668923; and psymbol reader does this in 1.598869. That is, at the
end of this series, we see about a 10x speedup.
|
|
This parallelizes the new DWARF indexer. The indexer's storage was
designed so that each storage object and each indexer is fully
independent. This setup makes it simple to scan different CUs
independently.
This patch creates a new cooked index storage object per thread, and
then scans a subset of all the CUs in each such thread, using gdb's
existing thread pool.
In the ongoing "gdb gdb" example, this patch reduces the wall time
down to 0.668923, from 0.903534. (Note that the 0.903534 is the time
for the new index -- that is, when the "enable the new index" patch is
rebased to before this one. However, in the final series, that patch
appears toward the end. Hopefully this isn't too confusing.)
|
|
Because BFD is not thread-safe, we need to be sure that any section
data that is needed is read before trying to do any DWARF indexing in
the background.
This patch takes a simple approach to this -- it pre-reads the
"info"-related sections. This is done for the main file, but also any
auxiliary files as well, such as the DWO file.
This patch could be perhaps enhanced by removing some now-redundant
calls to dwarf2_section_info::read.
|
|
This wires the new DWARF indexer into the existing reader code. That
is, this patch makes the modification necessary to enable the new
indexer. It is not actually enabled by this patch -- that will be
done later.
I did a bit of performance testing for this patch and a few others. I
copied my built gdb to /tmp, so that each test would be done on the
same executable. Then, each time, I did:
$ ./gdb -nx
(gdb) maint time 1
(gdb) file /tmp/gdb
This patch is the baseline and on one machine came in at 1.598869 wall
time.
|
|
This implements quick_symbol_functions for the cooked DWARF index.
This is the code that interfaces between the new index and the rest of
gdb. Cooked indexes still aren't created by anything.
For the most part this is straightforward. It shares some concepts
with the existing DWARF indices. However, because names are stored
pre-split in the cooked index, name lookup here is necessarily
different; see expand_symtabs_matching for the gory details.
|
|
This patch adds the code to index DWARF. This is just the scanner; it
reads the DWARF and constructs the index, but nothing calls it yet.
The indexer is split into two parts: a storage object and an indexer
object. This is done to support the parallelization of this code -- a
future patch will create a single storage object per thread.
|
|
This patch introduces the new DWARF index class. It is called
"cooked" to contrast against a "raw" index, which is mapped from disk
without extra effort.
Nothing constructs a cooked index yet. The essential idea here is
that index entries are created via the "add" method; then when all the
entries have been read, they are "finalize"d -- name canonicalization
is performed and the entries are added to a sorted vector.
Entries use the DWARF name (DW_AT_name) or linkage name, not the full
name as is done for partial symbols.
These two facets -- the short name and the deferred canonicalization
-- help improve the performance of this approach. This will become
clear in later patches, when parallelization is added.
Some special code is needed for Ada, because GNAT only emits mangled
("encoded", in the Ada lingo) names, and so we reconstruct the
hierarchical structure after the fact. This is also done in the
finalization phase.
One other aspect worth noting is that the way the "main" function is
found is different in the new code. Currently gdb will notice
DW_AT_main_subprogram, but won't recognize "main" during reading --
this is done later, via explicit symbol lookup. This is done
differently in the new code so that finalization can be done in the
background without then requiring a synchronization to look up the
symbol.
|
|
This updates skip_one_die to speed it up in the cases where either
sibling_offset or size_if_constant are set.
|
|
The new DIE scanner works more or less along the lines indicated by
the text for the .debug_names section, disregarding the bugs in the
specification.
While working on this, I noticed that whether a DIE is interesting is
a static property of the DIE's abbrev. It also turns out that many
abbrevs imply a static size for the DIE data, and additionally that
for many abbrevs, the sibling offset is stored at a constant offset
from the start of the DIE.
This patch changes the abbrev reader to analyze each abbrev and stash
the results on the abbrev. These combine to speed up the new indexer.
If the "interesting" flag is false, GDB knows to skip the DIE
immediately. If the sibling offset is statically known, skipping can
be done without reading any attributes; and in some other cases, the
DIE can be skipped using simple arithmetic.
|
|
The replacement for the DWARF psymbol reader works in a somewhat
different way. The current reader reads and stores all the DIEs that
might be interesting. Then, if it is missing a DIE, it re-scans the
CU and reads them all. This approach is used for both intra- and
inter-CU references.
I instrumented the partial DIE hash to see how frequently it was used:
[ 0] -> 1538165
[ 1] -> 4912
[ 2] -> 96102
[ 3] -> 175
[ 4] -> 244
That is, most DIEs are never used, and some are looked up twice -- but
this is just an artifact of the implementation of
partial_die_info::fixup, which may do two lookups.
Based on this, the new implementation doesn't try to store any DIEs,
but instead just re-scans them on demand. In order to do this,
though, it is convenient to have a cache of DWARF abbrevs. This way,
if a second CU is needed to resolve an inter-CU reference, the abbrevs
for that CU need only be computed a single time.
|
|
This changes the file_and_directory object to be able to compute and
cache the "fullname" in the same way that is done by other code, like
the psymtab reader.
|
|
The new DWARF scanner needs to save the entire cutu_reader object, not
just parts of it. In order to make this possible, this patch
refactors build_type_psymtabs_reader. This change is done separately
because it is easy to review in isolation and it helps make the later
patches smaller.
|
|
This adds a new overload of dwarf5_djb_hash. This is used in
subsequent patches.
|
|
This patch adds an option to skip_one_die that causes it not to skip
child DIEs. This is needed in the new scanner.
|
|
This changes dwarf2_get_pc_bounds so that it does not directly access
a psymtab or psymtabs_addrmap. Instead, both the addrmap and the
desired payload are passed as parameters. This makes it suitable to
be used by the new indexer.
|
|
This adds a new member to dwarf2_per_cu_data that indicates whether
addresses have been seen for this CU. This is then set by the
.debug_aranges reader. The idea here is to detect when a CU does not
have address information, so that the new indexer will know to do
extra scanning in that case.
|
|
Tom de Vries found a failure that we tracked down to a latent bug in
read_addrmap_from_aranges (previously create_addrmap_from_aranges).
The bug is that this code can erroneously reject .debug_aranges when
dwz is in use, due to CUs at duplicate offsets. Because aranges can't
refer to a CU coming from the dwz file, the fix is to simply skip such
CUs in the loop.
|
|
This patch splits create_addrmap_from_aranges into a wrapper function
and a worker function. The worker function is then used in a later
patch.
|
|
Remove all macros related to getting and setting some symbol value:
#define SYMBOL_VALUE(symbol) (symbol)->value.ivalue
#define SYMBOL_VALUE_ADDRESS(symbol) \
#define SET_SYMBOL_VALUE_ADDRESS(symbol, new_value) \
#define SYMBOL_VALUE_BYTES(symbol) (symbol)->value.bytes
#define SYMBOL_VALUE_COMMON_BLOCK(symbol) (symbol)->value.common_block
#define SYMBOL_BLOCK_VALUE(symbol) (symbol)->value.block
#define SYMBOL_VALUE_CHAIN(symbol) (symbol)->value.chain
#define MSYMBOL_VALUE(symbol) (symbol)->value.ivalue
#define MSYMBOL_VALUE_RAW_ADDRESS(symbol) ((symbol)->value.address + 0)
#define MSYMBOL_VALUE_ADDRESS(objfile, symbol) \
#define BMSYMBOL_VALUE_ADDRESS(symbol) \
#define SET_MSYMBOL_VALUE_ADDRESS(symbol, new_value) \
#define MSYMBOL_VALUE_BYTES(symbol) (symbol)->value.bytes
#define MSYMBOL_BLOCK_VALUE(symbol) (symbol)->value.block
Replace them with equivalent methods on the appropriate objects.
Change-Id: Iafdab3b8eefc6dc2fd895aa955bf64fafc59ed50
|
|
Straightforward change, return an std::string instead of a
gdb::unique_xmalloc_ptr<char>. No behavior change expected.
Change-Id: Ia5e94c94221c35f978bb1b7bdffbff7209e0520e
|