aboutsummaryrefslogtreecommitdiff
path: root/gdb/dwarf2
AgeCommit message (Collapse)AuthorFilesLines
2022-04-21Always use dwarf2_initialize_objfileTom Tromey2-13/+5
Internally we noticed that some tests would fail like so on Windows: warning: Section .debug_aranges in [...] has duplicate debug_info_offset 0x0, ignoring .debug_aranges. Debugging showed that, in fact, a second CU was being created at this offset. We tracked this down to the fact that, while the ELF reader is careful to re-use the per-BFD data, other readers are not, and could re-read the DWARF data multiple times. However, since the change to allow an objfile to have multiple "quick symbol" implementations, there's no reason for this approach -- it's safe and easy for all symbol readers to reuse the per-BFD data when reading DWARF. This patch implements this idea, simplifying dwarf2_build_psymtabs and making it private, and then switching to dwarf2_initialize_objfile as the sole way to start the DWARF reader. Note that, while I think the call to dwarf2_build_frame_info in machoread.c is also obsolete, I haven't attempted to remove it here.
2022-04-21gdbsupport: add path_join functionSimon Marchi2-24/+21
In this review [1], Eli pointed out that we should be careful when concatenating file names to avoid duplicated slashes. On Windows, a double slash at the beginning of a file path has a special meaning. So naively concatenating "/" and "foo/bar" would give "//foo/bar", which would not give the desired results. We already have a few spots doing: if (first_path ends with a slash) path = first_path + second_path else path = first_path + slash + second_path In general, I think it's nice to avoid superfluous slashes in file paths, since they might end up visible to the user and look a bit unprofessional. Introduce the path_join function that can be used to join multiple path components together (along with unit tests). I initially wanted to make it possible to join two absolute paths, to support the use case of prepending a sysroot path to a target file path, or the prepending the debug-file-directory to a target file path. But the code in solib_find_1 shows that it is more complex than this anyway (for example, when the right hand side is a Windows path with a drive letter). So I don't think we need to support that case in path_join. That also keeps the implementation simpler. Change a few spots to use path_join to show how it can be used. I believe that all the spots I changed are guarded by some checks that ensure the right hand side operand is not an absolute path. Regression-tested on Ubuntu 18.04. Built-tested on Windows, and I also ran the new unit-test there. [1] https://sourceware.org/pipermail/gdb-patches/2022-April/187559.html Change-Id: I0df889f7e3f644e045f42ff429277b732eb6c752
2022-04-20Replace symbol_symtab with symbol::symtabTom Tromey1-6/+6
This turns symbol_symtab into a method on symbol. It also replaces symbol_set_symtab with a method.
2022-04-20Add accessors for symbol's artificial fieldTom Tromey1-1/+1
For a series I'm experimenting with, it was handy to hide a symbol's "artificial" field behind accessors. This patch is the result.
2022-04-20Unify the DWARF index holdersTom Tromey3-73/+72
The dwarf2_per_bfd object has a separate field for each possible kind of index. Until an earlier patch in this series, two of these were even derived from a common base class, but still had separate slots. This patch unifies all the index fields using the common base class that was introduced earlier in this series. This makes it more obvious that only a single index can be active at a time, and also removes some code from dwarf2_initialize_objfile.
2022-04-20Add an ad hoc version check to dwarf_scanner_baseTom Tromey2-2/+15
Some generic code in the DWARF reader has a special case for older versions of .gdb_index. This patch adds an ad hoc version check method so that these spots can work without specific knowledge of which index is in use.
2022-04-20Simplify version check in dw2_symtab_iter_nextTom Tromey1-5/+5
This simplifies the index versio check in dw2_symtab_iter_next, by passing a reference to the index object to this function. This avoids an indirection via the per_bfd object.
2022-04-20Introduce and use dwarf_scanner_baseTom Tromey3-8/+26
This introduces dwarf_scanner_base, a base class for all the index readers in the DWARF code. Then, it changes both mapped_index_base and cooked_index_vector to derive from this new base class.
2022-04-20Introduce readnow_functionsTom Tromey1-53/+56
This introduces readnow_functions, a new subclass of dwarf2_base_index_functions, and changes the DWARF reader to use it. This lets us drop the "index is NULL" hack from the gdb index code.
2022-04-20Remove some "OBJF_READNOW" code from dwarf2_debug_names_indexTom Tromey1-16/+1
The dwarf2_debug_names_index code treats a NULL debug_names_table as if it were from OBJF_READNOW. However, this trick is only done for gdb_index, never for debug_names -- see dwarf2_initialize_objfile.
2022-04-20Let mapped index classes create the quick_symbol_functions objectTom Tromey2-7/+28
This changes the mapped index classes to create the quick_symbol_functions objects. This is a step toward having a more abstract interface to mapped indices.
2022-04-20Give mapped_index_base a virtual destructorTom Tromey1-4/+1
This changes mapped_index_base to have a virtual destructor, so it can be destroyed via its base class.
2022-04-20Move mapped_index_base to new header fileTom Tromey3-72/+98
This moves mapped_index_base and the helper struct name_component to a new header file in gdb/dwarf2/.
2022-04-20Micro-optimize cooked_index_entry::full_nameTom Tromey1-6/+5
I noticed that cooked_index_entry::full_name can return the canonical string when there is no parent entry. Regression tested on x86-64 Fedora 34.
2022-04-18gdbsupport: make gdb_abspath return an std::stringSimon Marchi1-3/+1
I'm trying to switch these functions to use std::string instead of char arrays, as much as possible. Some callers benefit from it (can avoid doing a copy of the result), while others suffer (have to make one more copy). Change-Id: Iced49b8ee2f189744c5072a3b217aab5af17a993
2022-04-16Add comments to dwarf2/abbrev-cache.hTom Tromey1-1/+9
This patch started when I noticed that the unordered_set include wasn't needed in abbrev-cache.h. (That was probably leftover from some earlier implementation of the class.) Then, I noticed that the class itself was under-commented. This patch fixes both issues.
2022-04-14Ignore 0,0 entries in .debug_arangesTom Tromey1-2/+9
When running the internal AdaCore test suite against the new DWARF indexer, I found one regression on RISC-V. The test in question uses --gc-sections, and winds up with an entry in the middle of a .debug_aranges that has both address and length of 0. In this scenario, gdb assumes the entries are terminated and then proceeds to reject the section because it reads a subsequent entry as if it were a header. It seems to me that, because each header describes the size of each .debug_aranges CU, it's better to simply ignore 0,0 entries and simply read to the end. That is what this patch does. I've patched an existing test to provide a regression test for this.
2022-04-14gdb: remove move constructor and move assignment operator from cooked_indexSimon Marchi1-2/+0
Building with clang++-14, I see: CXX dwarf2/cooked-index.o In file included from /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.c:21: /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.h:172:12: error: explicitly defaulted move constructor is implicitly deleted [-Werror,-Wdefaulted-function-deleted] explicit cooked_index (cooked_index &&other) = default; ^ /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.h:225:16: note: move constructor of 'cooked_index' is implicitly deleted because field 'm_storage' has a deleted move constructor auto_obstack m_storage; ^ /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/gdb_obstack.h:128:28: note: 'auto_obstack' has been explicitly marked deleted here DISABLE_COPY_AND_ASSIGN (auto_obstack); ^ In file included from /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.c:21: /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.h:174:17: error: explicitly defaulted move assignment operator is implicitly deleted [-Werror,-Wdefaulted-function-deleted] cooked_index &operator= (cooked_index &&other) = default; ^ /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.h:225:16: note: move assignment operator of 'cooked_index' is implicitly deleted because field 'm_storage' has a deleted move assignment operator auto_obstack m_storage; ^ /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/gdb_obstack.h:128:3: note: 'operator=' has been explicitly marked deleted here DISABLE_COPY_AND_ASSIGN (auto_obstack); ^ /home/smarchi/src/binutils-gdb/gdb/../include/ansidecl.h:425:8: note: expanded from macro 'DISABLE_COPY_AND_ASSIGN' void operator= (const TYPE &) = delete ^ We explicitly make cooked_index have a default move constructor and move assignment operator. But it doesn't actually happen because cooked_index has a field of type auto_obstack, which isn't movable. We don't actually need cooked_index to be movable at the moment, so remove those lines. Change-Id: Ifc1fe3d7d67e3ae1a14363d6c1869936fe80b0a2
2022-04-12gdb: fix "passing NULL to memcpy" UBsan error in dwarf2/cooked-index.cSimon Marchi1-4/+2
Reading a simple file compiled with : $ gcc -DONE=1 -gdwarf-4 -g3 test.c $ gcc --version gcc (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0 I get: Reading symbols from /tmp/cwd/a.out... /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.c:332:11: runtime error: null pointer passed as argument 2, which is declared to never be null It looks like even if the size is 0 (the size of the `entries` vector is 0), we shouldn't be passing a NULL pointer to memcpy. And `entries.data ()` returns NULL. Fix that by using std::vector::insert to insert the items of entries into m_entries. I haven't checked, but it should essentially compile down to a memcpy, since the vector elements are trivially copyiable. Change-Id: I75f1c901e9b522e42e89eb5936e2c70d68eb21e5
2022-04-12gdb: change subfile::name and buildsym_compunit::m_comp_dir to stringsSimon Marchi1-5/+5
Change subfile::name to be a string, for easier memory management. Change buildsym_compunit::m_comp_dir as well, since we move one in to the other at some point in patch_subfile_names, so it's easier to do both at the same time. There are various NULL checks for both fields currently, replace them with empty checks, I think it ends up equivalent. I can't test the change in xcoffread.c, it's best-effort. Change-Id: I62b5fb08b2089e096768a090627ac7617e90a016
2022-04-12gdb: use decltype instead of typeof in dwarf2/read.cSimon Marchi1-1/+1
When building with -std=c++11, I get: CXX dwarf2/read.o /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c: In function ‘void dwarf2_build_psymtabs_hard(dwarf2_per_objfile*)’: /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:7130:23: error: expected type-specifier before ‘typeof’ 7130 | using iter_type = typeof (per_bfd->all_comp_units.begin ()); | ^~~~~~ This is because typeof is a GNU extension. Use C++'s decltype keyword instead. Change-Id: Ieca2e8d25e50f71dc6c615a405a972a54de3ef14
2022-04-12Remove dwarf2_per_cu_data::vTom Tromey2-65/+33
Now that the psymtab reader has been removed, the dwarf2_per_cu_data::v union is no longer needed. Instead, we can simply move the members from dwarf2_per_cu_quick_data into dwarf2_per_cu_data and remove the "quick" object entirely.
2022-04-12Delete DWARF psymtab codeTom Tromey6-2964/+198
This removes the DWARF psymtab reader.
2022-04-12Enable the new DWARF indexerTom Tromey1-1/+2
This patch finally enables the new indexer. It is left until this point in the series to avoid any regressions; in particular, it has to come after the changes to the DWARF index writer to avoid this problem. However, if you experiment with the series, this patch can be moved anywhere from the patch to wire in the new reader to this point. Moving this patch around is how I got separate numbers for the parallelization and background finalization patches. In the ongoing performance example, this reduces the time from the baseline of 1.598869 to 0.903534.
2022-04-12Adapt .debug_names writer to new DWARF scannerTom Tromey1-10/+42
This updates the .debug_names writer to work with the new DWARF scanner.
2022-04-12Adapt .gdb_index writer to new DWARF scannerTom Tromey1-8/+55
This updates the .gdb_index writer to work with the new DWARF scanner. The .debug_names writer is deferred to another patch, to make review simpler. This introduces a small hack to psyms_seen_size, but is inconsequential because this function will be deleted in a subsequent patch.
2022-04-12Genericize addrmap handling in the DWARF index writerTom Tromey1-9/+28
This updates the DWARF index writing code to make the addrmap-writing a bit more generic. Now, it can handle multiple maps, and it can work using the maps generated by the new indexer. Note that the new addrmap_index_data::using_index field will be deleted in a future patch, when the rest of the DWARF psymtab code is removed.
2022-04-12Change parameters to write_address_mapTom Tromey1-4/+4
To support the removal of partial symtabs from the DWARF index writer, this makes a small change to have write_address_map accept the address map as a parameter, rather than assuming it always comes from the per-BFD object.
2022-04-12Change the key type in psym_index_mapTom Tromey1-9/+9
In order to change the DWARF index writer to avoid partial symtabs, this patch changes the key type in psym_index_map (and renames that type as well). Using the dwarf2_per_cu_data as the key makes it simpler to reuse this code with the new indexer.
2022-04-12Rename write_psymtabs_to_indexTom Tromey3-9/+9
We'll be removing all the psymtab code from the DWARF reader. As a preparatory step, this renames write_psymtabs_to_index to avoid the "psymtab" name.
2022-04-12"Finalize" the DWARF index in the backgroundTom Tromey2-2/+25
After scanning the CUs, the DWARF indexer merges all the data into a single vector, canonicalizing C++ names as it proceeds. While not necessarily single-threaded, this process is currently done in just one thread, to keep memory costs lower. However, this work is all done without reference to any data outside of the indexes. This patch improves the apparent performance of GDB by moving it to the background. All uses of the index are then made to wait for this process to complete. In our ongoing example, this reduces the scanning time on gdb itself to 0.173937 (wall). Recall that before this patch, the time was 0.668923; and psymbol reader does this in 1.598869. That is, at the end of this series, we see about a 10x speedup.
2022-04-12Parallelize DWARF indexingTom Tromey4-88/+289
This parallelizes the new DWARF indexer. The indexer's storage was designed so that each storage object and each indexer is fully independent. This setup makes it simple to scan different CUs independently. This patch creates a new cooked index storage object per thread, and then scans a subset of all the CUs in each such thread, using gdb's existing thread pool. In the ongoing "gdb gdb" example, this patch reduces the wall time down to 0.668923, from 0.903534. (Note that the 0.903534 is the time for the new index -- that is, when the "enable the new index" patch is rebased to before this one. However, in the final series, that patch appears toward the end. Hopefully this isn't too confusing.)
2022-04-12Pre-read DWARF section dataTom Tromey2-122/+103
Because BFD is not thread-safe, we need to be sure that any section data that is needed is read before trying to do any DWARF indexing in the background. This patch takes a simple approach to this -- it pre-reads the "info"-related sections. This is done for the main file, but also any auxiliary files as well, such as the DWO file. This patch could be perhaps enhanced by removing some now-redundant calls to dwarf2_section_info::read.
2022-04-12Wire in the new DWARF indexerTom Tromey1-59/+146
This wires the new DWARF indexer into the existing reader code. That is, this patch makes the modification necessary to enable the new indexer. It is not actually enabled by this patch -- that will be done later. I did a bit of performance testing for this patch and a few others. I copied my built gdb to /tmp, so that each test would be done on the same executable. Then, each time, I did: $ ./gdb -nx (gdb) maint time 1 (gdb) file /tmp/gdb This patch is the baseline and on one machine came in at 1.598869 wall time.
2022-04-12Implement quick_symbol_functions for cooked DWARF indexTom Tromey2-0/+281
This implements quick_symbol_functions for the cooked DWARF index. This is the code that interfaces between the new index and the rest of gdb. Cooked indexes still aren't created by anything. For the most part this is straightforward. It shares some concepts with the existing DWARF indices. However, because names are stored pre-split in the cooked index, name lookup here is necessarily different; see expand_symtabs_matching for the gory details.
2022-04-12The new DWARF indexerTom Tromey2-2/+818
This patch adds the code to index DWARF. This is just the scanner; it reads the DWARF and constructs the index, but nothing calls it yet. The indexer is split into two parts: a storage object and an indexer object. This is done to support the parallelization of this code -- a future patch will create a single storage object per thread.
2022-04-12Introduce the new DWARF index classTom Tromey2-0/+530
This patch introduces the new DWARF index class. It is called "cooked" to contrast against a "raw" index, which is mapped from disk without extra effort. Nothing constructs a cooked index yet. The essential idea here is that index entries are created via the "add" method; then when all the entries have been read, they are "finalize"d -- name canonicalization is performed and the entries are added to a sorted vector. Entries use the DWARF name (DW_AT_name) or linkage name, not the full name as is done for partial symbols. These two facets -- the short name and the deferred canonicalization -- help improve the performance of this approach. This will become clear in later patches, when parallelization is added. Some special code is needed for Ada, because GNAT only emits mangled ("encoded", in the Ada lingo) names, and so we reconstruct the hierarchical structure after the fact. This is also done in the finalization phase. One other aspect worth noting is that the way the "main" function is found is different in the new code. Currently gdb will notice DW_AT_main_subprogram, but won't recognize "main" during reading -- this is done later, via explicit symbol lookup. This is done differently in the new code so that finalization can be done in the background without then requiring a synchronization to look up the symbol.
2022-04-12Update skip_one_die for new abbrev propertiesTom Tromey1-0/+18
This updates skip_one_die to speed it up in the cases where either sibling_offset or size_if_constant are set.
2022-04-12Statically examine abbrev propertiesTom Tromey2-2/+159
The new DIE scanner works more or less along the lines indicated by the text for the .debug_names section, disregarding the bugs in the specification. While working on this, I noticed that whether a DIE is interesting is a static property of the DIE's abbrev. It also turns out that many abbrevs imply a static size for the DIE data, and additionally that for many abbrevs, the sibling offset is stored at a constant offset from the start of the DIE. This patch changes the abbrev reader to analyze each abbrev and stash the results on the abbrev. These combine to speed up the new indexer. If the "interesting" flag is false, GDB knows to skip the DIE immediately. If the sibling offset is statically known, skipping can be done without reading any attributes; and in some other cases, the DIE can be skipped using simple arithmetic.
2022-04-12Introduce DWARF abbrev cacheTom Tromey4-3/+129
The replacement for the DWARF psymbol reader works in a somewhat different way. The current reader reads and stores all the DIEs that might be interesting. Then, if it is missing a DIE, it re-scans the CU and reads them all. This approach is used for both intra- and inter-CU references. I instrumented the partial DIE hash to see how frequently it was used: [ 0] -> 1538165 [ 1] -> 4912 [ 2] -> 96102 [ 3] -> 175 [ 4] -> 244 That is, most DIEs are never used, and some are looked up twice -- but this is just an artifact of the implementation of partial_die_info::fixup, which may do two lookups. Based on this, the new implementation doesn't try to store any DIEs, but instead just re-scans them on demand. In order to do this, though, it is convenient to have a cache of DWARF abbrevs. This way, if a second CU is needed to resolve an inter-CU reference, the abbrevs for that CU need only be computed a single time.
2022-04-12Add "fullname" handling to file_and_directoryTom Tromey2-0/+60
This changes the file_and_directory object to be able to compute and cache the "fullname" in the same way that is done by other code, like the psymtab reader.
2022-04-12Refactor build_type_psymtabs_readerTom Tromey1-12/+7
The new DWARF scanner needs to save the entire cutu_reader object, not just parts of it. In order to make this possible, this patch refactors build_type_psymtabs_reader. This change is done separately because it is easy to review in isolation and it helps make the later patches smaller.
2022-04-12Add new overload of dwarf5_djb_hashTom Tromey2-0/+18
This adds a new overload of dwarf5_djb_hash. This is used in subsequent patches.
2022-04-12Let skip_one_die not skip childrenTom Tromey1-6/+10
This patch adds an option to skip_one_die that causes it not to skip child DIEs. This is needed in the new scanner.
2022-04-12Refactor dwarf2_get_pc_boundsTom Tromey1-20/+29
This changes dwarf2_get_pc_bounds so that it does not directly access a psymtab or psymtabs_addrmap. Instead, both the addrmap and the desired payload are passed as parameters. This makes it suitable to be used by the new indexer.
2022-04-12Add dwarf2_per_cu_data::addresses_seenTom Tromey2-0/+7
This adds a new member to dwarf2_per_cu_data that indicates whether addresses have been seen for this CU. This is then set by the .debug_aranges reader. The idea here is to detect when a CU does not have address information, so that the new indexer will know to do extra scanning in that case.
2022-04-12Fix latent bug in read_addrmap_from_arangesTom Tromey1-2/+3
Tom de Vries found a failure that we tracked down to a latent bug in read_addrmap_from_aranges (previously create_addrmap_from_aranges). The bug is that this code can erroneously reject .debug_aranges when dwz is in use, due to CUs at duplicate offsets. Because aranges can't refer to a CU coming from the dwz file, the fix is to simply skip such CUs in the loop.
2022-04-12Split create_addrmap_from_arangesTom Tromey1-17/+32
This patch splits create_addrmap_from_aranges into a wrapper function and a worker function. The worker function is then used in a later patch.
2022-04-11gdb: remove symbol value macrosSimon Marchi2-22/+19
Remove all macros related to getting and setting some symbol value: #define SYMBOL_VALUE(symbol) (symbol)->value.ivalue #define SYMBOL_VALUE_ADDRESS(symbol) \ #define SET_SYMBOL_VALUE_ADDRESS(symbol, new_value) \ #define SYMBOL_VALUE_BYTES(symbol) (symbol)->value.bytes #define SYMBOL_VALUE_COMMON_BLOCK(symbol) (symbol)->value.common_block #define SYMBOL_BLOCK_VALUE(symbol) (symbol)->value.block #define SYMBOL_VALUE_CHAIN(symbol) (symbol)->value.chain #define MSYMBOL_VALUE(symbol) (symbol)->value.ivalue #define MSYMBOL_VALUE_RAW_ADDRESS(symbol) ((symbol)->value.address + 0) #define MSYMBOL_VALUE_ADDRESS(objfile, symbol) \ #define BMSYMBOL_VALUE_ADDRESS(symbol) \ #define SET_MSYMBOL_VALUE_ADDRESS(symbol, new_value) \ #define MSYMBOL_VALUE_BYTES(symbol) (symbol)->value.bytes #define MSYMBOL_BLOCK_VALUE(symbol) (symbol)->value.block Replace them with equivalent methods on the appropriate objects. Change-Id: Iafdab3b8eefc6dc2fd895aa955bf64fafc59ed50
2022-04-07gdb: change file_file_name to return an std::stringSimon Marchi3-18/+10
Straightforward change, return an std::string instead of a gdb::unique_xmalloc_ptr<char>. No behavior change expected. Change-Id: Ia5e94c94221c35f978bb1b7bdffbff7209e0520e