aboutsummaryrefslogtreecommitdiff
path: root/gdb/dwarf2
AgeCommit message (Collapse)AuthorFilesLines
47 hoursgdb/dwarf: skip type units in create_dwo_cus_hash_tableSimon Marchi1-0/+6
When compiling with -gsplit-dwarf -fdebug-types-section, DWARF 5 .debug_info.dwo sections may contain some type units: $ llvm-dwarfdump -F -color a-test.dwo | head -n 5 a-test.dwo: file format elf64-x86-64 .debug_info.dwo contents: 0x00000000: Type Unit: length = 0x000008a0, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_type, abbr_offset = 0x0000, addr_size = 0x08, name = 'vector<int, std::allocator<int> >', type_signature = 0xb499dcf29e2928c4, type_offset = 0x0023 (next unit at 0x000008a4) In this case, create_dwo_cus_hash_table wrongly creates a dwo_unit for it and adds it to dwo_file::cus. create_dwo_debug_type_hash_table later correctly creates a dwo_unit that it puts in dwo_file::tus. This can be observed with: $ ./gdb -nx -q --data-directory=data-directory -ex 'maint set dwarf sync on' -ex "maint set worker-threads 0" -ex "set debug dwarf-read 2" -ex "file a.out" -batch ... [dwarf-read] create_dwo_cus_hash_table: Reading .debug_info.dwo for /home/smarchi/build/binutils-gdb/gdb/a-test.dwo: [dwarf-read] create_dwo_cus_hash_table: offset 0x0, dwo_id 0xb499dcf29e2928c4 [dwarf-read] create_dwo_cus_hash_table: offset 0x8a4, dwo_id 0x496a8791a842701b [dwarf-read] create_dwo_cus_hash_table: offset 0x941, dwo_id 0xefd13b3f62ea9fea ... [dwarf-read] create_dwo_debug_type_hash_table: Reading .debug_info.dwo for /home/smarchi/build/binutils-gdb/gdb/a-test.dwo [dwarf-read] create_dwo_debug_type_hash_table: offset 0x0, signature 0xb499dcf29e2928c4 [dwarf-read] create_dwo_debug_type_hash_table: offset 0x8a4, signature 0x496a8791a842701b [dwarf-read] create_dwo_debug_type_hash_table: offset 0x941, signature 0xefd13b3f62ea9fea ... Fix it by skipping anything that isn't a compile unit in create_dwo_cus_hash_table. After this patch, the debug output of create_dwo_cus_hash_table only shows one created dwo_unit, as we expect. I couldn't find any user-visible problem related to this, I just noticed it while debugging. Change-Id: I7dddf766fe1164123b6702027b1beb56114f25b1 Reviewed-By: Tom de Vries <tdevries@suse.de>
47 hoursgdb/dwarf: rename some functions to specify "dwo"Simon Marchi1-12/+13
Rename some functions to make it clearer that they are only relevant when dealing with DWO files. Change-Id: Ia0cd3320bf16ebdbdc3c09d7963f372e6679ef7c Reviewed-By: Tom de Vries <tdevries@suse.de>
8 days[gdb/symtab] Handle DW_OP_entry_value at function entryTom de Vries2-16/+76
On riscv64-linux, with test-case gdb.base/vla-optimized-out.exp I ran into: ... (gdb) p sizeof (a)^M $2 = <optimized out>^M (gdb) FAIL: $exp: o1: printed size of optimized out vla ... The variable a has type 0xbf: ... <1><bf>: Abbrev Number: 12 (DW_TAG_array_type) <c0> DW_AT_type : <0xe3> <c4> DW_AT_sibling : <0xdc> <2><c8>: Abbrev Number: 13 (DW_TAG_subrange_type) <c9> DW_AT_type : <0xdc> <cd> DW_AT_upper_bound : 13 byte block: a3 1 5a 23 1 8 20 24 8 20 26 31 1c (DW_OP_entry_value: (DW_OP_reg10 (a0)); DW_OP_plus_uconst: 1; DW_OP_const1u: 32; DW_OP_shl; DW_OP_const1u: 32; DW_OP_shra; DW_OP_lit1; DW_OP_minus) ... which has an upper bound using a DW_OP_entry_value, and since the corresponding call site contains no information to resolve the value of a0 at function entry: ... <2><6b>: Abbrev Number: 6 (DW_TAG_call_site) <6c> DW_AT_call_return_pc: 0x638 <74> DW_AT_call_origin : <0x85> ... evaluting the dwarf expression fails, and we get <optimized out>. My first thought was to try breaking at *f1 instead of f1 to see if that would help, but actually the breakpoint resolved to the same address. In other words, the inferior is stopped at function entry. Fix this by resolving DW_OP_entry_value when stopped at function entry by simply evaluating the expression. This handles these two cases (x86_64, using reg rdi): - DW_OP_entry_value: (DW_OP_regx: 5 (rdi)) - DW_OP_entry_value: (DW_OP_bregx: 5 (rdi) 0; DW_OP_deref_size: 4) Tested on x86_64-linux. Tested gdb.base/vla-optimized-out.exp on riscv64-linux. Tested an earlier version of gdb.dwarf2/dw2-entry-value-2.exp on riscv64-linux, but atm I'm running into trouble on that machine (cfarm92) so I haven't tested the current version there.
9 daysgdb/dwarf2: pass correct dwarf2_cu to lookup_dwo_id in create_cus_hash_tableSimon Marchi1-1/+1
Commit 71a48752660b ("gdb/dwarf: remove create_dwo_cu_reader") introduced a regression when handling files compiled with "-gsplit-dwarf -fdebug-types-section" (at least with clang): $ cat test.cpp #include <vector> int main() { std::vector<int> v; return v.size (); } $ clang++ -O0 test.cpp -g -gdwarf-5 -gsplit-dwarf -fdebug-types-section -o test $ ./gdb -nx -q --data-directory=data-directory ./test -ex "maint expand-symtabs" Reading symbols from ./test... /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:6159: internal-error: setup_type_unit_groups: Assertion `per_cu->is_debug_types' failed. In the main file, we have a skeleton CU with a certain DWO ID: 0x00000000: Compile Unit: ..., unit_type = DW_UT_skeleton, ..., DWO_id = 0x146eaa4daf5deef2, ... In the .dwo file, the first unit is a type unit with a certain type signature: 0x00000000: Type Unit: ..., unit_type = DW_UT_split_type, ..., type_signature = 0xb499dcf29e2928c4, ... and the split compile unit matching the DWO ID from the skeleton from the main file comes later: 0x0000117f: Compile Unit: ..., unit_type = DW_UT_split_compile, ..., DWO_id = 0x146eaa4daf5deef2, ... The problem introduced by the aforementioned commit is that when creating a dwo_unit structure representing the type unit, we use the signature (DWO id) from the skeleton, instead of the signature from the type unit's header. As a result, all dwo_units get created with the same signature (the DWO id) and only the first unit gets inserted in the hash table. When looking up the comp unit by DWO ID later on, we wrongly find the type unit, and try to expand a type unit as a comp unit, hitting the assert. Before that commit, we passed `reader.cu ()` to lookup_dwo_id, which yields a dwarf2_cu built from parsing the type unit's header. This dwarf2_cu contains the comp_unit_header with the correct signature. Fix the code to use `reader.cu ()` again. Another thing that enables this bug is the fact that since DWARF 5, type and compile units are all in .debug_info, and therefore read by create_cus_hash_table, so they both end up in dwo_file::cus. Type units should end up in dwo_file::tus, otherwise they won't be found by lookup_dwo_cutu. This bug hasn't given me trouble so far, so I'm not fixing it right now, but it's on my todo list. The problem can be seen with some tests, when using the dwarf5-fission-debug-types board: $ make check TESTS="gdb.cp/expand-sals.exp" RUNTESTFLAGS="--target_board=dwarf5-fission-debug-types CC_FOR_TARGET=clang CXX_FOR_TARGET=clang++" Running /home/simark/src/binutils-gdb/gdb/testsuite/gdb.cp/expand-sals.exp ... FAIL: gdb.cp/expand-sals.exp: gdb_breakpoint: set breakpoint at main (GDB internal error) But this patch also adds a DWARF assembler-based test that triggers the internal error. Note that the new test does not use the build_executable_and_dwo_files proc, because I found that it is subtly broken and doesn't work to put multiple units in a single .dwo file. The debug abbrev offset field in the second unit's header would be 0, when it should have been something else. The problem is that no linking is ever done to generate the .dwo file, so the relocation that would apply for this field is never applied. Instead, I generate two DWARF debug infos separately and link the .dwo file using gdb_compile, it seems to work fine. Change-Id: I96f809c56f703e25f72b8622c32e6bb91de20d6a Approved-By: Tom Tromey <tom@tromey.com>
9 daysUpdate copyright dates to include 2025Tom Tromey62-62/+62
This updates the copyright headers to include 2025. I did this by running gdb/copyright.py and then manually modifying a few files as noted by the script. Approved-By: Eli Zaretskii <eliz@gnu.org>
14 daysgdb/dwarf: rename cache -> abbrev_cacheSimon Marchi2-6/+6
"cache" is just a bit too generic to be clear. Change-Id: I8bf01c5fe84e076af1afd2453b1a115777630271
14 daysFix parsing .debug_aranges section for signed addresses.Martin Simmons1-2/+8
Some architectures, such as MIPS, have signed addresses and this changes read_addrmap_from_aranges to record them as signed when required. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32658 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-02Clean up cooked_index::done_readingTom Tromey4-30/+28
The cooked index worker maintains the state for the various state transition in the scanner. It is held by the cooked_index while scanning is in progress, then deleted once this has completed. I noticed that none of the arguments to cooked_index::done_reading were really needed -- the cooked_index already has access to the worker should it need it. Removing these parameters makes the code a bit simpler and also cleans up some confusing code around the use of the deferred warnings object. Regression tested on x86-64 Fedora 40. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-02gdb/dwarf2: remove unused includesSimon Marchi4-11/+2
Remove some includes reported as unused by clangd. Change-Id: I841938c3c6254e4f0d154a1e172c4968ff326333
2025-04-01Update cooked_index commentTom Tromey1-0/+9
This updates the cooked_index comment with some notes about object lifetimes, in an attempt to make navigating this code a bit simpler. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Add cooked_index_worker::done_readingTom Tromey4-33/+35
The two readers currently using cooked_index_worker shared some code. This patch factors this out into a new "done_reading" method. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Remove cooked_index_worker::result_typeTom Tromey5-66/+122
cooked_index_worker::result_type is an ad hoc tuple type used for transferring data between phases of the indexer. It's a bit unwieldy and another patch I'm working on would be somewhat nicer without it. This patch removes the type. Now cooked_index_ephemeral objects are transferred instead, which is handy because they already hold the needed state. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Update comments from moved methodsTom Tromey3-14/+14
This updates the "See xyz.h" comments for all the methods that were moved earlier in this series. Perhaps I should have removed them instead. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Move cooked_index_worker to cooked-index-worker.[ch]Tom Tromey4-273/+270
This moves the cooked_index_worker class to cooked-index-worker.[ch]. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Change includes in cooked-index-worker.hTom Tromey1-4/+3
This changes cooked-index-worker.h to include the new header files. This breaks the circular dependency that would otherwise be introduced in the next patch. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Move cooked_index_shard to new filesTom Tromey4-417/+466
This moves cooked_index_shard to a couple of new files, dwarf2/cooked-index-shard.[ch]. The rationale is the same as the previous patch: cooked-index.h had to be split to enable other cleanups. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Move cooked_index_entry to new filesTom Tromey4-446/+501
This moves cooked_index_entry and some related helper code to a couple of new files, dwarf2/cooked-index-entry.[ch]. The main rationale for this is that in order to finish this series and remove "cooked_index_worker::result_type", I had to split cooked-index.h into multiple parts to avoid circular includes. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Make language_requires_canonicalization 'static'Tom Tromey2-9/+5
language_requires_canonicalization is only called from cooked-index.c, so mark it as static. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Rename cooked_index_storageTom Tromey5-23/+27
This renames cooked_index_storage to cooked_index_worker_result, making its function more clear. It also updates the class comment to as well. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-01Rename cooked-index-storage.[ch]Tom Tromey4-13/+13
A discussion with Simon made me realize that cooked_index_storage isn't a very clear name, especially now that it's escaped from read.c. While it does provide some storage (I guess any object does in a sense), it is really a helper for cooked_index_worker -- a temporary object that is destroyed after reading has completed. This patch renames this file. Later patches will rename the class and move cooked_index_worker here, something I think is reasonable given that cooked_index_storage is really something of a helper class for cooked_index_worker. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-26gdb: add configure option to disable compileGuinevere Larsen1-0/+12
GDB's compile subsystem is deeply tied to GDB's ability to understand DWARF. A future patch will add the option to disable DWARF at configure time, but for that to work, the compile subsystem will need to be entirely disabled as well, so this patch adds that possibility. I also think there is motive for a security conscious user to disable compile for it's own sake. Considering that the code is quite unmaintained, and depends on an equally unmaintained gcc plugin, there is a case to be made that this is an unnecessary increase in the attack surface if a user knows they won't use the subsystem. Additionally, this can make compilation slightly faster and the final binary is around 3Mb smaller. But these are all secondary to the main goal of being able to disable dwarf at configure time. To be able to achieve optional compilation, some of the code that interfaces with compile had to be changed. All parts that directly called compile things have been wrapped by ifdefs checking for compile support. The file compile/compile.c has been setup in a similar way to how python's and guile's main file has been setup, still being compiled but only for with placeholder command. Finally, to avoid several new errors, a new TCL proc was introduced to gdb.exp, allow_compile_tests, which checks if the "compile" command is recognized before the inferior is started and otherwise skips the compile tests. All tests in the gdb.compile subfolder have been updated to use that, and the test gdb.base/filename-completion also uses this. The proc skip_compile_feature_tests to recognize when the subsystem has been disabled at compile time. Reviewed-By: Eli Zaretskii <eliz@gnu.org> Approved-By: Tom Tromey <tom@tromey.com>
2025-03-26gdb/dwarf: use reference in cutu_reader::cutu_reader interfaceSimon Marchi3-64/+60
Change some parameters to be references instead of pointers, when the value must not be nullptr. I'd like to do this more of this kind of change, but I have to limit the scope of the change, otherwise there's just no end (and some local variables could also be turned into references). So for now, just do it the cutu_reader constructors. Change-Id: I9442c6043726981d58f9b141f516c590c0a71bcc Approved-By: Tom Tromey <tom@tromey.com>
2025-03-26gdb/dwarf: update comment of cutu_reader::cutu_reader (the DWO variant)Simon Marchi1-13/+9
The comment on this constructor is really outdated. Update it to better reflect the reality today. I'd eventually like to change this cutu_reader constructor not to use dwarf2_per_cu, because it seems like an abuse of dwarf2_per_cu just to pass 3 values. But for now, just document the existing behavior. Change-Id: Id96db020c361e64d9b0d2f25d51950b206658aa2 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-26gdb/dwarf: remove redundant read of dwo_nameSimon Marchi1-4/+6
lookup_dwo_unit receives the name of the DWO unit to look up, as read from the DW_AT_dwo_name attribute of the skeleton DIE. But then, it doesn't use it: /* Yeah, we look dwo_name up again, but it simplifies the code. */ dwo_name = dwarf2_dwo_name (comp_unit_die, cu); Perhaps this comment made sense at some point, but with the code we have today, I don't understand it. It should be fine to use the name passed as a parameter, which the caller also obtained by calling dwarf2_dwo_name. Change-Id: I84723e12726f77e4202d042428ee0eed9962ceb8 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-25gdb/dwarf: use std::equal_range in cooked_index_shard::findSimon Marchi1-16/+19
Looking at `cooked_index_shard::find`, I thought that we could make a small optimization: when finding the upper bound, we already know the lower bound. And we know that the upper bound is >= the lower bound. So we could pass `lower` as the first argument of the `std::upper_bound` call to cut the part of the search space that is below `lower`. It then occured to me that what we do is basically what `std::equal_range` is for, so why not use it. Implementations of `std::equal_range` are likely do to things as efficiently as possible. Unfortunately, because `cooked_index_entry::compare` is sensitive to the order of its parameters, we need to provide two different comparison functions (just like we do know, to the lower_bound and upper_bound calls). But I think that the use of equal_range makes it clear what the intent of the code is. Regression tested using the various DWARF target boards on Debian 12. Change-Id: Idfad812fb9abae1b942d81ad9976aeed7c2cf762 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-25gdb/dwarf: remove unnecessary comparison in cooked_index_entry::compareSimon Marchi1-5/+2
I believe that the `(mode == MATCH && a == munge ('<'))` part of the condition is unnecesary. Or perhaps I don't understand the algorithm. The use of "munge" above effectively makes it so that the template portion of names is completely ignored for the sake of the comparison. Then, in the condition, this: a == munge ('<') is functionally equivalent to a == '\0' If `a` is indeed '\0', and `b` is also '\0', then we would have taken the earlier branch: if (a == b) return 0; If `b` is not '\0', then we won't take this branch and we'll go into the final comparison: return a < b ? -1 : 1; So, as far as I can see, there is no case where `mode == MATCH`, where we're going to use this special `return 0`. Regression tested using the various DWARF target boards on Debian 12. Change-Id: I5ea0463c1fdbbc1b003de2f0a423fd0073cc9dec Approved-By: Tom Tromey <tom@tromey.com>
2025-03-24gdb/dwarf: move CU check up in cutu_reader::read_cutu_die_from_dwoSimon Marchi1-5/+8
We have this pattern of check in multiple places: /* Skip dummy compilation units. */ if (m_info_ptr >= begin_info_ptr + this_cu->length () || peek_abbrev_code (abfd, m_info_ptr) == 0) m_dummy_p = true; In all places except one (read_cutu_die_from_dwo), this is done after reading the unit header but before potentially reading the first DIE. The effect is that we consider dummy units that have no DIE at all. Either the "data" portion of the unit (the portion after the header) has a size of zero, or the first abbrev code is 0, i.e. "end of list". According to this old commit I found [1], dummy CUs were used as filler for incremental LTO linking. A comment reads: WARNING: If THIS_CU is a "dummy CU" (used as filler by the incremental linker) then DIE_READER_FUNC will not get called. In read_cutu_die_from_dwo, however, this check is done after having read the first DIE. So at the time of the check, m_info_ptr has already been advanced just past the first DIE. As a result, compilations units with a single DIE are considered (erroneously, IMO) as dummy. In commit aab6de1613df ("gdb/dwarf: fix spurious error when encountering dummy CU") [2], I mentioned a real world case where compilation units with a single top-level DIE were being considered dummy. I believe that those units should not actually have been treated as dummy. A CU with just one DIE may not be very interesting, but I don't see any reason to consider it dummy. Move the dummy check above the read_toplevel_die call, and return early if the CU is dummy. I am 99% convinced that it's not even possible to encounter an empty unit here, and considered turning it into an assert (it did pass the testsuite). This function is passed a dwo_unit, and functions that create a dwo_unit are: - create_debug_type_hash_table (creates a dwo_unit for each type unit found in a dwo file) - create_cus_hash_table (creates a dwo_unit for each comp unit found in a dwo file) - create_dwo_unit_in_dwp_v1 - create_dwo_unit_in_dwp_v2 - create_dwo_unit_in_dwp_v5 In the first two, there are already dummy checks, so we wouldn't even get to read_cutu_die_from_dwo for such an empty CU. However, in the last three, there is no such checks, we just trust the dwp file's index and create dwo_units out of that. So I guess it would be possible to craft a broken dwp file with a CU that has no DIE. Out of caution, I didn't switch that to an assert, but I also don't really know what would be the mode of failure if that were to happen. Regtested using the various DWARF target boards on Debian 12. [1] https://gitlab.com/gnutools/binutils-gdb/-/commit/dee91e82ae87f379c90fddff8db7c4b54a116609#dd409f60ba6f9c066432dafbda7093ac5eec76d1_3434_3419 [2] https://gitlab.com/gnutools/binutils-gdb/-/commit/aab6de1613df693059a6a2b505cc8f20d479d109 Change-Id: I90e6fa205cb2d23ebebeae6ae7806461596f9ace Approved-By: Tom Tromey <tom@tromey.com>
2025-03-24gdb/dwarf: remove cutu_reader::read_cutu_die_from_dwo abbrev table parameterSimon Marchi2-10/+6
This parameter is always used to set cutu_reader::m_dwo_abbrev_table. Remove the parameter, and have read_cutu_die_from_dwo set the field directly. Change-Id: I6c0c7d23591fb2c3d28cdea1befa4e6b379fd0d3 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-21Introduce die_info::children and use itTom Tromey2-166/+72
This adds a new die_info::children method. This returns a range that can be used to iterate over a DIE's children. Then this goes through and updates all the relevant loops to use foreach instead. This is a net code reduction. You'll note that in some places the code was checking the tag as well, like: while (child_die && child_die->tag) I believe this can't happen and is just a copy-paste oddity from the old days. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-21Rename die_info::sibling to die_info::nextTom Tromey3-44/+44
I want to add support for C++ foreach iteration over DIE siblings. I considered writing a custom iterator for this, but it would be largely identical to the already-existing next_iterator. I didn't want to duplicate the code... Then I tried parameterizing next_iterator by having it take an optional pointer-to-member template argument. However, this would involve changes in many places, because currently a next_iterator can be instantiated before the underlying type is complete. So in the end I decided to rename die_info::sibling to die_info::next. This name is slightly worse but (1) IMO it isn't really all that bad, nobody would have blinked if it was called 'next' in the initial patch, and (2) with the change to iteration it is barely used. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-20Fix grammar error in dwarf2/attribute.hTom Tromey1-2/+2
A recent patch of mine had a comment with bad grammar; apparently I didn't finish editing it. This patch cleans it up.
2025-03-18gdb/dwarf: use gdb::unordered_set for cooked_index_storage::m_reader_hashSimon Marchi2-29/+53
Replace an htab with gdb::unordered_set. I think we could also use the dwarf2_per_cu pointer itself as the identity, basically have the functional equivalent of: gdb::unordered_map<dwarf2_per_cu *, cutu_reader_up> But I kept the existing behavior of using dwarf2_per_cu::index as the identity. Change-Id: Ief3df9a71ac26ca7c07a7b79ca0c26c9d031c11d Approved-By: Tom Tromey <tom@tromey.com>
2025-03-18gdb/dwarf: remove type_unit_groupSimon Marchi2-79/+46
The type_unit_group is an indirection between a stmt_list_hash (possible dwo_unit + line table section offset) and a type_unit_group_unshareable that provides no real value. In dwarf2_per_objfile, we maintain a stmt_list_hash -> type_unit_group mapping, and in dwarf2_per_objfile, we maintain a type_unit_group_unshareable mapping. The type_unit_group type is empty and only exists to have an identity and to be a link between the two mappings. This patch changes it so that we have a single stmt_list_hash -> type_unit_group_unshareable mapping. Regression tested on Debian 12 amd64 with a bunch of DWARF target boards. Change-Id: I9c5778ecb18963f353e9dd058e0f8152f7d8930c Approved-By: Tom Tromey <tom@tromey.com>
2025-03-18gdb/dwarf: use gdb::unordered_map for ↵Simon Marchi4-157/+60
dwarf2_per_bfd::{quick_file_names_table,type_unit_groups} Change these two hash tables to use gdb::unordered_map. I changed these two at the same time because they both use the same key, a stmt_list_hash. Unlike other previous patches that used a gdb::unordered_set, use an unordered_map here because the key isn't found in the element itself (well, it was before, because of how htab works, but it didn't need to be). You'll notice that the type_unit_group structure is empty. That structure isn't really needed. It is removed in the following patch. Regression tested on Debian 12 amd64 with a bunch of DWARF target boards. Change-Id: Iec2289958d0f755cab8198f5b72ecab48358ba11 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-18Remove is_nonnegative and as_nonnegativeTom Tromey2-39/+20
This removes attribute::is_nonnegative and attribute::as_nonnegative in favor of a call to unsigned_constant. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-18Handle DW_END_defaultTom Tromey1-0/+3
I noticed that gdb doesn't handle DW_END_default. This patch adds support for this. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-18Assume DW_AT_alignment is unsignedTom Tromey1-10/+5
This changes get_alignment to assume that DW_AT_alignment refers to an unsigned value. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-18Assume DW_AT_decl_line is unsignedTom Tromey1-9/+6
This changes read_decl_line and new_symbol to assume that DW_AT_decl_line should refer to an unsigned value. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-18Use form name in complaint in dwarf2_record_block_entry_pcTom Tromey1-2/+2
This changes dwarf2_record_block_entry_pc to issue a complaint using the form name rather than a value. This seems more correct to me. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-18Introduce and use attribute::unsigned_constantTom Tromey5-70/+147
This introduces a new 'unsigned_constant' method on attribute. This method can be used to get the value as an unsigned number. Unsigned scalar forms are handled, and signed scalar forms are handled as well provided that the value is non-negative. Several spots in the reader that expect small DWARF-defined constants are updated to use this new method. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680 Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-18Rename form_is_signed to form_is_strictly_signedTom Tromey2-6/+9
This renames attribute::form_is_signed to form_is_strictly_signed. I think this more accurately captures what it does: it says whether a form will always use signed data -- not whether a form might use signed data, which DW_FORM_data* do depending on context. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680 Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-18gdb/dwarf: set m_top_level_die directly in read_cutu_die_from_dwoSimon Marchi2-11/+5
read_cutu_die_from_dwo currently returns the dwo's top-level DIE through a parameter. Following the previous patch, all code paths end up setting m_top_level_die. Simplify this by having read_cutu_die_from_dwo set m_top_level_die directly. I think it's easier to understand, because there's one less indirection to follow. Change-Id: Ib659f1d2e38501a8fe2b5dd0ca2add3ef55e8d60 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-18gdb/dwarf: fix spurious error when encountering dummy CUSimon Marchi2-32/+15
I built an application with -gsplit-dwarf (i.e. dwo), and some CUs are considered "dummy" by the DWARF reader. That is, the top-level DIE (DW_TAG_compile_unit) does not have any children. Here's the skeleton: 0x0000c0cb: Compile Unit: length = 0x0000001d, format = DWARF32, version = 0x0005, unit_type = DW_UT_skeleton, abbr_offset = 0x529b, addr_size = 0x08, DWO_id = 0x0ed2693dd2a756dc (next unit at 0x0000c0ec) 0x0000c0df: DW_TAG_skeleton_unit DW_AT_stmt_list [DW_FORM_sec_offset] (0x09dee00f) DW_AT_dwo_name [DW_FORM_strp] ("CMakeFiles/lib_crl.dir/crl/dispatch/crl_dispatch_queue.cpp.dwo") DW_AT_comp_dir [DW_FORM_strp] ("/home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/Telegram/lib_crl") DW_AT_GNU_pubnames [DW_FORM_flag_present] (true) And here's the entire debug info in the .dwo file: .debug_info.dwo contents: 0x00000000: Compile Unit: length = 0x0000001a, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size = 0x08, DWO_id = 0x0ed2693dd2a756dc (next unit at 0x0000001e) 0x00000014: DW_TAG_compile_unit DW_AT_producer [DW_FORM_strx] ("GNU C++20 14.2.1 20250207 -mno-direct-extern-access -mtune=generic -march=x86-64 -gsplit-dwarf -g3 -gz=none -O2 -std=gnu++20 -fPIC -fno-strict-aliasing") DW_AT_language [DW_FORM_data1] (DW_LANG_C_plus_plus_14) DW_AT_name [DW_FORM_strx] ("/home/simark/src/tdesktop/Telegram/lib_crl/crl/dispatch/crl_dispatch_queue.cpp") DW_AT_comp_dir [DW_FORM_strx] ("/home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/Telegram/lib_crl") When loading the binary in GDB, I see some warnings: $ ./gdb -q -nx --data-directory=data-directory -ex 'maint set dwarf sync on' -ex "file /home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/telegram-desktop" Reading symbols from /home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/telegram-desktop... DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc0cb DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc152 DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc194 DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc1b5 (gdb) It turns out that these errors are not really justified. What happens is: - cutu_reader::read_cutu_die_from_dwo return 0, indicating that the CU is "dummy" - back in cutu_reader::cutu_reader, we omit setting m_top_level_die to the DIE from the dwo file, meaning that m_top_level_die keeps pointing to the DIE from the main file (DW_TAG_skeleton_unit) - later, in cutu_reader::prepare_one_comp_unit, there is a check that m_top_level_die->tag is one of DW_TAG_{compile,partial,type}_unit, which triggers My proposal to fix this is to set m_top_level_die even if the CU is dummy. Even if the top-level DIE does not have any children, I don't see any reason to leave cutu_reader::m_top_level_die in a different state than when the CU is not dummy. While at it, set m_dummy_p directly in read_cutu_die_from_dwo, instead of returning a value and having the caller do it. This is all inside cutu_reader anyway. Change-Id: I483a68a369bb461a8dfa5bf2106ab1d6a0067198 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-18gdb/dwarf: remove create_dwo_cu_readerSimon Marchi1-37/+23
This function, as can be seen by its comment, is a remnant of past design. Inline its content into create_cus_hash_table. Change-Id: Id900bae2cdce8f33bf01199fb1d366646effc76e Approved-By: Tom Tromey <tom@tromey.com>
2025-03-17gdb/dwarf: remove unused cooked_index::cooked_index parameterSimon Marchi4-8/+4
Following the previous patch, this parameter is now unused. Remove it. Change-Id: I7e96a3ba61ad9a0d6b64f9129aeeb9a8f3da22a7 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-17gdbsupport: add some -Wunused-* warning flagsSimon Marchi5-13/+2
Add a few -Wunused-* diagnostic flags that look useful. Some are known to gcc, some to clang, some to both. Fix the fallouts. -Wunused-const-variable=1 is understood by gcc, but not clang. -Wunused-const-variable would be undertsood by both, but for gcc at least it would flag the unused const variables in headers. This doesn't make sense to me, because as soon as one source file includes a header but doesn't use a const variable defined in that header, it's an error. With `=1`, gcc only warns about unused const variable in the main source file. It's not a big deal that clang doesn't understand it though: any instance of that problem will be flagged by any gcc build. Change-Id: Ie20d99524b3054693f1ac5b53115bb46c89a5156 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-17gdb/dwarf: use gdb::unordered_set for seen_namesSimon Marchi1-26/+38
Direct replacement of an htab with a gdb::unordered_set. Using a large test program, I see a small but consistent performance improvement. The "file" command time goes on average from 7.88 to 7.73 seconds (~2%). To give a rough estimate of the scale of the test program, the 8 seen_names hash tables (one for each worker thread) had between 173846 and 866961 entries. Change-Id: I0157cbd04bb55338bb1fcefd2690aeef52fe3afe Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: assume that no dwarf2_cu exist when calling load_full_comp_unitSimon Marchi1-13/+12
After staring at the code, I got convinced that it was not possible for load_full_comp_unit to be called while a dwarf2_cu object exists in per_objfile for this_cu. If you follow all callers of load_full_comp_unit, you can see that all calls to load_full_comp_unit (except one, see below) are gated one way or another by the fact that: per_objfile->get_cu (per_cu) == nullptr Some calls are gated by maybe_queue_comp_unit returning true. If it returns true, then necessarily the dwarf2_cu is unset for that per_cu. The spot that didn't seem to check for whether the dwarf2_cu is already set before calling load_full_comp_unit is dw2_do_instantiate_symtab. It didn't trigger when running the testsuite, but I could imagine a made up case where the dwarf2_cu would already be set because we looked up a DIE reference to it (follow_die_ref) for whatever reason. Then, something would cause the symtab for that CU to be expanded and dw2_do_instantiate_symtab to be called. I added a check in that function, because it seemed prudent to do so. All other load_cu calls are gated by this check, so it makes this call look just like the others. Finally, because all call sites that use cutu_reader::release_cu pass nullptr for `existing_cu` (and therefore cutu_reader creates a new dwarf2_cu), we know that cutu_reader::release_cu will always return a non-nullptr value. Add an assert in it and remove checks in load_full_comp_unit and read_signatured_type. Change-Id: I496be34bd4bf7edfa38d5135cf4bc4ccd960abe2 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: remove existing_cu parameter of load_full_comp_unitSimon Marchi1-20/+8
Following the previous patch, all callers now pass the same thing: per_objfile->get_cu (this_cu) Remove that parameter and to the call in the function itself. Change-Id: Iafd36b058d7b95efae518bb65035c6a03728b018 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: assume that source_cu->dies is always set in follow_die_offsetSimon Marchi1-7/+2
After staring at the code for a while, I got convinced that it's not possible for cu->dies to be nullptr in follow_die_offset. It might be a leftover from the psymtab days. In most cases, we see that the dwarf2_cu passedas `*ref_cu` has been obtained by doing: per_objfile->get_cu (per_cu); The only way for a dwarf2_cu to end up in the per_objfile like this is through load_full_comp_unit or read_signatured_type. Both of these functions call `reader.read_all_dies ()` (which loads the DIEs in memory and assigns dwarf2_cu::dies) before transferring the newly created dwarf2_cu to the per_objfile. So any dwarf2_cu obtained through per_objfile->get_cu (per_cu) ... will have its DIEs set. The only case today I'm aware of of a dwarf2_cu without DIEs is in the cooked indexer. It creates a cutu_reader, but does not call read_all_dies. Instead, it gets the info_ptr from the cutu_reader and reads the DIEs from the section buffer directly, on its own. But this is an entirely different code path that doesn't assign dwarf2_cu objects to per_objfile. So, remove the code path in follow_die_offset that tests for `source_cu->dies == NULL`. I added an assert at the top of the function to verify that `source_cu->dies` is always non-nullptr, as a way to test my hypothesis. We could probably get rid of it, but I left it there because it doesn't cost much to have it. Change-Id: I97f269f092128800850aa5e64eda7032c2edec60 Approved-By: Tom Tromey <tom@tromey.com>