aboutsummaryrefslogtreecommitdiff
path: root/gdb/dwarf2/read.h
AgeCommit message (Collapse)AuthorFilesLines
7 daysRename expand_symtabs_matchingTom Tromey1-5/+5
After this series, expand_symtabs_matching is now misnamed. This patch renames it, renames some associated types, and also fixes up some comments that I previously missed. Acked-By: Simon Marchi <simon.marchi@efficios.com>
7 daysMake dw_expand_symtabs_matching_file_matcher staticTom Tromey1-8/+0
dw_expand_symtabs_matching_file_matcher is no longer needed outside of read.c, so it can be made static. Acked-By: Simon Marchi <simon.marchi@efficios.com>
7 daysRemove dwarf2_per_cu_data::markTom Tromey1-12/+34
This removes dwarf2_per_cu_data::mark, replacing it with a locally-allocated boolean vector. It also inverts the sense of the flag -- now, the flag is true when a CU should be skipped, and false when the CU should be further examined. Also, the validity of the flag is no longer dependent on 'file_matcher != NULL'. This patch makes the subsequent patch to searching a bit simpler, so I've separated it out. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16994 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16998 Acked-By: Simon Marchi <simon.marchi@efficios.com>
7 daysAda import functions not in indexTom Tromey1-0/+6
The cooked index does not currently contain entries for Ada import functions. This means that whether or not these are visible to "break" depends on which CUs were previously expanded -- clearly a bug. This patch fixes the issue. I think the comments in the patch explain the fix reasonably well. Perhaps one to-do item here is to change GNAT to use DW_TAG_imported_declaration for these imports. This may eventually let us remove some of the current hacks. This version includes a fix from Simon to initialize the new member. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32511 Acked-By: Simon Marchi <simon.marchi@efficios.com>
9 daysMove lnp_state_machine to new fileTom Tromey1-0/+31
This patch moves lnp_state_machine and some supporting code to a new file, dwarf2/line-program.c. The main benefit of this is shrinking dwarf2/read.c a bit. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-14gdb/dwarf: clear per_bfd::num_{comp,type}_units on errorSimon Marchi1-0/+30
Commit bedd6a7a44 ("gdb/dwarf: track compilation and type unit count") causes this internal error: $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.dwarf2/debug-names-duplicate-cu/debug-names-duplicate-cu -ex "save gdb-index -dwarf-5 /tmp" -batch warning: Section .debug_names has incorrect number of CUs in CU table, ignoring .debug_names. /home/smarchi/src/binutils-gdb/gdb/dwarf2/index-write.c:1454: internal-error: write_debug_names: Assertion `comp_unit_counter == per_bfd->num_comp_units' failed. This is visible when running this test: $ make check TESTS="gdb.dwarf2/debug-names-duplicate-cu.exp" RUNTESTFLAGS="--target_board=cc-with-debug-names" ... Running /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.dwarf2/debug-names-duplicate-cu.exp ... gdb compile failed, warning: Section .debug_names has incorrect number of CUs in CU table, ignoring .debug_names. /home/smarchi/src/binutils-gdb/gdb/dwarf2/index-write.c:1454: internal-error: write_debug_names: Assertion `comp_unit_counter == per_bfd->num_comp_units' failed. ... === gdb Summary === # of untested testcases 1 However, it's easy to miss because it only causes an "UNTESTED" to be recorded, not a FAIL or UNRESOLVED. This is because the problem happens while trying to create the .debug_names index, as part of the test case compilation. The problem is: when we bail out from using .debug_names because we detect it is inconsistent with the units in .debug_info, we clear per_bfd->all_units, to destroy all units previously created, before proceeding to read the units with an index. However, we don't clear per_bfd->num_{comp,type}_units. As a result, per_bfd->all_units contains one unit, while per_bfd->num_comp_units is 2. Whenever we clear per_bfd->all_units, we should also clear per_bfd->num_{comp,type}_units. While at it, move this logic inside a scoped object. I added an assertion in finalize_all_units to verify that the size of per_bfd->all_units equals per_bfd->num_comp_units + per_bfd->num_type_units. This makes the problem (if omitting the fix) visible when running gdb.dwarf2/debug-names-duplicate-cu.exp with the unix (default) target board: ERROR: Couldn't load debug-names-duplicate-cu into GDB (GDB internal error). FAIL: gdb.dwarf2/debug-names-duplicate-cu.exp: find index type (GDB internal error) FAIL: gdb.dwarf2/debug-names-duplicate-cu.exp: find index type, check type is valid === gdb Summary === # of expected passes 1 # of unexpected failures 2 # of unresolved testcases 1 I considered changing the code to build a local vector of units first, then move it in per_bfd->all_units on success, that would avoid having to clean it up on error. I did not do it because it's a much larger change, but we could consider it. Change-Id: I49bcc0cb4b34aba3d882b27c8a93c168e8875c08 Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01gdb/dwarf: sort dwarf2_per_bfd::all_units by (section, offset)Simon Marchi1-8/+21
This patch started as a fix for PR 29518 ("GDB doesn't handle DW_FORM_ref_addr DIE references correctly with .debug_types sections") [1], but the scope has expanded a bit to fix the problem more generally, after I spotted a few issues related to the order of all_units. The first version of this patch is here [2]. PR 29518 shows that dwarf2_find_containing_comp_unit can erroneously find a type unit. The obvious problem is that the dwarf2_find_containing_comp_unit function searches the whole all_units vector (containing both comp and type units), when really it should just search the compilation units. A simple solution would be to make it search the all_comp_units view (which is removed in a patch earlier in this series). I then realized that in DWARF 5, since type units are in .debug_info (versus .debug_types in DWARF 4), type units can be interleaved with comp type in the all_units vector. That would make the all_comp_units and all_type_units views erroneous, and dwarf2_find_containing_comp_unit could still return something wrong. In v1, I added a sort in finalize_all_units to make sure all_units is in the order that dwarf2_find_containing_comp_unit expects: - comp units from the main file - type units from the main file - comp units from the dwz file - type units from the dwz file (not actually supported, see PR 30838) Another problem I spotted is that the .gdb_index reader creates units in this order: - comp units from .gdb_index from main file - comp units from .gdb_index from dwz file - type units from .gdb_index from main file This isn't the same order as above, so it would need the same sort step. Finally, I'm not exactly sure if and when it happens, but it looks like lookup_signatured_type can be called at a later time (after the initial scan and creation of dwarf2_per_cu object creation), when expanding a symtab. And that could lead to the creation of a new type unit (see function add_type_unit), which would place the new type unit at the end of the all_units vector, possibly screwing up the previous order. To handle all this in a nice and generic way, Tom Tromey proposed to change the all_units order, so that units are sorted by section, then section offset. This is what this patch implements. The sorting is done in finalize_all_units. This works well, because when looking up a unit by section offset, the caller knows which section the unit is in. Passing down a (section, section offset) tuple makes it clear and unambiguous what unit the caller is referring to. It should help eliminate some bugs where the callee used the section offset in the wrong section. Passing down the section along with the section offset replaces the "is_dwz" flag passed to dwarf2_find_containing_comp_unit and a bunch of other functions in a more general way. dwarf2_find_containing_comp_unit can now legitimately find and return type units even though it should be needed (type units are typically referred to by signature). But I don't think there is harm for this function to be more generic than needed. I therefore I renamed it to dwarf2_find_containing_unit. The sort criterion for "section" can be anything, as long as we use the same for sorting and searching. In this patch, I use the pointer to dwarf2_section_info, because it's easy. The downside is that the actual order depends on what the memory allocator decided to return, so could change from run to run, or machine to machine. Later, I might change it so that sections are ordered based on their properties, making the order stable across the board. This logic is encapsulated in the all_units_less_than function, so it's easy to change. The .debug_names reader can no longer rely on the order of the all_units vector for its checks, since all_units won't be the same order as found in the .debug_names lists. In fact, even before, it wasn't: this check assumed that .debug_info had all CUs before TUs, and that the index listed them in the exact same order. When I build a file with gcc and "-gdwarf-5 -fdebug-types-section", type units appear first in .debug_info. This caused GDB to reject a .debug_names index that is had produced: $ GDB="./gdb -nx -q --data-directory=data-directory" /home/smarchi/src/binutils-gdb/gdb/contrib/gdb-add-index.sh -dwarf-5 hello.so $ ./gdb -nx -q --data-directory=data-directory hello.so Reading symbols from hello.so... ⚠️ warning: Section .debug_names has incorrect entry in CU table, ignoring .debug_names. To make it work, add a new dwarf2_find_unit function that allows looking up a unit by start address (unlike dwarf2_find_containing_unit, which can find by any containing address), and make the .debug_names reader use it. It might make the load time of .debug_names a bit longer (the build and check step is now going to be O(n*log(n)) instead of O(n) where n is the number of units, or something like that), but I think it's important to be correct here. This patch adds a test (gdb.dwarf2/dw-form-ref-addr-with-type-units.exp), which tries to replicate the problem as shown by PR 29518. gdb.base/varval.exp needs a small change, because an error message changes (for the better, I think) gdb.dwarf2/debug-names-non-ascending-cu.exp now fails, because GDB no longer rejects a .debug_names index which lists CUs in a different order than .debug_info. Given the change I did to the .debug_names reader, explained above, I don't think this is a problem anymore (GDB can accept an index like that). I also don't think that DWARF 5 mandates that CUs are in ascending order. Delete this test. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=29518 [2] https://inbox.sourceware.org/gdb-patches/20250218193443.118139-1-simon.marchi@efficios.com/ Change-Id: I45f982d824d3842ac1eb73f8cce721a0a24b5faa Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01gdb/dwarf: move index unit vectors to .debug_names reader and use themSimon Marchi1-17/+0
The all_comp_units_index_cus and all_comp_units_index_tus vectors contain the CU and TU lists as found in the .debug_names list. It seems like they are meant to be used by the .debug_names reader when handling a DW_IDX_compile_unit or DW_IDX_type_unit attribute. The value of the attribute would translate directly into an index into one of these vectors. However, it looks like these vectors aren't actually used in practice. They are used in the dwarf2_per_bfd::get_index_{c,t}u methods, which in turn aren't used anywhere. The handlers of DW_IDX_compile_unit and DW_IDX_type_unit use the dwarf2_per_bfd::get_unit method, with the assumption that dwarf2_per_bfd::all_units has comp units before type units. This is not the case: the .debug_names reader creates the units in dwarf2_per_bfd::all_units using the create_all_units function, which creates the units in the order found in .debug_info, where type units can be interleaved with comp units. Since those vectors are specific to the .debug_names reader, move them to the mapped_debug_names_reader struct. Then, update the handlers of DW_IDX_compile_unit and DW_IDX_type_unit to actually use them. Change-Id: Ie7db81f4442f634ac6d02280a60c6c671bcd22a5 Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01gdb/dwarf: remove all_{comp,type}_units viewsSimon Marchi1-5/+0
In DWARF 5, type units appear in the .debug_info section, interleaved with comp units, and the order in all_units reflects that. The all_comp_units and all_type_units views are wrong in that case (all_comp_units contains some type units, and vice-versa). It would be possible to manually sort all_units to ensure that type units follow comp units, but this series takes the approach of sorting the units by section and section offset. Remove those views, and replace their uses with num_comp_units and num_type_units. It appears that the views were only used to know the number of each kind. The finalize_all_units function is now empty, but I am keeping it because a subsequent patch adds a call to std::sort in there to sort the all_units vector. Change-Id: I42a65b6f1b6192957b55cea0e2eaff097e13a33b Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01gdb/dwarf: track compilation and type unit countSimon Marchi1-1/+4
A subsequent commit will remove the all_comp_units and all_type_units array views, since it's not possible to assume that that all_units vector is segmented between comp and type units. Some callers still need to know the number of each kind, so track that separately. Change-Id: I712fbdfbf10b333c431b688b881cc0987e74f688 Approved-By: Tom Tromey <tom@tromey.com>
2025-06-17gdb/dwarf: rename get_cu -> get_unitSimon Marchi1-4/+4
This method returns type units too, so "get_unit" is a better name. Change-Id: I6ec9de3f783637a3e206bcaaec96a4e00b4b7d31 Approved-By: Tom Tromey <tom@tromey.com>
2025-05-23gdb: guard <mutex> include with CXX_STD_THREADSimon Marchi1-0/+2
Change-Id: I4335fbfdabe49778fe37b08689eec59be94c424b
2025-05-23gdb: include <mutex> in dwarf2/read.hSimon Marchi1-0/+1
The buildbot pointed out this compilation failure on AlmaLinux, with g++ 8.5.0, which is not seen on more recent systems: CXX gdbtypes.o In file included from ../../binutils-gdb/gdb/gdbtypes.c:39: ../../binutils-gdb/gdb/dwarf2/read.h:639:8: error: ‘mutex’ in namespace ‘std’ does not name a type std::mutex dwo_files_lock; ^~~~~ ../../binutils-gdb/gdb/dwarf2/read.h:639:3: note: ‘std::mutex’ is defined in header ‘<mutex>’; did you forget to ‘#include <mutex>’? ../../binutils-gdb/gdb/dwarf2/read.h:35:1: +#include <mutex> ../../binutils-gdb/gdb/dwarf2/read.h:639:3: std::mutex dwo_files_lock; ^~~ Fix it by including <mutex> in dwarf2/read.h. Change-Id: Iba334a3dad217c86841a5e804d0f386876f5ff2f
2025-05-23gdb/dwarf: split dwo_lock in more granular locksSimon Marchi1-0/+5
The dwo_lock mutex is used to synchronize access to some dwo/dwp-related data structures, such as dwarf2_per_bfd::dwo_files and dwp_file::loaded_{cus,tus}. Right now the scope of the lock is kind of coarse. It is taken in the top-level lookup_dwo_unit function, and held while the thread: - looks up an existing dwo_file in the per-bfd hash table for the given id/signature - if there's no existing dwo_file, attempt to find a .dwo file, open it, build the list of units it contains - if a new dwo_file was created, insert it in the per-bfd hash table - look up the desired unit in the dwo_file And something similar for the dwp code path. This means that two indexing thread can't read in two dwo files simultaneously. This isn't ideal in terms of parallelism. This patch breaks this lock into 3 more fine grained locks: - one lock to access dwarf2_per_bfd::dwo_files - one lock to access dwp_file::loaded_{cus,tus} - one lock in try_open_dwop_file, where we do two operations that aren't thread safe (bfd_check_format and gdb_bfd_record_inclusion) Unfortunately I don't see a clear speedup on my computer with 8 threads. But the change shouldn't hurt, in theory, and hopefully this can be a piece that helps in making GDB scale better on machines with many cores (if we ever bump the max number of worker threads). This patch uses "double-checked locking" to avoid holding the lock(s) for the whole duration of reading in dwo files. The idea is, when looking up a dwo with a given name: - with the lock held, check for an existing dwo_file with that name in dwarf2_per_bfd::dwo_files, if found return it - if not found, drop the lock, load the dwo file and create a dwo_file describing it - with the lock held, attempt to insert the new dwo_file in dwarf2_per_bfd::dwo_files. If an entry exists, it means another thread simultaneously created an equivalent dwo_file, but won the race. Drop the new dwo_file and use the existing one. The new dwo_file is automatically deleted, because it is help by a unique_ptr and the insertion into the unordered_set fails. Note that it shouldn't normally happen for two threads to look up a dwo file with the same name, since different units will point to different dwo files. But it were to happen, we handle it. This way of doing things allows two threads to read in two different dwo files simulatenously, which in theory should help get better parallelism. The same technique is used for dwp_file::loaded_{cus,tus}. I have some local CI jobs that run the fission and fission-dwp boards, and I haven't seen regressions. In addition to the regular testing, I ran a few tests using those boards on a ThreadSanitizer build of GDB. Change-Id: I625c98b0aa97b47d5ee59fe22a137ad0eafc8c25 Reviewed-By: Andrew Burgess <aburgess@redhat.com>
2025-05-12gdb/dwarf: move loops into locate_dw{o,z}_sectionsSimon Marchi1-2/+1
For a subsequent patch, it would be easier if the loop over sections inside locate_dwo_sections (I want to maintain some state for the duration of the loop). Move the for loop in there. And because locate_dwz_sections is very similar, modify that one too, to keep both in sync. Change-Id: I90b3d44184910cc2d86af265bb4b41828a5d2c2e Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: avoid cutu_reader movesSimon Marchi1-2/+0
In process_psymtab_comp_unit and ensure_cu_exists, we create a temporary cutu_reader on the stack, then move it to a heap allocated cutu_reader once we confirmed the unit is not dummy. I think it's unnecessary to create a temporary cutu_reader. The only downside of not doing so is that if it ends up that the CU is dummy, we made an allocation/deallocation for nothing. Dummy CUs are a rare thing, it shouldn't change anything. This allows removing the cutu_reader move constructor. Change-Id: I44742d471c495055ee46db41c0e7bdfbd2d5c0b7 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: scan .debug_info.dwo just onceSimon Marchi1-10/+3
When building -gsplit-dwarf and -fdebug-types-section in DWARF 5, the resulting .dwo files will typically have a .debug_info.dwo section with multiple type units followed by one compile unit: $ llvm-dwarfdump -F -color a-test.dwo | grep ' Unit' 0x00000000: Type Unit: length = 0x000008a0, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_type, abbr_offset = 0x0000, addr_size = 0x08, name = 'vector<int, std::allocator<int> >', type_signature = 0xb499dcf29e2928c4, type_offset = 0x0023 (next unit at 0x000008a4) 0x000008a4: Type Unit: length = 0x00000099, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_type, abbr_offset = 0x0000, addr_size = 0x08, name = 'allocator<int>', type_signature = 0x496a8791a842701b, type_offset = 0x0023 (next unit at 0x00000941) ... 0x000015c1: Compile Unit: length = 0x00000f58, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size = 0x08, DWO_id = 0xe8e359820d1c5803 (next unit at 0x0000251d) In open_and_init_dwo_file, we call create_dwo_cus_hash_table, which scans the section, looking for compile units, then call create_dwo_debug_types_hash_table, which scans the section again, looking for type units. It would make more sense to scan the section just once and handle both compile and type units at the same time. To achieve this, add create_dwo_unit_hash_tables, which knows how to handle both unit kinds in a single scan. It replaces create_dwo_cus_hash_table and create_dwo_debug_type_hash_table. Change open_and_init_dwo_file to call it. Note that I removed the DWARF version check in open_and_init_dwo_file when processing .debug_type.dwo sections: in DWARF 5, the .debug_type.dwo sections will just not exist, so the `dwo_file->sections.types` vector will be empty. Change-Id: I6e51d0ca06c258e0bf0e59927d62ae2df314a162 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: replace some "compile unit" terminology with "unit"Simon Marchi1-9/+9
In DWARF 5 (and even previous versions, with type units), compile units are just one type of units. In many places, we still use "compile units" when in reality it would be better to talk about "units" (unless we specifically want to talk about compile units). Rename comp-unit-head.{c.h} to unit-head.{c,h}, and do a big pass of renames in it to remove the specific mentions of compile units, where in fact we want to talk about units in general. Change-Id: Ia06c90ccb25756c366f269a12620f2f7c8378adb Approved-By: Tom Tromey <tom@tromey.com>
2025-04-22Handle DWARF 5 separate debug sectionsTom Tromey1-4/+4
DWARF 5 standardized the .gnu_debugaltlink section that dwz emits in multi-file mode. This is handled via some new forms, and a new .debug_sup section. This patch adds support for this to gdb. It is largely straightforward, I think, though one oddity is that I chose not to have this code search the system build-id directories for the supplementary file. My feeling was that, while it makes sense for a distro to unify the build-id concept with the hash stored in the .debug_sup section, there's no intrinsic need to do so. This in turn means that a few tests -- for example those that test the index cache -- will not work in this mode. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32808 Acked-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-20gdb/dwarf: make some more functions methods of cutu_readerSimon Marchi1-0/+33
These are only used by cutu_reader, so make them methods of cutu_reader. This makes it a bit more obvious in which context this code is called. lookup_dwo_unit_in_dwp can't be made a method of cutu_reader, as it is used in another context (lookup_dwp_signatured_type / lookup_signatured_type), which happens during CU expansion. Change-Id: Ic62c3119dd6ec198411768323aaf640ed165f51b Approved-By: Tom Tromey <tom@tromey.com>
2025-04-20gdb/dwarf: look up .dwp file ahead of timeSimon Marchi1-3/+0
get_dwp_file lazily looks for a .dwp file for the given objfile. It is called by indexing workers, when a cutu_reader object looks for a DWO file. It is called with the "dwo_lock" held, meaning that the first worker to get there will do the work, while the others will wait at the lock. I'm trying to reduce the time where this lock is taken and do other refactorings to make it easier to reason about the DWARF reader code. Moving the lookup of the .dwp file ahead, before we start parallelizing work, helps makes things simpler, because we can then assume everywhere else that we have already checked for a .dwp file. Put the call to open_and_init_dwp_file in dwarf2_has_info, right next to where we look up .dwz files. I used the same try-catch pattern as for the .dwz file lookup. Change-Id: I615da85f62a66d752607f0dbe9f0372dfa04b86b Approved-By: Tom Tromey <tom@tromey.com>
2025-04-08Update copyright dates to include 2025Tom Tromey1-1/+1
This updates the copyright headers to include 2025. I did this by running gdb/copyright.py and then manually modifying a few files as noted by the script. Approved-By: Eli Zaretskii <eliz@gnu.org>
2025-04-03gdb/dwarf: rename cache -> abbrev_cacheSimon Marchi1-1/+1
"cache" is just a bit too generic to be clear. Change-Id: I8bf01c5fe84e076af1afd2453b1a115777630271
2025-03-26gdb/dwarf: use reference in cutu_reader::cutu_reader interfaceSimon Marchi1-6/+6
Change some parameters to be references instead of pointers, when the value must not be nullptr. I'd like to do this more of this kind of change, but I have to limit the scope of the change, otherwise there's just no end (and some local variables could also be turned into references). So for now, just do it the cutu_reader constructors. Change-Id: I9442c6043726981d58f9b141f516c590c0a71bcc Approved-By: Tom Tromey <tom@tromey.com>
2025-03-24gdb/dwarf: remove cutu_reader::read_cutu_die_from_dwo abbrev table parameterSimon Marchi1-2/+1
This parameter is always used to set cutu_reader::m_dwo_abbrev_table. Remove the parameter, and have read_cutu_die_from_dwo set the field directly. Change-Id: I6c0c7d23591fb2c3d28cdea1befa4e6b379fd0d3 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-18gdb/dwarf: remove type_unit_groupSimon Marchi1-23/+29
The type_unit_group is an indirection between a stmt_list_hash (possible dwo_unit + line table section offset) and a type_unit_group_unshareable that provides no real value. In dwarf2_per_objfile, we maintain a stmt_list_hash -> type_unit_group mapping, and in dwarf2_per_objfile, we maintain a type_unit_group_unshareable mapping. The type_unit_group type is empty and only exists to have an identity and to be a link between the two mappings. This patch changes it so that we have a single stmt_list_hash -> type_unit_group_unshareable mapping. Regression tested on Debian 12 amd64 with a bunch of DWARF target boards. Change-Id: I9c5778ecb18963f353e9dd058e0f8152f7d8930c Approved-By: Tom Tromey <tom@tromey.com>
2025-03-18gdb/dwarf: use gdb::unordered_map for ↵Simon Marchi1-6/+15
dwarf2_per_bfd::{quick_file_names_table,type_unit_groups} Change these two hash tables to use gdb::unordered_map. I changed these two at the same time because they both use the same key, a stmt_list_hash. Unlike other previous patches that used a gdb::unordered_set, use an unordered_map here because the key isn't found in the element itself (well, it was before, because of how htab works, but it didn't need to be). You'll notice that the type_unit_group structure is empty. That structure isn't really needed. It is removed in the following patch. Regression tested on Debian 12 amd64 with a bunch of DWARF target boards. Change-Id: Iec2289958d0f755cab8198f5b72ecab48358ba11 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-18Introduce and use attribute::unsigned_constantTom Tromey1-1/+1
This introduces a new 'unsigned_constant' method on attribute. This method can be used to get the value as an unsigned number. Unsigned scalar forms are handled, and signed scalar forms are handled as well provided that the value is non-negative. Several spots in the reader that expect small DWARF-defined constants are updated to use this new method. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680 Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-18gdb/dwarf: set m_top_level_die directly in read_cutu_die_from_dwoSimon Marchi1-1/+0
read_cutu_die_from_dwo currently returns the dwo's top-level DIE through a parameter. Following the previous patch, all code paths end up setting m_top_level_die. Simplify this by having read_cutu_die_from_dwo set m_top_level_die directly. I think it's easier to understand, because there's one less indirection to follow. Change-Id: Ib659f1d2e38501a8fe2b5dd0ca2add3ef55e8d60 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-18gdb/dwarf: fix spurious error when encountering dummy CUSimon Marchi1-5/+5
I built an application with -gsplit-dwarf (i.e. dwo), and some CUs are considered "dummy" by the DWARF reader. That is, the top-level DIE (DW_TAG_compile_unit) does not have any children. Here's the skeleton: 0x0000c0cb: Compile Unit: length = 0x0000001d, format = DWARF32, version = 0x0005, unit_type = DW_UT_skeleton, abbr_offset = 0x529b, addr_size = 0x08, DWO_id = 0x0ed2693dd2a756dc (next unit at 0x0000c0ec) 0x0000c0df: DW_TAG_skeleton_unit DW_AT_stmt_list [DW_FORM_sec_offset] (0x09dee00f) DW_AT_dwo_name [DW_FORM_strp] ("CMakeFiles/lib_crl.dir/crl/dispatch/crl_dispatch_queue.cpp.dwo") DW_AT_comp_dir [DW_FORM_strp] ("/home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/Telegram/lib_crl") DW_AT_GNU_pubnames [DW_FORM_flag_present] (true) And here's the entire debug info in the .dwo file: .debug_info.dwo contents: 0x00000000: Compile Unit: length = 0x0000001a, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size = 0x08, DWO_id = 0x0ed2693dd2a756dc (next unit at 0x0000001e) 0x00000014: DW_TAG_compile_unit DW_AT_producer [DW_FORM_strx] ("GNU C++20 14.2.1 20250207 -mno-direct-extern-access -mtune=generic -march=x86-64 -gsplit-dwarf -g3 -gz=none -O2 -std=gnu++20 -fPIC -fno-strict-aliasing") DW_AT_language [DW_FORM_data1] (DW_LANG_C_plus_plus_14) DW_AT_name [DW_FORM_strx] ("/home/simark/src/tdesktop/Telegram/lib_crl/crl/dispatch/crl_dispatch_queue.cpp") DW_AT_comp_dir [DW_FORM_strx] ("/home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/Telegram/lib_crl") When loading the binary in GDB, I see some warnings: $ ./gdb -q -nx --data-directory=data-directory -ex 'maint set dwarf sync on' -ex "file /home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/telegram-desktop" Reading symbols from /home/simark/src/tdesktop/build-relwithdebuginfo-split-nogz/telegram-desktop... DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc0cb DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc152 DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc194 DWARF Error: unexpected tag 'DW_TAG_skeleton_unit' at offset 0xc1b5 (gdb) It turns out that these errors are not really justified. What happens is: - cutu_reader::read_cutu_die_from_dwo return 0, indicating that the CU is "dummy" - back in cutu_reader::cutu_reader, we omit setting m_top_level_die to the DIE from the dwo file, meaning that m_top_level_die keeps pointing to the DIE from the main file (DW_TAG_skeleton_unit) - later, in cutu_reader::prepare_one_comp_unit, there is a check that m_top_level_die->tag is one of DW_TAG_{compile,partial,type}_unit, which triggers My proposal to fix this is to set m_top_level_die even if the CU is dummy. Even if the top-level DIE does not have any children, I don't see any reason to leave cutu_reader::m_top_level_die in a different state than when the CU is not dummy. While at it, set m_dummy_p directly in read_cutu_die_from_dwo, instead of returning a value and having the caller do it. This is all inside cutu_reader anyway. Change-Id: I483a68a369bb461a8dfa5bf2106ab1d6a0067198 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-17gdbsupport: add some -Wunused-* warning flagsSimon Marchi1-1/+0
Add a few -Wunused-* diagnostic flags that look useful. Some are known to gcc, some to clang, some to both. Fix the fallouts. -Wunused-const-variable=1 is understood by gcc, but not clang. -Wunused-const-variable would be undertsood by both, but for gcc at least it would flag the unused const variables in headers. This doesn't make sense to me, because as soon as one source file includes a header but doesn't use a const variable defined in that header, it's an error. With `=1`, gcc only warns about unused const variable in the main source file. It's not a big deal that clang doesn't understand it though: any instance of that problem will be flagged by any gcc build. Change-Id: Ie20d99524b3054693f1ac5b53115bb46c89a5156 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: remove some _1 suffixesSimon Marchi1-2/+2
These methods don't have (or no longer have) a counterpart without the _1 suffix, so remove the suffix. Change-Id: Ifdfe4fb3b6b09c6bb9e30c27acf9f9ecbcb207f2 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: remove cutu_reader::keep, add cutu_reader::release_cuSimon Marchi1-3/+2
This is a bit subjective, but I often struggle to understand what cutu_reader::keep is meant to do (keep what, where). Perhaps it's just a question of bad naming, but I think it's a bit confusing for cutu_reader to transfer the ownership of the dwarf2_cu to the per_objfile directly. Add the cutu::release_cu method and make the caller of cutu_reader transfer the ownership to the per_objfile object. Right now, it is theoretically possible for release_cu to return nullptr, so I made callers check the return value. A patch later in this series will change release_cu to ensure it always return non-nullptr, so those callers will get simplified. Change-Id: I3103ff894d1654a95c9d69001073c218501c988a Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: change cutu_reader::read_die_and_siblings to ↵Simon Marchi1-1/+2
cutu_reader::read_all_dies After construction of a cutu_reader, only the top-level DIE has been read in memory. If the caller wants to access the full DIE tree, it does: reader.top_level_die ()->child = reader.read_die_and_siblings (reader.top_level_die ()); I don't really like this poking into cutu_reader's data structures from the outside, I would prefer if that work was done by cutu_reader. Rename the read_die_and_siblings method to read_all_dies, and do that work inside cutu_reader. I also moved these operations inside the read_all_dies method: gdb_assert (cu->die_hash.empty ()); cu->die_hash.reserve (cu->header.get_length_without_initial () / 12); ... cu->dies = reader.top_level_die (); The rationale for this is that read_all_dies (and the functions it calls) is responsible for filling the die_hash set. So I think it makes sense for it to do the reserve. It is also cutu_reader's job, currently, to create and fill the fields of dwarf2_cu. So I think it makes sense for it to set cu->dies, after having read the DIEs in memory. Change-Id: I088c2e0b367db7d1f67e8c9e2d5b0d61165292fc Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: access m_info_ptr directly instead of passing info_ptr aroundSimon Marchi1-14/+7
The few methods of cutu_reader that read DIEs into memory generally receive an info_ptr that says where to start reading and return another one (either by return value or parameter) indicating where the caller should continue reading. We can avoid all this passing around by having these methods access m_info_ptr directly. This allows changing some methods that read DIEs to return `die_info *`, instead of returning it by parameter, which just makes the code simpler to read, I think. The only method that meaningfully reads and writes m_info_ptr (except the places that initially set it up) is read_full_die_1. It reads and increments m_info_ptr once to read the abbrev and once again to read each attribute. Other methods use it for logging. The methods cutu_reader::read_attribute and cutu_reader::read_attribute_value do not touch m_info_ptr directly, because they are used in cooked-indexer.c, which appears to read some things in a non-linear fashion, unlike cutu_reader's DIE-reading methods. The cooked indexer calls cutu_reader::info_ptr to get the m_info_ptr value just after the top-level DIE, and then it does its own attribute reading after that. Change-Id: I251f63d13d453a2827b21349760da033171880e2 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: factor out to cutu_reader::skip_one_attribute methodSimon Marchi1-0/+5
I was reading cutu_reader::skip_one_die, and thought that the code to skip one attribute made it quite difficult to read. Factor this code out to a new method, to get it out of the way. As a bonus, it transforms one goto in a recursion call, which is also easier to follow. Unfortunately, I have no idea how to test DW_FORM_indirect, as it doesn't seem to appear anywhere in the testsuite, and I don't think that compilers often emit that. Change-Id: I2257b3e594aafb7c7da52ddd55baa651cefb802f Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: remove pretend_language parameter from load_full_{comp,type}_unitSimon Marchi1-6/+2
I noticed that load_full_comp_unit and load_full_type_unit didn't use their pretend_language parameter. Remove them, and then remove more things that were needed to get the language value to that point, including the dwarf2_queue_item field. Change-Id: Ie8cb21c54ae49da065a1b0a20bf18ccb93961d1a Approved-By: Tom Tromey <tom@tromey.com>
2025-03-10Revert past commitsSimon Marchi1-19/+36
I accidentally pushed my work-in-progress branch... revert that. Sorry for the noise :(. The list of commits reverted are: ae2a50a9ae15 attempt to revamp to the CU/TU list e9386435c94f gdb/dwarf: print DWARF CUs/TUs in "maint print objfiles" 6cbd64aa3eb0 gdb/dwarf: add dwarf_source_language_name 32a187da7622 libiberty: move DW_LANG_* definitions to dwarf2.def b3fa38aef59d gdb/dwarf: move index unit vectors to debug names reader and use them 30ba74418982 gdb/dwarf: track comp and type units count bedb4e09f292 gdb/dwarf: remove unnecessary braces b4f18de12c77 gdb/dwarf: use ranged for loop in some pots Change-Id: I80aed2847025f5b15c16c997680783b39858a703
2025-03-10attempt to revamp to the CU/TU listSimon Marchi1-19/+17
Change-Id: I1c8214413583d540c10c9a2322ef2a21f8bb54e7
2025-03-10gdb/dwarf: move index unit vectors to debug names reader and use themSimon Marchi1-17/+0
Since these vectors contain the CU and TU lists as found in the .debug_names header, it seems like they are meant to be used by the .debug_names reader when handling a DW_IDX_compile_unit or DW_IDX_type_unit attribute. The value of the attribute would translate directly into an index into one of these vectors. However there's something fishy: it looks like these vectors aren't actually used in practice. They are used in the dwarf2_per_bfd::get_index_{c,t}u methods, which in turn aren't used anywhere. The handlers of DW_IDX_compile_unit and DW_IDX_type_unit use the dwarf2_per_bfd::get_cu method, assuming that all compile units are placed before type units in the dwarf2_per_bfd::all_units vector. I see several problems with that: 1. I found out [1] that the dwarf2_per_bfd::all_units didn't always have the CUs before the TUs. So indexing dwarf2_per_bfd::all_units with that assumption will not work. 2. The dwarf2_find_containing_comp_unit function assumes an ordering of units by section offset (among other criteria) in order to do a binary search. Even though it's probably commonly the case, nothing guarantees that the order of CUs and TUs in the .debug_names header (which defines the indices used to refer to them) will be sorted by section offset. It's not possible to make dwarf2_find_containing_comp_unit (assuming it wants to do a binary search by section offset) and the DW_IDX_compile_unit / DW_IDX_type_unit handlers use the same vector. 3. I have not tested this, but in the presence of a dwz supplementary file, the .debug_names reader should probably not put the units from the main and dwz files in the same vectors to look them up by index. Presumably, if both the main and dwz files have a .debug_names index, they have distinct CU / TU lists. So, an CU index of 1 in an index entry in the main file would refer to a different CU than an index of 1 in an index entry in the dwz file. The current code doesn't seem to account for that, it just indexes dwarf2_per_bfd::all_units. Since those vectors are kind of specific to the .debug_names reader, move them there, in the mapped_debug_names_reader struct. Then, update the handlers of DW_IDX_compile_unit and DW_IDX_type_unit to use them. [1] https://inbox.sourceware.org/gdb-patches/87a5ab5i5m.fsf@tromey.com/T/#mbdcfe35f94db33e59500eb0d3d225661cab016a4 Change-Id: I3958d70bb3875268143471da745aa09336ab2500
2025-03-10gdb/dwarf: track comp and type units countSimon Marchi1-1/+3
A subsequent commit will remove the all_comp_units and all_type_units array views, since the all_units vector will no longer be segmented between comp and type units. Some callers still need to know the number of each kind, so track that separately. Change-Id: I6ef184767a96e5be095bbf9142aa850adbb083ac
2025-03-10gdb/dwarf: save DWARF version in dwarf2_loclist_baton, remove it from ↵Simon Marchi1-21/+0
dwarf2_per_cu When running: $ make check TESTS="gdb.cp/cpexprs-debug-types.exp" RUNTESTFLAGS="--target_board=fission" I get: (gdb) break -qualified main /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.h:295: internal-error: version: Assertion `m_dwarf_version != 0' failed. The problem is that dwarf2_per_cu objects created in the read_cutu_die_from_dwo code path never have their DWARF version set. A seemingly obvious solution would be to add a call to dwarf2_per_cu::set_version in there (there's a patch in the referenced PR that does that). However, this comment in read_comp_units_from_section is a bit scary: /* Init this asap, to avoid a data race in the set_version in cutu_reader::cutu_reader (which may be run in parallel for the cooked index case). */ this_cu->set_version (cu_header.version); I don't know if a DWO file can be read while the cooked indexer runs, so if it would be a problem here, but I prefer to be safe than sorry. This patch side-steps the problem by deleting the DWARF version from dwarf2_per_cu. The only users of dwarf2_per_cu::version are the loclists callbacks in `loc.c`. Add the DWARF version to dwarf2_loclist_baton and modify those callbacks to get the version from there instead. Initialize that new field in fill_in_loclist_baton. I like this approach because there is no version field that is possibly unset now. I wasn't keen on doing this at first because I thought it would waste space, but the dwarf2_loclist_baton has 7 bytes of padding at the end anyway, so we might as well use that. Cc: Ricky Zhou <ricky@rzhou.org> Cc: Tom de Vries <tdevries@suse.de> Cc: Tom Tromey <tom@tromey.com> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32309 Change-Id: I30d4ede7d67da5d80ff65c6122f5868e1098ec52 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-10gdb/dwarf: rename comp_unit_die to top_level_dieSimon Marchi1-2/+2
The name "comp_unit_die" is a bit misleading, because it can also represent a type unit (DW_TAG_type_unit). I think that "top_level_die" is clear. Change-Id: Ibaac99897f0ac7499f0f82caeed3385e1e6ee870 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-10gdb/dwarf: add doc for cutu_reader::is_dummySimon Marchi1-0/+5
Change-Id: Ifb80557187c12822bdea7ad400c32c3dce968a7f Approved-By: Tom Tromey <tom@tromey.com>
2025-03-07gdb/dwarf: call other cutu_reader constructor in ensure_lang and ↵Simon Marchi1-2/+2
dw2_get_file_names PR 32742 shows this failing: $ make check TESTS="gdb.ada/access_to_unbounded_array.exp" RUNTESTFLAGS="--target_board=fission" Running /home/simark/src/binutils-gdb/gdb/testsuite/gdb.ada/access_to_unbounded_array.exp ... FAIL: gdb.ada/access_to_unbounded_array.exp: scenario=all: gdb_breakpoint: set breakpoint at foo.adb:23 (GDB internal error) Or, interactively: $ ./gdb -q -nx --data-directory=data-directory testsuite/outputs/gdb.ada/access_to_unbounded_array/foo-all -ex 'b foo.adb:23' -batch /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:19567: internal-error: set_lang: Assertion `old_value == language_unknown || old_value == language_minimal || old_value == lang' failed. The symptom is that for a given dwarf2_per_cu, the language gets set twice. First, set to `language_ada`, and then, to `language_minimal`. It's unexpected for the language of a CU to get changed like this. The CU at offset 0x0 in the main file looks like: 0x00000000: Compile Unit: length = 0x00000030, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000034) 0x0000000b: DW_TAG_compile_unit DW_AT_low_pc [DW_FORM_addr] (0x000000000000339a) DW_AT_high_pc [DW_FORM_data8] (0x0000000000000432) DW_AT_stmt_list [DW_FORM_sec_offset] (0x00000000) DW_AT_GNU_dwo_name [DW_FORM_strp] ("b~foo.dwo") DW_AT_comp_dir [DW_FORM_strp] ("/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.ada/access_to_unbounded_array") DW_AT_GNU_pubnames [DW_FORM_flag_present] (true) DW_AT_GNU_addr_base [DW_FORM_sec_offset] (0x00000000) DW_AT_GNU_dwo_id [DW_FORM_data8] (0x277aee54e7bd47f7) This refers to the DWO file b~foo.dwo, whose top-level DIE is: .debug_info.dwo contents: 0x00000000: Compile Unit: length = 0x00000b63, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000b67) 0x0000000b: DW_TAG_compile_unit DW_AT_producer [DW_FORM_GNU_str_index] ("GNU Ada 14.2.1 20250207 -fgnat-encodings=minimal -gdwarf-4 -fdebug-types-section -fuse-ld=gold -gnatA -gnatWb -gnatiw -gdwarf-4 -gsplit-dwarf -ggnu-pubnames -gnatws -mtune=generic -march=x86-64") DW_AT_language [DW_FORM_data1] (DW_LANG_Ada95) DW_AT_name [DW_FORM_GNU_str_index] ("/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.ada/access_to_unbounded_array/b~foo.adb") DW_AT_comp_dir [DW_FORM_GNU_str_index] ("/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.ada/access_to_unbounded_array") DW_AT_GNU_dwo_id [DW_FORM_data8] (0xdbeffefab180a2cb) The thing to note is that the language attribute is only present in the DIE in the DWO file, not on the DIE in the main file. The first time the language gets set is here: #0 dwarf2_per_cu::set_lang (this=0x50f0000044b0, lang=language_ada, dw_lang=DW_LANG_Ada95) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:20788 #1 0x0000555561666af6 in cutu_reader::prepare_one_comp_unit (this=0x7ffff10bf2b0, cu=0x51700008e000, pretend_language=language_minimal) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:21029 #2 0x000055556159f740 in cutu_reader::cutu_reader (this=0x7ffff10bf2b0, this_cu=0x50f0000044b0, per_objfile=0x516000066080, abbrev_table=0x510000004640, existing_cu=0x0, skip_partial=false, pretend_language=language_minimal, cache=0x7ffff11b95e0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3371 #3 0x00005555615a547a in process_psymtab_comp_unit (this_cu=0x50f0000044b0, per_objfile=0x516000066080, storage=0x7ffff11b95e0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3799 #4 0x00005555615a9292 in cooked_index_worker_debug_info::process_cus (this=0x51700008dc80, task_number=0, first=std::unique_ptr<dwarf2_per_cu> = {...}, end=std::unique_ptr<dwarf2_per_cu> = {...}) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:4122 In this code path (particularly this specific cutu_reader constructir), the work is done to find and read the DWO file. So the language is properly identifier as language_ada, all good so far. The second time the language gets set is: #0 dwarf2_per_cu::set_lang (this=0x50f0000044b0, lang=language_minimal, dw_lang=0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:20788 #1 0x0000555561666af6 in cutu_reader::prepare_one_comp_unit (this=0x7ffff0f42730, cu=0x517000091b80, pretend_language=language_minimal) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:21029 #2 0x00005555615a1822 in cutu_reader::cutu_reader (this=0x7ffff0f42730, this_cu=0x50f0000044b0, per_objfile=0x516000066080, pretend_language=language_minimal, parent_cu=0x0, dwo_file=0x0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3464 #3 0x000055556158c850 in dw2_get_file_names (this_cu=0x50f0000044b0, per_objfile=0x516000066080) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1956 #4 0x000055556158f4f5 in dw_expand_symtabs_matching_file_matcher (per_objfile=0x516000066080, file_matcher=...) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:2157 #5 0x00005555616329e2 in cooked_index_functions::expand_symtabs_matching (this=0x50200002ab50, objfile=0x516000065780, file_matcher=..., lookup_name=0x0, symbol_matcher=..., expansion_notify=..., search_flags=..., domain=..., lang_matcher=...) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:15912 #6 0x0000555562ca8a14 in objfile::map_symtabs_matching_filename (this=0x516000065780, name=0x50200002ad90 "break pck.adb", real_path=0x0, callback=...) at /home/smarchi/src/binutils-gdb/gdb/symfile-debug.c:207 #7 0x0000555562d68775 in iterate_over_symtabs (pspace=0x513000005600, name=0x50200002ad90 "break pck.adb", callback=...) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:727 Here, we use the other cutu_reader constructor, the one that does not look up the DWO file for the passed CU. If a DWO file exists for this CU, the caller is expected to pass it as a parameter. That cutu_reader constructor also ends up setting the language of the CU. But because it didn't read the DWO file, it didn't figure out the language is language_ada, so it tries to set the language to the default, language_minimal. A question is: why do we end up trying to set the CU's language is this context. This is completely unrelated to what we're trying to do, that is get the file names from the line table. Setting the language is a side-effect of just constructing a cutu_reader, which we need to look up attributes in dw2_get_file_names_reader. There are probably some cleanups to be done here, to avoid doing useless work like looking up and setting the CU's language when all we need is an object to help reading the DIEs and attributes. But that is future work. The same cutu_reader constructor is used in `dwarf2_per_cu::ensure_lang`. Since this is the version of cutu_reader that does not look up the DWO file, it will conclude that the language is language_minimal and set that as the CU's language. In other words, `dwarf2_per_cu::ensure_lang` will get the language wrong, pretty ironic. Fix this by using the other cutu_reader constructor in those two spots. Pass `per_objfile->get_cu (this_cu)`, as the `existing_cu` parameter. I think this is necessary, because that constructor has an assert to check that if `existing_cu` is nullptr, then there must not be an existing `dwarf2_cu` in the per_objfile. To avoid getting things wrong like this, I think that the second cutu_reader constructor should be reserved for the spots that do pass a non-nullptr dwo_file. The only spot at the moment in create_cus_hash_table, where we read multiple units from the same DWO file. In this context, I guess it makes sense for efficiency to get the dwo_file once and pass it down to cutu_reader. For that constructor, make the parameters non-optional, add "non-nullptr" asserts, and update the code to assume the passed values are not nullptr. What I don't know is if this change is problematic thread-wise, if the functions I have modified to use the other cutu_reader constructor can be called concurrently in worker threads. If so, I think it would be problematic. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32742 Change-Id: I980d16875b9a43ab90e251504714d0d41165c7c8 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-07gdb/dwarf: move cooked_indexer to cooked-indexer.{h,c}Simon Marchi1-0/+87
Move the cooked_indexer class declaration to a new cooked-indexer.h file, and the implementation to cooked-indexer.c. Change-Id: Ibff3b06045b2af65fa9516097acf732d7c2d9414 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-07gdb/dwarf: move cutu_reader to read.hSimon Marchi1-0/+148
In order to move some things outside of read.c, cutu_reader needs to be in a header file. Change-Id: Ib26d7949c55867848d109332caf2efb1a6e72923 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-05gdb/dwarf: store dwo_file_up in dwo_file_setSimon Marchi1-5/+8
Heap-allocated dwo_file objects, stored in dwarf2_per_bfd::dwo_files, are never freed. They are created in one of the create_dwo_unit_in_dwp_* or lookup_dwo_cutu functions. I confirmed this by running: $ make check TESTS="gdb.cp/anon-ns.exp" RUNTESTFLAGS="--target_board=fission-dwp" $ ./gdb -q -nx --data-directory=data-directory testsuite/outputs/gdb.cp/anon-ns/anon-ns -ex "p main" -ex "file" -batch ... and checking the ASan leak report. I also debugged this invocation of GDB, placed a breakpoint on ~dwo_file, and didn't see any hit. Change the dwo_file set to hold dwo_file_up objects. When the dwarf2_per_bfd object gets destroyed, dwo_file objects will automatically get destroyed. With this change, I see the related leaks disappear in the ASan leak report, and my ~dwo_file breakpoint gets hit when debugging GDB. Change-Id: Icb38539c3f9e553f3625c625a00fc63dd6e9f3c5 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-05gdb/dwarf: make dwarf2_per_bfd::dwo_files a gdb::unordered_setSimon Marchi1-3/+42
Change the dwarf2_per_bfd::dwo_files htab to a gdb::unordered_set. No behavior change expected, except maybe the failure case in lookup_dwo_cutu. If open_and_init_dwo_file returns nullptr, the previous code would leave the slot value empty (nullptr). Is this legit? With the new hash table, the only thing we can do really is not attempt to insert the nullptr value. Change-Id: I63992f388b1197e696ded4ea483634e8ae67fce4 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-04gdb/dwarf: pass is_dwz to dwarf2_per_cu constructorSimon Marchi1-5/+7
It is always known at construction time whether a dwarf2_per_cu is built to represent a unit from a dwz file or not, so pass that information through the constructor. Change-Id: I278c1894ed606451aad02e830085190bb724c473 Approved-By: Tom Tromey <tom@tromey.com>