aboutsummaryrefslogtreecommitdiff
path: root/gdb/dwarf2
AgeCommit message (Collapse)AuthorFilesLines
2025-03-14gdb/dwarf: assume that no dwarf2_cu exist when calling load_full_comp_unitSimon Marchi1-13/+12
After staring at the code, I got convinced that it was not possible for load_full_comp_unit to be called while a dwarf2_cu object exists in per_objfile for this_cu. If you follow all callers of load_full_comp_unit, you can see that all calls to load_full_comp_unit (except one, see below) are gated one way or another by the fact that: per_objfile->get_cu (per_cu) == nullptr Some calls are gated by maybe_queue_comp_unit returning true. If it returns true, then necessarily the dwarf2_cu is unset for that per_cu. The spot that didn't seem to check for whether the dwarf2_cu is already set before calling load_full_comp_unit is dw2_do_instantiate_symtab. It didn't trigger when running the testsuite, but I could imagine a made up case where the dwarf2_cu would already be set because we looked up a DIE reference to it (follow_die_ref) for whatever reason. Then, something would cause the symtab for that CU to be expanded and dw2_do_instantiate_symtab to be called. I added a check in that function, because it seemed prudent to do so. All other load_cu calls are gated by this check, so it makes this call look just like the others. Finally, because all call sites that use cutu_reader::release_cu pass nullptr for `existing_cu` (and therefore cutu_reader creates a new dwarf2_cu), we know that cutu_reader::release_cu will always return a non-nullptr value. Add an assert in it and remove checks in load_full_comp_unit and read_signatured_type. Change-Id: I496be34bd4bf7edfa38d5135cf4bc4ccd960abe2 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: remove existing_cu parameter of load_full_comp_unitSimon Marchi1-20/+8
Following the previous patch, all callers now pass the same thing: per_objfile->get_cu (this_cu) Remove that parameter and to the call in the function itself. Change-Id: Iafd36b058d7b95efae518bb65035c6a03728b018 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: assume that source_cu->dies is always set in follow_die_offsetSimon Marchi1-7/+2
After staring at the code for a while, I got convinced that it's not possible for cu->dies to be nullptr in follow_die_offset. It might be a leftover from the psymtab days. In most cases, we see that the dwarf2_cu passedas `*ref_cu` has been obtained by doing: per_objfile->get_cu (per_cu); The only way for a dwarf2_cu to end up in the per_objfile like this is through load_full_comp_unit or read_signatured_type. Both of these functions call `reader.read_all_dies ()` (which loads the DIEs in memory and assigns dwarf2_cu::dies) before transferring the newly created dwarf2_cu to the per_objfile. So any dwarf2_cu obtained through per_objfile->get_cu (per_cu) ... will have its DIEs set. The only case today I'm aware of of a dwarf2_cu without DIEs is in the cooked indexer. It creates a cutu_reader, but does not call read_all_dies. Instead, it gets the info_ptr from the cutu_reader and reads the DIEs from the section buffer directly, on its own. But this is an entirely different code path that doesn't assign dwarf2_cu objects to per_objfile. So, remove the code path in follow_die_offset that tests for `source_cu->dies == NULL`. I added an assert at the top of the function to verify that `source_cu->dies` is always non-nullptr, as a way to test my hypothesis. We could probably get rid of it, but I left it there because it doesn't cost much to have it. Change-Id: I97f269f092128800850aa5e64eda7032c2edec60 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: rename local variables in follow_die_offsetSimon Marchi1-21/+21
Rename some local variables to better make the distinction between the source and target CUs. Change-Id: I8b43fac91b8a6f1ca6fd1972846fd6bf28608fe3 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: remove unnecessary per_objfile parameter in ↵Simon Marchi2-6/+3
cooked_indexer::ensure_cu_exists The per_objfile object can be obtained from the cutu_reader. This is actually how both callers get it in order to pass it as argument. Change-Id: Iac134ded247d841f80ab5ca55dd9055b556410c3 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: remove some _1 suffixesSimon Marchi2-8/+8
These methods don't have (or no longer have) a counterpart without the _1 suffix, so remove the suffix. Change-Id: Ifdfe4fb3b6b09c6bb9e30c27acf9f9ecbcb207f2 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: remove cutu_reader::keep, add cutu_reader::release_cuSimon Marchi2-15/+22
This is a bit subjective, but I often struggle to understand what cutu_reader::keep is meant to do (keep what, where). Perhaps it's just a question of bad naming, but I think it's a bit confusing for cutu_reader to transfer the ownership of the dwarf2_cu to the per_objfile directly. Add the cutu::release_cu method and make the caller of cutu_reader transfer the ownership to the per_objfile object. Right now, it is theoretically possible for release_cu to return nullptr, so I made callers check the return value. A patch later in this series will change release_cu to ensure it always return non-nullptr, so those callers will get simplified. Change-Id: I3103ff894d1654a95c9d69001073c218501c988a Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: change cutu_reader::read_die_and_siblings to ↵Simon Marchi2-35/+17
cutu_reader::read_all_dies After construction of a cutu_reader, only the top-level DIE has been read in memory. If the caller wants to access the full DIE tree, it does: reader.top_level_die ()->child = reader.read_die_and_siblings (reader.top_level_die ()); I don't really like this poking into cutu_reader's data structures from the outside, I would prefer if that work was done by cutu_reader. Rename the read_die_and_siblings method to read_all_dies, and do that work inside cutu_reader. I also moved these operations inside the read_all_dies method: gdb_assert (cu->die_hash.empty ()); cu->die_hash.reserve (cu->header.get_length_without_initial () / 12); ... cu->dies = reader.top_level_die (); The rationale for this is that read_all_dies (and the functions it calls) is responsible for filling the die_hash set. So I think it makes sense for it to do the reserve. It is also cutu_reader's job, currently, to create and fill the fields of dwarf2_cu. So I think it makes sense for it to set cu->dies, after having read the DIEs in memory. Change-Id: I088c2e0b367db7d1f67e8c9e2d5b0d61165292fc Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: access m_info_ptr directly instead of passing info_ptr aroundSimon Marchi2-138/+103
The few methods of cutu_reader that read DIEs into memory generally receive an info_ptr that says where to start reading and return another one (either by return value or parameter) indicating where the caller should continue reading. We can avoid all this passing around by having these methods access m_info_ptr directly. This allows changing some methods that read DIEs to return `die_info *`, instead of returning it by parameter, which just makes the code simpler to read, I think. The only method that meaningfully reads and writes m_info_ptr (except the places that initially set it up) is read_full_die_1. It reads and increments m_info_ptr once to read the abbrev and once again to read each attribute. Other methods use it for logging. The methods cutu_reader::read_attribute and cutu_reader::read_attribute_value do not touch m_info_ptr directly, because they are used in cooked-indexer.c, which appears to read some things in a non-linear fashion, unlike cutu_reader's DIE-reading methods. The cooked indexer calls cutu_reader::info_ptr to get the m_info_ptr value just after the top-level DIE, and then it does its own attribute reading after that. Change-Id: I251f63d13d453a2827b21349760da033171880e2 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: factor out to cutu_reader::skip_one_attribute methodSimon Marchi2-99/+108
I was reading cutu_reader::skip_one_die, and thought that the code to skip one attribute made it quite difficult to read. Factor this code out to a new method, to get it out of the way. As a bonus, it transforms one goto in a recursion call, which is also easier to follow. Unfortunately, I have no idea how to test DW_FORM_indirect, as it doesn't seem to appear anywhere in the testsuite, and I don't think that compilers often emit that. Change-Id: I2257b3e594aafb7c7da52ddd55baa651cefb802f Approved-By: Tom Tromey <tom@tromey.com>
2025-03-14gdb/dwarf: remove pretend_language parameter from load_full_{comp,type}_unitSimon Marchi2-31/+18
I noticed that load_full_comp_unit and load_full_type_unit didn't use their pretend_language parameter. Remove them, and then remove more things that were needed to get the language value to that point, including the dwarf2_queue_item field. Change-Id: Ie8cb21c54ae49da065a1b0a20bf18ccb93961d1a Approved-By: Tom Tromey <tom@tromey.com>
2025-03-13gdb/dwarf: keep going even if reading macro information failsSimon Marchi1-3/+14
On Debian 12, with gcc 12 and ld 2.40, I get some failures when running: $ make check TESTS="gdb.base/style.exp" RUNTESTFLAGS="--target_board=fission" I think I stumble on this bug [1], preventing the test from doing anything that requires expanding the compilation unit: $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.base/style/style Reading symbols from testsuite/outputs/gdb.base/style/style... (gdb) p main DW_FORM_strp pointing outside of .debug_str section [in module /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/style/style] (gdb) The error is thrown here: #0 0x00007ffff693f0a1 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6 #1 0x0000555569ce6852 in throw_it(return_reason, errors, const char *, typedef __va_list_tag __va_list_tag *) (reason=RETURN_ERROR, error=GENERIC_ERROR, fmt=0x555562a9fc40 "%s pointing outside of %s section [in module %s]", ap=0x7fffffff8df0) at /home/smarchi/src/binutils-gdb/gdbsupport/common-exceptions.cc:203 #2 0x0000555569ce690f in throw_verror (error=GENERIC_ERROR, fmt=0x555562a9fc40 "%s pointing outside of %s section [in module %s]", ap=0x7fffffff8df0) at /home/smarchi/src/binutils-gdb/gdbsupport/common-exceptions.cc:211 #3 0x000055556879c0cb in verror (string=0x555562a9fc40 "%s pointing outside of %s section [in module %s]", args=0x7fffffff8df0) at /home/smarchi/src/binutils-gdb/gdb/utils.c:193 #4 0x0000555569cfa88d in error (fmt=0x555562a9fc40 "%s pointing outside of %s section [in module %s]") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:45 #5 0x000055556667dbff in dwarf2_section_info::read_string (this=0x61b000042a08, objfile=0x616000055e80, str_offset=262811, form_name=0x555562886b40 "DW_FORM_strp") at /home/smarchi/src/binutils-gdb/gdb/dwarf2/section.c:211 #6 0x00005555662486b7 in dwarf_decode_macro_bytes (per_objfile=0x616000056180, builder=0x614000006040, abfd=0x6120000f4b40, mac_ptr=0x60300004f5be "", mac_end=0x60300004f5bb "\002\004", current_file=0x62100007ad70, lh=0x60f000028bd0, section=0x61700008ba78, section_is_gnu=1, section_is_dwz=0, offset_size=4, str_section=0x61700008bac8, str_offsets_section=0x61700008baf0, str_offsets_base=std::optional<unsigned long> = {...}, include_hash=..., cu=0x61700008b600) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/macro.c:511 #7 0x000055556624af0e in dwarf_decode_macros (per_objfile=0x616000056180, builder=0x614000006040, section=0x61700008ba78, lh=0x60f000028bd0, offset_size=4, offset=0, str_section=0x61700008bac8, str_offsets_section=0x61700008baf0, str_offsets_base=std::optional<unsigned long> = {...}, section_is_gnu=1, cu=0x61700008b600) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/macro.c:934 #8 0x000055556642cb82 in dwarf_decode_macros (cu=0x61700008b600, offset=0, section_is_gnu=1) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:19435 #9 0x000055556639bd12 in read_file_scope (die=0x6210000885c0, cu=0x61700008b600) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:6366 #10 0x0000555566392d99 in process_die (die=0x6210000885c0, cu=0x61700008b600) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:5310 #11 0x0000555566390d72 in process_full_comp_unit (cu=0x61700008b600, pretend_language=language_minimal) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:5075 The exception is then only caught at the event-loop level (start_event_loop), causing the whole debug info reading process to be aborted. I think it's a little harsh, considering that a lot of things could work even if we failed to read macro information. Catch the exception inside read_file_scope, print the exception, and carry on. We could go even more fine-grained: if reading the string for one macro definition fails, we could continue reading the macro information. Perhaps it's just that one macro definition that is broken. However, I don't need this level of granularity, so I haven't attempted this. Also, my experience is that macro reading fails when the compiler or linker has a bug, in which case pretty much everything is messed up. With this patch, it now looks like: $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.base/style/style Reading symbols from testsuite/outputs/gdb.base/style/style... (gdb) p main While reading section .debug_macro.dwo: DW_FORM_strp pointing outside of .debug_str section [in module /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/style/style] $1 = {int (int, char **)} 0x684 <main> (gdb) In the test I am investigating (gdb.base/style.exp with the fission board), it allows more tests to run: -# of expected passes 107 -# of unexpected failures 17 +# of expected passes 448 +# of unexpected failures 19 Of course, we still see the error about the macro information, and some macro-related tests still fail (those would be kfailed ideally), but many tests that are not macro-dependent now pass. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111409 Change-Id: I0bdb01f153eff23c63c96ce3f41114bb027e5796 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-12gdb/dwarf: use all_units_range in ↵Simon Marchi1-2/+2
dwarf2_base_index_functions::expand_all_symtabs Commit 292041562289 ("gdb/dwarf: use ranged for loop in some spots") broke some tests notably gdb.base/maint.exp with the fission board. $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.base/maint/maint -ex start -ex "maint expand-sym" -batch ... Temporary breakpoint 1, main (argc=1, argv=0x7fffffffdc48, envp=0x7fffffffdc58) at /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.base/break.c:43 43 if (argc == 12345) { /* an unlikely value < 2^16, in case uninited */ /* set breakpoint 6 here */ /usr/include/c++/14.2.1/debug/safe_iterator.h:392: In function: gnu_debug::_Safe_iterator<_Iterator, _Sequence, _Category>& gnu_debug::_Safe_iterator<_Iterator, _Sequence, _Category>::operator++() [with _Iterator = gnu_cxx:: normal_iterator<std::unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter>*, std::vector<std::unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter>, std::allocator<std::unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter> > > >; _Sequence = std::debug::vector<std::unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter> >; _Category = std::forward_iterator_tag] Error: attempt to increment a singular iterator. Note that this is caught because I build with -D_GLIBCXX_DEBUG=1. Otherwise, it might crash more randomly, or just not crash at all (but still be buggy). While iterating on the all_units vector, some type units get added there: #0 add_type_unit (per_bfd=0x51b000044b80, section=0x50e0000c2280, sect_off=0, length=74, sig=4367013491293299229) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:2576 #1 0x00005555618a3a40 in lookup_dwo_signatured_type (cu=0x51700009b580, sig=4367013491293299229) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:2664 #2 0x00005555618ee176 in queue_and_load_dwo_tu (dwo_unit=0x521000120e00, cu=0x51700009b580) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:8329 #3 0x00005555618eeafe in queue_and_load_all_dwo_tus (cu=0x51700009b580) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:8366 #4 0x00005555618966a6 in dw2_do_instantiate_symtab (per_cu=0x50f0000043c0, per_objfile=0x516000065a80, skip_partial=true) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1695 #5 0x00005555618968d4 in dw2_instantiate_symtab (per_cu=0x50f0000043c0, per_objfile=0x516000065a80, skip_partial=true) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1719 #6 0x000055556189ac3f in dwarf2_base_index_functions::expand_all_symtabs (this=0x502000024390, objfile=0x516000065780) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1977 This invalidates the iterator in dwarf2_base_index_functions::expand_all_symtabs, which is caught by the libstdc++ debug mode. I'm not entirely sure that it is correct to append type units from dwo files to the all_units vector like this. The dwarf2_find_containing_comp_unit function expects a precise ordering of the elements of the all_units vector, to be able to do a binary search. Appending a type unit at the end at this point certainly doesn't respect that ordering. For now I'd just like to undo the regression. Do that by using all_units_range in the ranged for loop. I will keep in mind to investigate whether this insertion of type units in all_units after the fact really makes sense or not. Change-Id: Iec131e59281cf2dbd12d3f3d163b59018fdc54da
2025-03-12gdb/dwarf: remove unused parameter of create_dwo_cu_readerSimon Marchi1-5/+4
Change-Id: I0c5b7591eab8e6616b653be7c04bc75159427ad6
2025-03-12gdb/dwarf: remove unnecessary bracesSimon Marchi1-9/+8
Change-Id: I3cd6b932d0dfb4cc07b6d48a1dc9ec35e7bfa03e
2025-03-12gdb/dwarf: use ranged for loop in some spotsSimon Marchi2-14/+7
I noticed that these loops could be written to avoid the iteration variable `i`. Change-Id: I8b58eb9913b6ac8505ee45eb8009ef7027236cb9
2025-03-11Use gdb set in dwarf2/aranges.cTom Tromey1-1/+1
This changes dwarf2/aranges.c to use gdb::unordered_set. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-10Add string cache and use it in cooked indexTom Tromey2-13/+6
The cooked index needs to allocate names in some cases -- when canonicalizing or when synthesizing Ada package names. This process currently uses a vector of unique_ptrs to manage the memory. Another series I'm writing adds another spot where this allocation must be done, and examining the result showed that certain names were allocated multiple times. To clean this up, this patch introduces a string cache object and changes the cooked indexer to use it. I considered using bcache here, but bcache doesn't work as nicely with string_view -- because bcache is fundamentally memory-based, a temporary copy of the contents must be made to ensure that bcache can see the trailing \0. Furthermore, writing a custom class lets us avoid another copy when canonicalizing C++ names. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-10Revert past commitsSimon Marchi11-433/+326
I accidentally pushed my work-in-progress branch... revert that. Sorry for the noise :(. The list of commits reverted are: ae2a50a9ae15 attempt to revamp to the CU/TU list e9386435c94f gdb/dwarf: print DWARF CUs/TUs in "maint print objfiles" 6cbd64aa3eb0 gdb/dwarf: add dwarf_source_language_name 32a187da7622 libiberty: move DW_LANG_* definitions to dwarf2.def b3fa38aef59d gdb/dwarf: move index unit vectors to debug names reader and use them 30ba74418982 gdb/dwarf: track comp and type units count bedb4e09f292 gdb/dwarf: remove unnecessary braces b4f18de12c77 gdb/dwarf: use ranged for loop in some pots Change-Id: I80aed2847025f5b15c16c997680783b39858a703
2025-03-10attempt to revamp to the CU/TU listSimon Marchi8-221/+283
Change-Id: I1c8214413583d540c10c9a2322ef2a21f8bb54e7
2025-03-10gdb/dwarf: print DWARF CUs/TUs in "maint print objfiles"Simon Marchi2-7/+55
This was useful to me, to debug some problems. Before printing cooked index entries, print a list of CUs and TUs. The information printed for each is a bit arbitrary, I took a look at the types and printed what seemed relevant. An example of output for a CU: [0] ((dwarf2_per_cu_data *) 0x50f000007840) type: DW_UT_compile offset: 0x0 size: 0x1bff artificial: false GDB lang: c++ DWARF lang: DW_LANG_C_plus_plus And for a TU: [2] ((signatured_type *) 0x511000040000) type: DW_UT_type offset: 0x0 size: 0x94 signature: 0x2e966c0dc94b065b I moved the call to cooked_index_functions::wait before printing the CU/TU list, otherwise trying to call "maint print objfiles" quickly, like this, would lead to an internal error: $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.dwarf2/struct-with-sig/struct-with-sig -ex "maint print objfiles" This is because dwarf2_per_cu_data::m_unit_type was not yet set, when trying to read it. Waiting for the index to be built ensures that it is set, since setting the unit type is done as a side-effect somewhere. Change-Id: Ic810ec3bb4d3f5abb481cf1cee9b2954ff4f0874
2025-03-10gdb/dwarf: add dwarf_source_language_nameSimon Marchi2-0/+16
Add dwarf_source_language_name, to convert a DW_LANG_* constant to string. This will be used in a following patch. Change-Id: I552ebd318e2e770d590de5920edbd0b75075c1b7 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-10gdb/dwarf: move index unit vectors to debug names reader and use themSimon Marchi2-87/+72
Since these vectors contain the CU and TU lists as found in the .debug_names header, it seems like they are meant to be used by the .debug_names reader when handling a DW_IDX_compile_unit or DW_IDX_type_unit attribute. The value of the attribute would translate directly into an index into one of these vectors. However there's something fishy: it looks like these vectors aren't actually used in practice. They are used in the dwarf2_per_bfd::get_index_{c,t}u methods, which in turn aren't used anywhere. The handlers of DW_IDX_compile_unit and DW_IDX_type_unit use the dwarf2_per_bfd::get_cu method, assuming that all compile units are placed before type units in the dwarf2_per_bfd::all_units vector. I see several problems with that: 1. I found out [1] that the dwarf2_per_bfd::all_units didn't always have the CUs before the TUs. So indexing dwarf2_per_bfd::all_units with that assumption will not work. 2. The dwarf2_find_containing_comp_unit function assumes an ordering of units by section offset (among other criteria) in order to do a binary search. Even though it's probably commonly the case, nothing guarantees that the order of CUs and TUs in the .debug_names header (which defines the indices used to refer to them) will be sorted by section offset. It's not possible to make dwarf2_find_containing_comp_unit (assuming it wants to do a binary search by section offset) and the DW_IDX_compile_unit / DW_IDX_type_unit handlers use the same vector. 3. I have not tested this, but in the presence of a dwz supplementary file, the .debug_names reader should probably not put the units from the main and dwz files in the same vectors to look them up by index. Presumably, if both the main and dwz files have a .debug_names index, they have distinct CU / TU lists. So, an CU index of 1 in an index entry in the main file would refer to a different CU than an index of 1 in an index entry in the dwz file. The current code doesn't seem to account for that, it just indexes dwarf2_per_bfd::all_units. Since those vectors are kind of specific to the .debug_names reader, move them there, in the mapped_debug_names_reader struct. Then, update the handlers of DW_IDX_compile_unit and DW_IDX_type_unit to use them. [1] https://inbox.sourceware.org/gdb-patches/87a5ab5i5m.fsf@tromey.com/T/#mbdcfe35f94db33e59500eb0d3d225661cab016a4 Change-Id: I3958d70bb3875268143471da745aa09336ab2500
2025-03-10gdb/dwarf: track comp and type units countSimon Marchi2-8/+11
A subsequent commit will remove the all_comp_units and all_type_units array views, since the all_units vector will no longer be segmented between comp and type units. Some callers still need to know the number of each kind, so track that separately. Change-Id: I6ef184767a96e5be095bbf9142aa850adbb083ac
2025-03-10gdb/dwarf: remove unnecessary bracesSimon Marchi1-9/+8
Change-Id: If0b38b860e79771a16ea914af3e337fca0ee3a7d
2025-03-10gdb/dwarf: use ranged for loop in some potsSimon Marchi2-13/+7
I noticed that these loops could be written to avoid the iteration variable `i`. Change-Id: Ia3717acbbf732f0337870d35ac60fe6400383324
2025-03-10gdb/dwarf: save DWARF version in dwarf2_loclist_baton, remove it from ↵Simon Marchi4-38/+12
dwarf2_per_cu When running: $ make check TESTS="gdb.cp/cpexprs-debug-types.exp" RUNTESTFLAGS="--target_board=fission" I get: (gdb) break -qualified main /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.h:295: internal-error: version: Assertion `m_dwarf_version != 0' failed. The problem is that dwarf2_per_cu objects created in the read_cutu_die_from_dwo code path never have their DWARF version set. A seemingly obvious solution would be to add a call to dwarf2_per_cu::set_version in there (there's a patch in the referenced PR that does that). However, this comment in read_comp_units_from_section is a bit scary: /* Init this asap, to avoid a data race in the set_version in cutu_reader::cutu_reader (which may be run in parallel for the cooked index case). */ this_cu->set_version (cu_header.version); I don't know if a DWO file can be read while the cooked indexer runs, so if it would be a problem here, but I prefer to be safe than sorry. This patch side-steps the problem by deleting the DWARF version from dwarf2_per_cu. The only users of dwarf2_per_cu::version are the loclists callbacks in `loc.c`. Add the DWARF version to dwarf2_loclist_baton and modify those callbacks to get the version from there instead. Initialize that new field in fill_in_loclist_baton. I like this approach because there is no version field that is possibly unset now. I wasn't keen on doing this at first because I thought it would waste space, but the dwarf2_loclist_baton has 7 bytes of padding at the end anyway, so we might as well use that. Cc: Ricky Zhou <ricky@rzhou.org> Cc: Tom de Vries <tdevries@suse.de> Cc: Tom Tromey <tom@tromey.com> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32309 Change-Id: I30d4ede7d67da5d80ff65c6122f5868e1098ec52 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-10Use flags enum for cooked_index_entry::full_nameTom Tromey5-30/+50
I found a small bug coming from a couple of recent patches of mine for cooked_index_entry::full_name. First, commit aab26529b30 (Add "Ada linkage" mode to cooked_index_entry::full_name) added a small hack to optionally compute the Ada linkage name. Then, commit aab2ac34d7f (Avoid excessive CU expansion on failed matches) changed the relevant expand_symtabs_matching implementation to use this feature. However, the feature was used unconditionally, causing a bad side effect: the non-canonical name is now used for all languages, not just Ada. But, for C++ this is wrong. Furthermore, consider the declaration of full_name: const char *full_name (struct obstack *storage, bool for_main = false, bool for_ada_linkage = false, const char *default_sep = nullptr) const; ... and then consider this call in cooked_index::dump: gdb_printf (" qualified: %s\n", entry->full_name (&temp_storage, false, "::")); Oops! The "::" is silently converted to 'true' here. To fix both of these problems, this patch changes full_name to accept a flags enum rather than booleans. This avoids the type-safety problem. Then, full_name is changed to remove the "Ada" flag when the entry is not in fact an Ada symbol. Regression tested on x86-64 Fedora 40. Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-10gdb/dwarf: rename comp_unit_die to top_level_dieSimon Marchi3-42/+42
The name "comp_unit_die" is a bit misleading, because it can also represent a type unit (DW_TAG_type_unit). I think that "top_level_die" is clear. Change-Id: Ibaac99897f0ac7499f0f82caeed3385e1e6ee870 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-10gdb/dwarf: add doc for cutu_reader::is_dummySimon Marchi1-0/+5
Change-Id: Ifb80557187c12822bdea7ad400c32c3dce968a7f Approved-By: Tom Tromey <tom@tromey.com>
2025-03-07gdb/dwarf: call other cutu_reader constructor in ensure_lang and ↵Simon Marchi2-13/+14
dw2_get_file_names PR 32742 shows this failing: $ make check TESTS="gdb.ada/access_to_unbounded_array.exp" RUNTESTFLAGS="--target_board=fission" Running /home/simark/src/binutils-gdb/gdb/testsuite/gdb.ada/access_to_unbounded_array.exp ... FAIL: gdb.ada/access_to_unbounded_array.exp: scenario=all: gdb_breakpoint: set breakpoint at foo.adb:23 (GDB internal error) Or, interactively: $ ./gdb -q -nx --data-directory=data-directory testsuite/outputs/gdb.ada/access_to_unbounded_array/foo-all -ex 'b foo.adb:23' -batch /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:19567: internal-error: set_lang: Assertion `old_value == language_unknown || old_value == language_minimal || old_value == lang' failed. The symptom is that for a given dwarf2_per_cu, the language gets set twice. First, set to `language_ada`, and then, to `language_minimal`. It's unexpected for the language of a CU to get changed like this. The CU at offset 0x0 in the main file looks like: 0x00000000: Compile Unit: length = 0x00000030, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000034) 0x0000000b: DW_TAG_compile_unit DW_AT_low_pc [DW_FORM_addr] (0x000000000000339a) DW_AT_high_pc [DW_FORM_data8] (0x0000000000000432) DW_AT_stmt_list [DW_FORM_sec_offset] (0x00000000) DW_AT_GNU_dwo_name [DW_FORM_strp] ("b~foo.dwo") DW_AT_comp_dir [DW_FORM_strp] ("/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.ada/access_to_unbounded_array") DW_AT_GNU_pubnames [DW_FORM_flag_present] (true) DW_AT_GNU_addr_base [DW_FORM_sec_offset] (0x00000000) DW_AT_GNU_dwo_id [DW_FORM_data8] (0x277aee54e7bd47f7) This refers to the DWO file b~foo.dwo, whose top-level DIE is: .debug_info.dwo contents: 0x00000000: Compile Unit: length = 0x00000b63, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000b67) 0x0000000b: DW_TAG_compile_unit DW_AT_producer [DW_FORM_GNU_str_index] ("GNU Ada 14.2.1 20250207 -fgnat-encodings=minimal -gdwarf-4 -fdebug-types-section -fuse-ld=gold -gnatA -gnatWb -gnatiw -gdwarf-4 -gsplit-dwarf -ggnu-pubnames -gnatws -mtune=generic -march=x86-64") DW_AT_language [DW_FORM_data1] (DW_LANG_Ada95) DW_AT_name [DW_FORM_GNU_str_index] ("/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.ada/access_to_unbounded_array/b~foo.adb") DW_AT_comp_dir [DW_FORM_GNU_str_index] ("/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.ada/access_to_unbounded_array") DW_AT_GNU_dwo_id [DW_FORM_data8] (0xdbeffefab180a2cb) The thing to note is that the language attribute is only present in the DIE in the DWO file, not on the DIE in the main file. The first time the language gets set is here: #0 dwarf2_per_cu::set_lang (this=0x50f0000044b0, lang=language_ada, dw_lang=DW_LANG_Ada95) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:20788 #1 0x0000555561666af6 in cutu_reader::prepare_one_comp_unit (this=0x7ffff10bf2b0, cu=0x51700008e000, pretend_language=language_minimal) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:21029 #2 0x000055556159f740 in cutu_reader::cutu_reader (this=0x7ffff10bf2b0, this_cu=0x50f0000044b0, per_objfile=0x516000066080, abbrev_table=0x510000004640, existing_cu=0x0, skip_partial=false, pretend_language=language_minimal, cache=0x7ffff11b95e0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3371 #3 0x00005555615a547a in process_psymtab_comp_unit (this_cu=0x50f0000044b0, per_objfile=0x516000066080, storage=0x7ffff11b95e0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3799 #4 0x00005555615a9292 in cooked_index_worker_debug_info::process_cus (this=0x51700008dc80, task_number=0, first=std::unique_ptr<dwarf2_per_cu> = {...}, end=std::unique_ptr<dwarf2_per_cu> = {...}) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:4122 In this code path (particularly this specific cutu_reader constructir), the work is done to find and read the DWO file. So the language is properly identifier as language_ada, all good so far. The second time the language gets set is: #0 dwarf2_per_cu::set_lang (this=0x50f0000044b0, lang=language_minimal, dw_lang=0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:20788 #1 0x0000555561666af6 in cutu_reader::prepare_one_comp_unit (this=0x7ffff0f42730, cu=0x517000091b80, pretend_language=language_minimal) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:21029 #2 0x00005555615a1822 in cutu_reader::cutu_reader (this=0x7ffff0f42730, this_cu=0x50f0000044b0, per_objfile=0x516000066080, pretend_language=language_minimal, parent_cu=0x0, dwo_file=0x0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3464 #3 0x000055556158c850 in dw2_get_file_names (this_cu=0x50f0000044b0, per_objfile=0x516000066080) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1956 #4 0x000055556158f4f5 in dw_expand_symtabs_matching_file_matcher (per_objfile=0x516000066080, file_matcher=...) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:2157 #5 0x00005555616329e2 in cooked_index_functions::expand_symtabs_matching (this=0x50200002ab50, objfile=0x516000065780, file_matcher=..., lookup_name=0x0, symbol_matcher=..., expansion_notify=..., search_flags=..., domain=..., lang_matcher=...) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:15912 #6 0x0000555562ca8a14 in objfile::map_symtabs_matching_filename (this=0x516000065780, name=0x50200002ad90 "break pck.adb", real_path=0x0, callback=...) at /home/smarchi/src/binutils-gdb/gdb/symfile-debug.c:207 #7 0x0000555562d68775 in iterate_over_symtabs (pspace=0x513000005600, name=0x50200002ad90 "break pck.adb", callback=...) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:727 Here, we use the other cutu_reader constructor, the one that does not look up the DWO file for the passed CU. If a DWO file exists for this CU, the caller is expected to pass it as a parameter. That cutu_reader constructor also ends up setting the language of the CU. But because it didn't read the DWO file, it didn't figure out the language is language_ada, so it tries to set the language to the default, language_minimal. A question is: why do we end up trying to set the CU's language is this context. This is completely unrelated to what we're trying to do, that is get the file names from the line table. Setting the language is a side-effect of just constructing a cutu_reader, which we need to look up attributes in dw2_get_file_names_reader. There are probably some cleanups to be done here, to avoid doing useless work like looking up and setting the CU's language when all we need is an object to help reading the DIEs and attributes. But that is future work. The same cutu_reader constructor is used in `dwarf2_per_cu::ensure_lang`. Since this is the version of cutu_reader that does not look up the DWO file, it will conclude that the language is language_minimal and set that as the CU's language. In other words, `dwarf2_per_cu::ensure_lang` will get the language wrong, pretty ironic. Fix this by using the other cutu_reader constructor in those two spots. Pass `per_objfile->get_cu (this_cu)`, as the `existing_cu` parameter. I think this is necessary, because that constructor has an assert to check that if `existing_cu` is nullptr, then there must not be an existing `dwarf2_cu` in the per_objfile. To avoid getting things wrong like this, I think that the second cutu_reader constructor should be reserved for the spots that do pass a non-nullptr dwo_file. The only spot at the moment in create_cus_hash_table, where we read multiple units from the same DWO file. In this context, I guess it makes sense for efficiency to get the dwo_file once and pass it down to cutu_reader. For that constructor, make the parameters non-optional, add "non-nullptr" asserts, and update the code to assume the passed values are not nullptr. What I don't know is if this change is problematic thread-wise, if the functions I have modified to use the other cutu_reader constructor can be called concurrently in worker threads. If so, I think it would be problematic. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32742 Change-Id: I980d16875b9a43ab90e251504714d0d41165c7c8 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-07Avoid excessive CU expansion on failed matchesTom Tromey1-22/+24
PR symtab/31010 points out that something like "ptype INT" will expand all CUs in a typical program. The OP further points out that the original patch for PR symtab/30520: https://sourceware.org/pipermail/gdb-patches/2024-January/205924.html ... did solve the problem, but the patch changed after (my) review and reintroduced the bug. In cooked_index_functions::expand_symtabs_matching, the final component of a split name is compared with the entry's name using the usual method of calling get_symbol_name_matcher. This code iterates over languages and tries to split the original name according to each style. But, the Ada splitter uses the decoded name -- "int". This causes every C or C++ CU to be expanded. Clearly this is wrong. And, it seems to me that looping over languages and trying to guess the splitting style for the input text is probably bad. However, fixing the problem is not so easy (again due to Ada). I've filed a follow-up bug, PR symtab/32733, for this. Meanwhile, this patch changes the code to be closer to the originally-submitted patch. This works because the comparison is now done between the full name and the "lookup_name_without_params" object, which is a less adulterated variant of the original input. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31010 Tested-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-07gdb/dwarf: move cooked_indexer to cooked-indexer.{h,c}Simon Marchi4-825/+912
Move the cooked_indexer class declaration to a new cooked-indexer.h file, and the implementation to cooked-indexer.c. Change-Id: Ibff3b06045b2af65fa9516097acf732d7c2d9414 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-07gdb/dwarf: move cooked_index_storage to cooked-index-storage.{h,c}Simon Marchi4-137/+192
cooked_index_storage is currently declared in `cooked-index.h` and implemented in `read.c`. Move all that to new `cooked-index-storage.{h,c}` files. Change-Id: I2a07eb446d8a07b15c5664dfe01e3a820cdd45be Approved-By: Tom Tromey <tom@tromey.com>
2025-03-07gdb/dwarf: move cutu_reader to read.hSimon Marchi2-146/+148
In order to move some things outside of read.c, cutu_reader needs to be in a header file. Change-Id: Ib26d7949c55867848d109332caf2efb1a6e72923 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-06Add support for hierarchical Ada namesTom Tromey2-33/+134
In the near future, GNAT will start emitting DWARF names in a more standard way -- specifically, the package structure will be indicated by nested DW_TAG_module DIEs and a given entity will be nested in its package and only have a simple name. This patch changes gdb to understand this style of naming, while still supporting the existing GNAT output. A few special cases are needed. I've commented them. The name-computing code for the full DWARF reader is very complicated -- much too complicated, in my opinion. There are already several bugs in bugzilla about this (search for "physname"... but there are others as well), so I haven't filed any new ones. When I started this project, I thought it would solve some memory overuse issues we sometimes see from how the index-sharding code interacts with the GNAT-specific post-pass. However, to my surprise, the Ada code in gdb relies on some details of symbol naming, and so I've had to add code here to synthesize "linkage" names in some cases. This is unfortunate, but I think can eventually be fixed; I will file a bug to track this issue.
2025-03-06Add "Ada linkage" mode to cooked_index_entry::full_nameTom Tromey2-10/+26
Unfortunately, due to some details of how the Ada support in gdb currently works, the DWARF reader will still have to synthesize some "full name" entries after the cooked index has been constructed. You can see one particular finding related to this in: https://sourceware.org/bugzilla/show_bug.cgi?id=32142 This patch adds a new flag to cooked_index_entry::full_name to enable the construction of these names. I hope to redo this part of the Ada support eventually, so that this code can be removed and the full-name entries simply not created.
2025-03-06Store new Ada entries in cooked_index_shard::m_entriesTom Tromey2-8/+21
handle_gnat_encoded_entry might create synthetic cooked index entries for Ada packages. These aren't currently kept in m_entries, but it seems to me that they should be, particularly because a forthcoming GNAT will emit explicit DW_TAG_module for these names -- with this change, the indexes will be roughly equivalent regardless of which compiler was used.
2025-03-06Handle DW_TAG_module for AdaTom Tromey1-8/+19
This updates read_module_type to turn DW_TAG_module into a TYPE_CODE_NAMESPACE when the CU represents Ada code. Note that the GNAT that generates this isn't generally available yet and so this shouldn't have an impact on current code.
2025-03-06Add "synthetic" marker for index entriesTom Tromey3-9/+11
Currently, gdb will synthesize DW_TAG_module entries for Ada names. These entries are treated specially by the index writer, When GNAT starts emitting DW_TAG_module, the special case will be incorrect, because there will be non-synthetic DW_TAG_module entries in the index. This patch arranges to mark the synthetic entries and changes the index writer to follow.
2025-03-06Use DW_TAG_module for AdaTom Tromey3-3/+6
In GCC we decided to use DW_TAG_module to represent Ada packages, so make this same decision in gdb. This also updates tag_matches_domain to handle this case.
2025-03-06Use dwarf2_full_name when computing type namesTom Tromey1-4/+5
This changes a few spots in the DWARF reader to use dwarf2_full_name when computing the name of a type. This gives the correct name when a type is nested in a namespace. This oddity probably wasn't noticed before because some of the types in question are either normally anonymous in C++ (e.g, array type) or do not appear in a namespace (base type).
2025-03-06gdb/dwarf: remove unnecessary `this->` in read.cSimon Marchi1-45/+44
I like using `this->` when it's unclear that the method or field accessed is within the current class, but when accessing a private member prefixed with `m_`, it's unnecessary, as the prefix makes it clear. Remove some instances of it (some coming from the previous patch, other pre-existing) to de-clutter the code a bit. Change-Id: Ia83d0bce51d222fa3ac3d756d50170ec6ed12b94 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-06gdb/dwarf: make all fields of cutu_reader privateSimon Marchi1-272/+279
Make all fields of cutu_reader private, then add getters for whatever needs to be accessed outside of cutu_reader. This should help spot what's used by cutu_reader itself, and what is used by others. Change-Id: I71cb73fffa5d70cc9c7fc68bf74db937e84c2db1 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-06gdb/dwarf: pass dwarf2_cu instead of cutu_reader to two functionsSimon Marchi1-7/+4
These functions don't need to receive a cutu_reader, they only use it to obtain the contained dwarf2_cu, so change them to accept a dwarf2_cu. This helps reduce the creep of cutu_reader a little bit. Change-Id: Iebb3c4697a4aec638b47423b3ac59077d4fa5090 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-06gdb/dwarf: move a bunch of DIE-reading functions to cutu_readerSimon Marchi1-153/+154
With the hope of organizing things better and spotting patterns that could lead to simplification, move all these functions to be methods of cutu_reader. At least, this gives a good picture of what the entry points for DIE and attribute reading are, by looking at what methods are public. Right now, my vague understanding of cutu_reader is that it does 3 things: - it provides means to navigate and read the DIE tree, abstracting things like whether the real content is in a DWO file or not - it builds a dwarf2_cu object, for its own use but also for the use of the caller - it fills in missing details in the passed in dwarf2_per_cu In the future, I'd like to separate those concerns. I think that cutu_reader could retain the first one of those concerns, while the other two could be done by other classes or functions, perhaps using cutu_reader under the hood. Change-Id: I04e0d6c864bbc09c7071ac8e9493e1e54c093d68 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-06gdb/dwarf: add empty lines in cutu_reader::read_cutu_die_from_dwo commentSimon Marchi1-0/+3
I find it much more readable this way, with one idea per paragraph. Change-Id: Ib31b410867c8444e0f3200681881f54f1b8ebea8 Approved-By: Tom Tromey <tom@tromey.com>
2025-03-06gdb/dwarf: make init_cu_die_reader a method of cutu_readerSimon Marchi1-16/+20
init_cu_die_reader is only used inside cutu_reader, to initialize fields of cutu_reader, so make it a private method. Change-Id: Iaa80d4dbb8d0fa35bcac18ee70e147276874cc1b Approved-By: Tom Tromey <tom@tromey.com>
2025-03-06gdb/dwarf: make read_cutu_die_from_dwo a method of cutu_readerSimon Marchi1-20/+25
read_cutu_die_from_dwo is only used as a helper to cutu_reader, so make it a private method of cutu_reader. Remove the "result_reader" parameter, because it's always "this". Change-Id: I7df6162137451c160f0e6bf3539569fcb2421eff Approved-By: Tom Tromey <tom@tromey.com>
2025-03-06Use "::" as separator for Fortran in cooked indexTom Tromey1-0/+1
This teaches cooked_index_entry::full_name that "::" is the separator for Fortran. I don't know enough Fortran to write a test case for this. However, a different series I am working on has a regression if this patch is not applied. Approved-By: Simon Marchi <simon.marchi@efficios.com>