aboutsummaryrefslogtreecommitdiff
path: root/gdb/dwarf2
AgeCommit message (Collapse)AuthorFilesLines
2025-05-23gdb: guard <mutex> include with CXX_STD_THREADSimon Marchi1-0/+2
Change-Id: I4335fbfdabe49778fe37b08689eec59be94c424b
2025-05-23gdb: include <mutex> in dwarf2/read.hSimon Marchi1-0/+1
The buildbot pointed out this compilation failure on AlmaLinux, with g++ 8.5.0, which is not seen on more recent systems: CXX gdbtypes.o In file included from ../../binutils-gdb/gdb/gdbtypes.c:39: ../../binutils-gdb/gdb/dwarf2/read.h:639:8: error: ‘mutex’ in namespace ‘std’ does not name a type std::mutex dwo_files_lock; ^~~~~ ../../binutils-gdb/gdb/dwarf2/read.h:639:3: note: ‘std::mutex’ is defined in header ‘<mutex>’; did you forget to ‘#include <mutex>’? ../../binutils-gdb/gdb/dwarf2/read.h:35:1: +#include <mutex> ../../binutils-gdb/gdb/dwarf2/read.h:639:3: std::mutex dwo_files_lock; ^~~ Fix it by including <mutex> in dwarf2/read.h. Change-Id: Iba334a3dad217c86841a5e804d0f386876f5ff2f
2025-05-23gdb/dwarf: split dwo_lock in more granular locksSimon Marchi2-33/+64
The dwo_lock mutex is used to synchronize access to some dwo/dwp-related data structures, such as dwarf2_per_bfd::dwo_files and dwp_file::loaded_{cus,tus}. Right now the scope of the lock is kind of coarse. It is taken in the top-level lookup_dwo_unit function, and held while the thread: - looks up an existing dwo_file in the per-bfd hash table for the given id/signature - if there's no existing dwo_file, attempt to find a .dwo file, open it, build the list of units it contains - if a new dwo_file was created, insert it in the per-bfd hash table - look up the desired unit in the dwo_file And something similar for the dwp code path. This means that two indexing thread can't read in two dwo files simultaneously. This isn't ideal in terms of parallelism. This patch breaks this lock into 3 more fine grained locks: - one lock to access dwarf2_per_bfd::dwo_files - one lock to access dwp_file::loaded_{cus,tus} - one lock in try_open_dwop_file, where we do two operations that aren't thread safe (bfd_check_format and gdb_bfd_record_inclusion) Unfortunately I don't see a clear speedup on my computer with 8 threads. But the change shouldn't hurt, in theory, and hopefully this can be a piece that helps in making GDB scale better on machines with many cores (if we ever bump the max number of worker threads). This patch uses "double-checked locking" to avoid holding the lock(s) for the whole duration of reading in dwo files. The idea is, when looking up a dwo with a given name: - with the lock held, check for an existing dwo_file with that name in dwarf2_per_bfd::dwo_files, if found return it - if not found, drop the lock, load the dwo file and create a dwo_file describing it - with the lock held, attempt to insert the new dwo_file in dwarf2_per_bfd::dwo_files. If an entry exists, it means another thread simultaneously created an equivalent dwo_file, but won the race. Drop the new dwo_file and use the existing one. The new dwo_file is automatically deleted, because it is help by a unique_ptr and the insertion into the unordered_set fails. Note that it shouldn't normally happen for two threads to look up a dwo file with the same name, since different units will point to different dwo files. But it were to happen, we handle it. This way of doing things allows two threads to read in two different dwo files simulatenously, which in theory should help get better parallelism. The same technique is used for dwp_file::loaded_{cus,tus}. I have some local CI jobs that run the fission and fission-dwp boards, and I haven't seen regressions. In addition to the regular testing, I ran a few tests using those boards on a ThreadSanitizer build of GDB. Change-Id: I625c98b0aa97b47d5ee59fe22a137ad0eafc8c25 Reviewed-By: Andrew Burgess <aburgess@redhat.com>
2025-05-23gdb/dwarf: allocate DWP dwarf2_section_info with newSimon Marchi2-2/+9
For the same reason as explained in the previous patch (allocations on obstacks aren't thread-safe), change the allocation of dwarf2_section_info object for dwo files within dwp files to use "new". The dwo_file::section object is not always owned by the dwo_file, so introduce a new "dwo_file::section_holder" object that is only set when the dwo_file owns the dwarf2_section_info. Change-Id: I74c4608573c7a435bf3dadb83f96a805d21798a2 Approved-By: Tom Tromey <tom@tromey.com>
2025-05-23gdb/dwarf: allocate dwo_unit with newSimon Marchi1-29/+31
The following patch reduces the duration where the dwo_lock mutex is taken. One operation that is not thread safe is the allocation on dwo_units on the per_bfd obstack: dwo_unit *dwo_unit = OBSTACK_ZALLOC (&per_bfd->obstack, struct dwo_unit); We could take the lock around this allocation, but I think it's just easier to avoid the problem by having the dwo_unit objects allocated with "new". Change-Id: Ida04f905cb7941a8826e6078ed25dbcf57674090 Approved-By: Tom Tromey <tom@tromey.com>
2025-05-22Ensure cooked_index_entry self-tests are runTom Tromey1-1/+2
While looking at code coverage for gdb, I noticed that the cooked_index_entry self-tests were not run. I tracked this down to a formatting error in cooked-index-entry.c. I suspect it might be better to use a macro to define these initialization functions. That would probably remove the possibility for this kind of error.
2025-05-16Update comment for find_field_create_batonTom Tromey1-7/+3
Andrew pointed out that a recent commit neglected to update the comment for find_field_create_baton. This patch fixes the oversight.
2025-05-15Fix regression with dynamic array boundsTom Tromey4-81/+36
Kévin discovered that commit ba005d32b0f ("Handle dynamic field properties") regressed a test in the internal AdaCore test suite. The problem here is that, when writing that patch, I did not consider the case where an array type's bounds might come from a member of a structure -- but where the array is not defined in the structure's scope. In this scenario the field-resolution logic would trip this condition: /* Defensive programming in case we see unusual DWARF. */ if (fi == nullptr) return nullptr; This patch reworks this area, partly backing out that commit, and fixes the problem. In the new code, I chose to simply duplicate the field's location information. This isn't totally ideal, in that it might result in multiple copies of a baton. However, this seemed nicer than tracking the DIE/field correspondence for every field in every CU -- my thinking here is that this particular dynamic scenario is relatively rare overall. Also, if the baton cost does prove onerous, we could intern the batons somewhere. Regression tested on x86-64 Fedora 41. I also tested this using the AdaCore internal test suite. Tested-By: Simon Marchi <simon.marchi@efficios.com>
2025-05-15gdb: rename ldirname to gdb_ldirnameAndreas Schwab2-4/+4
It conflicts with the ldirname function that will be added in the next libiberty sync.
2025-05-12gdb/dwarf: skip broken .debug_macro.dwoSimon Marchi1-1/+19
Running gdb.base/errno.exp with gcc <= 13 with split DWARF results in: $ make check TESTS="gdb.base/errno.exp" RUNTESTFLAGS="CC_FOR_TARGET=gcc-13 --target_board=fission" (gdb) break -qualified main /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:7549: internal-error: locate_dwo_sections: Assertion `!dw_sect->readin' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. ... FAIL: gdb.base/errno.exp: macros: gdb_breakpoint: set breakpoint at main (GDB internal error) The assert being hit has been added in 28f15782adab ("gdb/dwarf: read multiple .debug_info.dwo sections"), but it merely exposed an existing problem. gcc versions <= 13 are affected by this bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111409 Basically, it produces .dwo files with multiple .debug_macro.dwo sections, with some unresolved links between them. I think that this macro debug info is unusable, and all we can do is ignore it. In locate_dwo_sections, if we detect a second .debug_macro.dwo section, forget about the previous .debug_macro.dwo and any subsequent one. This will effectively make it as if the macro debug info wasn't there at all. The errno test seems happy with it: # of expected passes 84 # of expected failures 8 Change-Id: I6489b4713954669bf69f6e91865063ddcd1ac2c8 Approved-By: Tom Tromey <tom@tromey.com>
2025-05-12gdb/dwarf: move loops into locate_dw{o,z}_sectionsSimon Marchi3-69/+69
For a subsequent patch, it would be easier if the loop over sections inside locate_dwo_sections (I want to maintain some state for the duration of the loop). Move the for loop in there. And because locate_dwz_sections is very similar, modify that one too, to keep both in sync. Change-Id: I90b3d44184910cc2d86af265bb4b41828a5d2c2e Approved-By: Tom Tromey <tom@tromey.com>
2025-05-06Handle field with dynamic bit offsetTom Tromey2-20/+55
I discovered that GCC emitted incorrect DWARF for the test case included in this patch. Eric wrote a fix for GCC, but then he found that gdb crashed on the resulting file. This test has a field that is at a non-constant bit offset from the start of the type. DWARF 5 does not allow for this situation (I've sent a report to the DWARF list), but DWARF 3 did allow for this via a combination of an expression for the byte offset and then the use of DW_AT_bit_offset. This looks like: <5><117a>: Abbrev Number: 17 (DW_TAG_member) <117b> DW_AT_name : (indirect string, offset: 0x1959): another_field ... <1188> DW_AT_bit_offset : 6 <1189> DW_AT_data_member_location: 6 byte block: 99 3d 1 0 0 22 (DW_OP_call4: <0x1193>; DW_OP_plus) ... <3><1193>: Abbrev Number: 2 (DW_TAG_dwarf_procedure) <1194> DW_AT_location : 15 byte block: 97 94 1 37 1a 32 1e 23 7 38 1b 31 1c 23 3 (DW_OP_push_object_address; DW_OP_deref_size: 1; DW_OP_lit7; DW_OP_and; DW_OP_lit2; DW_OP_mul; DW_OP_plus_uconst: 7; DW_OP_lit8; DW_OP_div; DW_OP_lit1; DW_OP_minus; DW_OP_plus_uconst: 3) Now, that combination is not fully general, in that the bit offset must be a constant -- only the byte offset may really vary. However, I couldn't come up with a situation where full generality is needed, mainly because GNAT won't seem to pack fields into the padding of a variable-length array. Meanwhile, the reason for the gdb crash is that the code handling DW_AT_bit_offset assumes that the byte offset is a constant. This causes an assertion failure. This patch arranges for DW_AT_bit_offset to be applied during field resolution, when needed.
2025-05-06Introduce apply_bit_offset_to_field helper functionTom Tromey1-40/+11
This patch makes a new function, apply_bit_offset_to_field, that is used to handle the logic of DW_AT_bit_offset. Currently there is just a single caller, but the next patch will change this.
2025-05-06Use OBSTACK_ZALLOC when allocating batonsTom Tromey1-6/+9
I found some places in dwarf2/read.c that allocate a location baton, but fail to initialize one of the fields. It seems safer to me to use OBSTACK_ZALLOC here, so this patch makes this change. This will be useful in a subsequent patch as well, where a new field is added to one of the batons.
2025-05-06Clean up handle_member_locationTom Tromey1-3/+2
This removes a redundant check from handle_member_location, and also changes the complaint -- currently it will issue the "complex location" complaint, but really what is happening here is an unrecognized form.
2025-05-06Handle dynamic field propertiesTom Tromey4-86/+117
I found a situation where gdb could not properly decode an Ada type. In this first scenario, the discriminant of a type is a bit-field. PROP_ADDR_OFFSET does not handle this situation, because it only allows an offset -- not a bit-size. My original approach to this just added a bit size as well, but after some discussion with Eric Botcazou, we found another failing case: a tagged type can have a second discriminant that appears at a variable offset. So, this patch changes this code to accept a general 'struct field' instead of trying to replicate the field-finding machinery by itself. This is handled at property-evaluation time by simply using a 'field' and resolving its dynamic properties. Then the usual field-extraction function is called to get the value. Because the baton now just holds a field, I renamed PROP_ADDR_OFFSET to PROP_FIELD. The DWARF reader now defers filling in the property baton until the fields have been attached to the type. Finally, I noticed that if the discriminant field has a biased representation, then unpack_field_as_long would not handle this either. This bug is also fixed here, and the test case checks this. Regression tested on x86-64 Fedora 41.
2025-05-06Constify property_addr_infoTom Tromey1-1/+1
This changes most places to use a const property_addr_info. This seems more correct to me because normally the user of a property_addr_info should not modify it. Furthermore, some functions already take a const object, and for a subsequent patch it is convenient if other functions do as well.
2025-05-05Fix sign of Ada rational constantsTom Tromey1-6/+7
My earlier patch commit 0c03db90 ("Use correct sign in get_mpz") was (very) incorrect. It changed get_mpz to check for a strict sign when examining part of an Ada rational constant. However, in Ada the "delta" for a fixed-point type must be positive, and so the components of the rational representation will be positive. This patch corrects the error. It also renames the get_mpz function, in case anyone is tempted to reuse this code for another purpose. Finally, this pulls over the test from the internal AdaCore test suite that found this issue.
2025-05-02[gdb/symtab] Throw DWARF error on out-of-bounds DW_FORM_strxTom de Vries1-3/+10
With the test-case contained in the patch, and gdb build with -fsanitize=address we get: ... ==23678==ERROR: AddressSanitizer: heap-buffer-overflow ...^M READ of size 1 at 0x6020000c30dc thread T3^[[1m^[[0m^M ptype global_var^M #0 0x2c6a40b in bfd_getl32 bfd/libbfd.c:846^M #1 0x168f96c in read_str_index gdb/dwarf2/read.c:15349^M ... The executable contains an out-of-bounds DW_FORM_strx attribute: ... $ readelf -wi $exec <2eb> DW_AT_name :readelf: Warning: string index of 1 converts to \ an offset of 0xc which is too big for section .debug_str (indexed string: 0x1): <string index too big> ... and read_str_index doesn't check for this: ... info_ptr = (str_offsets_section->buffer + str_offsets_base + str_index * offset_size); if (offset_size == 4) str_offset = bfd_get_32 (abfd, info_ptr); ... and consequently reads out-of-bounds. Fix this in read_str_index by checking for the out-of-bounds condition and throwing a DWARF error: ... (gdb) ptype global_var DWARF Error: Offset from DW_FORM_GNU_str_index or DW_FORM_strx pointing \ outside of .debug_str_offsets section in CU at offset 0x2d7 \ [in module dw-form-strx-out-of-bounds] No symbol "global_var" in current context. (gdb) ... Tested on x86_64-linux. Approved-By: Tom Tromey <tom@tromey.com>
2025-05-02[gdbsupport] Reimplement phex and phex_nz as templatesTom de Vries1-1/+1
Gdbsupport functions phex and phex_nz have a parameter sizeof_l: ... extern const char *phex (ULONGEST l, int sizeof_l); extern const char *phex_nz (ULONGEST l, int sizeof_l); ... and a lot of calls use: ... phex (l, sizeof (l)) ... Make this easier by reimplementing the functions as a template, allowing us to simply write: ... phex (l) ... Simplify existing code using: ... $ find gdb* -type f \ | xargs sed -i 's/phex (\([^,]*\), sizeof (\1))/phex (\1)/' $ find gdb* -type f \ | xargs sed -i 's/phex_nz (\([^,]*\), sizeof (\1))/phex_nz (\1)/' ... and manually review: ... $ find gdb* -type f | xargs grep "phex (.*, sizeof.*)" $ find gdb* -type f | xargs grep "phex_nz (.*, sizeof.*)" ... Tested on x86_64-linux. Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: change a bunch of functions to be methods of ↵Simon Marchi1-59/+52
cooked_index_worker_debug_info Move a few functions exclusively used to process units to become methods of cooked_index_worker_debug_info. Rename them to a more consistent name scheme, which gets rid of outdated naming. The comments were also quite outdated. Change-Id: I2e7dcc2e4ff372007dcb4f6c3d34187c9cc2da05 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: move cooked_index_worker_debug_info upSimon Marchi1-75/+73
The next patch moves some functions to be methods of cooked_index_worker_debug_info. Move cooked_index_worker_debug_info above those functions, to make that easier (methods can't be defined before the class declaration). Change-Id: I7723cb42efadb2cc86f2227b3c2fb275e2d620f9 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: clean up some cutu_reader::is_dummy() callsSimon Marchi2-6/+4
This patch tries to standardize the places where we check if units are dummy. When checking if a unit is dummy, it is not necessary to check for some other conditions. - cutu_reader::is_dummy() is a superset of cutu_reader::cu() returning nullptr, so it's not necessary to check if the cu method return nullptr if also checking if the unit is dummy. - cutu_reader::is_dummy() is a superset of cutu_reader::top_level_die() returning nullptr, so same deal. Remove some spots that check for these conditions in addition to cutu_reader::is_dummy(). In addition, also remove the checks for: !new_reader->top_level_die ()->has_children in cooked_indexer::ensure_cu_exists. IMO, it is not useful to special case the units having a single DIE. Especially in this function, which deals with importing things from another unit, a unit with a single DIE would be an edge case that should not happen with good debug info. I think it's preferable to have simpler code. Change-Id: I4529d7b3a0bd2891a60f41671de8cfd3114adb4a Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: avoid cutu_reader movesSimon Marchi3-17/+19
In process_psymtab_comp_unit and ensure_cu_exists, we create a temporary cutu_reader on the stack, then move it to a heap allocated cutu_reader once we confirmed the unit is not dummy. I think it's unnecessary to create a temporary cutu_reader. The only downside of not doing so is that if it ends up that the CU is dummy, we made an allocation/deallocation for nothing. Dummy CUs are a rare thing, it shouldn't change anything. This allows removing the cutu_reader move constructor. Change-Id: I44742d471c495055ee46db41c0e7bdfbd2d5c0b7 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: read multiple .debug_info.dwo sectionsSimon Marchi1-11/+26
When building with gcc, with flags -gdwarf-5, -gsplit-dwarf and -fdebug-types-section, the resulting .dwo files contain multiple .debug_info.dwo sections. One for each type unit and one for the compile unit. This is correct, as per DWARF 5, section F.2.3 ("Contents of the Split DWARF Object Files"): The split DWARF object files each contain the following sections: ... .debug_info.dwo (for the compilation unit) .debug_info.dwo (one COMDAT section for each type unit) ... GDB currently assumes that there is a single .debug_info.dwo section, causing unpredictable behavior. For example, sometimes this crash: ==81781==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x508000007a71 at pc 0x58704d32a59c bp 0x7ffc0acc0bb0 sp 0x7ffc0acc0ba0 READ of size 1 at 0x508000007a71 thread T0 #0 0x58704d32a59b in bfd_getl32 /home/smarchi/src/binutils-gdb/bfd/libbfd.c:846 #1 0x58704ae62dce in read_initial_length(bfd*, unsigned char const*, unsigned int*, bool) /home/smarchi/src/binutils-gdb/gdb/dwarf2/leb.c:92 #2 0x58704aaf76bf in read_comp_unit_head(comp_unit_head*, unsigned char const*, dwarf2_section_info*, rcuh_kind) /home/smarchi/src/binutils-gdb/gdb/dwarf2/comp-unit-head.c:47 #3 0x58704aaf8f97 in read_and_check_comp_unit_head(dwarf2_per_objfile*, comp_unit_head*, dwarf2_section_info*, dwarf2_section_info*, unsigned char const*, rcuh_kind) /home/smarchi/src/binutils-gdb/gdb/dwarf2/comp-unit-head.c:193 #4 0x58704b022908 in create_dwo_unit_hash_tables /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:6233 #5 0x58704b0334a5 in open_and_init_dwo_file /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:7588 #6 0x58704b03965a in lookup_dwo_cutu /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:7935 #7 0x58704b03a5b1 in lookup_dwo_comp_unit /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:8009 #8 0x58704aff5b70 in lookup_dwo_unit /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:2802 The first time that locate_dwo_sections gets called for a .debug_info.dwo section, dwo_sections::info gets initialized properly. The second time it gets called for a .debug_info.dwo section, the size field in dwo_sections::info gets overwritten with the size of the second section. But the buffer remains pointing to the contents of the first section, because the section is already "read in". So the size does not match the buffer. And even if it did, we would only keep the information about one .debug_info.dwo, out of the many. First, add an assert in locate_dwo_sections to make sure we don't try to fill in a dwo section info twice. Add the assert to other functions with the same pattern, while at it. Then, change dwo_sections::info to be a vector of sections (just like we do for type sections). Update locate_dwo_sections to append to that vector when seeing a new .debug_info.dwo section. Update open_and_init_dwo_file to read the units from each section. The problem can be observed by running some tests with the dwarf5-fission-debug-types target board. For example, gdb.base/condbreak.exp crashes (with the ASan failure shown above) before the patch and passes after). [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119766 Change-Id: Iedf275768b6057dee4b1542396714f3d89903cf3 Reviewed-By: Tom de Vries <tdevries@suse.de>
2025-04-29gdb/dwarf: scan .debug_info.dwo just onceSimon Marchi2-183/+94
When building -gsplit-dwarf and -fdebug-types-section in DWARF 5, the resulting .dwo files will typically have a .debug_info.dwo section with multiple type units followed by one compile unit: $ llvm-dwarfdump -F -color a-test.dwo | grep ' Unit' 0x00000000: Type Unit: length = 0x000008a0, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_type, abbr_offset = 0x0000, addr_size = 0x08, name = 'vector<int, std::allocator<int> >', type_signature = 0xb499dcf29e2928c4, type_offset = 0x0023 (next unit at 0x000008a4) 0x000008a4: Type Unit: length = 0x00000099, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_type, abbr_offset = 0x0000, addr_size = 0x08, name = 'allocator<int>', type_signature = 0x496a8791a842701b, type_offset = 0x0023 (next unit at 0x00000941) ... 0x000015c1: Compile Unit: length = 0x00000f58, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size = 0x08, DWO_id = 0xe8e359820d1c5803 (next unit at 0x0000251d) In open_and_init_dwo_file, we call create_dwo_cus_hash_table, which scans the section, looking for compile units, then call create_dwo_debug_types_hash_table, which scans the section again, looking for type units. It would make more sense to scan the section just once and handle both compile and type units at the same time. To achieve this, add create_dwo_unit_hash_tables, which knows how to handle both unit kinds in a single scan. It replaces create_dwo_cus_hash_table and create_dwo_debug_type_hash_table. Change open_and_init_dwo_file to call it. Note that I removed the DWARF version check in open_and_init_dwo_file when processing .debug_type.dwo sections: in DWARF 5, the .debug_type.dwo sections will just not exist, so the `dwo_file->sections.types` vector will be empty. Change-Id: I6e51d0ca06c258e0bf0e59927d62ae2df314a162 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: scan DWARF 5 DWO CUs by just reading the headerSimon Marchi1-24/+50
create_dwo_cus_hash_table is implemented by creating a cutu_reader (which is somewhat heavy) for all units in a .dwo file. The purpose of this cutu_reader is to be able to get the DWO ID, if it is specified by a DW_AT_GNU_dwo_id attribute. In DWARF 5, however, the DWO ID is available in the CU header. We can access it without accessing the DIEs, by just reading the header, which is more lightweight. Add a new code path to create_dwo_cus_hash_table that does that. The logic is copied from create_dwo_debug_type_hash_table, which does this already. This change helps circumvent a performance problem I want to fix (the same I was trying to fix in this patch [1]) when loading a file built with -gdwarf-5, -gsplit-dwarf and -fdebug-types-section. In this configuration, the produced .dwo files contain one compile unit and many type units each. All units in a given .dwo share the same abbrev table. Creating a cutu_reader for each unit meant re-reading the same abbrev table over and over. What's particularly bad is that this is done with the dwo_lock held, preventing other indexing threads from making progress. To give a rough idea, here's the time take by each worker to index units before this patch (on a rather large program): Time for "DWARF indexing worker": wall 18.627, user 0.885, sys 0.042, user+sys 0.927, 5.0 % CPU Time for "DWARF indexing worker": wall 18.888, user 0.862, sys 0.042, user+sys 0.904, 4.8 % CPU Time for "DWARF indexing worker": wall 19.172, user 1.848, sys 0.069, user+sys 1.917, 10.0 % CPU Time for "DWARF indexing worker": wall 19.297, user 1.544, sys 0.051, user+sys 1.595, 8.3 % CPU Time for "DWARF indexing worker": wall 19.545, user 3.408, sys 0.084, user+sys 3.492, 17.9 % CPU Time for "DWARF indexing worker": wall 19.759, user 4.221, sys 0.117, user+sys 4.338, 22.0 % CPU Time for "DWARF indexing worker": wall 19.789, user 4.187, sys 0.105, user+sys 4.292, 21.7 % CPU Time for "DWARF indexing worker": wall 19.825, user 4.933, sys 0.135, user+sys 5.068, 25.6 % CPU And the times with this patch: Time for "DWARF indexing worker": wall 0.163, user 0.089, sys 0.029, user+sys 0.118, 72.4 % CPU Time for "DWARF indexing worker": wall 0.176, user 0.096, sys 0.041, user+sys 0.137, 77.8 % CPU Time for "DWARF indexing worker": wall 0.265, user 0.167, sys 0.054, user+sys 0.221, 83.4 % CPU Time for "DWARF indexing worker": wall 0.353, user 0.257, sys 0.060, user+sys 0.317, 89.8 % CPU Time for "DWARF indexing worker": wall 0.524, user 0.399, sys 0.088, user+sys 0.487, 92.9 % CPU Time for "DWARF indexing worker": wall 0.648, user 0.517, sys 0.107, user+sys 0.624, 96.3 % CPU Time for "DWARF indexing worker": wall 0.657, user 0.523, sys 0.107, user+sys 0.630, 95.9 % CPU Time for "DWARF indexing worker": wall 0.753, user 0.612, sys 0.120, user+sys 0.732, 97.2 % CPU [1] https://inbox.sourceware.org/gdb-patches/20250326200002.136200-8-simon.marchi@efficios.com/ Change-Id: I34a422577e4c3ad7d478ec6df12a0e44d284c249 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb/dwarf: replace some "compile unit" terminology with "unit"Simon Marchi8-175/+165
In DWARF 5 (and even previous versions, with type units), compile units are just one type of units. In many places, we still use "compile units" when in reality it would be better to talk about "units" (unless we specifically want to talk about compile units). Rename comp-unit-head.{c.h} to unit-head.{c,h}, and do a big pass of renames in it to remove the specific mentions of compile units, where in fact we want to talk about units in general. Change-Id: Ia06c90ccb25756c366f269a12620f2f7c8378adb Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29gdb: add some scoped_time_its to profile startup timeSimon Marchi3-3/+17
I'm investigating some issues where GDB takes a lot of time to start up (read: for the DWARF index to be ready to do anything useful). Adding those scoped_time_it instances helped me gain some insights about where GDB spends time. I think they would be useful to have upstream, to make investigating future problems easier. It would also be useful to be able to give some numbers in the commit messages. Here's an example of what GDB outputs: Time for "minsyms install worker": wall 0.045, user 0.040, sys 0.004, user+sys 0.044, 97.8 % CPU Time for "minsyms install worker": wall 0.511, user 0.457, sys 0.048, user+sys 0.505, 98.8 % CPU Time for "minsyms install worker": wall 1.513, user 1.389, sys 0.111, user+sys 1.500, 99.1 % CPU Time for "minsyms install worker": wall 1.688, user 1.451, sys 0.102, user+sys 1.553, 92.0 % CPU Time for "minsyms install worker": wall 1.897, user 1.518, sys 0.089, user+sys 1.607, 84.7 % CPU Time for "minsyms install worker": wall 2.811, user 2.558, sys 0.231, user+sys 2.789, 99.2 % CPU Time for "minsyms install worker": wall 3.257, user 3.049, sys 0.188, user+sys 3.237, 99.4 % CPU Time for "minsyms install worker": wall 3.617, user 3.089, sys 0.211, user+sys 3.300, 91.2 % CPU Time for "DWARF indexing worker": wall 19.517, user 0.894, sys 0.075, user+sys 0.969, 5.0 % CPU Time for "DWARF indexing worker": wall 19.807, user 0.891, sys 0.086, user+sys 0.977, 4.9 % CPU Time for "DWARF indexing worker": wall 20.270, user 1.559, sys 0.119, user+sys 1.678, 8.3 % CPU Time for "DWARF indexing worker": wall 20.329, user 1.878, sys 0.147, user+sys 2.025, 10.0 % CPU Time for "DWARF indexing worker": wall 20.848, user 3.483, sys 0.224, user+sys 3.707, 17.8 % CPU Time for "DWARF indexing worker": wall 21.088, user 4.285, sys 0.295, user+sys 4.580, 21.7 % CPU Time for "DWARF indexing worker": wall 21.109, user 4.501, sys 0.274, user+sys 4.775, 22.6 % CPU Time for "DWARF indexing worker": wall 21.198, user 5.087, sys 0.319, user+sys 5.406, 25.5 % CPU Time for "DWARF skeletonless type units": wall 4.024, user 3.858, sys 0.115, user+sys 3.973, 98.7 % CPU Time for "DWARF add parent map": wall 0.092, user 0.086, sys 0.004, user+sys 0.090, 97.8 % CPU Time for "DWARF finalize worker": wall 0.278, user 0.241, sys 0.009, user+sys 0.250, 89.9 % CPU Time for "DWARF finalize worker": wall 0.307, user 0.304, sys 0.000, user+sys 0.304, 99.0 % CPU Time for "DWARF finalize worker": wall 0.727, user 0.719, sys 0.000, user+sys 0.719, 98.9 % CPU Time for "DWARF finalize worker": wall 0.913, user 0.901, sys 0.003, user+sys 0.904, 99.0 % CPU Time for "DWARF finalize worker": wall 0.776, user 0.767, sys 0.004, user+sys 0.771, 99.4 % CPU Time for "DWARF finalize worker": wall 1.897, user 1.869, sys 0.006, user+sys 1.875, 98.8 % CPU Time for "DWARF finalize worker": wall 2.534, user 2.512, sys 0.005, user+sys 2.517, 99.3 % CPU Time for "DWARF finalize worker": wall 2.607, user 2.583, sys 0.006, user+sys 2.589, 99.3 % CPU Time for "DWARF finalize worker": wall 3.142, user 3.094, sys 0.022, user+sys 3.116, 99.2 % CPU Change-Id: I9453589b9005c3226499428ae9cab9f4a8c22904 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29Handle base type without DW_AT_byte_sizeTom Tromey1-29/+38
DWARF says that a base type can have DW_AT_bit_size, without DW_AT_byte_size. However, gdb does not correctly handle this; in fact, it crashes, as pointed out in this LLVM merge request: https://github.com/llvm/llvm-project/pull/137123 This patch reworks the base type size logic a bit to handle this situation. Tested-by: Kevin Buettner <kevinb@redhat.com> Approved-by: Kevin Buettner <kevinb@redhat.com>
2025-04-24gdb/dwarf: add dwarf2_cu::find_die methodSimon Marchi2-5/+13
I added this small helper method in the series I'm writing, to make finding a DIE by section offset a bit nicer than using the unordered_set methods. It doesn't have any dependencies, so I thought I would submit it on its own. Change-Id: If7313194ab09d9bd6d6a52c24eb6fd9a9c1b76e0 Approved-by: Kevin Buettner <kevinb@redhat.com>
2025-04-24Use correct sign extension for enumeration typesTom Tromey1-22/+37
This changes update_enumeration_type_from_children to use the correct sign-extension method on the attribute. The logic here is a bit complicated: if the enum has an underlying type, then we use that type's signed-ness to interpret attributes; otherwise we must assume attributes are encoded as signed values. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use bool in update_enumeration_type_from_childrenTom Tromey1-6/+6
This is just a small preliminary cleanup to use 'bool' in update_enumeration_type_from_children.
2025-04-24Remove dead code from dwarf2_const_value_dataTom Tromey1-35/+16
dwarf2_const_value_data checks the size of the data like so: if (bits < sizeof (*value) * 8) ... else if (bits == sizeof (*value) * 8) ... else ... However, 'bits' can only be 8, 16, 32, or 64. And, because 'value' is a LONGEST, which is alwasy 64-bit, the final 'else' can never be taken. This patch removes the dead code. And, because this was the only reason for a non-void return value, the return type is changed as well. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use attribute::signed_constant in attribute::as_booleanTom Tromey1-1/+4
This changes attribute::as_boolean to use attribute::signed_constant. This is maybe overkill but lets any reasonable constant form through. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use correct sign for variant part discriminantsTom Tromey1-10/+24
The discriminant value for a variant part may be signed or unsigned, depending on the type of the variant. This patch changes the DWARF reader to delay interpretation of the relevant attribute until the signed-ness is known. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use correct sign in get_mpzTom Tromey2-2/+11
This changes dwarf2/read.c:get_mpz to use the correct sign-extension function. Normally a rational constant uses signed values, but a purely unsigned form also seems fine here. This adds a new attribute::form_is_strictly_unsigned, which is more precise than form_is_unsigned (which accepts a lot of forms that aren't really for ordinary constants). Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use attribute::unsigned_constant for DW_AT_data_member_locationTom Tromey1-3/+3
This changes the DWARF reader to use attribute::unsigned_constant for DW_AT_data_member_location. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use attribute::unsigned_constant for DW_AT_data_bit_offsetTom Tromey1-10/+16
This changes the DWARF reader to use attribute::unsigned_constant when examining DW_AT_data_bit_offset. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use correct sign for DW_AT_GNU_biasTom Tromey1-2/+7
DW_AT_GNU_bias may be signed or unsigned, depending on the underlying type. This patch changes the DWARF reader to examine the type before decoding the attribute. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use attribute::unsigned_constant for DW_AT_bit_strideTom Tromey1-1/+1
DW_AT_bit_stride uses an unsigned constant, so make this explicit in the reader. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use attribute::signed_constant for fixed-point scaleTom Tromey1-2/+2
This changes the DWARF reader to use attribute::signed_constant for DW_AT_binary_scale and DW_AT_decimal_scale. (FWIW these were the attributes that first lead me to find this problem.) Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Introduce attribute::signed_constantTom Tromey2-0/+39
This introduces a new method, attribute::signed_constant. This should be used wherever DWARF specifies a signed integer constant, or where this is implied by the context. It properly handles sign-extension for DW_FORM_data*. To my surprise, there doesn't seem to be a pre-existing sign-extension function. I've added one to common-utils.h alongside the align functions. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-24Use attribute::unsigned_constant for sizesTom Tromey1-17/+16
This changes the DWARF reader to use attribute::unsigned_constant for DW_AT_bit_size, DW_AT_byte_size, DW_AT_data_byte_size, and DW_AT_string_length. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
2025-04-22Handle DWARF 5 separate debug sectionsTom Tromey9-44/+197
DWARF 5 standardized the .gnu_debugaltlink section that dwz emits in multi-file mode. This is handled via some new forms, and a new .debug_sup section. This patch adds support for this to gdb. It is largely straightforward, I think, though one oddity is that I chose not to have this code search the system build-id directories for the supplementary file. My feeling was that, while it makes sense for a distro to unify the build-id concept with the hash stored in the .debug_sup section, there's no intrinsic need to do so. This in turn means that a few tests -- for example those that test the index cache -- will not work in this mode. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32808 Acked-By: Simon Marchi <simon.marchi@efficios.com>
2025-04-22Remove 'read' call from dwz_file::read_stringTom Tromey3-16/+19
dwz_file::read_string calls 'read' on the section, but this isn't needed as the sections have all been pre-read. This patch makes this change, and refactors dwz_file a bit to make this more obvious -- by making it clear that only the "static constructor" can create a dwz_file. Approved-By: Simon Marchi <simon.marchi@efficios.com> Tested-By: Alexandra Petlanova Hajkova <ahajkova@redhat.com>
2025-04-20gdb/dwarf: make some more functions methods of cutu_readerSimon Marchi2-36/+66
These are only used by cutu_reader, so make them methods of cutu_reader. This makes it a bit more obvious in which context this code is called. lookup_dwo_unit_in_dwp can't be made a method of cutu_reader, as it is used in another context (lookup_dwp_signatured_type / lookup_signatured_type), which happens during CU expansion. Change-Id: Ic62c3119dd6ec198411768323aaf640ed165f51b Approved-By: Tom Tromey <tom@tromey.com>
2025-04-20gdb/dwarf: look up .dwp file ahead of timeSimon Marchi2-29/+22
get_dwp_file lazily looks for a .dwp file for the given objfile. It is called by indexing workers, when a cutu_reader object looks for a DWO file. It is called with the "dwo_lock" held, meaning that the first worker to get there will do the work, while the others will wait at the lock. I'm trying to reduce the time where this lock is taken and do other refactorings to make it easier to reason about the DWARF reader code. Moving the lookup of the .dwp file ahead, before we start parallelizing work, helps makes things simpler, because we can then assume everywhere else that we have already checked for a .dwp file. Put the call to open_and_init_dwp_file in dwarf2_has_info, right next to where we look up .dwz files. I used the same try-catch pattern as for the .dwz file lookup. Change-Id: I615da85f62a66d752607f0dbe9f0372dfa04b86b Approved-By: Tom Tromey <tom@tromey.com>
2025-04-20gdb/dwarf: replace some per_objfile parameters with per_bfdSimon Marchi1-14/+14
Following a previous patch, these functions can accept a per_bfd instead of a per_objfile. Change-Id: Iacc8924d2e49a05920d9a7fde2f7584f709fbdd2 Approved-By: Tom Tromey <tom@tromey.com>
2025-04-20gdb/dwarf: pass section to create_dwp_hash_tableSimon Marchi1-14/+8
Instead of passing a boolean to create_dwp_hash_table to select the section to read, it's simpler to just pass the section. Change-Id: Ie043c31e80518239f6403288dcf03f7769c58e8c Approved-By: Tom Tromey <tom@tromey.com>