aboutsummaryrefslogtreecommitdiff
path: root/gdb/dwarf2/index-write.c
AgeCommit message (Collapse)AuthorFilesLines
2024-05-30gdb: remove unused includes in utils.hSimon Marchi1-0/+1
Remove some includes reported as unused by clangd. Add some includes in other files that were previously relying on the transitive include. Change-Id: Ibdd0a998b04d21362a20d0ca8e5267e21e2e133e
2024-04-25gdb: remove gdbcmd.hSimon Marchi1-1/+1
Most files including gdbcmd.h currently rely on it to access things actually declared in cli/cli-cmds.h (setlist, showlist, etc). To make things easy, replace all includes of gdbcmd.h with includes of cli/cli-cmds.h. This might lead to some unused includes of cli/cli-cmds.h, but it's harmless, and much faster than going through the 170 or so files by hand. Change-Id: I11f884d4d616c12c05f395c98bbc2892950fb00f Approved-By: Tom Tromey <tom@tromey.com>
2024-03-26gdb, gdbserver, gdbsupport: remove includes of early headersSimon Marchi1-1/+0
Now that defs.h, server.h and common-defs.h are included via the `-include` option, it is no longer necessary for source files to include them. Remove all the inclusions of these files I could find. Update the generation scripts where relevant. Change-Id: Ia026cff269c1b7ae7386dd3619bc9bb6a5332837 Approved-By: Pedro Alves <pedro@palves.net>
2024-02-29Use DW_FORM_ref_addr for DIE offset in .debug_namesTom Tromey1-2/+5
Today I realized that while the .debug_names writer uses DW_FORM_udata for the DIE offset, DW_FORM_ref_addr would be more appropriate here. This patch makes this change. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31361
2024-01-28Use the new symbol domainsTom Tromey1-4/+3
This patch changes the DWARF reader to use the new symbol domains. It also adjusts many bits of associated code to adapt to this change. The non-DWARF readers are updated on a best-effort basis. This is somewhat simpler since most of them only support C and C++. I have no way to test a few of these. I went back and forth a few times on how to handle the "tag" situation. The basic problem is that C has a special namespace for tags, which is separate from the type namespace. Other languages don't do this. So, the question is, should a DW_TAG_structure_type end up in the tag domain, or the type domain, or should it be language-dependent? I settled on making it language-dependent using a thought experiment. Suppose there was a Rust compiler that only emitted nameless DW_TAG_structure_type objects, and specified all structure type names using DW_TAG_typedef. This DWARF would be correct, in that it faithfully represents the source language -- but would not work with a purely struct-domain implementation in gdb. Therefore gdb would be wrong. Now, this approach is a little tricky for C++, which uses tags but also enters a typedef for them. I notice that some other readers -- like stabsread -- actually emit a typedef symbol as well. And, I think this is a reasonable approach. It uses more memory, but it makes the internals simpler. However, DWARF never did this for whatever reason, and so in the interest of keeping the series slightly shorter, I've left some C++-specific hacks in place here. Note that this patch includes language_minimal as a language that uses tags. I did this to avoid regressing gdb.dwarf2/debug-names-tu.exp, which doesn't specify the language for a type unit. Arguably this test case is wrong. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30164
2024-01-18Rewrite .debug_names writerTom Tromey1-239/+156
This rewrites GDB's .debug_names writer. It is now closer to the form imagined in the DWARF spec. In particular, names are emitted exactly as they appear in the original DWARF. In order to make the reader work nicely, some extensions were needed. These were all documented in an earlier patch. Note that in particular this writer solves the "main name" problem by putting a flag into the table. GDB does not use the .debug_names hash table, so it also does not write one. I consider this hash table to be essentially useless in general, due to the name canonicalization problem -- while DWARF says that writers should use the system demangling style, (1) this style varies across systems, so it can't truly be relied on; and (2) at least GCC and one other compiler don't actually follow this part of the spec anyway. It's important to note, though, that even if the hash was somehow useful, GDB probably still would not use it -- a sorted list of names is needed for completion and performs reasonably well for other lookups, so a hash table is just overhead, IMO. String emission is also simplified. There's no need in this writer to ingest the contents of .debug_str. A couple of tests are updated to reflect the fact that they now "fail" because the tests don't include .debug_aranges in the .S file. Arguably the .debug_names writer should also create this section; but I did not implement that in this series, and there is a separate bug about it. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24820 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
2024-01-18Do not write the index cache from an indexTom Tromey1-0/+2
The new .debug_names reader will work by creating a cooked index from .debug_names. This patch updates cooked_index::maybe_write_index to avoid writing the index in this case. However, in order to do this in a clean way, the readers are changed so that a nullptr result from index_for_writing means "cannot be done", and then the error message is moved into write_dwarf_index (where it historically lived).
2024-01-18Add language to cooked_index_entryTom Tromey1-2/+2
This adds a new 'lang' member to cooked_index_entry. This holds the language of the symbol. This is primarily useful for the new .debug_names reader, which will not scan the CUs for languages up front. This also changes cooked_index_shard::add to return a non-const pointer. This doesn't impact the current code, but is needed for the new reader.
2024-01-12Update copyright year range in header of all files managed by GDBAndrew Burgess1-1/+1
This commit is the result of the following actions: - Running gdb/copyright.py to update all of the copyright headers to include 2024, - Manually updating a few files the copyright.py script told me to update, these files had copyright headers embedded within the file, - Regenerating gdbsupport/Makefile.in to refresh it's copyright date, - Using grep to find other files that still mentioned 2023. If these files were updated last year from 2022 to 2023 then I've updated them this year to 2024. I'm sure I've probably missed some dates. Feel free to fix them up as you spot them.
2023-12-29dwarf, fortran: add support for DW_TAG_entry_pointNils-Christian Kempke1-1/+2
Fortran provides additional entry points for subroutines and functions. These entry points may use only a subset (or a different set) of the parameters of the original subroutine. The entry points may be described via the DWARF tag DW_TAG_entry_point. This commit adds support for parsing the DW_TAG_entry_point DWARF tag. Currently, between ifx/ifort/gfortran, only ifort is actually emitting this tag. Both, ifx and gfortran use the DW_TAG_subprogram tag as workaround/alternative. Thus, this patch really only adds more ifort support. Even so, some of the attached tests still fail for ifort, due to some wrong line info generated for the entry points in ifort. After this patch it is possible to set a breakpoint in gdb with the ifort compiled example at the entry points 'foo' and 'foobar', which was not possible before. As gcc and ifx do not emit the tag I also added a test to gdb.dwarf2 which uses some underlying c compiled code and adds some Fortran style DWARF to it emitting the DW_TAG_entry_point. Before this patch it was not possible to actually define breakpoint at the entry point tags. For gfortran there actually exists a bug on bugzilla, asking for the use of DW_TAG_entry_point over DW_TAG_subprogram: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37134 This patch was originally posted here https://sourceware.org/legacy-ml/gdb-patches/2017-07/msg00317.html but its review/pinging got lost after a while. I reworked it to fit the current GDB. Co-authored-by: Bernhard Heckel <bernhard.heckel@intel.com> Co-authored-by: Tim Wiederhake <tim.wiederhake@intel.com> Approved-by: Tom Tromey <tom@tromey.com>
2023-12-13gdb: improve error reporting for 'save gdb-index'Andrew Burgess1-1/+9
While making recent changes to 'save gdb-index' command I triggered some errors -- of the kind a user might be expected to trigger if they do something wrong -- and I didn't find GDB's output as helpful as it might be. For example: $ gdb -q /tmp/hello.x ... (gdb) save gdb-index /non_existing_dir Error while writing index for `/tmp/hello': mkstemp: No such file or directory. That the error message mentions '/tmp/hello', which does exist, but doesn't mention '/non_existing_dir', which doesn't is, I think, confusing. Also, I find the 'mkstemp' in the error message confusing for a user facing error. A user might not know what mkstemp means, and even if they do, that it appears in the error message is an internal GDB detail. The user doesn't care what function failed, but wants to know what was wrong with their input, and what they should do to fix things. Similarly, for a directory that does exist, but can't be written to: (gdb) save gdb-index /no_access_dir Error while writing index for `/tmp/hello': mkstemp: Permission denied. In this case, the 'Permission denied' might make the user thing there is a permissions issue with '/tmp/hello', which is not the case. After this patch, the new errors are: (gdb) save gdb-index /non_existing_dir Error while writing index for `/tmp/hello': `/non_existing_dir': No such file or directory. and: (gdb) save gdb-index /no_access_dir Error while writing index for `/tmp/hello': `/no_access_dir': Permission denied. we also have: (gdb) save gdb-index /tmp/not_a_directory Error while writing index for `/tmp/hello': `/tmp/not_a_directory': Is not a directory. I think these do a better job of guiding the user towards fixing the problem. I've added a new test that exercises all of these cases, and also checks the case where a user tries to use an executable that already contains an index in order to generate an index. As part of the new test I've factored out some code from ensure_gdb_index (lib/gdb.exp) into a new proc (get_index_type), which I've then used in the new test. I've confirmed that all the tests that use ensure_gdb_index still pass. During review it was pointed out that the testsuite proc have_index (lib/gdb.exp) is similar to the new get_index_type proc, so I've rewritten have_index to also use get_index_type, I've confirmed that all the tests that use have_index still pass. Nothing that worked correctly before this patch should give an error after this patch; I've only changed the output when the user was going to get an error anyway. Reviewed-By: Tom de Vries <tdevries@suse.de> Reviewed-By: Tom Tromey <tom@tromey.com> Approved-By: Tom Tromey <tom@tromey.com>
2023-11-29Use try_emplace in index-write.cTom Tromey1-16/+9
index-write.c has a comment indicating that C++17's try_emplace could be used. This patch makes the change. Approved-By: Pedro Alves <pedro@palves.net>
2023-11-28gdb: generate dwarf-5 index identically as worker-thread count changesAndrew Burgess1-2/+15
Similar to the previous commit, this commit ensures that the dwarf-5 index files are generated identically as the number of worker-threads changes. Building the dwarf-5 index makes use of a closed hash table, the bucket_hash local within debug_names::build(). Entries are added to bucket_hash from m_name_to_value_set, which, in turn, is populated by calls to debug_names::insert() in write_debug_names. The insert calls are ordered based on the entries within the cooked_index, and the ordering within cooked_index depends on the number of worker threads that GDB is using. My proposal is to sort each chain within the bucket_hash closed hash table prior to using this to build the dwarf-5 index. The buckets within bucket_hash will always have the same ordering (for a given GDB build with a given executable), and by sorting the chains within each bucket, we can be sure that GDB will see each entry in a deterministic order. I've extended the index creation test to cover this case. Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28gdb: generate gdb-index identically regardless of work thread countAndrew Burgess1-0/+69
It was observed that changing the number of worker threads that GDB uses (maintenance set worker-threads NUM) would have an impact on the layout of the generated gdb-index. The cause seems to be how the CU are distributed between threads, and then symbols that appear in multiple CU can be encountered earlier or later depending on whether a particular CU moves between threads. I certainly found this behaviour was reproducible when generating an index for GDB itself, like: gdb -q -nx -nh -batch \ -eiex 'maint set worker-threads NUM' \ -ex 'save gdb-index /tmp/' And then setting different values for NUM will change the generated index. Now, the question is: does this matter? I would like to suggest that yes, this does matter. At Red Hat we generate a gdb-index as part of the build process, and we would ideally like to have reproducible builds: for the same source, compiled with the same tool-chain, we should get the exact same output binary. And we do .... except for the index. Now we could simply force GDB to only use a single worker thread when we build the index, but, I don't think the idea of reproducible builds is that strange, so I think we should ensure that our generated indexes are always reproducible. To achieve this, I propose that we add an extra step when building the gdb-index file. After constructing the initial symbol hash table contents, we will pull all the symbols out of the hash, sort them, then re-insert them in sorted order. This will ensure that the structure of the generated hash will remain consistent (given the same set of symbols). I've extended the existing index-file test to check that the generated index doesn't change if we adjust the number of worker threads used. Given that this test is already rather slow, I've only made one change to the worker-thread count. Maybe this test should be changed to use a smaller binary, which is quicker to load, and for which we could then try many different worker thread counts. Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28gdb: C++-ify mapped_symtab from dwarf2/index-write.cAndrew Burgess1-46/+92
Make static the functions add_index_entry, find_slot, and hash_expand, member functions of the mapped_symtab class. Fold an additional snippet of code from write_gdbindex into mapped_symtab::minimize, this code relates to minimisation, so this seems like a good home for it. Make the n_elements, data, and m_string_obstack member variables of mapped_symtab private. Provide a new obstack() member function to provide access to the obstack when needed, and also add member functions begin(), end(), cbegin(), and cend() so that the mapped_symtab class can be treated like a contained and iterated over. I've also taken this opportunity to split out the logic for whether the hash table (m_data) needs expanding, this is the new function hash_needs_expanding. This will be useful in a later commit. There should be no user visible changes after this commit. Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28gdb: reduce size of generated gdb-index fileAndrew Burgess1-10/+19
I noticed in passing that out algorithm for generating the gdb-index file is incorrect. When building the hash table in add_index_entry we count every incoming entry rehash when the number of entries gets too large. However, some of the incoming entries will be duplicates, which don't actually result in new items being added to the hash table. As a result, we grow the gdb-index hash table far too often. With an unmodified GDB, generating a gdb-index for GDB, I see a file size of 90M, with a hash usage (in the generated index file) of just 2.6%. With a patched GDB, generating a gdb-index for the _same_ GDB binary, I now see a gdb-index file size of 30M, with a hash usage of 41.9%. This is a 67% reduction in gdb-index file size. Obviously, not every gdb-index file is going to see such big savings, however, the larger a program, and the more symbols that are duplicated between compilation units, the more GDB would over count, and so, over-grow the index. The gdb-index hash table we create has a minimum size of 1024, and then we grow the hash when it is 75% full, doubling the hash table at that time. Given this, then we expect that either: a. The hash table is size 1024, and less than 75% full, or b. The hash table is between 37.5% and 75% full. I've include a test that checks some of these constraints -- I've not bothered to check the upper limit, and over full hash table isn't really a problem here, but if the fill percentage is less than 37.5% then this indicates that we've done something wrong (obviously, I also check for the 1024 minimum size). Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28gdb: option completion for 'save gdb-index' commandAndrew Burgess1-17/+52
Add proper support for option completion to the 'save gdb-index' command. Update save_gdb_index_command function to make use of the new option_def data structures for parsing the '-dwarf-5' option. Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28gdb: allow use of ~ in 'save gdb-index' commandAndrew Burgess1-2/+5
Add a call to gdb_tilde_expand in the save_gdb_index_command function, this means that we can now do: (gdb) save gdb-index ~/blah/ Previous this wouldn't work. Approved-By: Tom Tromey <tom@tromey.com>
2023-11-21gdb: Replace gdb::optional with std::optionalLancelot Six1-2/+2
Since GDB now requires C++17, we don't need the internally maintained gdb::optional implementation. This patch does the following replacing: - gdb::optional -> std::optional - gdb::in_place -> std::in_place - #include "gdbsupport/gdb_optional.h" -> #include <optional> This change has mostly been done automatically. One exception is gdbsupport/thread-pool.* which did not use the gdb:: prefix as it already lives in the gdb namespace. Change-Id: I19a92fa03e89637bab136c72e34fd351524f65e9 Approved-By: Tom Tromey <tom@tromey.com> Approved-By: Pedro Alves <pedro@palves.net>
2023-10-20Don't include cooked-index.h from dwarf2/read.hTom Tromey1-0/+1
dwarf2/read.h includes cooked-index.h, but it doesn't need to. This patch removes the inclusion from this header, and adds one to index-write.c to make up for the absence.
2023-10-20[gdb/symtab] Fix more style issues in v9 .gdb_index section supportTom de Vries1-1/+1
I noticed a few more style issues in commit 8b9c08eddac ("[gdb/symtab] Add name_of_main and language_of_main to the DWARF index"), after checking it with gcc's check_GNU_style.{sh,py}. Fix these. Build on x86_64-linux.
2023-10-19Fix race in DWARF readerTom Tromey1-1/+1
The recent change to record the DWARF language in the per-CU data yielded a race warning in my testing: ThreadSanitizer: data race ../../binutils-gdb/gdb/dwarf2/read.c:21779 in prepare_one_comp_unit This patch fixes the bug by applying the same style of fix that was done for the ordinary (gdb) language. I wonder if this code could be improved. Requiring an atomic for the language in particular seems unfortunate, as it is often consulted during index finalization. However, I haven't investigated this. Regression tested on x86-64 Fedora 38. Reviewed-by: Tom de Vries <tdevries@suse.de>
2023-10-18[gdb/symtab] Fix two style issues in gdb/dwarf2/index-write.cTom de Vries1-3/+3
While reviewing gdb/dwarf2/index-write.c I noticed two style issues. Fix these. Tested on x86_64-linux. Approved-By: Tom Tromey <tom@tromey.com>
2023-10-18[gdb/symtab] Fix style issues in v9 .gdb_index section supportTom de Vries1-4/+4
Post-commit review pointed out a few style issues in commit 8b9c08eddac ("[gdb/symtab] Add name_of_main and language_of_main to the DWARF index"). Fix these. Tested on x86_64-linux. Reported-By: Tom Tromey <tom@tromey.com> Approved-By: Tom Tromey <tom@tromey.com>
2023-10-10[gdb/symtab] Add name_of_main and language_of_main to the DWARF indexMatheus Branco Borella1-8/+46
This patch adds a new section to the DWARF index containing the name and the language of the main function symbol, gathered from `cooked_index::get_main`, if available. Currently, for lack of a better name, this section is called the "shortcut table". The way this name is both saved and applied upon an index being loaded in mirrors how it is done in `cooked_index_functions`, more specifically, the full name of the main function symbol is saved and `set_objfile_main_name` is used to apply it after it is loaded. The main use case for this patch is in improving startup times when dealing with large binaries. Currently, when an index is used, GDB has to expand symtabs until it finds out what the language of the main function symbol is. For some large executables, this may take a considerable amount of time to complete, slowing down startup. This patch bypasses that operation by having both the name and language of the main function symbol be provided ahead of time by the index. In my testing (a binary with about 1.8GB worth of DWARF data) this change brings startup time down from about 34 seconds to about 1.5 seconds. When testing the patch with target board cc-with-gdb-index, test-case gdb.fortran/nested-funcs-2.exp starts failing, but this is due to a pre-existing issue, filed as PR symtab/30946. Tested on x86_64-linux, with target board unix and cc-with-gdb-index. PR symtab/24549 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549 Approved-By: Tom de Vries <tdevries@suse.de>
2023-09-20[gdb/symtab] Error out for .debug_types section in dwz fileTom de Vries1-0/+3
There are two methods to factor out type information in a dwarf4 executable: - use -fdebug-info-types to generate type units in a .debug_types section, and - use dwz to create partial units. The dwz method has an extra benefit: it also allows to factor out information between executables into a newly created .dwz file, pointed to by a .gnu_debugaltlink section. There is nothing prohibiting a .gnu_debugaltlink file to contain a .debug_types section. It's just not generated by dwz or any other tool atm, and consequently gdb has no support for it. Enhancement PR symtab/30838 is open about the lack of support. Make the current situation explicit by emitting a dwarf error: ... (gdb) file struct-with-sig-2^M Reading symbols from struct-with-sig-2...^M Dwarf Error: .debug_types section not supported in dwz file^M ... and add an assert in write_gdbindex: ... + /* See enhancement PR symtab/30838. */ + gdb_assert (!(per_cu->is_dwz && per_cu->is_debug_types)); ... to clarify why we can use: ... data_buf &cu_list = (per_cu->is_debug_types ? types_cu_list : per_cu->is_dwz ? dwz_cu_list : objfile_cu_list); ... The test-case is a modified copy from gdb.dwarf2/struct-with-sig.exp, so it keeps the copyright years range. Tested on x86_64-linux. Tested-By: Guinevere Larsen <blarsen@redhat.com> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30838
2023-09-16[gdb/symtab] Fix overly large gdb-index file check for 32-bitTom de Vries1-2/+82
Add a unit test which checks that write_gdb_index_1 will throw an error when the size of the file would exceed the maximum value capable of being represented by 'offset_type'. The unit test fails on 32-bit systems due to wrapping overflow. Fix this by changing the type of total_len in write_gdbindex_1 from size_t to uint64_t. Tested on x86_64-linux. Co-Authored-By: Kevin Buettner <kevinb@redhat.com> Approved-by: Kevin Buettner <kevinb@redhat.com>
2023-09-14Throw error when creating an overly large gdb-index fileKevin Buettner1-1/+8
The header in a .gdb_index section uses 32-bit unsigned offsets to refer to other areas of the section. Thus, there is a size limit of 2^32-1 which is currently unaccounted for by GDB's code for outputting these sections. At the moment, when GDB creates an overly large section, it will exit abnormally due to an internal error, which is caused by a failed assert in assert_file_size, which in turn is called from write_gdbindex_1, both of which are in gdb/dwarf2/index-write.c. This is what happens when that assert fails: $ gdb -q -nx -iex 'set auto-load no' -iex 'set debuginfod enabled off' -ex file ./libgraph_tool_inference.so -ex "save gdb-index `pwd`/" Reading symbols from ./libgraph_tool_inference.so... No executable file now. Discard symbol table from `libgraph_tool_inference.so'? (y or n) n Not confirmed. ../../gdb/dwarf2/index-write.c:1069: internal-error: assert_file_size: Assertion `file_size == expected_size' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. ----- Backtrace ----- 0x55fddb4d78b0 gdb_internal_backtrace_1 ../../gdb/bt-utils.c:122 0x55fddb4d78b0 _Z22gdb_internal_backtracev ../../gdb/bt-utils.c:168 0x55fddb98b5d4 internal_vproblem ../../gdb/utils.c:396 0x55fddb98b8de _Z15internal_verrorPKciS0_P13__va_list_tag ../../gdb/utils.c:476 0x55fddbb71654 _Z18internal_error_locPKciS0_z ../../gdbsupport/errors.cc:58 0x55fddb5a0f23 assert_file_size ../../gdb/dwarf2/index-write.c:1069 0x55fddb5a1ee0 assert_file_size /usr/include/c++/13/bits/stl_iterator.h:1158 0x55fddb5a1ee0 write_gdbindex_1 ../../gdb/dwarf2/index-write.c:1119 0x55fddb5a51be write_gdbindex ../../gdb/dwarf2/index-write.c:1273 [...] --------------------- ../../gdb/dwarf2/index-write.c:1069: internal-error: assert_file_size: Assertion `file_size == expected_size' failed. This problem was encountered while building the python-graph-tool package on Fedora. The Fedora bugzilla bug can be found here: https://bugzilla.redhat.com/show_bug.cgi?id=1773651 This commit prevents the internal error from occurring by calling error() when the file size exceeds 2^32-1. Using a gdb built with this commit, I now see this behavior instead: $ gdb -q -nx -iex 'set auto-load no' -iex 'set debuginfod enabled off' -ex file ./libgraph_tool_inference.so -ex "save gdb-index `pwd`/" Reading symbols from ./libgraph_tool_inference.so... No executable file now. Discard symbol table from `/mesquite2/fedora-bugs/1773651/libgraph_tool_inference.so'? (y or n) n Not confirmed. Error while writing index for `/mesquite2/fedora-bugs/1773651/libgraph_tool_inference.so': gdb-index maximum file size of 4294967295 exceeded (gdb) I wish I could provide a test case, but due to the sizes of both the input and output files, I think that testing resources would be strained or exceeded in many environments. My testing on Fedora 38 shows no regressions. Approved-by: Tom Tromey <tom@tromey.com>
2023-09-07[gdb/symtab] Fix gdb-index writing for .debug_typesTom de Vries1-7/+3
With test-case gdb.ada/same_enum.exp and target board dwarf4-gdb-index we run into: ... (gdb) print red^M No definition of "red" in current context.^M (gdb) FAIL: gdb.ada/same_enum.exp: print red ... [ This is a regression since commit 844a72efbce ("Simplify gdb_index writing"), so this is broken in gdb 12 and 13. ] The easiest way to see what's going wrong is with readelf. We have in section .gdb_index: ... [7194] pck__red: 2 [static, variable] 3 [static, variable] ... which points to the CUs 2 and 3 in the CU list (shown using "2" and "3"), but should be pointing to the TUs 2 and 3 in the TU list (shown using "T2" and "T3"). Fix this by removing the counter / types_counter distinction in write_gdbindex, such that we get the expected: ... [7194] pck__red: T2 [static, variable] T3 [static, variable] ... [ While reading write_gdbindex I noticed a few oddities related to dwz handling, I've filed PR30829 about this. ] Tested on x86_64-linux. Approved-By: Tom Tromey <tom@tromey.com> PR symtab/30827 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30827
2023-08-13[gdb/symtab] Don't deduplicate variables in gdb-indexTom de Vries1-3/+2
When running test-case gdb.python/py-symbol.exp with target board cc-with-gdb-index, we run into: ... (gdb) python print (len (gdb.lookup_static_symbols ('rr')))^M 1^M (gdb) FAIL: gdb.python/py-symbol.exp: print (len (gdb.lookup_static_symbols ('rr'))) ... [ Note that the test-case contains rr in both py-symtab.c: ... static int __attribute__ ((used)) rr = 42; /* line of rr */ ... and py-symtab-2.c: ... static int __attribute__ ((used)) rr = 99; /* line of other rr */ ... ] This passes with gdb-12-branch, and fails with gdb-13-branch. AFAIU the current code in symtab_index_entry::minimize makes the assumption that it's fine to store only one copy of rr in the gdb-index, because "print rr" will only ever print one, and always the same. But that fails to recognize that gdb supports gdb.lookup_static_symbols, which returns a list of variables rather than the first one. In other words, the current approach breaks feature parity between cooked index and gdb-index. Note btw that also debug-names has both instances: ... [ 5] #00597969 rr: <4> DW_TAG_variable DW_IDX_compile_unit=3 DW_IDX_GNU_internal=1 <4> DW_TAG_variable DW_IDX_compile_unit=4 DW_IDX_GNU_internal=1 ... Fix this in symtab_index_entry::minimize, by not deduplicating variables. Tested on x86_64-linux, with target boards unix and cc-with-gdb-index. Reviewed-by: Kevin Buettner <kevinb@redhat.com> PR symtab/30720 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30720
2023-08-10[gdb/symtab] Fix assertion in write_debug_namesTom de Vries1-1/+1
When running test-case gdb.dwarf2/pr13961.exp with target-board cc-with-debug-names, I run into: ... Running gdb.dwarf2/pr13961.exp ... gdb compile failed, gdb/dwarf2/index-write.c:1305: internal-error: \ write_debug_names: Assertion `counter == per_bfd->all_units.size ()' failed. ... This is a regression since commit 542a33e348a ("Only use the per-BFD object to write a DWARF index"), which did: ... - gdb_assert (counter == per_objfile->per_bfd->all_comp_units.size ()); + gdb_assert (counter == per_bfd->all_units.size ()); ... Fix this by reverting to using all_comp_units: ... gdb_assert (counter == per_bfd->all_comp_units.size ()); ... Tested on x86_64-linux, using target boards unix and cc-with-debug-names. PR symtab/30741 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30741
2023-08-03[gdb/symtab] Remove superfluous handling of Ada main in write_cooked_indexTom de Vries1-21/+6
I filed PR29179 about the following FAIL in test-case gdb.ada/O2_float_param.exp with target board cc-with-gdb-index: ... (gdb) break increment^M Function "increment" not defined.^M Make breakpoint pending on future shared library load? (y or [n]) n^M (gdb) FAIL: gdb.ada/O2_float_param.exp: scenario=all: gdb_breakpoint: \ set breakpoint at increment ... The FAIL was a regression since commit 2cf349be0e3 ("Do not put linkage names into .gdb_index"). Before that commit we had: ... $ readelf -w foo > READELF $ grep callee.*increment READELF [1568] callee__increment: 5 [global, function] [3115] callee.increment: 5 [global, function] ... but after only: ... $ grep callee.*increment READELF [3115] callee.increment: 5 [global, function] ... The regression was fixed by commit 67e83a0deef ("Fix regression in c-linkage-name.exp with gdb index"), which got us again: ... $ grep callee.*increment READELF [1568] callee__increment: 5 [global, function] [3115] callee.increment: 5 [global, function] ... The commit however did not claim that particular PR. A subsequent commit, commit 5fea9794325 ("Improve Ada support in .gdb_index") did claim to fix it, together with commit dd05fc7071a ("Change .gdb_index de-duplication implementation"). The commit 5fea9794325 contained the following addition in write_cooked_index: ... + if (entry->per_cu->lang () == language_ada) + { + /* We want to ensure that the Ada main function's name + appears verbatim in the index. However, this name will + be of the form "_ada_mumble", and will be rewritten by + ada_decode. So, recognize it specially here and add it + to the index by hand. */ + if (entry->tag == DW_TAG_subprogram + && strcmp (main_for_ada, name) == 0) + { + /* Leave it alone. */ + } + else + { + /* In order for the index to work when read back into + gdb, it has to use the encoded name, with any + suffixes stripped. */ + std::string encoded = ada_encode (name, false); + name = obstack_strdup (&symtab->m_string_obstack, + encoded.c_str ()); + } + } ... The code contains some special handling related to the Ada main function, so let's look at that one: foo. Before commit 67e83a0deef we have: ... $ grep foo.*function READELF [3733] foo: 7 [global, function] ... and after: ... $ grep foo.*function READELF [2738] _ada_foo: 7 [global, function] [3733] foo: 7 [global, function] ... so that looks identical to the callee.increment case. At commit 5fea9794325, we have slightly different index numbers: ... $ grep foo.*function READELF [1658] foo: 7 [global, function] [2738] _ada_foo: 7 [global, function] ... but otherwise the same result. If we disable the special handling of the Ada main function like so: ... - if (entry->tag == DW_TAG_subprogram + if (false && entry->tag == DW_TAG_subprogram ... we still have the exact same result because: ... (gdb) p main_for_ada $1 = 0x352e6a0 "_ada_foo" ... and ada_encode ("_ada_foo", false) == "_ada_foo". The comment seems to be copied from debug_names::insert, which does indeed use ada_decode, while the code in write_cooked_index uses ada_encode instead. Remove the superfluous special handling of Ada main in write_cooked_index. Tested on x86_64-linux, with target boards unix and cc-with-gdb-index. Approved-By: Tom Tromey <tom@tromey.com>
2023-02-24Only use the per-BFD object to write a DWARF indexTom Tromey1-44/+36
The DWARF index does not need access to the objfile or per-objfile objects when writing -- it's entirely based on the objfile-independent per-BFD data. This patch implements this idea by changing the entire API to only be passed the per-BFD object. This simplifies some lifetime reasoning for the next patch. This patch removes some code that ensures that the BFD came from a file. It seems to me that checking for the existence of a build-id is good enough for the index cache.
2023-01-31gdb/dwarf: rename cooked_index_vector to cooked_indexSimon Marchi1-7/+4
See previous patch's commit message for rationale. Change-Id: I6b8cdc045dffccc1c01ed690ff258af09f6ff076 Approved-By: Tom Tromey <tom@tromey.com>
2023-01-30Make addrmap const-correct in cooked indexTom Tromey1-5/+7
After the cooked index is created, the addrmaps should be const. Change-Id: I8234520ab346ced40a8dd6e478ba21fc438c2ba2
2023-01-30gdb: provide const-correct versions of addrmap::find and addrmap::foreachSimon Marchi1-1/+1
Users of addrmap::find and addrmap::foreach that have a const addrmap should ideally receive const pointers to objects, to indicate they should not be modified. However, users that have a non-const addrmap should still receive a non-const pointer. To achieve this, without adding more virtual methods, make the existing find and foreach virtual methods private and prefix them with "do_". Add small non-const and const wrappers for find and foreach. Obviously, the const can be cast away, but if using static_cast instead of C-style casts, then the compiler won't let you cast the const away. I changed all the callers of addrmap::find and addrmap::foreach I could find to make them use static_cast. Change-Id: Ia8e69d022564f80d961413658fe6068edc71a094
2023-01-01Update copyright year range in header of all files managed by GDBJoel Brobecker1-1/+1
This commit is the result of running the gdb/copyright.py script, which automated the update of the copyright year range for all source files managed by the GDB project to be updated to include year 2023.
2022-10-21Fix incorrect .gdb_index with new DWARF scannerTom Tromey1-0/+5
PR symtab/29694 points out a regression caused by the new DWARF scanner when the cc-with-gdb-index target board is used. What happens here is that an older version of gdb will make an index describing the "A" type as: [737] A: 1 [global, type] whereas the new gdb says: [1008] A: 0 [global, type] Here the old one is correct because the A in CU 0 is just a declaration without a size: <1><45>: Abbrev Number: 10 (DW_TAG_structure_type) <46> DW_AT_name : A <48> DW_AT_declaration : 1 <48> DW_AT_sibling : <0x6d> This patch fixes the problem by introducing the idea of a "type declaration". I think gdb still needs to recurse into these types, searching for methods, but by marking the type itself as a declaration, gdb can skip this type during lookups and when writing the index. Regression tested on x86-64 using the cc-with-gdb-index board. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29694
2022-10-17Change .gdb_index de-duplication implementationTom Tromey1-36/+41
While investigating PR symtab/29179, I found that one Ada test failed because, although a certain symbol was present in the index, with the new DWARF reader it pointed to a different CU than was chosen by earlier versions of gdb. This patch changes how symbol de-duplication is done, deferring the process until the entire symbol table has been constructed. This way, it's possible to always choose the lower-numbered CU among duplicates, which is how gdb (implicitly) previously worked. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29179
2022-10-17Improve Ada support in .gdb_indexTom Tromey1-9/+35
The cooked index work changed how .gdb_index is constructed, and in the process broke .gdb_index support. This is PR symtab/29179. This patch partially fixes the problem. It arranges for Ada names to be encoded in the form expected by the index code. In particular, linkage names for Ada are emitted, including the "main" name; names are Ada-encoded; and names are no longer case-folded, something that prevented operator names from round-tripping correctly. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29179
2022-10-17Fix regression in c-linkage-name.exp with gdb indexTom Tromey1-3/+4
c-linkage-name.exp started failing with the gdb-index target board due to an earlier patch. The problem here is that some linkage names must be in the index -- but, based on inspection, not C++ linkage names. This patch updates the code to exclude only these.
2022-10-08Merge both implementations of debug_names::insertTom Tromey1-27/+24
The class debug_names has two 'insert' overloads, but only one of them is ever called externally, and it simply forwards to the other implementation. It seems cleaner to me to have a single method, so this patch merges the two.
2022-09-22[gdb/symtab] Add all_comp_units/all_type_units views on all_unitsTom de Vries1-3/+2
Add all_comp_units/all_type_units views on all_units. Having the views allows us to: - easily get the number of CUs or TUs in all_units, and - easily access the nth CU or TU. This minimizes the use of tu_stats.nr_tus. Tested on x86_64-linux.
2022-09-22[gdb/symtab] Rename all_comp_units to all_unitsTom de Vries1-10/+10
Mechanically rename all_comp_units to all_units: ... $ sed -i 's/all_comp_units/all_units/' gdb/dwarf2/* ... Tested on x86_64-linux.
2022-08-03Use gdb_bfd_ref_ptr in objfileTom Tromey1-1/+1
This changes struct objfile to use a gdb_bfd_ref_ptr. In addition to removing some manual memory management, this fixes a use-after-free that was introduced by the registry rewrite series. The issue there was that, in some cases, registry shutdown could refer to memory that had already been freed. This help fix the bug by delaying the destruction of the BFD reference (and thus the per-bfd object) until after the registry has been shut down.
2022-07-11[gdb/symtab] Fix data race in per_cu->lengthTom de Vries1-1/+1
With gdb build with -fsanitize=thread and test-case gdb.dwarf2/dw4-sig-types.exp and target board cc-with-dwz-m we run into a data race between: ... Write of size 4 at 0x7b2800002268 by thread T4:^M #0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \ abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6236 \ (gdb+0x82f525)^M ... and this read: ... Previous read of size 4 at 0x7b2800002268 by thread T1:^M #0 dwarf2_find_containing_comp_unit gdb/dwarf2/read.c:23444 \ (gdb+0x86e22e)^M ... In other words, between this write: ... this_cu->length = cu->header.get_length (); ... and this read: ... && mid_cu->sect_off + mid_cu->length > sect_off)) ... of per_cu->length. Fix this similar to the per_cu->dwarf_version case, by only setting it if needed, and otherwise verifying that the same value is used. [ Note that the same code is already present in the other cutu_reader constructor. ] Move this logic into into a member function set_length to make sure it's used consistenly, and make the field private in order to enforce access through the member functions, and rename it to m_length. This exposes (running test-case gdb.dwarf2/fission-reread.exp) that in fill_in_sig_entry_from_dwo_entry, the m_length field is overwritten. For now, allow for that exception. While we're at it, make sure that the length is set before read. Tested on x86_64-linux. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29344
2022-07-04[gdb/symtab] Add get/set functions for per_cu->lang/unit_typeTom de Vries1-1/+1
The dwarf2_per_cu_data fields lang and unit_type both have a dont-know initial value (respectively language_unknown and (dwarf_unit_type)0), which allows us to add certain checks, f.i. checking that that a field is not read before written. Add get/set member functions for the two fields as a convenient location to add such checks, make the fields private to enforce using the member functions, and add the m_ prefix. Tested on x86_64-linux.
2022-06-12Remove addrmap wrapper functionsTom Tromey1-1/+1
This removes the various addrmap wrapper functions in favor of simple method calls on the objects themselves.
2022-05-04Fix crash when creating index from indexTom Tromey1-18/+9
My patches yesterday to unify the DWARF index base classes had a bug -- namely, I did the wholesale dynamic_cast-to-static_cast too hastily and introduced a crash. This can be seen by trying to add an index to a file that has an index, or by running a test like gdb-index-cxx.exp using the cc-with-debug-names.exp target board. This patch fixes the crash by introducing a new virtual method and removing some of the static casts.
2022-04-29De-duplicate .gdb_indexTom Tromey1-2/+23
This de-duplicates variables and types in .gdb_index, making the new index closer to what gdb generated before the new DWARF scanner series. Spot-checking the resulting index for gdb itself, it seems that the new scanner picks up some extra symbols not detected by the old one. I tested both the new and old versions of gdb on both new and old versions of the index, and startup time in all cases is roughly the same (it's worth noting that, for gdb itself, the index no longer provides any benefit over the DWARF scanner). So, I think this fixes the size issue with the new index writer. Regression tested on x86-64 Fedora 34.