aboutsummaryrefslogtreecommitdiff
path: root/gdb/corelow.c
diff options
context:
space:
mode:
authorAndrew Burgess <aburgess@redhat.com>2024-04-30 14:21:47 +0100
committerAndrew Burgess <aburgess@redhat.com>2024-09-07 20:28:57 +0100
commitfa826a4bbe98563d1ee98278c3ff0f52b77c413d (patch)
treec449d7efa41882523210bd31521ff03fab9017d9 /gdb/corelow.c
parenta47a679c76922687f6e904c64224af914da63f89 (diff)
downloadbinutils-fa826a4bbe98563d1ee98278c3ff0f52b77c413d.zip
binutils-fa826a4bbe98563d1ee98278c3ff0f52b77c413d.tar.gz
binutils-fa826a4bbe98563d1ee98278c3ff0f52b77c413d.tar.bz2
gdb: improve shared library build-id check for core-files
When GDB opens a core file, in 'core_target::build_file_mappings ()', we collection information about the files that are mapped into the core file, specifically, the build-id and the DT_SONAME attribute for the file, which will be set for some shared libraries. We then cache the DT_SONAME to build-id information on the core file bfd object in the function set_cbfd_soname_build_id. Later, when we are loading the shared libraries for the core file, we can use the library's file name to look in the DT_SONAME to build-id map, and, if we find a matching entry, we can use the build-id to validate that we are loading the correct shared library. This works OK, but has some limitations: not every shared library will have a DT_SONAME attribute. Though it is good practice to add such an attribute, it's not required. A library without this attribute will not have its build-id checked, which can lead to GDB loading the wrong shared library. What I want to do in this commit is to improve GDB's ability to use the build-ids extracted in core_target::build_file_mappings to both validate the shared libraries being loaded, and then to use these build-ids to potentially find (via debuginfod) the shared library. To do this I propose making the following changes to GDB: (1) Rather than just recording the DT_SONAME to build-id mapping in set_cbfd_soname_build_id, we should also record, the full filename to build-id mapping, and also the memory ranges to build-id mapping for every memory range covered by every mapped file. (2) Add a new callback solib_ops::find_solib_addr. This callback takes a solib object and returns an (optional) address within the inferior that is part of this library. We can use this address to find a mapped file using the stored memory ranges which will increase the cases in which a match can be found. (3) Move the mapped file record keeping out of solib.c and into corelow.c. Future commits will make use of this information from other parts of GDB. This information was never solib specific, it lived in the solib.c file because that was the only user of the data, but really, the data is all about the core file, and should be stored in core_target, other parts of GDB can then query this data as needed. Now, when we load a shared library for a core file, we do the following lookups: 1. Is the exact filename of the shared library found in the filename to build-id map? If so then use this build-id for validation. 2. Find an address within the shared library using ::find_solib_addr and then look for an entry in the mapped address to build-id map. If an entry is found then use this build-id. 3. Finally, look in the soname to build-id map. If an entry is found then use this build-id. The addition of step #2 here means that GDB is now far more likely to find a suitable build-id for a shared library. Having acquired a build-id the existing code for using debuginfod to lookup a shared library object can trigger more often. On top of this, we also create a build-id to filename map. This is useful as often a shared library is implemented as a symbolic link to the actual shared library file. The mapped file information is stored based on the actual, real file name, while the shared library information holds the original symbolic link file name. If when loading the shared library, we find the symbolic link has disappeared, we can use the build-id to file name map to check if the actual file is still around, if it is (and if the build-id matches) then we can fall back to use that file. This is another way in which we can slightly increase the chances that GDB will find the required files when loading a core file. Adding all of the above required pretty much a full rewrite of the existing set_cbfd_soname_build_id function and the corresponding get_cbfd_soname_build_id function, so I have taken the opportunity to move the information caching out of solib.c and into corelow.c where it is now accessed through the function core_target_find_mapped_file. At this point the benefit of this move is not entirely obvious, though I don't think the new location is significantly worse than where it was originally. The benefit though is that the cached information is no longer tied to the shared library loading code. I already have a second set of patches (not in this series) that make use of this caching from elsewhere in GDB. I've not included those patches in this series as this series is already pretty big, but even if those follow up patches don't arrive, I think the new location is just as good as the original location. Rather that caching the information within the core file BFD via the registry mechanism, the information used for the mapped file lookup is now stored within the core_file target directly.
Diffstat (limited to 'gdb/corelow.c')
-rw-r--r--gdb/corelow.c334
1 files changed, 325 insertions, 9 deletions
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 2798495..e23d7d7 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -60,6 +60,119 @@
#define O_LARGEFILE 0
#endif
+/* A mem_range and the build-id associated with the file mapped into the
+ given range. */
+
+struct mem_range_and_build_id
+{
+ mem_range_and_build_id (mem_range &&r, const bfd_build_id *id)
+ : range (r),
+ build_id (id)
+ { /* Nothing. */ }
+
+ /* A range of memory addresses. */
+ mem_range range;
+
+ /* The build-id of the file mapped into RANGE. */
+ const bfd_build_id *build_id;
+};
+
+/* An instance of this class is created within the core_target and is used
+ to hold all the information that relating to mapped files, their address
+ ranges, and their corresponding build-ids. */
+
+struct mapped_file_info
+{
+ /* See comment on function definition. */
+
+ void add (const char *soname, const char *expected_filename,
+ const char *actual_filename, std::vector<mem_range> &&ranges,
+ const bfd_build_id *build_id);
+
+ /* See comment on function definition. */
+
+ std::optional <core_target_mapped_file_info>
+ lookup (const char *filename, const std::optional<CORE_ADDR> &addr);
+
+private:
+
+ /* Helper for ::lookup. BUILD_ID is a build-id that was found in
+ one of the data structures within this class. Lookup the
+ corresponding filename in m_build_id_to_filename_map and return a pair
+ containing the build-id and filename.
+
+ If no corresponding filename is found in m_build_id_to_filename_map
+ then the returned pair contains BUILD_ID and an empty string.
+
+ If BUILD_ID is nullptr then the returned pair contains nullptr and an
+ empty string. */
+
+ struct core_target_mapped_file_info
+ make_result (const bfd_build_id *build_id)
+ {
+ if (build_id != nullptr)
+ {
+ auto it = m_build_id_to_filename_map.find (build_id);
+ if (it != m_build_id_to_filename_map.end ())
+ return { build_id, it->second };
+ }
+
+ return { build_id, {} };
+ }
+
+ /* A type that maps a string to a build-id. */
+ using string_to_build_id_map
+ = std::unordered_map<std::string, const bfd_build_id *>;
+
+ /* A type that maps a build-id to a string. */
+ using build_id_to_string_map
+ = std::unordered_map<const bfd_build_id *, std::string>;
+
+ /* When loading a core file, the build-ids are extracted based on the
+ file backed mappings. This map associates the name of a file that was
+ mapped into the core file with the corresponding build-id. The
+ build-id pointers in this map will never be nullptr as we only record
+ files if they have a build-id. */
+
+ string_to_build_id_map m_filename_to_build_id_map;
+
+ /* Map a build-id pointer back to the name of the file that was mapped
+ into the inferior's address space. If we lookup a matching build-id
+ using either a soname or an address then this map allows us to also
+ provide a full path to a file with a matching build-id. */
+
+ build_id_to_string_map m_build_id_to_filename_map;
+
+ /* If the file that was mapped into the core file was a shared library
+ then it might have a DT_SONAME tag in its .dynamic section, this tag
+ contains the name of a shared object. When opening a shared library,
+ if it's basename appears in this map then we can use the corresponding
+ build-id.
+
+ In the rare case that two different files have the same DT_SONAME
+ value then the build-id pointer in this map will be nullptr, this
+ indicates that it's not possible to find a build-id based on the given
+ DT_SONAME value. */
+
+ string_to_build_id_map m_soname_to_build_id_map;
+
+ /* This vector maps memory ranges onto an associated build-id. The
+ ranges are those of the files mapped into the core file.
+
+ Entries in this vector must not overlap, and are sorted be increasing
+ memory address. Within each entry the build-id pointer will not be
+ nullptr.
+
+ While building this vector the entries are not sorted, they are
+ sorted once after the table has finished being built. */
+
+ std::vector<mem_range_and_build_id> m_address_to_build_id_list;
+
+ /* False if address_to_build_id_list is unsorted, otherwise true. */
+
+ bool m_address_to_build_id_list_sorted = false;
+};
+
/* The core file target. */
static const target_info core_target_info = {
@@ -136,6 +249,13 @@ public:
/* See definition. */
void info_proc_mappings (struct gdbarch *gdbarch);
+ std::optional <core_target_mapped_file_info>
+ lookup_mapped_file_info (const char *filename,
+ const std::optional<CORE_ADDR> &addr)
+ {
+ return m_mapped_file_info.lookup (filename, addr);
+ }
+
private: /* per-core data */
/* Get rid of the core inferior. */
@@ -158,7 +278,13 @@ private: /* per-core data */
still be useful. */
std::vector<mem_range> m_core_unavailable_mappings;
- /* Build m_core_file_mappings. Called from the constructor. */
+ /* Data structure that holds information mapping filenames and address
+ ranges to the corresponding build-ids as well as the reverse build-id
+ to filename mapping. */
+ mapped_file_info m_mapped_file_info;
+
+ /* Build m_core_file_mappings and m_mapped_file_info. Called from the
+ constructor. */
void build_file_mappings ();
/* FIXME: kettenis/20031023: Eventually this field should
@@ -354,6 +480,10 @@ core_target::build_file_mappings ()
}
}
+ std::vector<mem_range> ranges;
+ for (const mapped_file::region &region : file_data.regions)
+ ranges.emplace_back (region.start, region.end - region.start);
+
if (expanded_fname == nullptr
|| abfd == nullptr
|| !bfd_check_format (abfd.get (), bfd_object))
@@ -430,16 +560,26 @@ core_target::build_file_mappings ()
}
}
- /* If this is a bfd of a shared library, record its soname and
- build-id. */
- if (file_data.build_id != nullptr && abfd != nullptr)
+ /* If this is a bfd with a build-id then record the filename,
+ optional soname (DT_SONAME .dynamic attribute), and the range of
+ addresses at which this bfd is mapped. This information can be
+ used to perform build-id checking when loading the shared
+ libraries. */
+ if (file_data.build_id != nullptr)
{
- gdb::unique_xmalloc_ptr<char> soname
- = gdb_bfd_read_elf_soname (bfd_get_filename (abfd.get ()));
+ normalize_mem_ranges (&ranges);
+
+ const char *actual_filename = nullptr;
+ gdb::unique_xmalloc_ptr<char> soname;
+ if (abfd != nullptr)
+ {
+ actual_filename = bfd_get_filename (abfd.get ());
+ soname = gdb_bfd_read_elf_soname (actual_filename);
+ }
- if (soname != nullptr)
- set_cbfd_soname_build_id (current_program_space->cbfd,
- soname.get (), file_data.build_id);
+ m_mapped_file_info.add (soname.get (), filename.c_str (),
+ actual_filename, std::move (ranges),
+ file_data.build_id);
}
}
@@ -1634,6 +1774,182 @@ maintenance_print_core_file_backed_mappings (const char *args, int from_tty)
targ->info_proc_mappings (targ->core_gdbarch ());
}
+/* Add more details discovered while processing the core-file's mapped file
+ information, we're building maps between filenames and the corresponding
+ build-ids, between address ranges and the corresponding build-ids, and
+ also a reverse map between build-id and the corresponding filename.
+
+ SONAME is the DT_SONAME attribute extracted from the .dynamic section of
+ a shared library that was mapped into the core file. This can be
+ nullptr if the mapped files was not a shared library, or didn't have a
+ DT_SONAME attribute.
+
+ EXPECTED_FILENAME is the name of the file that was mapped into the
+ inferior as extracted from the core file, this should never be nullptr.
+
+ ACTUAL_FILENAME is the name of the actual file GDB found to provide the
+ mapped file information, this can be nullptr if GDB failed to find a
+ suitable file. This might be different to EXPECTED_FILENAME, e.g. GDB
+ might have downloaded the file from debuginfod and so ACTUAL_FILENAME
+ will be a file in the debuginfod client cache.
+
+ RANGES is the list of memory ranges at which this file was mapped into
+ the inferior.
+
+ BUILD_ID is the build-id for this mapped file, this will never be
+ nullptr. Not every mapped file will have a build-id, but there's no
+ point calling this function if we failed to find a build-id; this
+ structure only exists so we can lookup files based on their build-id. */
+
+void
+mapped_file_info::add (const char *soname,
+ const char *expected_filename,
+ const char *actual_filename,
+ std::vector<mem_range> &&ranges,
+ const bfd_build_id *build_id)
+{
+ gdb_assert (build_id != nullptr);
+ gdb_assert (expected_filename != nullptr);
+
+ if (soname != nullptr)
+ {
+ /* If we already have an entry with this SONAME then this indicates
+ that the inferior has two files mapped into memory with different
+ file names (and most likely different build-ids), but with the
+ same DT_SONAME attribute. In this case we can't use the
+ DT_SONAME to figure out the expected build-id of a shared
+ library, so poison the entry for this SONAME by setting the entry
+ to nullptr. */
+ auto it = m_soname_to_build_id_map.find (soname);
+ if (it != m_soname_to_build_id_map.end ()
+ && it->second != nullptr
+ && !build_id_equal (it->second, build_id))
+ m_soname_to_build_id_map[soname] = nullptr;
+ else
+ m_soname_to_build_id_map[soname] = build_id;
+ }
+
+ /* When the core file is initially opened and the mapped files are
+ parsed, we group the build-id information based on the file name. As
+ a consequence, we should see each EXPECTED_FILENAME value exactly
+ once. This means that each insertion should always succeed. */
+ const auto [it, inserted]
+ = m_filename_to_build_id_map.emplace (expected_filename, build_id);
+ gdb_assert (inserted);
+
+ /* Setup the reverse build-id to file name map. */
+ if (actual_filename != nullptr)
+ m_build_id_to_filename_map.emplace (build_id, actual_filename);
+
+ /* Setup the list of memory range to build-id objects. */
+ for (mem_range &r : ranges)
+ m_address_to_build_id_list.emplace_back (std::move (r), build_id);
+
+ /* At this point the m_address_to_build_id_list is unsorted (we just
+ added some entries to the end of the list). All entries should be
+ added before any look-ups are performed, and the list is only sorted
+ when the first look-up is performed. */
+ gdb_assert (!m_address_to_build_id_list_sorted);
+}
+
+/* FILENAME is the name of a file GDB is trying to load, and ADDR is
+ (optionally) an address within the file in the inferior's address space.
+
+ Search through the information gathered from the core-file's mapped file
+ information looking for a file named FILENAME, or for a file that covers
+ ADDR. If a match is found then return the build-id for the file along
+ with the location where GDB found the mapped file.
+
+ The location of the mapped file might be the empty string if GDB was
+ unable to find the mapped file.
+
+ If no build-id can be found for FILENAME then GDB will return a pair
+ containing nullptr (for the build-id) and an empty string for the file
+ name. */
+
+std::optional <core_target_mapped_file_info>
+mapped_file_info::lookup (const char *filename,
+ const std::optional<CORE_ADDR> &addr)
+{
+ if (filename != nullptr)
+ {
+ /* If there's a matching entry in m_filename_to_build_id_map then the
+ associated build-id will not be nullptr, and can be used to
+ validate that FILENAME is correct. */
+ auto it = m_filename_to_build_id_map.find (filename);
+ if (it != m_filename_to_build_id_map.end ())
+ return make_result (it->second);
+ }
+
+ if (addr.has_value ())
+ {
+ /* On the first lookup, sort the address_to_build_id_list. */
+ if (!m_address_to_build_id_list_sorted)
+ {
+ std::sort (m_address_to_build_id_list.begin (),
+ m_address_to_build_id_list.end (),
+ [] (const mem_range_and_build_id &a,
+ const mem_range_and_build_id &b) {
+ return a.range < b.range;
+ });
+ m_address_to_build_id_list_sorted = true;
+ }
+
+ /* Look for the first entry whose range's start address is not less
+ than, or equal too, the address ADDR. If we find such an entry,
+ then the previous entry's range might contain ADDR. If it does
+ then that previous entry's build-id can be used. */
+ auto it = std::lower_bound
+ (m_address_to_build_id_list.begin (),
+ m_address_to_build_id_list.end (),
+ *addr,
+ [] (const mem_range_and_build_id &a,
+ const CORE_ADDR &b) {
+ return a.range.start <= b;
+ });
+
+ if (it != m_address_to_build_id_list.begin ())
+ {
+ --it;
+
+ if (it->range.contains (*addr))
+ return make_result (it->build_id);
+ }
+ }
+
+ if (filename != nullptr)
+ {
+ /* If the basename of FILENAME appears in m_soname_to_build_id_map
+ then when the mapped files were processed, we saw a file with a
+ DT_SONAME attribute corresponding to FILENAME, use that build-id
+ to validate FILENAME.
+
+ However, the build-id in this map might be nullptr if we saw
+ multiple mapped files with the same DT_SONAME attribute (though
+ this should be pretty rare). */
+ auto it
+ = m_soname_to_build_id_map.find (lbasename (filename));
+ if (it != m_soname_to_build_id_map.end ()
+ && it->second != nullptr)
+ return make_result (it->second);
+ }
+
+ return {};
+}
+
+/* See gdbcore.h. */
+
+std::optional <core_target_mapped_file_info>
+core_target_find_mapped_file (const char *filename,
+ std::optional<CORE_ADDR> addr)
+{
+ core_target *targ = get_current_core_target ();
+ if (targ == nullptr || current_program_space->cbfd.get () == nullptr)
+ return {};
+
+ return targ->lookup_mapped_file_info (filename, addr);
+}
+
void _initialize_corelow ();
void
_initialize_corelow ()