gdb: remove some caching from the dwarf reader

While working on some changes to 'info sources' I ran into a situation where I was seeing the same source files reported twice in the output of the 'info sources' command when using either .gdb_index or the .debug_name index. I traced the problem back to some caching in dwarf2_base_index_functions::map_symbol_filenames; when called GDB caches the set of filenames, but, filesnames are not removed as the index entries are expanded into full symtabs. As a result we can end up seeing filenames reported both from a full symtab _and_ from a (stale) previously cached index entry. Now, obviously, when seeing a problem like this the "correct" fix is to remove the stale entries from the cache, however, I ran a few experiments to see why this wasn't really hitting us anywhere, and, as far as I can tell, ::map_symbol_filenames is only called from three places: 1. The mi command -file-list-exec-source-files, 2. The 'info sources' command, and 3. Filename completion However, the result of this "bug" is that we will see duplicate filenames, and readline's completion mechanism already removes duplicates, so for case #3 we will never see any problems. Cases #1 and #2 are basically the same, and in each case, to see a problem we need to ensure we craft the test in a particular way, start up ensuring we have some unexpected symtabs, then run one of the commands to populate the cache, then expand one of the symtabs, and list the sources again. At this point you'll see duplicate entries in the results. Hardly surprising we haven't randomly hit this situation in testing. So, considering that use cases #1 and #2 are certainly not "high performance" code (i.e. I don't think these justify the need for caching) this leaves use case #3. Does this use justify the need for caching? Well the psymbol_functions::map_symbol_filenames function doesn't seem to do any extra caching, and within dwarf2_base_index_functions::map_symbol_filenames, the only expensive bit appears to be the call to dw2_get_file_names, and this already does its own caching via this_cu->v.quick->file_names. The upshot of all this analysis was that I'm not convinced the need for the additional caching is justified, and so, I propose that to fix the bug in GDB, I just remove the extra caching (for now). If we later find that the caching _was_ useful, then we can reintroduce it, but add it back such that it doesn't reintroduce this bug. As I was changing dwarf2_base_index_functions::map_symbol_filenames I replaced the use of htab_up with std::unordered_set. Tested using target_boards cc-with-debug-names and dwarf4-gdb-index. gdb/ChangeLog: * dwarf2/read.c: Add 'unordered_set' include. (dwarf2_base_index_functions::map_symbol_filenames): Replace 'visited' hash table with 'qfn_cache' unordered_set. Remove use of per_Bfd->filenames_cache cache, and use function local filenames_cache instead. Reindent. * dwarf2/read.h (struct dwarf2_per_bfd) <filenames_cache>: Delete. gdb/testsuite/ChangeLog: * gdb.base/info_sources.exp: Add new tests.
author: Andrew Burgess <andrew.burgess@embecosm.com> 2021-04-19 13:14:41 +0100
committer: Andrew Burgess <andrew.burgess@embecosm.com> 2021-04-23 09:40:42 +0100
commit: 9fc299558896b4ff19c45a3e62459851e4d96cb9 (patch)
tree: 0af699b88a9261fbec735e25b2e59038f7fa36a9 /gdb/dwarf2
parent: c21346c5e224ace15a3ba8b7f56cd375e30e29a8 (diff)
download: gdb-9fc299558896b4ff19c45a3e62459851e4d96cb9.zip
gdb-9fc299558896b4ff19c45a3e62459851e4d96cb9.tar.gz
gdb-9fc299558896b4ff19c45a3e62459851e4d96cb9.tar.bz2
2 files changed, 29 insertions, 46 deletions
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 40aeb29..306971e 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -88,6 +88,7 @@
 #include "rust-lang.h"
 #include "gdbsupport/pathstuff.h"
 #include "count-one-bits.h"
+#include <unordered_set>
 
 /* When == 1, print basic high level tracing messages.
    When > 1, be more verbose.
@@ -4713,58 +4714,44 @@ dwarf2_base_index_functions::map_symbol_filenames
 {
   dwarf2_per_objfile *per_objfile = get_dwarf2_per_objfile (objfile);
 
-  if (!per_objfile->per_bfd->filenames_cache)
-    {
-      per_objfile->per_bfd->filenames_cache.emplace ();
-
-      htab_up visited (htab_create_alloc (10,
-					  htab_hash_pointer, htab_eq_pointer,
-					  NULL, xcalloc, xfree));
+  /* Use caches to ensure we only call FUN once for each filename.  */
+  filename_seen_cache filenames_cache;
+  std::unordered_set<quick_file_names *> qfn_cache;
 
-      /* The rule is CUs specify all the files, including those used
-	 by any TU, so there's no need to scan TUs here.  We can
-	 ignore file names coming from already-expanded CUs.  */
+  /* The rule is CUs specify all the files, including those used by any TU,
+     so there's no need to scan TUs here.  We can ignore file names coming
+     from already-expanded CUs.  It is possible that an expanded CU might
+     reuse the file names data from a currently unexpanded CU, in this
+     case we don't want to report the files from the unexpanded CU.  */
 
-      for (dwarf2_per_cu_data *per_cu : per_objfile->per_bfd->all_comp_units)
+  for (dwarf2_per_cu_data *per_cu : per_objfile->per_bfd->all_comp_units)
+    {
+      if (per_objfile->symtab_set_p (per_cu))
 	{
-	  if (per_objfile->symtab_set_p (per_cu))
-	    {
-	      void **slot = htab_find_slot (visited.get (),
-					    per_cu->v.quick->file_names,
-					    INSERT);
-
-	      *slot = per_cu->v.quick->file_names;
-	    }
+	  if (per_cu->v.quick->file_names != nullptr)
+	    qfn_cache.insert (per_cu->v.quick->file_names);
 	}
+    }
 
-      for (dwarf2_per_cu_data *per_cu : per_objfile->per_bfd->all_comp_units)
-	{
-	  /* We only need to look at symtabs not already expanded.  */
-	  if (per_objfile->symtab_set_p (per_cu))
-	    continue;
-
-	  quick_file_names *file_data
-	    = dw2_get_file_names (per_cu, per_objfile);
-	  if (file_data == NULL)
-	    continue;
+  for (dwarf2_per_cu_data *per_cu : per_objfile->per_bfd->all_comp_units)
+    {
+      /* We only need to look at symtabs not already expanded.  */
+      if (per_objfile->symtab_set_p (per_cu))
+	continue;
 
-	  void **slot = htab_find_slot (visited.get (), file_data, INSERT);
-	  if (*slot)
-	    {
-	      /* Already visited.  */
-	      continue;
-	    }
-	  *slot = file_data;
+      quick_file_names *file_data = dw2_get_file_names (per_cu, per_objfile);
+      if (file_data == nullptr
+	  || qfn_cache.find (file_data) != qfn_cache.end ())
+	continue;
 
-	  for (int j = 0; j < file_data->num_file_names; ++j)
-	    {
-	      const char *filename = file_data->file_names[j];
-	      per_objfile->per_bfd->filenames_cache->seen (filename);
-	    }
+      for (int j = 0; j < file_data->num_file_names; ++j)
+	{
+	  const char *filename = file_data->file_names[j];
+	  filenames_cache.seen (filename);
 	}
     }
 
-  per_objfile->per_bfd->filenames_cache->traverse ([&] (const char *filename)
+  filenames_cache.traverse ([&] (const char *filename)
     {
       gdb::unique_xmalloc_ptr<char> this_real_name;
 
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 4dea86f..dc25ff1 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -235,10 +235,6 @@ public:
   /* The CUs we recently read.  */
   std::vector<dwarf2_per_cu_data *> just_read_cus;
 
-  /* Table containing all filenames.  This is an optional because the
-     table is lazily constructed on first access.  */
-  gdb::optional<filename_seen_cache> filenames_cache;
-
   /* If we loaded the index from an external file, this contains the
      resources associated to the open file, memory mapping, etc.  */
   std::unique_ptr<index_cache_resource> index_cache_res;
author	Andrew Burgess <andrew.burgess@embecosm.com>	2021-04-19 13:14:41 +0100
committer	Andrew Burgess <andrew.burgess@embecosm.com>	2021-04-23 09:40:42 +0100
commit	9fc299558896b4ff19c45a3e62459851e4d96cb9 (patch)
tree	0af699b88a9261fbec735e25b2e59038f7fa36a9 /gdb/dwarf2
parent	c21346c5e224ace15a3ba8b7f56cd375e30e29a8 (diff)
download	gdb-9fc299558896b4ff19c45a3e62459851e4d96cb9.zip gdb-9fc299558896b4ff19c45a3e62459851e4d96cb9.tar.gz gdb-9fc299558896b4ff19c45a3e62459851e4d96cb9.tar.bz2