Per-language symbol name hashing algorithm

Currently, we have a mess of symbol name hashing/comparison routines. There's msymbol_hash for mangled names, and dict_hash and msymbol_hash_iw for demangled names. Then there's strcmp_iw, strcmp_iw_ordered and Ada's full_match/wild_match, which all have to agree with the hashing routines. That's why dict_hash is really about Ada names. From the inconsistency department, minimal symbol hashing doesn't go via dict_hash, so Ada's wild matching can't ever work with minimal symbols. This patch starts fixing this, by doing two things: #1 - adds a language vector method to let each language decide how to compute a symbol name hash. #2 - makes dictionaries know the language of the symbols they hold, and then use the dictionaries language to decide which hashing method to use. For now, this is just scaffolding, since all languages install the default method. The series will make C++ install its own hashing method later on, and will add per-language symbol name comparison routines too. This patch was originally based on a patch that Keith wrote for the libcc1/C++ WIP support. gdb/ChangeLog: 2017-11-08 Keith Seitz <keiths@redhat.com> Pedro Alves <palves@redhat.com> * ada-lang.c (ada_language_defn): Install default_search_name_hash. * buildsym.c (struct buildsym_compunit): <language>: New field. (finish_block_internal): Pass language when creating dictionaries. (start_buildsym_compunit, start_symtab): New language parameters. Use them. (restart_symtab): Pass down compilation unit's language. * buildsym.h (enum language): Forward declare. (start_symtab): New 'language' parameter. * c-lang.c (c_language_defn, cplus_language_defn) (asm_language_defn, minimal_language_defn): Install default_search_name_hash. * coffread.c (coff_start_symtab): Adjust. * d-lang.c (d_language_defn): Install default_search_name_hash. * dbxread.c (struct symloc): Add 'pst_language' field. (PST_LANGUAGE): Define. (start_psymtab, read_ofile_symtab): Use it. (process_one_symbol): New 'language' parameter. Pass it down. * dictionary.c (struct dictionary) <language>: New field. (DICT_LANGUAGE): Define. (dict_create_hashed, dict_create_hashed_expandable) (dict_create_linear, dict_create_linear_expandable): New parameter 'language'. Set the dictionary's language. (iter_match_first_hashed): Adjust to rename. (insert_symbol_hashed): Assert we don't see mismatching languages. Adjust to rename. (dict_hash): Rename to ... (default_search_name_hash): ... this and make extern. * dictionary.h (struct language_defn): Forward declare. (dict_create_hashed): New parameter 'language'. * dwarf2read.c (dwarf2_start_symtab): Pass down language. * f-lang.c (f_language_defn): Install default_search_name_hash. * go-lang.c (go_language_defn): Install default_search_name_hash. * jit.c (finalize_symtab): Pass compunit's language to dictionary creation. * language.c (unknown_language_defn, auto_language_defn): * language.h (language_defn::la_search_name_hash): New field. (default_search_name_hash): Declare. * m2-lang.c (m2_language_defn): Install default_search_name_hash. * mdebugread.c (new_block): New parameter 'language'. * mdebugread.c (parse_symbol): Pass symbol language to block allocation. (psymtab_to_symtab_1): Pass down language. (new_symtab): Pass compunit's language to block allocation. * objc-lang.c (objc_language_defn): Install default_search_name_hash. * opencl-lang.c (opencl_language_defn): * p-lang.c (pascal_language_defn): Install default_search_name_hash. * rust-lang.c (rust_language_defn): Install default_search_name_hash. * stabsread.h (enum language): Forward declare. (process_one_symbol): Add 'language' parameter. * symtab.c (search_name_hash): New function. * symtab.h (search_name_hash): Declare. * xcoffread.c (read_xcoff_symtab): Pass language to start_symtab.
author: Pedro Alves <palves@redhat.com> 2017-11-08 15:07:56 +0000
committer: Pedro Alves <palves@redhat.com> 2017-11-08 16:02:24 +0000
commit: 5ffa0793690b42b2a0c1c21dbb5e64634e58fa00 (patch)
tree: 08f6c4540b28cefd8f8ae67386c9af8ed455c0a0 /gdb/dictionary.c
parent: 2a1dde5da23779fd0d269c3e51c7814aba9001dd (diff)
download: gdb-5ffa0793690b42b2a0c1c21dbb5e64634e58fa00.zip
gdb-5ffa0793690b42b2a0c1c21dbb5e64634e58fa00.tar.gz
gdb-5ffa0793690b42b2a0c1c21dbb5e64634e58fa00.tar.bz2
1 files changed, 27 insertions, 28 deletions
diff --git a/gdb/dictionary.c b/gdb/dictionary.c
index b2cfca2..1ffa4f3 100644
--- a/gdb/dictionary.c
+++ b/gdb/dictionary.c
@@ -165,6 +165,7 @@ struct dictionary_linear_expandable
 
 struct dictionary
 {
+  const struct language_defn *language;
   const struct dict_vector *vector;
   union
   {
@@ -179,6 +180,7 @@ struct dictionary
 /* Accessor macros.  */
 
 #define DICT_VECTOR(d)			(d)->vector
+#define DICT_LANGUAGE(d)                (d)->language
 
 /* These can be used for DICT_HASHED_EXPANDABLE, too.  */
 
@@ -245,8 +247,6 @@ static struct symbol *iter_match_next_hashed (const char *name,
 					      symbol_compare_ftype *compare,
 					      struct dict_iterator *iterator);
 
-static unsigned int dict_hash (const char *string);
-
 /* Functions only for DICT_HASHED.  */
 
 static int size_hashed (const struct dictionary *dict);
@@ -348,12 +348,11 @@ static void expand_hashtable (struct dictionary *dict);
 
 /* The creation functions.  */
 
-/* Create a dictionary implemented via a fixed-size hashtable.  All
-   memory it uses is allocated on OBSTACK; the environment is
-   initialized from SYMBOL_LIST.  */
+/* See dictionary.h.  */
 
 struct dictionary *
 dict_create_hashed (struct obstack *obstack,
+		    enum language language,
 		    const struct pending *symbol_list)
 {
   struct dictionary *retval;
@@ -363,6 +362,7 @@ dict_create_hashed (struct obstack *obstack,
 
   retval = XOBNEW (obstack, struct dictionary);
   DICT_VECTOR (retval) = &dict_hashed_vector;
+  DICT_LANGUAGE (retval) = language_def (language);
 
   /* Calculate the number of symbols, and allocate space for them.  */
   for (list_counter = symbol_list;
@@ -391,17 +391,15 @@ dict_create_hashed (struct obstack *obstack,
   return retval;
 }
 
-/* Create a dictionary implemented via a hashtable that grows as
-   necessary.  The dictionary is initially empty; to add symbols to
-   it, call dict_add_symbol().  Call dict_free() when you're done with
-   it.  */
+/* See dictionary.h.  */
 
 extern struct dictionary *
-dict_create_hashed_expandable (void)
+dict_create_hashed_expandable (enum language language)
 {
   struct dictionary *retval = XNEW (struct dictionary);
 
   DICT_VECTOR (retval) = &dict_hashed_expandable_vector;
+  DICT_LANGUAGE (retval) = language_def (language);
   DICT_HASHED_NBUCKETS (retval) = DICT_EXPANDABLE_INITIAL_CAPACITY;
   DICT_HASHED_BUCKETS (retval) = XCNEWVEC (struct symbol *,
 					   DICT_EXPANDABLE_INITIAL_CAPACITY);
@@ -410,13 +408,11 @@ dict_create_hashed_expandable (void)
   return retval;
 }
 
-/* Create a dictionary implemented via a fixed-size array.  All memory
-   it uses is allocated on OBSTACK; the environment is initialized
-   from the SYMBOL_LIST.  The symbols are ordered in the same order
-   that they're found in SYMBOL_LIST.  */
+/* See dictionary.h.  */
 
 struct dictionary *
 dict_create_linear (struct obstack *obstack,
+		    enum language language,
 		    const struct pending *symbol_list)
 {
   struct dictionary *retval;
@@ -426,6 +422,7 @@ dict_create_linear (struct obstack *obstack,
 
   retval = XOBNEW (obstack, struct dictionary);
   DICT_VECTOR (retval) = &dict_linear_vector;
+  DICT_LANGUAGE (retval) = language_def (language);
 
   /* Calculate the number of symbols, and allocate space for them.  */
   for (list_counter = symbol_list;
@@ -455,17 +452,15 @@ dict_create_linear (struct obstack *obstack,
   return retval;
 }
 
-/* Create a dictionary implemented via an array that grows as
-   necessary.  The dictionary is initially empty; to add symbols to
-   it, call dict_add_symbol().  Call dict_free() when you're done with
-   it.  */
+/* See dictionary.h.  */
 
 struct dictionary *
-dict_create_linear_expandable (void)
+dict_create_linear_expandable (enum language language)
 {
   struct dictionary *retval = XNEW (struct dictionary);
 
   DICT_VECTOR (retval) = &dict_linear_expandable_vector;
+  DICT_LANGUAGE (retval) = language_def (language);
   DICT_LINEAR_NSYMS (retval) = 0;
   DICT_LINEAR_EXPANDABLE_CAPACITY (retval) = DICT_EXPANDABLE_INITIAL_CAPACITY;
   DICT_LINEAR_SYMS (retval)
@@ -638,7 +633,9 @@ iter_match_first_hashed (const struct dictionary *dict, const char *name,
 			 symbol_compare_ftype *compare,
 			 struct dict_iterator *iterator)
 {
-  unsigned int hash_index = dict_hash (name) % DICT_HASHED_NBUCKETS (dict);
+  unsigned int hash_index
+    = (search_name_hash (DICT_LANGUAGE (dict)->la_language, name)
+       % DICT_HASHED_NBUCKETS (dict));
   struct symbol *sym;
 
   DICT_ITERATOR_DICT (iterator) = dict;
@@ -689,10 +686,15 @@ insert_symbol_hashed (struct dictionary *dict,
 		      struct symbol *sym)
 {
   unsigned int hash_index;
+  unsigned int hash;
   struct symbol **buckets = DICT_HASHED_BUCKETS (dict);
 
-  hash_index = 
-    dict_hash (SYMBOL_SEARCH_NAME (sym)) % DICT_HASHED_NBUCKETS (dict);
+  /* We don't want to insert a symbol into a dictionary of a different
+     language.  The two may not use the same hashing algorithm.  */
+  gdb_assert (SYMBOL_LANGUAGE (sym) == DICT_LANGUAGE (dict)->la_language);
+
+  hash = search_name_hash (SYMBOL_LANGUAGE (sym), SYMBOL_SEARCH_NAME (sym));
+  hash_index = hash % DICT_HASHED_NBUCKETS (dict);
   sym->hash_next = buckets[hash_index];
   buckets[hash_index] = sym;
 }
@@ -765,13 +767,10 @@ expand_hashtable (struct dictionary *dict)
   xfree (old_buckets);
 }
 
-/* Produce an unsigned hash value from STRING0 that is consistent
-   with strcmp_iw, strcmp, and, at least on Ada symbols, wild_match.
-   That is, two identifiers equivalent according to any of those three
-   comparison operators hash to the same value.  */
+/* See dictionary.h.  */
 
-static unsigned int
-dict_hash (const char *string0)
+unsigned int
+default_search_name_hash (const char *string0)
 {
   /* The Ada-encoded version of a name P1.P2...Pn has either the form
      P1__P2__...Pn<suffix> or _ada_P1__P2__...Pn<suffix> (where the Pi
author	Pedro Alves <palves@redhat.com>	2017-11-08 15:07:56 +0000
committer	Pedro Alves <palves@redhat.com>	2017-11-08 16:02:24 +0000
commit	5ffa0793690b42b2a0c1c21dbb5e64634e58fa00 (patch)
tree	08f6c4540b28cefd8f8ae67386c9af8ed455c0a0 /gdb/dictionary.c
parent	2a1dde5da23779fd0d269c3e51c7814aba9001dd (diff)
download	gdb-5ffa0793690b42b2a0c1c21dbb5e64634e58fa00.zip gdb-5ffa0793690b42b2a0c1c21dbb5e64634e58fa00.tar.gz gdb-5ffa0793690b42b2a0c1c21dbb5e64634e58fa00.tar.bz2