From 55fc1623f942fba10362cb199f9356d75ca5835b Mon Sep 17 00:00:00 2001 From: Tom Tromey Date: Thu, 3 Nov 2022 13:49:17 -0600 Subject: Add name canonicalization for C PR symtab/29105 shows a number of situations where symbol lookup can result in the expansion of too many CUs. What happens is that lookup_signed_typename will try to look up a type like "signed int". In cooked_index_functions::expand_symtabs_matching, when looping over languages, the C++ case will canonicalize this type name to be "int" instead. Then this method will proceed to expand every CU that has an entry for "int" -- i.e., nearly all of them. A crucial component of this is that the caller, objfile::lookup_symbol, does not do this canonicalization, so when it tries to find the symbol for "signed int", it fails -- causing the loop to continue. This patch fixes the problem by introducing name canonicalization for C. The idea here is that, by making C and C++ agree on the canonical name when a symbol name can have multiple spellings, we avoid the bad behavior in objfile::lookup_symbol (and any other such code -- I don't know if there is any). Unlike C++, C only has a few situations where canonicalization is needed. And, in particular, due to the lack of overloading (thus avoiding any issues in linespec) and due to the way c-exp.y works, I think that no canonicalization is needed during symbol lookup -- only during symtab construction. This explains why lookup_name_info is not touched. The stabs reader is modified on a "best effort" basis. The DWARF reader needed one small tweak in dwarf2_name to avoid a regression in dw2-unusual-field-names.exp. I think this is adequately explained by the comment, but basically this is a scenario that should not occur in real code, only the gdb test suite. lookup_signed_typename is simplified. It used to search for two different type names, but now gdb can search just for the canonical form. gdb.dwarf2/enum-type.exp needed a small tweak, because the canonicalizer turns "unsigned integer" into "unsigned int integer". It seems better here to use the correct C type name. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29105 Tested-by: Simon Marchi Reviewed-by: Andrew Burgess --- gdb/c-lang.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) (limited to 'gdb/c-lang.c') diff --git a/gdb/c-lang.c b/gdb/c-lang.c index e15541f..46c0da0 100644 --- a/gdb/c-lang.c +++ b/gdb/c-lang.c @@ -727,6 +727,20 @@ c_is_string_type_p (struct type *type) +/* See c-lang.h. */ + +gdb::unique_xmalloc_ptr +c_canonicalize_name (const char *name) +{ + if (strchr (name, ' ') != nullptr + || streq (name, "signed") + || streq (name, "unsigned")) + return cp_canonicalize_string (name); + return nullptr; +} + + + void c_language_arch_info (struct gdbarch *gdbarch, struct language_arch_info *lai) -- cgit v1.1