aboutsummaryrefslogtreecommitdiff
path: root/libctf/libctf.ver
AgeCommit message (Collapse)AuthorFilesLines
2021-09-27libctf: try several possibilities for linker versioning flagsNick Alcock1-6/+4
Checking for linker versioning by just grepping ld --help output for mentions of --version-script is inadequate now that Solaris 11.4 implements a --version-script with different semantics. Try linking a test program with a small wildcard-using version script with each supported set of flags in turn, to make sure that linker versioning is not only advertised but actually works. The Solaris "GNU-compatible" linker versioning is not quite GNU-compatible enough, but we can work around the differences by generating a new version script that removes the comments from the original (Solaris ld requires #-style comments), and making another version script for libctf-nonbfd in particular which doesn't mention any of the symbols that appear in libctf.la, to avoid Solaris ld introducing corresponding new NOTYPE symbols to match the version script. libctf/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> PR libctf/27967 * configure.ac (VERSION_FLAGS): Replace with... (ac_cv_libctf_version_script): ... this multiple test. (VERSION_FLAGS_NOBFD): Substitute this too. * Makefile.am (libctf_nobfd_la_LDFLAGS): Use it. Split out... (libctf_ldflags_nover): ... non-versioning flags here. (libctf_la_LDFLAGS): Use it. * libctf.ver: Give every symbol not in libctf-nobfd a comment on the same line noting as much.
2021-05-06libctf, include: support an alternative encoding for nonrepresentable typesNick Alcock1-0/+1
Before now, types that could not be encoded in CTF were represented as references to type ID 0, which does not itself appear in the dictionary. This choice is annoying in several ways, principally that it forces generators and consumers of CTF to grow special cases for types that are referenced in valid dicts but don't appear. Allow an alternative representation (which will become the only representation in format v4) whereby nonrepresentable types are encoded as actual types with kind CTF_K_UNKNOWN (an already-existing kind theoretically but not in practice used for padding, with value 0). This is backward-compatible, because CTF_K_UNKNOWN was not used anywhere before now: it was used in old-format function symtypetabs, but these were never emitted by any compiler and the code to handle them in libctf likely never worked and was removed last year, in favour of new-format symtypetabs that contain only type IDs, not type kinds. In order to link this type, we need an API addition to let us add types of unknown kind to the dict: we let them optionally have names so that GCC can emit many different unknown types and those types with identical names will be deduplicated together. There are also small tweaks to the deduplicator to actually dedup such types, to let opening of dicts with unknown types with names work, to return the ECTF_NONREPRESENTABLE error on resolution of such types (like ID 0), and to print their names as something useful but not a valid C identifier, mostly for the sake of the dumper. Tests added in the next commit. include/ChangeLog 2021-05-06 Nick Alcock <nick.alcock@oracle.com> * ctf.h (CTF_K_UNKNOWN): Document that it can be used for nonrepresentable types, not just padding. * ctf-api.h (ctf_add_unknown): New. libctf/ChangeLog 2021-05-06 Nick Alcock <nick.alcock@oracle.com> * ctf-open.c (init_types): Unknown types may have names. * ctf-types.c (ctf_type_resolve): CTF_K_UNKNOWN is as non-representable as type ID 0. (ctf_type_aname): Print unknown types. * ctf-dedup.c (ctf_dedup_hash_type): Do not early-exit for CTF_K_UNKNOWN types: they have real hash values now. (ctf_dedup_rwalk_one_output_mapping): Treat CTF_K_UNKNOWN types like other types with no referents: call the callback and do not skip them. (ctf_dedup_emit_type): Emit via... * ctf-create.c (ctf_add_unknown): ... this new function. * libctf.ver (LIBCTF_1.2): Add it.
2021-02-20libctf, include: find types of symbols by nameNick Alcock1-0/+6
The existing ctf_lookup_by_symbol and ctf_arc_lookup_symbol functions suffice to look up the types of symbols if the caller already has a symbol number. But the caller often doesn't have one of those and only knows the name of the symbol: also, in object files, the caller might not have a useful symbol number in any sense (and neither does libctf: the 'symbol number' we use in that case literally starts at 0 for the lexicographically first-sorted symbol in the symtypetab and counts those symbols, so it corresponds to nothing useful). This means that even though object files have a symtypetab (generated by the compiler or by ld -r), the only way we can look up anything in it is to iterate over all symbols in turn with ctf_symbol_next until we find the one we want. This is unhelpful and pointlessly inefficient. So add a pair of functions to look up symbols by name in a dict and in a whole archive: ctf_lookup_by_symbol_name and ctf_arc_lookup_symbol_name. These are identical to the existing functions except that they take symbol names rather than symbol numbers. To avoid insane repetition, we do some refactoring in the process, so that both ctf_lookup_by_symbol and ctf_arc_lookup_symbol turn into thin wrappers around internal functions that do both lookup by symbol index and lookup by name. This massively reduces code duplication because even the existing lookup-by-index stuff wants to use a name sometimes (when looking up in indexed sections), and the new lookup-by-name stuff has to turn it into an index sometimes (when looking up in non-indexed sections): doing it this way lets us share most of that. The actual name->index lookup is done by ctf_lookup_symbol_idx. We do not anticipate this lookup to be as heavily used as ld.so symbol lookup by many orders of magnitude, so using the ELF symbol hashes would probably take more time to read them than is saved by using the hashes, and it adds a lot of complexity. Instead, do a linear search for the symbol name, caching all the name -> index mappings as we go, so that future searches are likely to hit in the cache. To avoid having to repeat this search over and over in a CTF archive when ctf_arc_lookup_symbol_name is used, have cached archive lookups (the sort done by ctf_arc_lookup_symbol* and the ctf_archive_next iterator) pick out the first dict they cache in a given archive and store it in a new ctf_archive field, ctfi_crossdict_cache. This can be used to store cross-dictionary cached state that depends on things like the ELF symbol table rather than the contents of any one dict. ctf_lookup_symbol_idx then caches its name->index mappings in the dictionary named in the crossdict cache, if any, so that ctf_lookup_symbol_idx in other dicts in the same archive benefit from the previous linear search, and the symtab only needs to be scanned at most once. (Note that if you call ctf_lookup_by_symbol_name in one specific dict, and then follow it with a ctf_arc_lookup_symbol_name, the former will not use the crossdict cache because it's only populated by the dict opens in ctf_arc_lookup_symbol_name. This is harmless except for a small one-off waste of memory and time: it's only a cache, after all. We can fix this later by using the archive caching machinery more aggressively.) In ctf-archive, we do similar things, turning ctf_arc_lookup_symbol into a wrapper around a new function that does both index -> ID and name -> ID lookups across all dicts in an archive. We add a new ctfi_symnamedicts cache that maps symbol names to the ctf_dict_t * that it was found in (so that linear searches for symbols don't need to be repeated): but we also *remove* a cache, the ctfi_syms cache that was memoizing the actual ctf_id_t returned from every call to ctf_arc_lookup_symbol. This is pointless: all it saves is one call to ctf_lookup_by_symbol, and that's basically an array lookup and nothing more so isn't worth caching. (Equally, given that symbol -> index mappings are cached by ctf_lookup_by_symbol_name, those calls are nearly free after the first call, so there's no point caching the ctf_id_t in that case either.) We fix up one test that was doing manual symbol lookup to use ctf_arc_lookup_symbol instead, and enhance it to check that the caching layer is not totally broken: we also add a new test to do lookups in a .o file, and another to do lookups in an archive with conflicted types and make sure that sort of multi-dict lookup is actually working. include/ChangeLog 2021-02-17 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h (ctf_arc_lookup_symbol_name): New. (ctf_lookup_by_symbol_name): Likewise. libctf/ChangeLog 2021-02-17 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dict_t) <ctf_symhash>: New. <ctf_symhash_latest>: Likewise. (struct ctf_archive_internal) <ctfi_crossdict_cache>: New. <ctfi_symnamedicts>: New. <ctfi_syms>: Remove. (ctf_lookup_symbol_name): Remove. * ctf-lookup.c (ctf_lookup_symbol_name): Propagate errors from parent properly. Make static. (ctf_lookup_symbol_idx): New, linear search for the symbol name, cached in the crossdict cache's ctf_symhash (if available), or this dict's (otherwise). (ctf_try_lookup_indexed): Allow the symname to be passed in. (ctf_lookup_by_symbol): Turn into a wrapper around... (ctf_lookup_by_sym_or_name): ... this, supporting name lookup too, using ctf_lookup_symbol_idx in non-writable dicts. Special-case name lookup in dynamic dicts without reported symbols, which have no symtab or dynsymidx but where name lookup should still work. (ctf_lookup_by_symbol_name): New, another wrapper. * ctf-archive.c (enosym): Note that this is present in ctfi_symnamedicts too. (ctf_arc_close): Adjust for removal of ctfi_syms. Free the ctfi_symnamedicts. (ctf_arc_flush_caches): Likewise. (ctf_dict_open_cached): Memoize the first cached dict in the crossdict cache. (ctf_arc_lookup_symbol): Turn into a wrapper around... (ctf_arc_lookup_sym_or_name): ... this. No longer cache ctf_id_t lookups: just call ctf_lookup_by_symbol as needed (but still cache the dicts those lookups succeed in). Add lookup-by-name support, with dicts of successful lookups cached in ctfi_symnamedicts. Refactor the caching code a bit. (ctf_arc_lookup_symbol_name): New, another wrapper. * ctf-open.c (ctf_dict_close): Free the ctf_symhash. * libctf.ver (LIBCTF_1.2): New version. Add ctf_lookup_by_symbol_name, ctf_arc_lookup_symbol_name. * testsuite/libctf-lookup/enum-symbol.c (main): Use ctf_arc_lookup_symbol rather than looking up the name ourselves. Fish it out repeatedly, to make sure that symbol caching isn't broken. (symidx_64): Remove. (symidx_32): Remove. * testsuite/libctf-lookup/enum-symbol-obj.lk: Test symbol lookup in an unlinked object file (indexed symtypetab sections only). * testsuite/libctf-writable/symtypetab-nonlinker-writeout.c (try_maybe_reporting): Check symbol types via ctf_lookup_by_symbol_name as well as ctf_symbol_next. * testsuite/libctf-lookup/conflicting-type-syms.*: New test of lookups in a multi-dict archive.
2021-01-01Update year range in copyright notice of binutils filesAlan Modra1-1/+1
2020-11-25libctf, include: support foreign-endianness symtabs with CTFNick Alcock1-0/+2
The CTF symbol lookup machinery added recently has one deficit: it assumes the symtab is in the machine's native endianness. This is always true when the linker is writing out symtabs (because cross linkers byteswap symbols only after libctf has been called on them), but may be untrue in the cross case when the linker or another tool (objdump, etc) is reading them. Unfortunately the easy way to model this to the caller, as an endianness field in the ctf_sect_t, is precluded because doing so would change the size of the ctf_sect_t, which would be an ABI break. So, instead, allow the endianness of the symtab to be set after open time, by calling one of the two new API functions ctf_symsect_endianness (for ctf_dict_t's) or ctf_arc_symsect_endianness (for entire ctf_archive_t's). libctf calls these functions automatically for objects opened via any of the BFD-aware mechanisms (ctf_bfdopen, ctf_bfdopen_ctfsect, ctf_fdopen, ctf_open, or ctf_arc_open), but the various mechanisms that just take raw ctf_sect_t's will assume the symtab is in native endianness and need a later call to ctf_*symsect_endianness to adjust it if needed. (This call is basically free if the endianness is actually native: it only costs anything if the symtab endianness was previously guessed wrong, and there is a symtab, and we are using it directly rather than using symtab indexing.) Obviously, calling ctf_lookup_by_symbol or ctf_symbol_next before the symtab endianness is correctly set will probably give wrong answers -- but you can set it at any time as long as it is before then. include/ChangeLog 2020-11-23 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h: Style nit: remove () on function names in comments. (ctf_sect_t): Mention endianness concerns. (ctf_symsect_endianness): New declaration. (ctf_arc_symsect_endianness): Likewise. libctf/ChangeLog 2020-11-23 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dict_t) <ctf_symtab_little_endian>: New. (struct ctf_archive_internal) <ctfi_symsect_little_endian>: Likewise. * ctf-create.c (ctf_serialize): Adjust for new field. * ctf-open.c (init_symtab): Note the semantics of repeated calls. (ctf_symsect_endianness): New. (ctf_bufopen_internal): Set ctf_symtab_little_endian suitably for the native endianness. (_Static_assert): Moved... (swap_thing): ... with this... * swap.h: ... to here. * ctf-util.c (ctf_elf32_to_link_sym): Use it, byteswapping the Elf32_Sym if the ctf_symtab_little_endian demands it. (ctf_elf64_to_link_sym): Likewise swap the Elf64_Sym if needed. * ctf-archive.c (ctf_arc_symsect_endianness): New, set the endianness of the symtab used by the dicts in an archive. (ctf_archive_iter_internal): Initialize to unknown (assumed native, do not call ctf_symsect_endianness). (ctf_dict_open_by_offset): Call ctf_symsect_endianness if need be. (ctf_dict_open_internal): Propagate the endianness down. (ctf_dict_open_sections): Likewise. * ctf-open-bfd.c (ctf_bfdopen_ctfsect): Get the endianness from the struct bfd and pass it down to the archive. * libctf.ver: Add ctf_symsect_endianness and ctf_arc_symsect_endianness.
2020-11-20libctf, include: add ctf_getsymsect and ctf_getstrsectNick Alcock1-0/+3
libctf has long provided ctf_getdatasect, which hands back a pointer to the CTF section a (read-only) dict came from. But it has no such functions to return pointers to the ELF symbol table or string table it's working from, which is unfortunate because several libctf functions (ctf_open, ctf_fdopen, and ctf_bfdopen) figure out which string and symbol table to use themselves, and don't tell the user what they decided, so the caller can't agree on which symtab to use with libctf even if it wanted to. Add a pair of functions to return the symtab and strtab in use. Like ctf_getdatasect, these return ctf_sect_t structures by value, filled with all-NULL/0 content if a symtab or strtab is not being used. include/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h (ctf_getsymsect): New. (ctf_getstrsect): Likewise. libctf/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-open.c (ctf_getsymsect): New. (ctf_getstrsect): Likewise. * libctf.ver: Add them.
2020-11-20libctf, include: CTF-archive-wide symbol lookupNick Alcock1-0/+3
CTF archives may contain multiple dicts, each of which contain many types and possibly a bunch of symtypetab entries relating to those types: each symtypetab entry is going to appear in exactly one dict, with the corresponding entries in the other dicts empty (either pads, or indexed symtypetabs that do not mention that symbol). But users of libctf usually want to get back the type associated with a symbol without having to dig around to find out which dict that type might be in. This adds machinery to do that -- and since you probably want to do it repeatedly, it adds internal caching to the ctf-archive machinery so that iteration over archives via ctf_archive_next and repeated symbol lookups do not have to repeatedly reopen the archive. (Iteration using ctf_archive_iter will gain caching soon.) Two new API functions: ctf_dict_t * ctf_arc_lookup_symbol (ctf_archive_t *arc, unsigned long symidx, ctf_id_t *typep, int *errp); This looks up the symbol with index SYMIDX in the archive ARC, returning the dictionary in which it resides and optionally the type index as well. Errors are returned in ERRP. The dict should be ctf_dict_close()d when done, but is also cached inside the ctf_archive so that the open cost is only paid once. The result of the symbol lookup is also cached internally, so repeated lookups of the same symbol are nearly free. void ctf_arc_flush_caches (ctf_archive_t *arc); Flush all the caches. Done at close time, but also available as an API function if users want to do it by hand. include/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h (ctf_arc_lookup_symbol): New. (ctf_arc_flush_caches): Likewise. * ctf.h: Document new auto-ctf_import behaviour. libctf/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (struct ctf_archive_internal) <ctfi_dicts>: New, dicts the archive machinery has opened and cached. <ctfi_symdicts>: New, cache of dicts containing symbols looked up. <ctfi_syms>: New, cache of types of symbols looked up. * ctf-archive.c (ctf_arc_close): Free them on close. (enosym): New, flag entry for 'symbol not present'. (ctf_arc_import_parent): New, automatically import the parent from ".ctf" if this is a child in an archive and ".ctf" is present. (ctf_dict_open_sections): Use it. (ctf_archive_iter_internal): Likewise. (ctf_cached_dict_close): New, thunk around ctf_dict_close. (ctf_dict_open_cached): New, open and cache a dict. (ctf_arc_flush_caches): New, flush the caches. (ctf_arc_lookup_symbol): New, look up a symbol in (all members of) an archive, and cache the lookup. (ctf_archive_iter): Note the new caching behaviour. (ctf_archive_next): Use ctf_dict_open_cached. * libctf.ver: Add ctf_arc_lookup_symbol and ctf_arc_flush_caches.
2020-11-20libctf: symbol type linking supportNick Alcock1-0/+4
This adds facilities to write out the function info and data object sections, which efficiently map from entries in the symbol table to types. The write-side code is entirely new: the read-side code was merely significantly changed and support for indexed tables added (pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff header fields). With this in place, you can use ctf_lookup_by_symbol to look up the types of symbols of function and object type (and, as before, you can use ctf_lookup_variable to look up types of file-scope variables not present in the symbol table, as long as you know their name: but variables that are also data objects are now found in the data object section instead.) (Compatible) file format change: The CTF spec has always said that the function info section looks much like the CTF_K_FUNCTIONs in the type section: an info word (including an argument count) followed by a return type and N argument types. This format is suboptimal: it means function symbols cannot be deduplicated and it causes a lot of ugly code duplication in libctf. But conveniently the compiler has never emitted this! Because it has always emitted a rather different format that libctf has never accepted, we can be sure that there are no instances of this function info section in the wild, and can freely change its format without compatibility concerns or a file format version bump. (And since it has never been emitted in any code that generated any older file format version, either, we need keep no code to read the format as specified at all!) So the function info section is now specified as an array of uint32_t, exactly like the object data section: each entry is a type ID in the type section which must be of kind CTF_K_FUNCTION, the prototype of this function. This allows function types to be deduplicated and also correctly encodes the fact that all functions declared in C really are types available to the program: so they should be stored in the type section like all other types. (In format v4, we will be able to represent the types of static functions as well, but that really does require a file format change.) We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the new function info format is in use. A sufficiently new compiler will always set this flag. New libctf will always set this flag: old libctf will refuse to open any CTF dicts that have this flag set. If the flag is not set on a dict being read in, new libctf will disregard the function info section. Format v4 will remove this flag (or, rather, the flag has no meaning there and the bit position may be recycled for some other purpose). New API: Symbol addition: ctf_add_func_sym: Add a symbol with a given name and type. The type must be of kind CTF_K_FUNCTION (a function pointer). Internally this adds a name -> type mapping to the ctf_funchash in the ctf_dict. ctf_add_objt_sym: Add a symbol with a given name and type. The type kind can be anything, including function pointers. This adds to ctf_objthash. These both treat symbols as name -> type mappings: the linker associates symbol names with symbol indexes via the ctf_link_shuffle_syms callback, which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the ctf_dict. Repeated relinks can add more symbols. Variables that are also exposed as symbols are removed from the variable section at serialization time. CTF symbol type sections which have enough pads, defined by CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols where most types are unknown, or in archive where most types are defined in some child or parent dict, not in this specific dict) are sorted by name rather than symidx and accompanied by an index which associates each symbol type entry with a name: the existing ctf_lookup_by_symbol will map symbol indexes to symbol names and look the names up in the index automatically. (This is currently ELF-symbol-table-dependent, but there is almost nothing specific to ELF in here and we can add support for other symbol table formats easily). The compiler also uses index sections to communicate the contents of object file symbol tables without relying on any specific ordering of symbols: it doesn't need to sort them, and libctf will detect an unsorted index section via the absence of the new CTF_F_IDXSORTED header flag, and sort it if needed. Iteration: ctf_symbol_next: Iterator which returns the types and names of symbols one by one, either for function or data symbols. This does not require any sorting: the ctf_link machinery uses it to pull in all the compiler-provided symbols cheaply, but it is not restricted to that use. (Compatible) changes in API: ctf_lookup_by_symbol: can now be called for object and function symbols: never returns ECTF_NOTDATA (which is now not thrown by anything, but is kept for compatibility and because it is a plausible error that we might start throwing again at some later date). Internally we also have changes to the ctf-string functionality so that "external" strings (those where we track a string -> offset mapping, but only write out an offset) can be consulted via the usual means (ctf_strptr) before the strtab is written out. This is important because ctf_link_add_linker_symbol can now be handed symbols named via strtab offsets, and ctf_link_shuffle_syms must figure out their actual names by looking in the external symtab we have just been fed by the ctf_link_add_strtab callback, long before that strtab is written out. include/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h (ctf_symbol_next): New. (ctf_add_objt_sym): Likewise. (ctf_add_func_sym): Likewise. * ctf.h: Document new function info section format. (CTF_F_NEWFUNCINFO): New. (CTF_F_IDXSORTED): New. (CTF_F_MAX): Adjust accordingly. libctf/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New. (_libctf_nonnull_): Likewise. (ctf_in_flight_dynsym_t): New. (ctf_dict_t) <ctf_funcidx_names>: Likewise. <ctf_objtidx_names>: Likewise. <ctf_nfuncidx>: Likewise. <ctf_nobjtidx>: Likewise. <ctf_funcidx_sxlate>: Likewise. <ctf_objtidx_sxlate>: Likewise. <ctf_objthash>: Likewise. <ctf_funchash>: Likewise. <ctf_dynsyms>: Likewise. <ctf_dynsymidx>: Likewise. <ctf_dynsymmax>: Likewise. <ctf_in_flight_dynsym>: Likewise. (struct ctf_next) <u.ctn_next>: Likewise. (ctf_symtab_skippable): New prototype. (ctf_add_funcobjt_sym): Likewise. (ctf_dynhash_sort_by_name): Likewise. (ctf_sym_to_elf64): Rename to... (ctf_elf32_to_link_sym): ... this, and... (ctf_elf64_to_link_sym): ... this. * ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO flag, and presence of index sections. Refactor out ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func sxlate sections if corresponding index section is present. Adjust for new func info section format. (ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error handling. Report incorrect-length index sections. Always do an init_symtab, even if there is no symtab section (there may be index sections still). (flip_objts): Adjust comment: func and objt sections are actually identical in structure now, no need to caveat. (ctf_dict_close): Free newly-added data structures. * ctf-create.c (ctf_create): Initialize them. (ctf_symtab_skippable): New, refactored out of init_symtab, with st_nameidx_set check added. (ctf_add_funcobjt_sym): New, add a function or object symbol to the ctf_objthash or ctf_funchash, by name. (ctf_add_objt_sym): Call it. (ctf_add_func_sym): Likewise. (symtypetab_delete_nonstatic_vars): New, delete vars also present as data objects. (CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters: this is a function emission, not a data object emission. (CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit pads for symbols with no type (only set for unindexed sections). (CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters: always emit indexed. (symtypetab_density): New, figure out section sizes. (emit_symtypetab): New, emit a symtypetab. (emit_symtypetab_index): New, emit a symtypetab index. (ctf_serialize): Call them, emitting suitably sorted symtypetab sections and indexes. Set suitable header flags. Copy over new fields. * ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an order on symtypetab index sections. * ctf-link.c (ctf_add_type_mapping): Delete erroneous comment relating to code that was never committed. (ctf_link_one_variable): Improve variable name. (check_sym): New, symtypetab analogue of check_variable. (ctf_link_deduplicating_one_symtypetab): New. (ctf_link_deduplicating_syms): Likewise. (ctf_link_deduplicating): Call them. (ctf_link_deduplicating_per_cu): Note that we don't call them in this case (yet). (ctf_link_add_strtab): Set the error on the fp correctly. (ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add a linker symbol to the in-flight list. (ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the in-flight list into a mapping we can use, now its names are resolvable in the external strtab. * ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with external strtab offsets. (ctf_str_rollback): Adjust comment. (ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from writeout time... (ctf_str_add_external): ... to string addition time. * ctf-lookup.c (ctf_lookup_var_key_t): Rename to... (ctf_lookup_idx_key_t): ... this, now we use it for syms too. <clik_names>: New member, a name table. (ctf_lookup_var): Adjust accordingly. (ctf_lookup_variable): Likewise. (ctf_lookup_by_id): Shuffle further up in the file. (ctf_symidx_sort_arg_cb): New, callback for... (sort_symidx_by_name): ... this new function to sort a symidx found to be unsorted (likely originating from the compiler). (ctf_symidx_sort): New, sort a symidx. (ctf_lookup_symbol_name): Support dynamic symbols with indexes provided by the linker. Use ctf_link_sym_t, not Elf64_Sym. Check the parent if a child lookup fails. (ctf_lookup_by_symbol): Likewise. Work for function symbols too. (ctf_symbol_next): New, iterate over symbols with types (without sorting). (ctf_lookup_idx_name): New, bsearch for symbol names in indexes. (ctf_try_lookup_indexed): New, attempt an indexed lookup. (ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol. (ctf_func_args): Likewise. (ctf_get_dict): Move... * ctf-types.c (ctf_get_dict): ... here. * ctf-util.c (ctf_sym_to_elf64): Re-express as... (ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and st_nameidx_set (always 0, so st_nameidx can be ignored). Look in the ELF strtab for names. (ctf_elf32_to_link_sym): Likewise, for Elf32_Sym. (ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be. * libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and ctf_add_func_sym.
2020-11-20bfd, include, ld, binutils, libctf: CTF should use the dynstr/symNick Alcock1-0/+2
This is embarrassing. The whole point of CTF is that it remains intact even after a binary is stripped, providing a compact mapping from symbols to types for everything in the externally-visible interface of an ELF object: it has connections to the symbol table for that purpose, and to the string table to avoid duplicating symbol names. So it's a shame that the hooks I implemented last year served to hook it up to the .symtab and .strtab, which obviously disappear on strip, leaving any accompanying the CTF dict containing references to strings (and, soon, symbols) which don't exist any more because their containing strtab has been vaporized. The original Solaris design used .dynsym and .dynstr (well, actually, .ldynsym, which has more symbols) which do not disappear. So should we. Thankfully the work we did before serves as guide rails, and adjusting things to use the .dynstr and .dynsym was fast and easy. The only annoyance is that the dynsym is assembled inside elflink.c in a fairly piecemeal fashion, so that the easiest way to get the symbols out was to hook in before every call to swap_symbol_out (we also leave in a hook in front of symbol additions to the .symtab because it seems plausible that we might want to hook them in future too: for now that hook is unused). We adjust things so that rather than being offered a whole hash table of symbols at once, libctf is now given symbols one at a time, with st_name indexes already resolved and pointing at their final .dynstr offsets: it's now up to libctf to resolve these to names as needed using the strtab info we pass it separately. Some bits might be contentious. The ctf_new_dynstr callback takes an elf_internal_sym, and this remains an elf_internal_sym right down through the generic emulation layers into ldelfgen. This is no worse than the elf_sym_strtab we used to pass down, but in the future when we gain non-ELF CTF symtab support we might want to lower the elf_internal_sym to some other representation (perhaps a ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym. We rename the 'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer have anything to do with symbols. There are some API changes to pieces of API which are technically public but actually totally unused by anything and/or unused by anything but ld so they can change freely: the ctf_link_symbol gains new fields to allow symbol names to be given as strtab offsets as well as strings, and a symidx so that the symbol index can be passed in. ctf_link_shuffle_syms loses its callback parameter: the idea now is that linkers call the new ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the strtab entries with ctf_link_add_strtab, and then a call to ctf_link_shuffle_syms will apply both and arrange to use them to reorder the CTF symtab at CTF serialization time (which is coming in the next commit). Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always set in v3-format CTF dicts from this commit forwards: CTF dicts without this flag are associated with .strtab like they used to be, so that old dicts' external strings don't turn to garbage when loaded by new libctf. Dicts with this flag are associated with .dynstr and .dynsym instead. (The flag is not the next in sequence because this commit was written quite late: the missing flags will be filled in by the next commit.) Tests forthcoming in a later commit in this series. bfd/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * elflink.c (elf_finalize_dynstr): Call examine_strtab after dynstr finalization. (elf_link_swap_symbols_out): Don't call it here. Call ctf_new_symbol before swap_symbol_out. (elf_link_output_extsym): Call ctf_new_dynsym before swap_symbol_out. (bfd_elf_final_link): Likewise. * elf.c (swap_out_syms): Pass in bfd_link_info. Call ctf_new_symbol before swap_symbol_out. (_bfd_elf_compute_section_file_positions): Adjust. binutils/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not .symtab and .strtab. include/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * bfdlink.h (struct elf_sym_strtab): Replace with... (struct elf_internal_sym): ... this. (struct bfd_link_callbacks) <examine_strtab>: Take only a symstrtab argument. <ctf_new_symbol>: New. <ctf_new_dynsym>: Likewise. * ctf-api.h (struct ctf_link_sym) <st_symidx>: New. <st_nameidx>: Likewise. <st_nameidx_set>: Likewise. (ctf_link_iter_symbol_f): Removed. (ctf_link_shuffle_syms): Remove most parameters, just takes a ctf_dict_t now. (ctf_link_add_linker_symbol): New, split from ctf_link_shuffle_syms. * ctf.h (CTF_F_DYNSTR): New. (CTF_F_MAX): Adjust. ld/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to... (struct ctf_strtab_iter_cb_arg): ... this, changing fields: <syms>: Remove. <symcount>: Remove. <symstrtab>: Rename to... <strtab>: ... this. (ldelf_ctf_strtab_iter_cb): Adjust. (ldelf_ctf_symbols_iter_cb): Remove. (ldelf_new_dynsym_for_ctf): New, tell libctf about a single symbol. (ldelf_examine_strtab_for_ctf): Rename to... (ldelf_acquire_strings_for_ctf): ... this, only doing the strtab portion and not symbols. * ldelfgen.h: Adjust declarations accordingly. * ldemul.c (ldemul_examine_strtab_for_ctf): Rename to... (ldemul_acquire_strings_for_ctf): ... this. (ldemul_new_dynsym_for_ctf): New. * ldemul.h: Adjust declarations accordingly. * ldlang.c (ldlang_ctf_apply_strsym): Rename to... (ldlang_ctf_acquire_strings): ... this. (ldlang_ctf_new_dynsym): New. (lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do the actual symbol shuffle. * ldlang.h (struct elf_strtab_hash): Adjust accordingly. * ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks. libctf/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-link.c (ctf_link_shuffle_syms): Adjust. (ctf_link_add_linker_symbol): New, unimplemented stub. * libctf.ver: Add it. * ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized dicts. * ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the symtab/strtab if not present, dynsym/dynstr otherwise. * ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from some arbitrary member of a CTF archive. * ctf-impl.h (ctf_arc_bufpreamble): Declare it.
2020-11-20libctf, include, binutils, gdb: rename CTF-opening functionsNick Alcock1-0/+2
The functions that return ctf_dict_t's given a ctf_archive_t and a name are very clumsily named. It sounds like they return *archives*, not dictionaries, and the names are very long and clunky. Why do we have a ctf_arc_open_by_name when it opens a dictionary, not an archive, and when there is no way to open a dictionary in any other way? The answer is purely internal: the function is located in ctf-archive.c, and everything in there was called ctf_arc_*, and there is another way to open a dict (by offset in the archive), that is internal to ctf-archive.c and that nothing else can call. This is clearly bad naming. The internal organization of the source tree should not dictate public API names! So rename things (keeping the old, bad names for compatibility), and adjust all users. You now open a dict using ctf_dict_open, and open it giving ELF sections via ctf_dict_open_sections. binutils/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * objdump.c (dump_ctf): Use ctf_dict_open, not ctf_arc_open_by_name. * readelf.c (dump_section_as_ctf): Likewise. gdb/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctfread.c (elfctf_build_psymtabs): Use ctf_dict_open, not ctf_arc_open_by_name. include/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h (ctf_arc_open_by_name): Rename to... (ctf_dict_open): ... this, keeping compatibility function. (ctf_arc_open_by_name_sections): Rename to... (ctf_dict_open_sections): ... this, keeping compatibility function. libctf/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-archive.c (ctf_arc_open_by_offset): Rename to... (ctf_dict_open_by_offset): ... this. Adjust callers. (ctf_arc_open_by_name_internal): Rename to... (ctf_dict_open_internal): ... this. Adjust callers. (ctf_arc_open_by_name_sections): Rename to... (ctf_dict_open_sections): ... this, keeping compatibility function. (ctf_arc_open_by_name): Rename to... (ctf_dict_open): ... this, keeping compatibility function. * libctf.ver: New functions added. * ctf-link.c (ctf_link_one_input_archive): Adjusted accordingly. (ctf_link_deduplicating_open_inputs): Likewise.
2020-11-20libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_tNick Alcock1-0/+6
The naming of the ctf_file_t type in libctf is a historical curiosity. Back in the Solaris days, CTF dictionaries were originally generated as a separate file and then (sometimes) merged into objects: hence the datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw CTF is essentially never written to a file on its own, and the datatype changed name to a "CTF dictionary" years ago. So the term "CTF file" refers to something that is never a file! This is at best confusing. The type has also historically been known as a 'CTF container", which is even more confusing now that we have CTF archives which are *also* a sort of container (they contain CTF dictionaries), but which are never referred to as containers in the source code. So fix this by completing the renaming, renaming ctf_file_t to ctf_dict_t throughout, and renaming those few functions that refer to CTF files by name (keeping compatibility aliases) to refer to dicts instead. Old users who still refer to ctf_file_t will see (harmless) pointer-compatibility warnings at compile time, but the ABI is unchanged (since C doesn't mangle names, and ctf_file_t was always an opaque type) and things will still compile fine as long as -Werror is not specified. All references to CTF containers and CTF files in the source code are fixed to refer to CTF dicts instead. Further (smaller) renamings of annoyingly-named functions to come, as part of the process of souping up queries across whole archives at once (needed for the function info and data object sections). binutils/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t. (dump_ctf_archive_member): Likewise. (dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close. * readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t. (dump_ctf_archive_member): Likewise. (dump_section_as_ctf): Likewise. Use ctf_dict_close, not ctf_file_close. gdb/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctfread.c: Change uses of ctf_file_t to ctf_dict_t. (ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close. include/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h (ctf_file_t): Rename to... (ctf_dict_t): ... this. Keep ctf_file_t around for compatibility. (struct ctf_file): Likewise rename to... (struct ctf_dict): ... this. (ctf_file_close): Rename to... (ctf_dict_close): ... this, keeping compatibility function. (ctf_parent_file): Rename to... (ctf_parent_dict): ... this, keeping compatibility function. All callers adjusted. * ctf.h: Rename references to ctf_file_t to ctf_dict_t. (struct ctf_archive) <ctfa_nfiles>: Rename to... <ctfa_ndicts>: ... this. ld/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ldlang.c (ctf_output): This is a ctf_dict_t now. (lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t. (ldlang_open_ctf): Adjust comment. (lang_merge_ctf): Use ctf_dict_close, not ctf_file_close. * ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to ctf_dict_t. Change opaque declaration accordingly. * ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust. * ldemul.h (examine_strtab_for_ctf): Likewise. (ldemul_examine_strtab_for_ctf): Likewise. * ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise. libctf/ChangeLog 2020-11-20 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations adjusted. (ctf_fileops): Rename to... (ctf_dictops): ... this. (ctf_dedup_t) <cd_id_to_file_t>: Rename to... <cd_id_to_dict_t>: ... this. (ctf_file_t): Fix outdated comment. <ctf_fileops>: Rename to... <ctf_dictops>: ... this. (struct ctf_archive_internal) <ctfi_file>: Rename to... <ctfi_dict>: ... this. * ctf-archive.c: Rename ctf_file_t to ctf_dict_t. Rename ctf_archive.ctfa_nfiles to ctfa_ndicts. Rename ctf_file_close to ctf_dict_close. All users adjusted. * ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers. (ctf_bundle_t) <ctb_file>: Rename to... <ctb_dict): ... this. * ctf-decl.c: Rename ctf_file_t to ctf_dict_t. * ctf-dedup.c: Likewise. Rename ctf_file_close to ctf_dict_close. Refer to CTF dicts, not CTF containers. * ctf-dump.c: Likewise. * ctf-error.c: Likewise. * ctf-hash.c: Likewise. * ctf-inlines.h: Likewise. * ctf-labels.c: Likewise. * ctf-link.c: Likewise. * ctf-lookup.c: Likewise. * ctf-open-bfd.c: Likewise. * ctf-string.c: Likewise. * ctf-subr.c: Likewise. * ctf-types.c: Likewise. * ctf-util.c: Likewise. * ctf-open.c: Likewise. (ctf_file_close): Rename to... (ctf_dict_close): ...this. (ctf_file_close): New trivial wrapper around ctf_dict_close, for compatibility. (ctf_parent_file): Rename to... (ctf_parent_dict): ... this. (ctf_parent_file): New trivial wrapper around ctf_parent_dict, for compatibility. * libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-07-22libctf, link: add the ability to filter out variables from the linkNick Alcock1-0/+1
The CTF variables section (containing variables that have no corresponding symtab entries) can cause the string table to get very voluminous if the names of variables are long. Some callers want to filter out particular variables they know they won't need. So add a "variable filter" callback that does that: it's passed the name of the variable and a corresponding ctf_file_t / ctf_id_t pair, and should return 1 to filter it out. ld doesn't use this machinery yet, but we could easily add it later if desired. (But see later for a commit that turns off CTF variable- section linking in ld entirely by default.) include/ * ctf-api.h (ctf_link_variable_filter_t): New. (ctf_link_set_variable_filter): Likewise. libctf/ * libctf.ver (ctf_link_set_variable_filter): Add. * ctf-impl.h (ctf_file_t) <ctf_link_variable_filter>: New. <ctf_link_variable_filter_arg>: Likewise. * ctf-create.c (ctf_serialize): Adjust. * ctf-link.c (ctf_link_set_variable_filter): New, set it. (ctf_link_one_variable): Call it if set.
2020-07-22libctf, link: add lazy linking: clean up input members: err/warn cleanupNick Alcock1-2/+0
This rather large and intertwined pile of changes does three things: First, it transitions from dprintf to ctf_err_warn for things the user might care about: this one file is the major impetus for the ctf_err_warn infrastructure, because things like file names are crucial in linker error messages, and errno values are utterly incapable of communicating them Second, it stabilizes the ctf_link APIs: you can now call ctf_link_add_ctf without a CTF argument (only a NAME), to lazily ctf_open the file with the given NAME when needed, and close it as soon as possible, to save memory. This is not an API change because a null CTF argument was prohibited before now. Since getting CTF directly from files uses ctf_open, passing in only a NAME requires use of libctf, not libctf-nobfd. The linker's behaviour is unchanged, as it still passes in a ctf_archive_t as before. This also let us fix a leak: we were opening ctf_archives and their containing ctf_files, then only closing the files and leaving the archives open. Third, this commit restructures the ctf_link_in_member argument used by the CTF linking machinery and adjusts its users accordingly. We drop two members: - arcname, which is difficult to construct and then only used in error messages (that were only dprintf()ed, so never seen!) - share_mode, since we store the flags passed to ctf_link (including the share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold of it We rename others whose existing names were fairly dreadful: - done_main_member -> done_parent, using consistent terminology for .ctf as the parent of all archive members - main_input_fp -> in_fp_parent, likewise - file_name -> in_file_name, likewise We add one new member, cu_mapped. Finally, we move the various frees of things like mapping table data to the top-level ctf_link, since deduplicating links will want to do that too. include/ * ctf-api.h (ECTF_NEEDSBFD): New. (ECTF_NERR): Adjust. (ctf_link): Rename share_mode arg to flags. libctf/ * Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere. * Makefile.in: Regenerated. * ctf-impl.h (ctf_link_input_name): New. (ctf_file_t) <ctf_link_flags>: New. * ctf-create.c (ctf_serialize): Adjust accordingly. * ctf-link.c: Define ctf_open as weak when PIC. (ctf_arc_close_thunk): Remove unnecessary thunk. (ctf_file_close_thunk): Likewise. (ctf_link_input_name): New. (ctf_link_input_t): New value of the ctf_file_t.ctf_link_input. (ctf_link_input_close): Adjust accordingly. (ctf_link_add_ctf_internal): New, split from... (ctf_link_add_ctf): ... here. Return error if lazy loading of CTF is not possible. Change to just call... (ctf_link_add): ... this new function. (ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the ctf_file_close_thunk. (ctf_link_in_member_cb_arg_t) <file_name> Rename to... <in_file_name>: ... this. <arcname>: Drop. <share_mode>: Likewise (migrated to ctf_link_flags). <done_main_member>: Rename to... <done_parent>: ... this. <main_input_fp>: Rename to... <in_fp_parent>: ... this. <cu_mapped>: New. (ctf_link_one_type): Adjuwt accordingly. Transition to ctf_err_warn, removing a TODO. (ctf_link_one_variable): Note a case too common to warn about. Report in the debug stream if a cu-mapped link prevents addition of a conflicting variable. (ctf_link_one_input_archive_member): Adjust. (ctf_link_lazy_open): New, open a CTF archive for linking when needed. (ctf_link_close_one_input_archive): New, close it again. (ctf_link_one_input_archive): Adjust for lazy opening, member renames, and ctf_err_warn transition. Move the empty_link_type_mapping call to... (ctf_link): ... here. Adjut for renamings and thunk removal. Don't spuriously fail if some input contains no CTF data. (ctf_link_write): ctf_err_warn transition. * libctf.ver: Remove not-yet-stable comment.
2020-07-22libctf, ld, binutils: add textual error/warning reporting for libctfNick Alcock1-0/+1
This commit adds a long-missing piece of infrastructure to libctf: the ability to report errors and warnings using all the power of printf, rather than being restricted to one errno value. Internally, libctf calls ctf_err_warn() to add errors and warnings to a list: a new iterator ctf_errwarning_next() then consumes this list one by one and hands it to the caller, which can free it. New errors and warnings are added until the list is consumed by the caller or the ctf_file_t is closed, so you can dump them at intervals. The caller can of course choose to print only those warnings it wants. (I am not sure whether we want objdump, readelf or ld to print warnings or not: right now I'm printing them, but maybe we only want to print errors? This entirely depends on whether warnings are voluminous things describing e.g. the inability to emit single types because of name clashes or something. There are no users of this infrastructure yet, so it's hard to say.) There is no internationalization here yet, but this at least adds a place where internationalization can be added, to one of ctf_errwarning_next or ctf_err_warn. We also provide a new ctf_assert() function which uses this infrastructure to provide non-fatal assertion failures while emitting an assert-like string to the caller: to save space and avoid needlessly duplicating unchanging strings, the assertion test is inlined but the print-things-out failure case is not. All assertions in libctf will be converted to use this machinery in future commits and propagate assertion-failure errors up, so that the linker in particular cannot be killed by libctf assertion failures when it could perfectly well just print warnings and drop the CTF section. include/ * ctf-api.h (ECTF_INTERNAL): Adjust error text. (ctf_errwarning_next): New. libctf/ * ctf-impl.h (ctf_assert): New. (ctf_err_warning_t): Likewise. (ctf_file_t) <ctf_errs_warnings>: Likewise. (ctf_err_warn): New prototype. (ctf_assert_fail_internal): Likewise. * ctf-inlines.h (ctf_assert_internal): Likewise. * ctf-open.c (ctf_file_close): Free ctf_errs_warnings. * ctf-create.c (ctf_serialize): Copy it on serialization. * ctf-subr.c (ctf_err_warn): New, add an error/warning. (ctf_errwarning_next): New iterator, free and pass back errors/warnings in succession. * libctf.ver (ctf_errwarning_next): Add. ld/ * ldlang.c (lang_ctf_errs_warnings): New, print CTF errors and warnings. Assert when libctf asserts. (lang_merge_ctf): Call it. (land_write_ctf): Likewise. binutils/ * objdump.c (ctf_archive_member): Print CTF errors and warnings. * readelf.c (dump_ctf_archive_member): Likewise.
2020-07-22libctf, next: introduce new class of easier-to-use iteratorsNick Alcock1-0/+9
The libctf machinery currently only provides one way to iterate over its data structures: ctf_*_iter functions that take a callback and an arg and repeatedly call it. This *works*, but if you are doing a lot of iteration it is really quite inconvenient: you have to package up your local variables into structures over and over again and spawn lots of little functions even if it would be clearer in a single run of code. Look at ctf-string.c for an extreme example of how unreadable this can get, with three-line-long functions proliferating wildly. The deduplicator takes this to the Nth level. It iterates over a whole bunch of things: if we'd had to use _iter-class iterators for all of them there would be twenty additional functions in the deduplicator alone, for no other reason than that the iterator API requires it. Let's do something better. strtok_r gives us half the design: generators in a number of other languages give us the other half. The *_next API allows you to iterate over CTF-like entities in a single function using a normal while loop. e.g. here we are iterating over all the types in a dict: ctf_next_t *i = NULL; int *hidden; ctf_id_t id; while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR) { /* do something with 'hidden' and 'id' */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Here we are walking through the members of a struct with CTF ID 'struct_type': ctf_next_t *i = NULL; ssize_t offset; const char *name; ctf_id_t membtype; while ((offset = ctf_member_next (fp, struct_type, &i, &name, &membtype)) >= 0 { /* do something with offset, name, and membtype */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Like every other while loop, this means you have access to all the local variables outside the loop while inside it, with no need to tiresomely package things up in structures, move the body of the loop into a separate function, etc, as you would with an iterator taking a callback. ctf_*_next allocates 'i' for you on first entry (when it must be NULL), and frees and NULLs it and returns a _next-dependent flag value when the iteration is over: the fp errno is set to ECTF_NEXT_END when the iteartion ends normally. If you want to exit early, call ctf_next_destroy on the iterator. You can copy iterators using ctf_next_copy, which copies their current iteration position so you can remember loop positions and go back to them later (or ctf_next_destroy them if you don't need them after all). Each _next function returns an always-likely-to-be-useful property of the thing being iterated over, and takes pointers to parameters for the others: with very few exceptions all those parameters can be NULLs if you're not interested in them, so e.g. you can iterate over only the offsets of members of a structure this way: while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0) If you pass an iterator in use by one iteration function to another one, you get the new error ECTF_NEXT_WRONGFUN back; if you try to change ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back. Internally the ctf_next_t remembers the iteration function in use, various sizes and increments useful for almost all iterations, then uses unions to overlap the actual entities being iterated over to keep ctf_next_t size down. Iterators available in the public API so far (all tested in actual use in the deduplicator): /* Iterate over the members of a STRUCT or UNION, returning each member's offset and optionally name and member type in turn. On end-of-iteration, returns -1. */ ssize_t ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, const char **name, ctf_id_t *membtype); /* Iterate over the members of an enum TYPE, returning each enumerand's NAME or NULL at end of iteration or error, and optionally passing back the enumerand's integer VALue. */ const char * ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, int *val); /* Iterate over every type in the given CTF container (not including parents), optionally including non-user-visible types, returning each type ID and optionally the hidden flag in turn. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag, int want_hidden); /* Iterate over every variable in the given CTF container, in arbitrary order, returning the name and type of each variable in turn. The NAME argument is not optional. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name); /* Iterate over all CTF files in an archive, returning each dict in turn as a ctf_file_t, and NULL on error or end of iteration. It is the caller's responsibility to close it. Parent dicts may be skipped. Regardless of whether they are skipped or not, the caller must ctf_import the parent if need be. */ ctf_file_t * ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it, const char **name, int skip_parent, int *errp); ctf_label_next is prototyped but not implemented yet. include/ * ctf-api.h (ECTF_NEXT_END): New error. (ECTF_NEXT_WRONGFUN): Likewise. (ECTF_NEXT_WRONGFP): Likewise. (ECTF_NERR): Adjust. (ctf_next_t): New. (ctf_next_create): New prototype. (ctf_next_destroy): Likewise. (ctf_next_copy): Likewise. (ctf_member_next): Likewise. (ctf_enum_next): Likewise. (ctf_type_next): Likewise. (ctf_label_next): Likewise. (ctf_variable_next): Likewise. libctf/ * ctf-impl.h (ctf_next): New. (ctf_get_dict): New prototype. * ctf-lookup.c (ctf_get_dict): New, split out of... (ctf_lookup_by_id): ... here. * ctf-util.c (ctf_next_create): New. (ctf_next_destroy): New. (ctf_next_copy): New. * ctf-types.c (includes): Add <assert.h>. (ctf_member_next): New. (ctf_enum_next): New. (ctf_type_iter): Document the lack of iteration over parent types. (ctf_type_next): New. (ctf_variable_next): New. * ctf-archive.c (ctf_archive_next): New. * libctf.ver: Add new public functions.
2020-07-22libctf: add ctf_refNick Alcock1-0/+1
This allows you to bump the refcount on a ctf_file_t, so that you can smuggle it out of iterators which open and close the ctf_file_t for you around the loop body (like ctf_archive_iter). You still can't use this to preserve a ctf_file_t for longer than the lifetime of its containing entity (e.g. ctf_archive). include/ * ctf-api.h (ctf_ref): New. libctf/ * libctf.ver (ctf_ref): New. * ctf-open.c (ctf_ref): Implement it.
2020-07-22libctf: add ctf_archive_countNick Alcock1-0/+1
Another count that was otherwise unavailable without doing expensive operations. include/ * ctf-api.h (ctf_archive_count): New. libctf/ * ctf-archive.c (ctf_archive_count): New. * libctf.ver: New public function.
2020-07-22libctf: add ctf_member_countNick Alcock1-0/+1
This returns the number of members in a struct or union, or the number of enumerations in an enum. (This was only available before now by iterating across every member, but it can be returned much faster than that.) include/ * ctf-api.h (ctf_member_count): New. libctf/ * ctf-types.c (ctf_member_count): New. * libctf.ver: New public function.
2020-07-22libctf: add ctf_type_kind_forwardedNick Alcock1-0/+1
This is just like ctf_type_kind, except that forwards get the type of the thing being pointed to rather than CTF_K_FORWARD. include/ * ctf-api.h (ctf_type_kind_forwarded): New. libctf/ * ctf-types.c (ctf_type_kind_forwarded): New.
2020-07-22libctf: add ctf_type_name_rawNick Alcock1-0/+1
We already have a function ctf_type_aname_raw, which returns the raw name of a type with no decoration for structures or arrays or anything like that: just the underlying name of whatever it is that's being ultimately pointed at. But this can be inconvenient to use, becauswe it always allocates new storage for the string and copies it in, so it can potentially fail. Add ctf_type_name_raw, which just returns the string directly out of libctf's guts: it will live until the ctf_file_t is closed (if we later gain the ability to remove types from writable dicts, it will live as long as the type lives). Reimplement ctf_type_aname_raw in terms of it. include/ * ctf-api.c (ctf_type_name_raw): New. libctf/ * ctf-types.c (ctf_type_name_raw): New. (ctf_type_aname_raw): Reimplement accordingly.
2020-06-26libctf, binutils: support CTF archives like objdumpNick Alcock1-0/+1
objdump and readelf have one major CTF-related behavioural difference: objdump can read .ctf sections that contain CTF archives and extract and dump their members, while readelf cannot. Since the linker often emits CTF archives, this means that readelf intermittently and (from the user's perspective) randomly fails to read CTF in files that ld emits, with a confusing error message wrongly claiming that the CTF content is corrupt. This is purely because the archive-opening code in libctf was needlessly tangled up with the BFD code, so readelf couldn't use it. Here, we disentangle it, moving ctf_new_archive_internal from ctf-open-bfd.c into ctf-archive.c and merging it with the helper function in ctf-archive.c it was already using. We add a new public API function ctf_arc_bufopen, that looks very like ctf_bufopen but returns an archive given suitable section data rather than a ctf_file_t: the archive is a ctf_archive_t, so it can be called on raw CTF dictionaries (with no archive present) and will return a single-member synthetic "archive". There is a tiny lifetime tweak here: before now, the archive code could assume that the symbol section in the ctf_archive_internal wrapper structure was always owned by BFD if it was present and should always be freed: now, the caller can pass one in via ctf_arc_bufopen, wihch has the usual lifetime rules for such sections (caller frees): so we add an extra field to track whether this is an internal call from ctf-open-bfd, in which case we still free the symbol section. include/ * ctf-api.h (ctf_arc_bufopen): New. libctf/ * ctf-impl.h (ctf_new_archive_internal): Declare. (ctf_arc_bufopen): Remove. (ctf_archive_internal) <ctfi_free_symsect>: New. * ctf-archive.c (ctf_arc_close): Use it. (ctf_arc_bufopen): Fuse into... (ctf_new_archive_internal): ... this, moved across from... * ctf-open-bfd.c: ... here. (ctf_bfdopen_ctfsect): Use ctf_arc_bufopen. * libctf.ver: Add it. binutils/ * readelf.c (dump_section_as_ctf): Support .ctf archives using ctf_arc_bufopen. Automatically load the .ctf member of such archives as the parent of all other members, unless specifically overridden via --ctf-parent. Split out dumping code into... (dump_ctf_archive_member): ... here, as in objdump, and call it once per archive member. (dump_ctf_indent_lines): Code style fix.
2020-01-01Update year range in copyright notice of binutils filesAlan Modra1-1/+1
2019-10-03libctf: installable libctf as a shared libraryNick Alcock1-0/+161
This lets other programs read and write CTF-format data. Two versioned shared libraries are created: libctf.so and libctf-nobfd.so. They contain identical content except that libctf-nobfd.so contains no references to libbfd and does not implement ctf_open, ctf_fdopen, ctf_bfdopen or ctf_bfdopen_ctfsect, so it can be used by programs that cannot use BFD, like readelf. The soname major version is presently .0 until the linker API stabilizes, when it will flip to .1 and hopefully never change again. New in v3. v4: libtoolize and turn into a pair of shared libraries. Drop --enable-install-ctf: now controlled by --enable-shared and --enable-install-libbfd, like everything else. v5: Add ../bfd to ACLOCAL_AMFLAGS and AC_CONFIG_MACRO_DIR. Fix tabdamage. * Makefile.def (host_modules): libctf is no longer no_install. * Makefile.in: Regenerated. libctf/ * configure.ac (AC_DISABLE_SHARED): New, like opcodes/. (LT_INIT): Likewise. (AM_INSTALL_LIBBFD): Likewise. (dlopen): Note why this is necessary in a comment. (SHARED_LIBADD): Initialize for possibly-PIC libiberty: derived from opcodes/. (SHARED_LDFLAGS): Likewise. (BFD_LIBADD): Likewise, for libbfd. (BFD_DEPENDENCIES): Likewise. (VERSION_FLAGS): Initialize, using a version script if ld supports one, or libtool -export-symbols-regex otherwise. (AC_CONFIG_MACRO_DIR): Add ../BFD. * Makefile.am (ACLOCAL_AMFLAGS): Likewise. (INCDIR): New. (AM_CPPFLAGS): Use $(srcdir), not $(top_srcdir). (noinst_LIBRARIES): Replace with... [INSTALL_LIBBFD] (lib_LTLIBRARIES): This, or... [!INSTALL_LIBBFD] (noinst_LTLIBRARIES): ... this, mentioning new libctf-nobfd.la as well. [INSTALL_LIBCTF] (include_HEADERS): Add the CTF headers. [!INSTALL_LIBCTF] (include_HEADERS): New, empty. (libctf_a_SOURCES): Rename to... (libctf_nobfd_la_SOURCES): ... this, all of libctf other than ctf-open-bfd.c. (libctf_la_SOURCES): Now derived from libctf_nobfd_la_SOURCES, with ctf-open-bfd.c added. (libctf_nobfd_la_LIBADD): New, using @SHARED_LIBADD@. (libctf_la_LIBADD): New, using @BFD_LIBADD@ as well. (libctf_la_DEPENDENCIES): New, using @BFD_DEPENDENCIES@. * Makefile.am [INSTALL_LIBCTF]: Use it. * aclocal.m4: Add ../bfd/acinclude.m4, ../config/acx.m4, and the libtool macros. * libctf.ver: New, everything is version LIBCTF_1.0 currently (even the unstable components). * Makefile.in: Regenerated. * config.h.in: Likewise. * configure: Likewise. binutils/ * Makefile.am (LIBCTF): Mention the .la file. (LIBCTF_NOBFD): New. (readelf_DEPENDENCIES): Use it. (readelf_LDADD): Likewise. * Makefile.in: Regenerated. ld/ * configure.ac (TESTCTFLIB): Set to the .so or .a, like TESTBFDLIB. * Makefile.am (TESTCTFLIB): Use it. (LIBCTF): Use the .la file. (check-DEJAGNU): Use it. * Makefile.in: Regenerated. * configure: Likewise. include/ * ctf-api.h: Note the instability of the ctf_link interfaces.