Age | Commit message (Collapse) | Author | Files | Lines |
|
The recent change to detect duplicate enum values and return ECTF_DUPLICATE
when found turns out to perturb a great many callers. In particular, the
pahole-created kernel BTF has the same problem we historically did, and
gleefully emits duplicated enum constants in profusion. Handling the
resulting duplicate errors from BTF -> CTF converters reasonably is
unreasonably difficult (it amounts to forcing them to skip some types or
reimplement the deduplicator).
So let's step back a bit. What we care about mostly is that the
deduplicator treat enums with conflicting enumeration constants as
conflicting types: programs that want to look up enumeration constant ->
value mappings using the new APIs to do so might well want the same checks
to apply to any ctf_add_* operations they carry out (and since they're
*using* the new APIs, added at the same time as this restriction was
imposed, there is likely to be no negative consequence of this).
So we want some way to allow processes that know about duplicate detection
to opt into it, while allowing everyone else to stay clear of it: but we
want ctf_link to get this behaviour even if its caller has opted out.
So add a new concept to the API: dict-wide CTF flags, set via
ctf_dict_set_flag, obtained via ctf_dict_get_flag. They are not bitflags
but simple arbitrary integers and an on/off value, stored in an unspecified
manner (the one current flag, we translate into an LCTF_* flag value in the
internal ctf_dict ctf_flags word). If you pass in an invalid flag or value
you get a new ECTF_BADFLAG error, so the caller can easily tell whether
flags added in future are valid with a particular libctf or not.
We check this flag in ctf_add_enumerator, and set it around the link
(including on child per-CU dicts). The newish enumerator-iteration test is
souped up to check the semantics of the flag as well.
The fact that the flag can be set and unset at any time has curious
consequences. You can unset the flag, insert a pile of duplicates, then set
it and expect the new duplicates to be detected, not only by
ctf_add_enumerator but also by ctf_lookup_enumerator. This means we now
have to maintain the ctf_names and conflicting_enums enum-duplication
tracking as new enums are added, not purely as the dict is opened.
Move that code out of init_static_types_internal and into a new
ctf_track_enumerator function that addition can also call.
(None of this affects the file format or serialization machinery, which has
to be able to handle duplicate enumeration constants no matter what.)
include/
* ctf-api.h (CTF_ERRORS) [ECTF_BADFLAG]: New.
(ECTF_NERR): Update.
(CTF_STRICT_NO_DUP_ENUMERATORS): New flag.
(ctf_dict_set_flag): New function.
(ctf_dict_get_flag): Likewise.
libctf/
* ctf-impl.h (LCTF_STRICT_NO_DUP_ENUMERATORS): New flag.
(ctf_track_enumerator): Declare.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_create_per_cu): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link): Likewise.
(ctf_link_write): Likewise.
* ctf-subr.c (ctf_dict_set_flag): New function.
(ctf_dict_get_flag): New function.
* ctf-open.c (init_static_types_internal): Move enum tracking to...
* ctf-create.c (ctf_track_enumerator): ... this new function.
(ctf_add_enumerator): Call it.
* libctf.ver: Add the new functions.
* testsuite/libctf-lookup/enumerator-iteration.c: Test them.
|
|
Three new functions for looking up the enum type containing a given
enumeration constant, and optionally that constant's value.
The simplest, ctf_lookup_enumerator, looks up a root-visible enumerator by
name in one dict: if the dict contains multiple such constants (which is
possible for dicts created by older versions of the libctf deduplicator),
ECTF_DUPLICATE is returned.
The next simplest, ctf_lookup_enumerator_next, is an iterator which returns
all enumerators with a given name in a given dict, whether root-visible or
not.
The most elaborate, ctf_arc_lookup_enumerator_next, finds all
enumerators with a given name across all dicts in an entire CTF archive,
whether root-visible or not, starting looking in the shared parent dict;
opened dicts are cached (as with all other ctf_arc_*lookup functions) so
that repeated use does not incur repeated opening costs.
All three of these return enumerator values as int64_t: unfortunately, API
compatibility concerns prevent us from doing the same with the other older
enum-related functions, which all return enumerator constant values as ints.
We may be forced to add symbol-versioning compatibility aliases that fix the
other functions in due course, bumping the soname for platforms that do not
support such things.
ctf_arc_lookup_enumerator_next is implemented as a nested ctf_archive_next
iterator, and inside that, a nested ctf_lookup_enumerator_next iterator
within each dict. To aid in this, add support to ctf_next_t iterators for
iterators that are implemented in terms of two simultaneous nested iterators
at once. (It has always been possible for callers to use as many nested or
semi-overlapping ctf_next_t iterators as they need, which is one of the
advantages of this style over the _iter style that calls a function for each
thing iterated over: the iterator change here permits *ctf_next_t iterators
themselves* to be implemented by iterating using multiple other iterators as
part of their internal operation, transparently to the caller.)
Also add a testcase that tests all these functions (which is fairly easy
because ctf_arc_lookup_enumerator_next is implemented in terms of
ctf_lookup_enumerator_next) in addition to enumeration addition in
ctf_open()ed dicts, ctf_add_enumerator duplicate enumerator addition, and
conflicting enumerator constant deduplication.
include/
* ctf-api.h (ctf_lookup_enumerator): New.
(ctf_lookup_enumerator_next): Likewise.
(ctf_arc_lookup_enumerator_next): Likewise.
libctf/
* libctf.ver: Add them.
* ctf-impl.h (ctf_next_t) <ctn_next_inner>: New.
* ctf-util.c (ctf_next_copy): Copy it.
(ctf_next_destroy): Destroy it.
* ctf-lookup.c (ctf_lookup_enumerator): New.
(ctf_lookup_enumerator_next): New.
* ctf-archive.c (ctf_arc_lookup_enumerator_next): New.
* testsuite/libctf-lookup/enumerator-iteration.*: New test.
* testsuite/libctf-lookup/enum-ctf-2.c: New test CTF, used by the
above.
|
|
Starting with ld.lld-17, ld.lld is invoked with the option
--no-undefined-version enabled by default. Furthermore, The functions
ctf_label_set() and ctf_label_get() are not defined. Their inclusion in
libctf/libctf.ver causes ld.lld-17 to fail emitting the following error
messages:
ld.lld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_label_set' failed: symbol not defined
ld.lld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_label_get' failed: symbol not defined
This patch fixes the issue by removing the symbol names from
libctf/libctf.ver.
[nca: fused in later commit that marked ctf_arc_open as libctf
only as well. Added ChangeLog entry.]
Signed-off-by: Nicholas Vinson <nvinson234@gmail.com>
libctf/
* libctf.ver: drop nonexistent label functions: mark
ctf_arc_open as libctf-only.
|
|
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:
1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
author I haven't committed, 'Kalray SA.', to cover gas testsuite
files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
|
|
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
|
|
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.
The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
|
|
Checking for linker versioning by just grepping ld --help output for
mentions of --version-script is inadequate now that Solaris 11.4
implements a --version-script with different semantics. Try linking a
test program with a small wildcard-using version script with each
supported set of flags in turn, to make sure that linker versioning is
not only advertised but actually works.
The Solaris "GNU-compatible" linker versioning is not quite
GNU-compatible enough, but we can work around the differences by
generating a new version script that removes the comments from the
original (Solaris ld requires #-style comments), and making another
version script for libctf-nonbfd in particular which doesn't mention any
of the symbols that appear in libctf.la, to avoid Solaris ld introducing
corresponding new NOTYPE symbols to match the version script.
libctf/ChangeLog
2021-09-27 Nick Alcock <nick.alcock@oracle.com>
PR libctf/27967
* configure.ac (VERSION_FLAGS): Replace with...
(ac_cv_libctf_version_script): ... this multiple test.
(VERSION_FLAGS_NOBFD): Substitute this too.
* Makefile.am (libctf_nobfd_la_LDFLAGS): Use it. Split out...
(libctf_ldflags_nover): ... non-versioning flags here.
(libctf_la_LDFLAGS): Use it.
* libctf.ver: Give every symbol not in libctf-nobfd a comment on
the same line noting as much.
|
|
Before now, types that could not be encoded in CTF were represented as
references to type ID 0, which does not itself appear in the
dictionary. This choice is annoying in several ways, principally that it
forces generators and consumers of CTF to grow special cases for types
that are referenced in valid dicts but don't appear.
Allow an alternative representation (which will become the only
representation in format v4) whereby nonrepresentable types are encoded
as actual types with kind CTF_K_UNKNOWN (an already-existing kind
theoretically but not in practice used for padding, with value 0).
This is backward-compatible, because CTF_K_UNKNOWN was not used anywhere
before now: it was used in old-format function symtypetabs, but these
were never emitted by any compiler and the code to handle them in libctf
likely never worked and was removed last year, in favour of new-format
symtypetabs that contain only type IDs, not type kinds.
In order to link this type, we need an API addition to let us add types
of unknown kind to the dict: we let them optionally have names so that
GCC can emit many different unknown types and those types with identical
names will be deduplicated together. There are also small tweaks to the
deduplicator to actually dedup such types, to let opening of dicts with
unknown types with names work, to return the ECTF_NONREPRESENTABLE error
on resolution of such types (like ID 0), and to print their names as
something useful but not a valid C identifier, mostly for the sake of
the dumper.
Tests added in the next commit.
include/ChangeLog
2021-05-06 Nick Alcock <nick.alcock@oracle.com>
* ctf.h (CTF_K_UNKNOWN): Document that it can be used for
nonrepresentable types, not just padding.
* ctf-api.h (ctf_add_unknown): New.
libctf/ChangeLog
2021-05-06 Nick Alcock <nick.alcock@oracle.com>
* ctf-open.c (init_types): Unknown types may have names.
* ctf-types.c (ctf_type_resolve): CTF_K_UNKNOWN is as
non-representable as type ID 0.
(ctf_type_aname): Print unknown types.
* ctf-dedup.c (ctf_dedup_hash_type): Do not early-exit for
CTF_K_UNKNOWN types: they have real hash values now.
(ctf_dedup_rwalk_one_output_mapping): Treat CTF_K_UNKNOWN types
like other types with no referents: call the callback and do not
skip them.
(ctf_dedup_emit_type): Emit via...
* ctf-create.c (ctf_add_unknown): ... this new function.
* libctf.ver (LIBCTF_1.2): Add it.
|
|
The existing ctf_lookup_by_symbol and ctf_arc_lookup_symbol functions
suffice to look up the types of symbols if the caller already has a
symbol number. But the caller often doesn't have one of those and only
knows the name of the symbol: also, in object files, the caller might
not have a useful symbol number in any sense (and neither does libctf:
the 'symbol number' we use in that case literally starts at 0 for the
lexicographically first-sorted symbol in the symtypetab and counts those
symbols, so it corresponds to nothing useful).
This means that even though object files have a symtypetab (generated by
the compiler or by ld -r), the only way we can look up anything in it is
to iterate over all symbols in turn with ctf_symbol_next until we find
the one we want.
This is unhelpful and pointlessly inefficient.
So add a pair of functions to look up symbols by name in a dict and in a
whole archive: ctf_lookup_by_symbol_name and ctf_arc_lookup_symbol_name.
These are identical to the existing functions except that they take
symbol names rather than symbol numbers.
To avoid insane repetition, we do some refactoring in the process, so
that both ctf_lookup_by_symbol and ctf_arc_lookup_symbol turn into thin
wrappers around internal functions that do both lookup by symbol index
and lookup by name. This massively reduces code duplication because
even the existing lookup-by-index stuff wants to use a name sometimes
(when looking up in indexed sections), and the new lookup-by-name stuff
has to turn it into an index sometimes (when looking up in non-indexed
sections): doing it this way lets us share most of that.
The actual name->index lookup is done by ctf_lookup_symbol_idx. We do
not anticipate this lookup to be as heavily used as ld.so symbol lookup
by many orders of magnitude, so using the ELF symbol hashes would
probably take more time to read them than is saved by using the hashes,
and it adds a lot of complexity. Instead, do a linear search for the
symbol name, caching all the name -> index mappings as we go, so that
future searches are likely to hit in the cache. To avoid having to
repeat this search over and over in a CTF archive when
ctf_arc_lookup_symbol_name is used, have cached archive lookups (the
sort done by ctf_arc_lookup_symbol* and the ctf_archive_next iterator)
pick out the first dict they cache in a given archive and store it in a
new ctf_archive field, ctfi_crossdict_cache. This can be used to store
cross-dictionary cached state that depends on things like the ELF symbol
table rather than the contents of any one dict. ctf_lookup_symbol_idx
then caches its name->index mappings in the dictionary named in the
crossdict cache, if any, so that ctf_lookup_symbol_idx in other dicts
in the same archive benefit from the previous linear search, and the
symtab only needs to be scanned at most once.
(Note that if you call ctf_lookup_by_symbol_name in one specific dict,
and then follow it with a ctf_arc_lookup_symbol_name, the former will
not use the crossdict cache because it's only populated by the dict
opens in ctf_arc_lookup_symbol_name. This is harmless except for a small
one-off waste of memory and time: it's only a cache, after all. We can
fix this later by using the archive caching machinery more
aggressively.)
In ctf-archive, we do similar things, turning ctf_arc_lookup_symbol into
a wrapper around a new function that does both index -> ID and name ->
ID lookups across all dicts in an archive. We add a new
ctfi_symnamedicts cache that maps symbol names to the ctf_dict_t * that
it was found in (so that linear searches for symbols don't need to be
repeated): but we also *remove* a cache, the ctfi_syms cache that was
memoizing the actual ctf_id_t returned from every call to
ctf_arc_lookup_symbol. This is pointless: all it saves is one call to
ctf_lookup_by_symbol, and that's basically an array lookup and nothing
more so isn't worth caching. (Equally, given that symbol -> index
mappings are cached by ctf_lookup_by_symbol_name, those calls are nearly
free after the first call, so there's no point caching the ctf_id_t in
that case either.)
We fix up one test that was doing manual symbol lookup to use
ctf_arc_lookup_symbol instead, and enhance it to check that the caching
layer is not totally broken: we also add a new test to do lookups in a
.o file, and another to do lookups in an archive with conflicted types
and make sure that sort of multi-dict lookup is actually working.
include/ChangeLog
2021-02-17 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_arc_lookup_symbol_name): New.
(ctf_lookup_by_symbol_name): Likewise.
libctf/ChangeLog
2021-02-17 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_symhash>: New.
<ctf_symhash_latest>: Likewise.
(struct ctf_archive_internal) <ctfi_crossdict_cache>: New.
<ctfi_symnamedicts>: New.
<ctfi_syms>: Remove.
(ctf_lookup_symbol_name): Remove.
* ctf-lookup.c (ctf_lookup_symbol_name): Propagate errors from
parent properly. Make static.
(ctf_lookup_symbol_idx): New, linear search for the symbol name,
cached in the crossdict cache's ctf_symhash (if available), or
this dict's (otherwise).
(ctf_try_lookup_indexed): Allow the symname to be passed in.
(ctf_lookup_by_symbol): Turn into a wrapper around...
(ctf_lookup_by_sym_or_name): ... this, supporting name lookup too,
using ctf_lookup_symbol_idx in non-writable dicts. Special-case
name lookup in dynamic dicts without reported symbols, which have
no symtab or dynsymidx but where name lookup should still work.
(ctf_lookup_by_symbol_name): New, another wrapper.
* ctf-archive.c (enosym): Note that this is present in
ctfi_symnamedicts too.
(ctf_arc_close): Adjust for removal of ctfi_syms. Free the
ctfi_symnamedicts.
(ctf_arc_flush_caches): Likewise.
(ctf_dict_open_cached): Memoize the first cached dict in the
crossdict cache.
(ctf_arc_lookup_symbol): Turn into a wrapper around...
(ctf_arc_lookup_sym_or_name): ... this. No longer cache
ctf_id_t lookups: just call ctf_lookup_by_symbol as needed (but
still cache the dicts those lookups succeed in). Add
lookup-by-name support, with dicts of successful lookups cached in
ctfi_symnamedicts. Refactor the caching code a bit.
(ctf_arc_lookup_symbol_name): New, another wrapper.
* ctf-open.c (ctf_dict_close): Free the ctf_symhash.
* libctf.ver (LIBCTF_1.2): New version. Add
ctf_lookup_by_symbol_name, ctf_arc_lookup_symbol_name.
* testsuite/libctf-lookup/enum-symbol.c (main): Use
ctf_arc_lookup_symbol rather than looking up the name ourselves.
Fish it out repeatedly, to make sure that symbol caching isn't
broken.
(symidx_64): Remove.
(symidx_32): Remove.
* testsuite/libctf-lookup/enum-symbol-obj.lk: Test symbol lookup
in an unlinked object file (indexed symtypetab sections only).
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.c
(try_maybe_reporting): Check symbol types via
ctf_lookup_by_symbol_name as well as ctf_symbol_next.
* testsuite/libctf-lookup/conflicting-type-syms.*: New test of
lookups in a multi-dict archive.
|
|
|
|
The CTF symbol lookup machinery added recently has one deficit: it
assumes the symtab is in the machine's native endianness. This is
always true when the linker is writing out symtabs (because cross
linkers byteswap symbols only after libctf has been called on them), but
may be untrue in the cross case when the linker or another tool
(objdump, etc) is reading them.
Unfortunately the easy way to model this to the caller, as an endianness
field in the ctf_sect_t, is precluded because doing so would change the
size of the ctf_sect_t, which would be an ABI break. So, instead, allow
the endianness of the symtab to be set after open time, by calling one
of the two new API functions ctf_symsect_endianness (for ctf_dict_t's)
or ctf_arc_symsect_endianness (for entire ctf_archive_t's). libctf
calls these functions automatically for objects opened via any of the
BFD-aware mechanisms (ctf_bfdopen, ctf_bfdopen_ctfsect, ctf_fdopen,
ctf_open, or ctf_arc_open), but the various mechanisms that just take
raw ctf_sect_t's will assume the symtab is in native endianness and need
a later call to ctf_*symsect_endianness to adjust it if needed. (This
call is basically free if the endianness is actually native: it only
costs anything if the symtab endianness was previously guessed wrong,
and there is a symtab, and we are using it directly rather than using
symtab indexing.)
Obviously, calling ctf_lookup_by_symbol or ctf_symbol_next before the
symtab endianness is correctly set will probably give wrong answers --
but you can set it at any time as long as it is before then.
include/ChangeLog
2020-11-23 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h: Style nit: remove () on function names in comments.
(ctf_sect_t): Mention endianness concerns.
(ctf_symsect_endianness): New declaration.
(ctf_arc_symsect_endianness): Likewise.
libctf/ChangeLog
2020-11-23 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_symtab_little_endian>: New.
(struct ctf_archive_internal) <ctfi_symsect_little_endian>: Likewise.
* ctf-create.c (ctf_serialize): Adjust for new field.
* ctf-open.c (init_symtab): Note the semantics of repeated calls.
(ctf_symsect_endianness): New.
(ctf_bufopen_internal): Set ctf_symtab_little_endian suitably for
the native endianness.
(_Static_assert): Moved...
(swap_thing): ... with this...
* swap.h: ... to here.
* ctf-util.c (ctf_elf32_to_link_sym): Use it, byteswapping the
Elf32_Sym if the ctf_symtab_little_endian demands it.
(ctf_elf64_to_link_sym): Likewise swap the Elf64_Sym if needed.
* ctf-archive.c (ctf_arc_symsect_endianness): New, set the
endianness of the symtab used by the dicts in an archive.
(ctf_archive_iter_internal): Initialize to unknown (assumed native,
do not call ctf_symsect_endianness).
(ctf_dict_open_by_offset): Call ctf_symsect_endianness if need be.
(ctf_dict_open_internal): Propagate the endianness down.
(ctf_dict_open_sections): Likewise.
* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Get the endianness from the
struct bfd and pass it down to the archive.
* libctf.ver: Add ctf_symsect_endianness and
ctf_arc_symsect_endianness.
|
|
libctf has long provided ctf_getdatasect, which hands back a pointer to
the CTF section a (read-only) dict came from. But it has no such
functions to return pointers to the ELF symbol table or string table
it's working from, which is unfortunate because several libctf functions
(ctf_open, ctf_fdopen, and ctf_bfdopen) figure out which string and
symbol table to use themselves, and don't tell the user what they
decided, so the caller can't agree on which symtab to use with libctf
even if it wanted to.
Add a pair of functions to return the symtab and strtab in use. Like
ctf_getdatasect, these return ctf_sect_t structures by value, filled
with all-NULL/0 content if a symtab or strtab is not being used.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_getsymsect): New.
(ctf_getstrsect): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-open.c (ctf_getsymsect): New.
(ctf_getstrsect): Likewise.
* libctf.ver: Add them.
|
|
CTF archives may contain multiple dicts, each of which contain many
types and possibly a bunch of symtypetab entries relating to those
types: each symtypetab entry is going to appear in exactly one dict,
with the corresponding entries in the other dicts empty (either pads, or
indexed symtypetabs that do not mention that symbol). But users of
libctf usually want to get back the type associated with a symbol
without having to dig around to find out which dict that type might be
in.
This adds machinery to do that -- and since you probably want to do it
repeatedly, it adds internal caching to the ctf-archive machinery so
that iteration over archives via ctf_archive_next and repeated symbol
lookups do not have to repeatedly reopen the archive. (Iteration using
ctf_archive_iter will gain caching soon.)
Two new API functions:
ctf_dict_t *
ctf_arc_lookup_symbol (ctf_archive_t *arc, unsigned long symidx,
ctf_id_t *typep, int *errp);
This looks up the symbol with index SYMIDX in the archive ARC, returning
the dictionary in which it resides and optionally the type index as
well. Errors are returned in ERRP. The dict should be
ctf_dict_close()d when done, but is also cached inside the ctf_archive
so that the open cost is only paid once. The result of the symbol
lookup is also cached internally, so repeated lookups of the same symbol
are nearly free.
void ctf_arc_flush_caches (ctf_archive_t *arc);
Flush all the caches. Done at close time, but also available as an API
function if users want to do it by hand.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_arc_lookup_symbol): New.
(ctf_arc_flush_caches): Likewise.
* ctf.h: Document new auto-ctf_import behaviour.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (struct ctf_archive_internal) <ctfi_dicts>: New, dicts
the archive machinery has opened and cached.
<ctfi_symdicts>: New, cache of dicts containing symbols looked up.
<ctfi_syms>: New, cache of types of symbols looked up.
* ctf-archive.c (ctf_arc_close): Free them on close.
(enosym): New, flag entry for 'symbol not present'.
(ctf_arc_import_parent): New, automatically import the parent from
".ctf" if this is a child in an archive and ".ctf" is present.
(ctf_dict_open_sections): Use it.
(ctf_archive_iter_internal): Likewise.
(ctf_cached_dict_close): New, thunk around ctf_dict_close.
(ctf_dict_open_cached): New, open and cache a dict.
(ctf_arc_flush_caches): New, flush the caches.
(ctf_arc_lookup_symbol): New, look up a symbol in (all members of)
an archive, and cache the lookup.
(ctf_archive_iter): Note the new caching behaviour.
(ctf_archive_next): Use ctf_dict_open_cached.
* libctf.ver: Add ctf_arc_lookup_symbol and ctf_arc_flush_caches.
|
|
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
|
|
This is embarrassing.
The whole point of CTF is that it remains intact even after a binary is
stripped, providing a compact mapping from symbols to types for
everything in the externally-visible interface of an ELF object: it has
connections to the symbol table for that purpose, and to the string
table to avoid duplicating symbol names. So it's a shame that the hooks
I implemented last year served to hook it up to the .symtab and .strtab,
which obviously disappear on strip, leaving any accompanying the CTF
dict containing references to strings (and, soon, symbols) which don't
exist any more because their containing strtab has been vaporized. The
original Solaris design used .dynsym and .dynstr (well, actually,
.ldynsym, which has more symbols) which do not disappear. So should we.
Thankfully the work we did before serves as guide rails, and adjusting
things to use the .dynstr and .dynsym was fast and easy. The only
annoyance is that the dynsym is assembled inside elflink.c in a fairly
piecemeal fashion, so that the easiest way to get the symbols out was to
hook in before every call to swap_symbol_out (we also leave in a hook in
front of symbol additions to the .symtab because it seems plausible that
we might want to hook them in future too: for now that hook is unused).
We adjust things so that rather than being offered a whole hash table of
symbols at once, libctf is now given symbols one at a time, with st_name
indexes already resolved and pointing at their final .dynstr offsets:
it's now up to libctf to resolve these to names as needed using the
strtab info we pass it separately.
Some bits might be contentious. The ctf_new_dynstr callback takes an
elf_internal_sym, and this remains an elf_internal_sym right down
through the generic emulation layers into ldelfgen. This is no worse
than the elf_sym_strtab we used to pass down, but in the future when we
gain non-ELF CTF symtab support we might want to lower the
elf_internal_sym to some other representation (perhaps a
ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym. We rename the
'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer
have anything to do with symbols.
There are some API changes to pieces of API which are technically public
but actually totally unused by anything and/or unused by anything but ld
so they can change freely: the ctf_link_symbol gains new fields to allow
symbol names to be given as strtab offsets as well as strings, and a
symidx so that the symbol index can be passed in. ctf_link_shuffle_syms
loses its callback parameter: the idea now is that linkers call the new
ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the
strtab entries with ctf_link_add_strtab, and then a call to
ctf_link_shuffle_syms will apply both and arrange to use them to reorder
the CTF symtab at CTF serialization time (which is coming in the next
commit).
Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always
set in v3-format CTF dicts from this commit forwards: CTF dicts without
this flag are associated with .strtab like they used to be, so that old
dicts' external strings don't turn to garbage when loaded by new libctf.
Dicts with this flag are associated with .dynstr and .dynsym instead.
(The flag is not the next in sequence because this commit was written
quite late: the missing flags will be filled in by the next commit.)
Tests forthcoming in a later commit in this series.
bfd/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* elflink.c (elf_finalize_dynstr): Call examine_strtab after
dynstr finalization.
(elf_link_swap_symbols_out): Don't call it here. Call
ctf_new_symbol before swap_symbol_out.
(elf_link_output_extsym): Call ctf_new_dynsym before
swap_symbol_out.
(bfd_elf_final_link): Likewise.
* elf.c (swap_out_syms): Pass in bfd_link_info. Call
ctf_new_symbol before swap_symbol_out.
(_bfd_elf_compute_section_file_positions): Adjust.
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not
.symtab and .strtab.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* bfdlink.h (struct elf_sym_strtab): Replace with...
(struct elf_internal_sym): ... this.
(struct bfd_link_callbacks) <examine_strtab>: Take only a
symstrtab argument.
<ctf_new_symbol>: New.
<ctf_new_dynsym>: Likewise.
* ctf-api.h (struct ctf_link_sym) <st_symidx>: New.
<st_nameidx>: Likewise.
<st_nameidx_set>: Likewise.
(ctf_link_iter_symbol_f): Removed.
(ctf_link_shuffle_syms): Remove most parameters, just takes a
ctf_dict_t now.
(ctf_link_add_linker_symbol): New, split from
ctf_link_shuffle_syms.
* ctf.h (CTF_F_DYNSTR): New.
(CTF_F_MAX): Adjust.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to...
(struct ctf_strtab_iter_cb_arg): ... this, changing fields:
<syms>: Remove.
<symcount>: Remove.
<symstrtab>: Rename to...
<strtab>: ... this.
(ldelf_ctf_strtab_iter_cb): Adjust.
(ldelf_ctf_symbols_iter_cb): Remove.
(ldelf_new_dynsym_for_ctf): New, tell libctf about a single
symbol.
(ldelf_examine_strtab_for_ctf): Rename to...
(ldelf_acquire_strings_for_ctf): ... this, only doing the strtab
portion and not symbols.
* ldelfgen.h: Adjust declarations accordingly.
* ldemul.c (ldemul_examine_strtab_for_ctf): Rename to...
(ldemul_acquire_strings_for_ctf): ... this.
(ldemul_new_dynsym_for_ctf): New.
* ldemul.h: Adjust declarations accordingly.
* ldlang.c (ldlang_ctf_apply_strsym): Rename to...
(ldlang_ctf_acquire_strings): ... this.
(ldlang_ctf_new_dynsym): New.
(lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do
the actual symbol shuffle.
* ldlang.h (struct elf_strtab_hash): Adjust accordingly.
* ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_shuffle_syms): Adjust.
(ctf_link_add_linker_symbol): New, unimplemented stub.
* libctf.ver: Add it.
* ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized
dicts.
* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the
symtab/strtab if not present, dynsym/dynstr otherwise.
* ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from
some arbitrary member of a CTF archive.
* ctf-impl.h (ctf_arc_bufpreamble): Declare it.
|
|
The functions that return ctf_dict_t's given a ctf_archive_t and a name
are very clumsily named. It sounds like they return *archives*, not
dictionaries, and the names are very long and clunky. Why do we
have a ctf_arc_open_by_name when it opens a dictionary, not an archive,
and when there is no way to open a dictionary in any other way? The
answer is purely internal: the function is located in ctf-archive.c,
and everything in there was called ctf_arc_*, and there is another
way to open a dict (by offset in the archive), that is internal to
ctf-archive.c and that nothing else can call.
This is clearly bad naming. The internal organization of the source tree
should not dictate public API names!
So rename things (keeping the old, bad names for compatibility), and
adjust all users. You now open a dict using ctf_dict_open, and
open it giving ELF sections via ctf_dict_open_sections.
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf): Use ctf_dict_open, not
ctf_arc_open_by_name.
* readelf.c (dump_section_as_ctf): Likewise.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c (elfctf_build_psymtabs): Use ctf_dict_open, not
ctf_arc_open_by_name.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_arc_open_by_name): Rename to...
(ctf_dict_open): ... this, keeping compatibility function.
(ctf_arc_open_by_name_sections): Rename to...
(ctf_dict_open_sections): ... this, keeping compatibility function.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-archive.c (ctf_arc_open_by_offset): Rename to...
(ctf_dict_open_by_offset): ... this. Adjust callers.
(ctf_arc_open_by_name_internal): Rename to...
(ctf_dict_open_internal): ... this. Adjust callers.
(ctf_arc_open_by_name_sections): Rename to...
(ctf_dict_open_sections): ... this, keeping compatibility function.
(ctf_arc_open_by_name): Rename to...
(ctf_dict_open): ... this, keeping compatibility function.
* libctf.ver: New functions added.
* ctf-link.c (ctf_link_one_input_archive): Adjusted accordingly.
(ctf_link_deduplicating_open_inputs): Likewise.
|
|
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
|
|
The CTF variables section (containing variables that have no
corresponding symtab entries) can cause the string table to get very
voluminous if the names of variables are long. Some callers want to
filter out particular variables they know they won't need.
So add a "variable filter" callback that does that: it's passed the name
of the variable and a corresponding ctf_file_t / ctf_id_t pair, and
should return 1 to filter it out.
ld doesn't use this machinery yet, but we could easily add it later if
desired. (But see later for a commit that turns off CTF variable-
section linking in ld entirely by default.)
include/
* ctf-api.h (ctf_link_variable_filter_t): New.
(ctf_link_set_variable_filter): Likewise.
libctf/
* libctf.ver (ctf_link_set_variable_filter): Add.
* ctf-impl.h (ctf_file_t) <ctf_link_variable_filter>: New.
<ctf_link_variable_filter_arg>: Likewise.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-link.c (ctf_link_set_variable_filter): New, set it.
(ctf_link_one_variable): Call it if set.
|
|
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
|
|
This commit adds a long-missing piece of infrastructure to libctf: the
ability to report errors and warnings using all the power of printf,
rather than being restricted to one errno value. Internally, libctf
calls ctf_err_warn() to add errors and warnings to a list: a new
iterator ctf_errwarning_next() then consumes this list one by one and
hands it to the caller, which can free it. New errors and warnings are
added until the list is consumed by the caller or the ctf_file_t is
closed, so you can dump them at intervals. The caller can of course
choose to print only those warnings it wants. (I am not sure whether we
want objdump, readelf or ld to print warnings or not: right now I'm
printing them, but maybe we only want to print errors? This entirely
depends on whether warnings are voluminous things describing e.g. the
inability to emit single types because of name clashes or something.
There are no users of this infrastructure yet, so it's hard to say.)
There is no internationalization here yet, but this at least adds a
place where internationalization can be added, to one of
ctf_errwarning_next or ctf_err_warn.
We also provide a new ctf_assert() function which uses this
infrastructure to provide non-fatal assertion failures while emitting an
assert-like string to the caller: to save space and avoid needlessly
duplicating unchanging strings, the assertion test is inlined but the
print-things-out failure case is not. All assertions in libctf will be
converted to use this machinery in future commits and propagate
assertion-failure errors up, so that the linker in particular cannot be
killed by libctf assertion failures when it could perfectly well just
print warnings and drop the CTF section.
include/
* ctf-api.h (ECTF_INTERNAL): Adjust error text.
(ctf_errwarning_next): New.
libctf/
* ctf-impl.h (ctf_assert): New.
(ctf_err_warning_t): Likewise.
(ctf_file_t) <ctf_errs_warnings>: Likewise.
(ctf_err_warn): New prototype.
(ctf_assert_fail_internal): Likewise.
* ctf-inlines.h (ctf_assert_internal): Likewise.
* ctf-open.c (ctf_file_close): Free ctf_errs_warnings.
* ctf-create.c (ctf_serialize): Copy it on serialization.
* ctf-subr.c (ctf_err_warn): New, add an error/warning.
(ctf_errwarning_next): New iterator, free and pass back
errors/warnings in succession.
* libctf.ver (ctf_errwarning_next): Add.
ld/
* ldlang.c (lang_ctf_errs_warnings): New, print CTF errors
and warnings. Assert when libctf asserts.
(lang_merge_ctf): Call it.
(land_write_ctf): Likewise.
binutils/
* objdump.c (ctf_archive_member): Print CTF errors and warnings.
* readelf.c (dump_ctf_archive_member): Likewise.
|
|
The libctf machinery currently only provides one way to iterate over its
data structures: ctf_*_iter functions that take a callback and an arg
and repeatedly call it.
This *works*, but if you are doing a lot of iteration it is really quite
inconvenient: you have to package up your local variables into
structures over and over again and spawn lots of little functions even
if it would be clearer in a single run of code. Look at ctf-string.c
for an extreme example of how unreadable this can get, with
three-line-long functions proliferating wildly.
The deduplicator takes this to the Nth level. It iterates over a whole
bunch of things: if we'd had to use _iter-class iterators for all of
them there would be twenty additional functions in the deduplicator
alone, for no other reason than that the iterator API requires it.
Let's do something better. strtok_r gives us half the design: generators
in a number of other languages give us the other half.
The *_next API allows you to iterate over CTF-like entities in a single
function using a normal while loop. e.g. here we are iterating over all
the types in a dict:
ctf_next_t *i = NULL;
int *hidden;
ctf_id_t id;
while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR)
{
/* do something with 'hidden' and 'id' */
}
if (ctf_errno (fp) != ECTF_NEXT_END)
/* iteration error */
Here we are walking through the members of a struct with CTF ID
'struct_type':
ctf_next_t *i = NULL;
ssize_t offset;
const char *name;
ctf_id_t membtype;
while ((offset = ctf_member_next (fp, struct_type, &i, &name,
&membtype)) >= 0
{
/* do something with offset, name, and membtype */
}
if (ctf_errno (fp) != ECTF_NEXT_END)
/* iteration error */
Like every other while loop, this means you have access to all the local
variables outside the loop while inside it, with no need to tiresomely
package things up in structures, move the body of the loop into a
separate function, etc, as you would with an iterator taking a callback.
ctf_*_next allocates 'i' for you on first entry (when it must be NULL),
and frees and NULLs it and returns a _next-dependent flag value when the
iteration is over: the fp errno is set to ECTF_NEXT_END when the
iteartion ends normally. If you want to exit early, call
ctf_next_destroy on the iterator. You can copy iterators using
ctf_next_copy, which copies their current iteration position so you can
remember loop positions and go back to them later (or ctf_next_destroy
them if you don't need them after all).
Each _next function returns an always-likely-to-be-useful property of
the thing being iterated over, and takes pointers to parameters for the
others: with very few exceptions all those parameters can be NULLs if
you're not interested in them, so e.g. you can iterate over only the
offsets of members of a structure this way:
while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0)
If you pass an iterator in use by one iteration function to another one,
you get the new error ECTF_NEXT_WRONGFUN back; if you try to change
ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back.
Internally the ctf_next_t remembers the iteration function in use,
various sizes and increments useful for almost all iterations, then
uses unions to overlap the actual entities being iterated over to keep
ctf_next_t size down.
Iterators available in the public API so far (all tested in actual use
in the deduplicator):
/* Iterate over the members of a STRUCT or UNION, returning each member's
offset and optionally name and member type in turn. On end-of-iteration,
returns -1. */
ssize_t
ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it,
const char **name, ctf_id_t *membtype);
/* Iterate over the members of an enum TYPE, returning each enumerand's
NAME or NULL at end of iteration or error, and optionally passing
back the enumerand's integer VALue. */
const char *
ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it,
int *val);
/* Iterate over every type in the given CTF container (not including
parents), optionally including non-user-visible types, returning
each type ID and optionally the hidden flag in turn. Returns CTF_ERR
on end of iteration or error. */
ctf_id_t
ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag,
int want_hidden);
/* Iterate over every variable in the given CTF container, in arbitrary
order, returning the name and type of each variable in turn. The
NAME argument is not optional. Returns CTF_ERR on end of iteration
or error. */
ctf_id_t
ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name);
/* Iterate over all CTF files in an archive, returning each dict in turn as a
ctf_file_t, and NULL on error or end of iteration. It is the caller's
responsibility to close it. Parent dicts may be skipped. Regardless of
whether they are skipped or not, the caller must ctf_import the parent if
need be. */
ctf_file_t *
ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it,
const char **name, int skip_parent, int *errp);
ctf_label_next is prototyped but not implemented yet.
include/
* ctf-api.h (ECTF_NEXT_END): New error.
(ECTF_NEXT_WRONGFUN): Likewise.
(ECTF_NEXT_WRONGFP): Likewise.
(ECTF_NERR): Adjust.
(ctf_next_t): New.
(ctf_next_create): New prototype.
(ctf_next_destroy): Likewise.
(ctf_next_copy): Likewise.
(ctf_member_next): Likewise.
(ctf_enum_next): Likewise.
(ctf_type_next): Likewise.
(ctf_label_next): Likewise.
(ctf_variable_next): Likewise.
libctf/
* ctf-impl.h (ctf_next): New.
(ctf_get_dict): New prototype.
* ctf-lookup.c (ctf_get_dict): New, split out of...
(ctf_lookup_by_id): ... here.
* ctf-util.c (ctf_next_create): New.
(ctf_next_destroy): New.
(ctf_next_copy): New.
* ctf-types.c (includes): Add <assert.h>.
(ctf_member_next): New.
(ctf_enum_next): New.
(ctf_type_iter): Document the lack of iteration over parent
types.
(ctf_type_next): New.
(ctf_variable_next): New.
* ctf-archive.c (ctf_archive_next): New.
* libctf.ver: Add new public functions.
|
|
This allows you to bump the refcount on a ctf_file_t, so that you can
smuggle it out of iterators which open and close the ctf_file_t for you
around the loop body (like ctf_archive_iter).
You still can't use this to preserve a ctf_file_t for longer than the
lifetime of its containing entity (e.g. ctf_archive).
include/
* ctf-api.h (ctf_ref): New.
libctf/
* libctf.ver (ctf_ref): New.
* ctf-open.c (ctf_ref): Implement it.
|
|
Another count that was otherwise unavailable without doing expensive
operations.
include/
* ctf-api.h (ctf_archive_count): New.
libctf/
* ctf-archive.c (ctf_archive_count): New.
* libctf.ver: New public function.
|
|
This returns the number of members in a struct or union, or the number
of enumerations in an enum. (This was only available before now by
iterating across every member, but it can be returned much faster than
that.)
include/
* ctf-api.h (ctf_member_count): New.
libctf/
* ctf-types.c (ctf_member_count): New.
* libctf.ver: New public function.
|
|
This is just like ctf_type_kind, except that forwards get the
type of the thing being pointed to rather than CTF_K_FORWARD.
include/
* ctf-api.h (ctf_type_kind_forwarded): New.
libctf/
* ctf-types.c (ctf_type_kind_forwarded): New.
|
|
We already have a function ctf_type_aname_raw, which returns the raw
name of a type with no decoration for structures or arrays or anything
like that: just the underlying name of whatever it is that's being
ultimately pointed at.
But this can be inconvenient to use, becauswe it always allocates new
storage for the string and copies it in, so it can potentially fail.
Add ctf_type_name_raw, which just returns the string directly out of
libctf's guts: it will live until the ctf_file_t is closed (if we later
gain the ability to remove types from writable dicts, it will live as
long as the type lives).
Reimplement ctf_type_aname_raw in terms of it.
include/
* ctf-api.c (ctf_type_name_raw): New.
libctf/
* ctf-types.c (ctf_type_name_raw): New.
(ctf_type_aname_raw): Reimplement accordingly.
|
|
objdump and readelf have one major CTF-related behavioural difference:
objdump can read .ctf sections that contain CTF archives and extract and
dump their members, while readelf cannot. Since the linker often emits
CTF archives, this means that readelf intermittently and (from the
user's perspective) randomly fails to read CTF in files that ld emits,
with a confusing error message wrongly claiming that the CTF content is
corrupt. This is purely because the archive-opening code in libctf was
needlessly tangled up with the BFD code, so readelf couldn't use it.
Here, we disentangle it, moving ctf_new_archive_internal from
ctf-open-bfd.c into ctf-archive.c and merging it with the helper
function in ctf-archive.c it was already using. We add a new public API
function ctf_arc_bufopen, that looks very like ctf_bufopen but returns
an archive given suitable section data rather than a ctf_file_t: the
archive is a ctf_archive_t, so it can be called on raw CTF dictionaries
(with no archive present) and will return a single-member synthetic
"archive".
There is a tiny lifetime tweak here: before now, the archive code could
assume that the symbol section in the ctf_archive_internal wrapper
structure was always owned by BFD if it was present and should always be
freed: now, the caller can pass one in via ctf_arc_bufopen, wihch has
the usual lifetime rules for such sections (caller frees): so we add an
extra field to track whether this is an internal call from ctf-open-bfd,
in which case we still free the symbol section.
include/
* ctf-api.h (ctf_arc_bufopen): New.
libctf/
* ctf-impl.h (ctf_new_archive_internal): Declare.
(ctf_arc_bufopen): Remove.
(ctf_archive_internal) <ctfi_free_symsect>: New.
* ctf-archive.c (ctf_arc_close): Use it.
(ctf_arc_bufopen): Fuse into...
(ctf_new_archive_internal): ... this, moved across from...
* ctf-open-bfd.c: ... here.
(ctf_bfdopen_ctfsect): Use ctf_arc_bufopen.
* libctf.ver: Add it.
binutils/
* readelf.c (dump_section_as_ctf): Support .ctf archives using
ctf_arc_bufopen. Automatically load the .ctf member of such
archives as the parent of all other members, unless specifically
overridden via --ctf-parent. Split out dumping code into...
(dump_ctf_archive_member): ... here, as in objdump, and call
it once per archive member.
(dump_ctf_indent_lines): Code style fix.
|
|
|
|
This lets other programs read and write CTF-format data.
Two versioned shared libraries are created: libctf.so and
libctf-nobfd.so. They contain identical content except that
libctf-nobfd.so contains no references to libbfd and does not implement
ctf_open, ctf_fdopen, ctf_bfdopen or ctf_bfdopen_ctfsect, so it can be
used by programs that cannot use BFD, like readelf.
The soname major version is presently .0 until the linker API
stabilizes, when it will flip to .1 and hopefully never change again.
New in v3.
v4: libtoolize and turn into a pair of shared libraries. Drop
--enable-install-ctf: now controlled by --enable-shared and
--enable-install-libbfd, like everything else.
v5: Add ../bfd to ACLOCAL_AMFLAGS and AC_CONFIG_MACRO_DIR. Fix tabdamage.
* Makefile.def (host_modules): libctf is no longer no_install.
* Makefile.in: Regenerated.
libctf/
* configure.ac (AC_DISABLE_SHARED): New, like opcodes/.
(LT_INIT): Likewise.
(AM_INSTALL_LIBBFD): Likewise.
(dlopen): Note why this is necessary in a comment.
(SHARED_LIBADD): Initialize for possibly-PIC libiberty: derived from
opcodes/.
(SHARED_LDFLAGS): Likewise.
(BFD_LIBADD): Likewise, for libbfd.
(BFD_DEPENDENCIES): Likewise.
(VERSION_FLAGS): Initialize, using a version script if ld supports
one, or libtool -export-symbols-regex otherwise.
(AC_CONFIG_MACRO_DIR): Add ../BFD.
* Makefile.am (ACLOCAL_AMFLAGS): Likewise.
(INCDIR): New.
(AM_CPPFLAGS): Use $(srcdir), not $(top_srcdir).
(noinst_LIBRARIES): Replace with...
[INSTALL_LIBBFD] (lib_LTLIBRARIES): This, or...
[!INSTALL_LIBBFD] (noinst_LTLIBRARIES): ... this, mentioning new
libctf-nobfd.la as well.
[INSTALL_LIBCTF] (include_HEADERS): Add the CTF headers.
[!INSTALL_LIBCTF] (include_HEADERS): New, empty.
(libctf_a_SOURCES): Rename to...
(libctf_nobfd_la_SOURCES): ... this, all of libctf other than
ctf-open-bfd.c.
(libctf_la_SOURCES): Now derived from libctf_nobfd_la_SOURCES,
with ctf-open-bfd.c added.
(libctf_nobfd_la_LIBADD): New, using @SHARED_LIBADD@.
(libctf_la_LIBADD): New, using @BFD_LIBADD@ as well.
(libctf_la_DEPENDENCIES): New, using @BFD_DEPENDENCIES@.
* Makefile.am [INSTALL_LIBCTF]: Use it.
* aclocal.m4: Add ../bfd/acinclude.m4, ../config/acx.m4, and the
libtool macros.
* libctf.ver: New, everything is version LIBCTF_1.0 currently (even
the unstable components).
* Makefile.in: Regenerated.
* config.h.in: Likewise.
* configure: Likewise.
binutils/
* Makefile.am (LIBCTF): Mention the .la file.
(LIBCTF_NOBFD): New.
(readelf_DEPENDENCIES): Use it.
(readelf_LDADD): Likewise.
* Makefile.in: Regenerated.
ld/
* configure.ac (TESTCTFLIB): Set to the .so or .a, like TESTBFDLIB.
* Makefile.am (TESTCTFLIB): Use it.
(LIBCTF): Use the .la file.
(check-DEJAGNU): Use it.
* Makefile.in: Regenerated.
* configure: Likewise.
include/
* ctf-api.h: Note the instability of the ctf_link interfaces.
|