Age | Commit message (Collapse) | Author | Files | Lines |
|
When you look up a type by name using ctf_lookup_by_name, in most cases
libctf can just strip off any qualifiers and look for the name, but for
pointer types this doesn't work, since the caller will want the pointer
type itself. But pointer types are nameless, and while they cite the
types they point to, looking up a type by name requires a link going the
*other way*, from the type pointed to to the pointer type that points to
it.
libctf has always built this up at open time: ctf_ptrtab is an array of
type indexes pointing from the index of every type to the index of the
type that points to it. But because it is built up at open time (and
because it uses type indexes and not type IDs) it is restricted to
working within a single dict and ignoring parent/child
relationships. This is normally invisible, unless you manage to get a
dict with a type in the parent but the only pointer to it in a child.
The ctf_ptrtab will not track this relationship, so lookups of this
pointer type by name will fail. Since which type is in the parent and
which in the child is largely opaque to the user (which goes where is up
to the deduplicator, and it can and does reshuffle things to save
space), this leads to a very bad user experience, with an
obviously-visible pointer type which ctf_lookup_by_name claims doesn't
exist.
The fix is to have another array, ctf_pptrtab, which is populated in
child dicts: like the parent's ctf_ptrtab, it has one element per type
in the parent, but is all zeroes except for those types which are
pointed to by types in the child: so it maps parent dict indices to
child dict indices. The array is grown, and new child types scanned,
whenever a lookup happens and new types have been added to the child
since the last time a lookup happened that might need the pptrtab.
(So for non-writable dicts, this only happens once, since new types
cannot be added to non-writable dicts at all.)
Since this introduces new complexity (involving updating only part of
the ctf_pptrtab) which is only seen when a writable dict is in use, we
introduce a new libctf-writable testsuite that contains lookup tests
with no corresponding CTF-containing .c files (which can thus be run
even on platforms with no .ctf-section support in the linker yet), and
add a test to check that creation of pointers in children to types in
parents and a following lookup by name works as expected. The non-
writable case is tested in a new libctf-regression testsuite which is
used to track now-fixed outright bugs in libctf.
libctf/ChangeLog
2021-01-05 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_pptrtab>: New.
<ctf_pptrtab_len>: New.
<ctf_pptrtab_typemax>: New.
* ctf-create.c (ctf_serialize): Update accordingly.
(ctf_add_reftype): Note that we don't need to update pptrtab here,
despite updating ptrtab.
* ctf-open.c (ctf_dict_close): Destroy the pptrtab.
(ctf_import): Likewise.
(ctf_import_unref): Likewise.
* ctf-lookup.c (grow_pptrtab): New.
(refresh_pptrtab): New, update a pptrtab.
(ctf_lookup_by_name): Turn into a wrapper around (and rename to)...
(ctf_lookup_by_name_internal): ... this: construct the pptrtab, and
use it in addition to the parent's ptrtab when parent dicts are
searched.
* testsuite/libctf-regression/regression.exp: New testsuite for
regression tests.
* testsuite/libctf-regression/pptrtab*: New test.
* testsuite/libctf-writable/writable.exp: New testsuite for tests of
writable CTF dicts.
* testsuite/libctf-writable/pptrtab*: New test.
|
|
libctf has no intrinsic support for the GCC unnamed structure member
extension. This principally means that you can't look up named members
inside unnamed struct or union members via ctf_member_info: you have to
tiresomely find out the type ID of the unnamed members via iteration,
then look in each of these.
This is ridiculous. Fix it by extending ctf_member_info so that it
recurses into unnamed members for you: this is still unambiguous because
GCC won't let you create ambiguously-named members even in the presence
of this extension.
For consistency, and because the release hasn't happened and we can
still do this, break the ctf_member_next API and add flags: we specify
one flag, CTF_MN_RECURSE, which if set causes ctf_member_next to
automatically recurse into unnamed members for you, returning not only
the members themselves but all their contained members, so that you can
use ctf_member_next to identify every member that it would be valid to
call ctf_member_info with.
New lookup tests are added for all of this.
include/ChangeLog
2021-01-05 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_MN_RECURSE): New.
(ctf_member_next): Add flags argument.
libctf/ChangeLog
2021-01-05 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (struct ctf_next) <u.ctn_next>: Move to...
<ctn_next>: ... here.
* ctf-util.c (ctf_next_destroy): Unconditionally destroy it.
* ctf-lookup.c (ctf_symbol_next): Adjust accordingly.
* ctf-types.c (ctf_member_iter): Reimplement in terms of...
(ctf_member_next): ... this. Support recursive unnamed member
iteration (off by default).
(ctf_member_info): Look up members in unnamed sub-structs.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_member_next call.
(ctf_dedup_emit_struct_members): Likewise.
* testsuite/libctf-lookup/struct-iteration-ctf.c: Test empty unnamed
members, and a normal member after the end.
* testsuite/libctf-lookup/struct-iteration.c: Verify that
ctf_member_count is consistent with the number of successful returns
from a non-recursive ctf_member_next.
* testsuite/libctf-lookup/struct-iteration-*: New, test iteration
over struct members.
* testsuite/libctf-lookup/struct-lookup.c: New test.
* testsuite/libctf-lookup/struct-lookup.lk: New test.
|
|
|
|
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
|
|
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
|
|
isspace() notoriously takes an int, not a char. Cast uses
appropriately.
libctf/
* ctf-lookup.c (ctf_lookup_by_name): Adjust.
|
|
The libctf machinery currently only provides one way to iterate over its
data structures: ctf_*_iter functions that take a callback and an arg
and repeatedly call it.
This *works*, but if you are doing a lot of iteration it is really quite
inconvenient: you have to package up your local variables into
structures over and over again and spawn lots of little functions even
if it would be clearer in a single run of code. Look at ctf-string.c
for an extreme example of how unreadable this can get, with
three-line-long functions proliferating wildly.
The deduplicator takes this to the Nth level. It iterates over a whole
bunch of things: if we'd had to use _iter-class iterators for all of
them there would be twenty additional functions in the deduplicator
alone, for no other reason than that the iterator API requires it.
Let's do something better. strtok_r gives us half the design: generators
in a number of other languages give us the other half.
The *_next API allows you to iterate over CTF-like entities in a single
function using a normal while loop. e.g. here we are iterating over all
the types in a dict:
ctf_next_t *i = NULL;
int *hidden;
ctf_id_t id;
while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR)
{
/* do something with 'hidden' and 'id' */
}
if (ctf_errno (fp) != ECTF_NEXT_END)
/* iteration error */
Here we are walking through the members of a struct with CTF ID
'struct_type':
ctf_next_t *i = NULL;
ssize_t offset;
const char *name;
ctf_id_t membtype;
while ((offset = ctf_member_next (fp, struct_type, &i, &name,
&membtype)) >= 0
{
/* do something with offset, name, and membtype */
}
if (ctf_errno (fp) != ECTF_NEXT_END)
/* iteration error */
Like every other while loop, this means you have access to all the local
variables outside the loop while inside it, with no need to tiresomely
package things up in structures, move the body of the loop into a
separate function, etc, as you would with an iterator taking a callback.
ctf_*_next allocates 'i' for you on first entry (when it must be NULL),
and frees and NULLs it and returns a _next-dependent flag value when the
iteration is over: the fp errno is set to ECTF_NEXT_END when the
iteartion ends normally. If you want to exit early, call
ctf_next_destroy on the iterator. You can copy iterators using
ctf_next_copy, which copies their current iteration position so you can
remember loop positions and go back to them later (or ctf_next_destroy
them if you don't need them after all).
Each _next function returns an always-likely-to-be-useful property of
the thing being iterated over, and takes pointers to parameters for the
others: with very few exceptions all those parameters can be NULLs if
you're not interested in them, so e.g. you can iterate over only the
offsets of members of a structure this way:
while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0)
If you pass an iterator in use by one iteration function to another one,
you get the new error ECTF_NEXT_WRONGFUN back; if you try to change
ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back.
Internally the ctf_next_t remembers the iteration function in use,
various sizes and increments useful for almost all iterations, then
uses unions to overlap the actual entities being iterated over to keep
ctf_next_t size down.
Iterators available in the public API so far (all tested in actual use
in the deduplicator):
/* Iterate over the members of a STRUCT or UNION, returning each member's
offset and optionally name and member type in turn. On end-of-iteration,
returns -1. */
ssize_t
ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it,
const char **name, ctf_id_t *membtype);
/* Iterate over the members of an enum TYPE, returning each enumerand's
NAME or NULL at end of iteration or error, and optionally passing
back the enumerand's integer VALue. */
const char *
ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it,
int *val);
/* Iterate over every type in the given CTF container (not including
parents), optionally including non-user-visible types, returning
each type ID and optionally the hidden flag in turn. Returns CTF_ERR
on end of iteration or error. */
ctf_id_t
ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag,
int want_hidden);
/* Iterate over every variable in the given CTF container, in arbitrary
order, returning the name and type of each variable in turn. The
NAME argument is not optional. Returns CTF_ERR on end of iteration
or error. */
ctf_id_t
ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name);
/* Iterate over all CTF files in an archive, returning each dict in turn as a
ctf_file_t, and NULL on error or end of iteration. It is the caller's
responsibility to close it. Parent dicts may be skipped. Regardless of
whether they are skipped or not, the caller must ctf_import the parent if
need be. */
ctf_file_t *
ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it,
const char **name, int skip_parent, int *errp);
ctf_label_next is prototyped but not implemented yet.
include/
* ctf-api.h (ECTF_NEXT_END): New error.
(ECTF_NEXT_WRONGFUN): Likewise.
(ECTF_NEXT_WRONGFP): Likewise.
(ECTF_NERR): Adjust.
(ctf_next_t): New.
(ctf_next_create): New prototype.
(ctf_next_destroy): Likewise.
(ctf_next_copy): Likewise.
(ctf_member_next): Likewise.
(ctf_enum_next): Likewise.
(ctf_type_next): Likewise.
(ctf_label_next): Likewise.
(ctf_variable_next): Likewise.
libctf/
* ctf-impl.h (ctf_next): New.
(ctf_get_dict): New prototype.
* ctf-lookup.c (ctf_get_dict): New, split out of...
(ctf_lookup_by_id): ... here.
* ctf-util.c (ctf_next_create): New.
(ctf_next_destroy): New.
(ctf_next_copy): New.
* ctf-types.c (includes): Add <assert.h>.
(ctf_member_next): New.
(ctf_enum_next): New.
(ctf_type_iter): Document the lack of iteration over parent
types.
(ctf_type_next): New.
(ctf_variable_next): New.
* ctf-archive.c (ctf_archive_next): New.
* libctf.ver: Add new public functions.
|
|
|
|
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
|
|
The first two of these allow you to get function type info and args out
of the types section give a type ID: astonishingly, this was missing
from libctf before now: so even though types of kind CTF_K_FUNCTION were
supported, you couldn't find out anything about them. (The existing
ctf_func_info and ctf_func_args only allow you to get info about
functions in the function section, i.e. given symbol table indexes, not
type IDs.)
The second of these allows you to get the raw undecorated name out of
the CTF section (strdupped for safety) without traversing subtypes to
build a full C identifier out of it. It's useful for things that are
already tracking the type kind etc and just need an unadorned name.
include/
* ctf-api.h (ECTF_NOTFUNC): Fix description.
(ctf_func_type_info): New.
(ctf_func_type_args): Likewise.
libctf/
* ctf-types.c (ctf_type_aname_raw): New.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-error.c (_ctf_errlist): Fix description.
|
|
Not all platforms have it. Use libiberty xstrndup() instead.
(The include of libiberty.h happens in an unusual place due to the
requirements of synchronization of most source files between this
project and another that does not use libiberty. It serves to pull
libiberty.h in for all source files in libctf/, which does the trick.)
Tested on x86_64-pc-linux-gnu, x86_64-unknown-freebsd12.0,
sparc-sun-solaris2.11, i686-pc-cygwin, i686-w64-mingw32.
libctf/
* ctf-decls.h: Include <libiberty.h>.
* ctf-lookup.c (ctf_lookup_by_name): Call xstrndup(), not strndup().
|
|
- Use of nonportable <endian.h>
- Use of qsort_r
- Use of zlib without appropriate magic to pull in the binutils zlib
- Use of off64_t without checking (fixed by dropping the unused fields
that need off64_t entirely)
- signedness problems due to long being too short a type on 32-bit
platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be
used only for functions that return ctf_id_t
- One lingering use of bzero() and of <sys/errno.h>
All fixed, using code from gnulib where possible.
Relatedly, set cts_size in a couple of places it was missed
(string table and symbol table loading upon ctf_bfdopen()).
binutils/
* objdump.c (make_ctfsect): Drop cts_type, cts_flags, and
cts_offset.
* readelf.c (shdr_to_ctf_sect): Likewise.
include/
* ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset.
(ctf_id_t): This is now an unsigned type.
(CTF_ERR): Cast it to ctf_id_t. Note that it should only be used
for ctf_id_t-returning functions.
libctf/
* Makefile.am (ZLIB): New.
(ZLIBINC): Likewise.
(AM_CFLAGS): Use them.
(libctf_a_LIBADD): New, for LIBOBJS.
* configure.ac: Check for zlib, endian.h, and qsort_r.
* ctf-endian.h: New, providing htole64 and le64toh.
* swap.h: Code style fixes.
(bswap_identity_64): New.
* qsort_r.c: New, from gnulib (with one added #include).
* ctf-decls.h: New, providing a conditional qsort_r declaration,
and unconditional definitions of MIN and MAX.
* ctf-impl.h: Use it. Do not use <sys/errno.h>.
(ctf_set_errno): Now returns unsigned long.
* ctf-util.c (ctf_set_errno): Adjust here too.
* ctf-archive.c: Use ctf-endian.h.
(ctf_arc_open_by_offset): Use memset, not bzero. Drop cts_type,
cts_flags and cts_offset.
(ctf_arc_write): Drop debugging dependent on the size of off_t.
* ctf-create.c: Provide a definition of roundup if not defined.
(ctf_create): Drop cts_type, cts_flags and cts_offset.
(ctf_add_reftype): Do not check if type IDs are below zero.
(ctf_add_slice): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_member_offset): Cast error-returning ssize_t's to size_t
when known error-free. Drop CTF_ERR usage for functions returning
int.
(ctf_add_member_encoded): Drop CTF_ERR usage for functions returning
int.
(ctf_add_variable): Likewise.
(enumcmp): Likewise.
(enumadd): Likewise.
(membcmp): Likewise.
(ctf_add_type): Likewise. Cast error-returning ssize_t's to size_t
when known error-free.
* ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions
returning int: use CTF_ERR for functions returning ctf_type_id.
(ctf_dump_label): Likewise.
(ctf_dump_objts): Likewise.
* ctf-labels.c (ctf_label_topmost): Likewise.
(ctf_label_iter): Likewise.
(ctf_label_info): Likewise.
* ctf-lookup.c (ctf_func_args): Likewise.
* ctf-open.c (upgrade_types): Cast to size_t where appropriate.
(ctf_bufopen): Likewise. Use zlib types as needed.
* ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions
returning int.
(ctf_enum_iter): Likewise.
(ctf_type_size): Likewise.
(ctf_type_align): Likewise. Cast to size_t where appropriate.
(ctf_type_kind_unsliced): Likewise.
(ctf_type_kind): Likewise.
(ctf_type_encoding): Likewise.
(ctf_member_info): Likewise.
(ctf_array_info): Likewise.
(ctf_enum_value): Likewise.
(ctf_type_rvisit): Likewise.
* ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and
cts_offset.
(ctf_simple_open): Likewise.
(ctf_bfdopen_ctfsect): Likewise. Set cts_size properly.
* Makefile.in: Regenerate.
* aclocal.m4: Likewise.
* config.h: Likewise.
* configure: Likewise.
|
|
These functions allow you to look up types given a name in a simple
subset of C declarator syntax (no function pointers), to look up the
types of variables given a name, and to look up the types of data
objects and the type signatures of functions given symbol table offsets.
(Despite its name, one function in this commit, ctf_lookup_symbol_name(),
is for the internal use of libctf only, and does not appear in any
public header files.)
libctf/
* ctf-lookup.c (isqualifier): New.
(ctf_lookup_by_name): Likewise.
(struct ctf_lookup_var_key): Likewise.
(ctf_lookup_var): Likewise.
(ctf_lookup_variable): Likewise.
(ctf_lookup_symbol_name): Likewise.
(ctf_lookup_by_symbol): Likewise.
(ctf_func_info): Likewise.
(ctf_func_args): Likewise.
include/
* ctf-api.h (ctf_func_info): New.
(ctf_func_args): Likewise.
(ctf_lookup_by_symbol): Likewise.
(ctf_lookup_by_symbol): Likewise.
(ctf_lookup_variable): Likewise.
|
|
The CTF creation process looks roughly like (error handling elided):
int err;
ctf_file_t *foo = ctf_create (&err);
ctf_id_t type = ctf_add_THING (foo, ...);
ctf_update (foo);
ctf_*write (...);
Some ctf_add_THING functions accept other type IDs as arguments,
depending on the type: cv-quals, pointers, and structure and union
members all take other types as arguments. So do 'slices', which
let you take an existing integral type and recast it as a type
with a different bitness or offset within a byte, for bitfields.
One class of THING is not a type: "variables", which are mappings
of names (in the internal string table) to types. These are mostly
useful when encoding variables that do not appear in a symbol table
but which some external user has some other way to figure out the
address of at runtime (dynamic symbol lookup or querying a VM
interpreter or something).
You can snapshot the creation process at any point: rolling back to a
snapshot deletes all types and variables added since that point.
You can make arbitrary type queries on the CTF container during the
creation process, but you must call ctf_update() first, which
translates the growing dynamic container into a static one (this uses
the CTF opening machinery, added in a later commit), which is quite
expensive. This function must also be called after adding types
and before writing the container out.
Because addition of types involves looking up existing types, we add a
little of the type lookup machinery here, as well: only enough to
look up types in dynamic containers under construction.
libctf/
* ctf-create.c: New file.
* ctf-lookup.c: New file.
include/
* ctf-api.h (zlib.h): New include.
(ctf_sect_t): New.
(ctf_sect_names_t): Likewise.
(ctf_encoding_t): Likewise.
(ctf_membinfo_t): Likewise.
(ctf_arinfo_t): Likewise.
(ctf_funcinfo_t): Likewise.
(ctf_lblinfo_t): Likewise.
(ctf_snapshot_id_t): Likewise.
(CTF_FUNC_VARARG): Likewise.
(ctf_simple_open): Likewise.
(ctf_bufopen): Likewise.
(ctf_create): Likewise.
(ctf_add_array): Likewise.
(ctf_add_const): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_float): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_function): Likewise.
(ctf_add_integer): Likewise.
(ctf_add_slice): Likewise.
(ctf_add_pointer): Likewise.
(ctf_add_type): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_restrict): Likewise.
(ctf_add_struct): Likewise.
(ctf_add_union): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_volatile): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_member_encoded): Likewise.
(ctf_add_variable): Likewise.
(ctf_set_array): Likewise.
(ctf_update): Likewise.
(ctf_snapshot): Likewise.
(ctf_rollback): Likewise.
(ctf_discard): Likewise.
(ctf_write): Likewise.
(ctf_gzwrite): Likewise.
(ctf_compress_write): Likewise.
|