Age | Commit message (Collapse) | Author | Files | Lines |
|
ctf-link.c is unnecessarily confusing because ctf_link_lazy_open is
positioned near functions that have nothing to do with opening files.
Move it around, and fix some tabdamage that's crept in lately.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_lazy_open): Move up in the file, to near
ctf_link_add_ctf.
* ctf-lookup.c (ctf_lookup_symbol_idx): Repair tabdamage.
(ctf_lookup_by_sym_or_name): Likewise.
* testsuite/libctf-lookup/struct-iteration.c: Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Likewise.
|
|
This is a tricky one. BFD, on the linker's behalf, reports symbols to
libctf via the ctf_new_symbol and ctf_new_dynsym callbacks, which
ultimately call ctf_link_add_linker_symbol. But while this happens
after strtab offsets are finalized, it happens before the .dynstr is
actually laid out, so we can't iterate over it at this stage and
it is not clear what the reported symbols are actually called. So
a second callback, examine_strtab, is called after the .dynstr is
finalized, which calls ctf_link_add_strtab and ultimately leads
to ldelf_ctf_strtab_iter_cb being called back repeatedly until the
offsets of every string in the .dynstr is passed to libctf.
libctf can then use this to get symbol names out of the input (which
usually stores symbol types in the form of a name -> type mapping at
this stage) and extract the types of those symbols, feeding them back
into their final form as a 1:1 association with the real symtab's
STT_OBJ and STT_FUNC symbols (with a few skipped, see
ctf_symtab_skippable).
This representation is compact, but has one problem: if libctf somehow
gets confused about the st_type of a symbol, it'll stick an entry into
the function symtypetab when it should put it into the object
symtypetab, or vice versa, and *every symbol from that one on* will have
the wrong CTF type because it's actually looking up the type for a
different symbol.
And we have just such a bug. ctf_link_add_strtab was not taking the
refcounts of strings into consideration, so even strings that had been
eliminated from the strtab by virtue of being in objects eliminated via
--as-needed etc were being reported. This is harmful because it can
lead to multiple strings with the same apparent offset, and if the last
duplicate to be reported relates to an eliminated symbol, we look up the
wrong symbol from the input and gets its type wrong: if it's unlucky and
the eliminated symbol is also of the wrong st_type, we will end up with
a corrupted symtypetab.
Thankfully the wrong-st_type case is already diagnosed by a
this-can-never-happen paranoid warning:
CTF warning: Symbol 61a added to CTF as a function but is of type 1
or the converse
* CTF warning: Symbol a3 added to CTF as a data object but is of type 2
so at least we can tell when the corruption has spread to more than one
symbol's type.
Skipping zero-refcounted strings is easy: teach _bfd_elf_strtab_str to
skip them, and ldelf_ctf_strtab_iter_cb to loop over skipped strings
until it falls off the end or finds one that isn't skipped.
bfd/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* elf-strtab.c (_bfd_elf_strtab_str): Skip strings with zero refcount.
ld/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ldelfgen.c (ldelf_ctf_strtab_iter_cb): Skip zero-refcount strings.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-create.c (symtypetab_density): Report the symbol name as
well as index in the name != object error; note the likely
consequences.
* ctf-link.c (ctf_link_shuffle_syms): Report the symbol index
as well as name.
|
|
In the "no symbols" case (commonplace for executables), we were freeing
the ctf_dynsyms using free(), instead of ctf_dynhash_destroy(), leaking
a little memory.
(This is harmless in the common case of ld usage, but libctf might be
used by persistent processes too.)
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_shuffle_syms): Free ctf_dynsyms properly.
|
|
A transient bug in the preceding change (fixed before commit) exposed a
new failure, of ld/testsuite/ld-ctf/diag-parname.d. This attempts to
ensure that if we link a dict with child type IDs but no attached
parent, we get a suitable ECTF_NOPARENT error. This was happening
before this commit, but only by chance, because ctf_variable_iter and
ctf_variable_next check to see if the dict they're passed is a child
dict without an associated parent. We forgot error-checking on the
ctf_variable_next call, and as a result this was concealed -- and
looking for the problem exposed a new bug.
If any of the lookups beneath ctf_dedup_hash_type fail, the CTF link
does *not* fail, but acts quite bizarrely, skipping the type but
emitting an error to the CTF error/warning log -- so the linker will
report an error, emit a partial CTF dict missing some types, and exit
with exitcode 0 as if nothing went wrong. Since ctf_dedup_hash_type is
never expected to fail in normal operation, this is surely wrong:
failures at emission time do not emit partial CTF dicts, so failures
at hashing time should not either.
So propagate the error back up.
Also fix a couple of smaller bugs where we fail to properly free things
and/or propagate error codes on various rare link-time errors and
out-of-memory conditions.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-dedup.c (ctf_dedup): Pass on errors from ctf_dedup_hash_type.
Call ctf_dedup_fini properly on other errors.
(ctf_dedup_emit_type): Set the errno on dynhash insertion failure.
* ctf-link.c (ctf_link_deduplicating_per_cu): Close outputs beyond
output 0 when asserting because >1 output is found.
(ctf_link_deduplicating): Likewise, when asserting because the
shared output is not the same as the passed-in fp.
|
|
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
|
|
There is no such thing, and the comment makes no sense, and doesn't
match what the code is doing. We always want to put variables in the
same dicts as the types they relate to if at all possible.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_one_variable): Remove reference to
"unconflicted link mode".
|
|
The nondeduplicating CTF linker was kept around when the deduplicating
one was added so that people had something to fall back to in case the
deduplicating linker turned out to be buggy. It's now much more stable
than the nondeduplicating linker, in addition to much faster, using much
less memory and producing much better output. In addition, while
libctf has a linker flag to invoke the nondeduplicating linker, ld does
not expose it: the only way to turn it on within ld is an intentionally-
undocumented environment variable. So we can remove it without any ABI
or user-visibility concerns (the only thing we leave around is the
CTF_LINK_NONDEDUP flag, which can easily be interpreted as "deduplicate
less", though right now it does nothing).
This lets us remove a lot of complexity associated with tracking
filenames and CU names separately (something the deduplcating linker
never bothered with, since the cunames are always reliable and ld never
hands us useful filenames anyway)
The biggest lacuna left behind is the ctf_type_mapping machinery, which
slows down deduplicating links quite a lot. We can't just ditch it
because ctf_add_type uses it: removing the slowdown from the
deduplicating linker is a job for another commit.
include/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_SHARE_DUPLICATED): Note that this might
merely change how much deduplication is done.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_create_per_cu): Drop FILENAME now that it is
always identical to CUNAME.
(ctf_link_deduplicating_one_symtypetab): Adjust.
(ctf_link_one_type): Remove.
(ctf_link_one_input_archive_member): Likewise.
(ctf_link_close_one_input_archive): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link): No longer call it. Drop CTF_LINK_NONDEDUP path.
Improve header comment a bit (dicts, not files). Adjust
ctf_create_per_cu call.
(ctf_link_deduplicating_variables): Simplify.
(ctf_link_in_member_cb_arg_t) <cu_name>: Remove.
<in_input_cu_file>: Likewise.
<in_fp_parent>: Likewise.
<done_parent>: Likewise.
(ctf_link_one_variable): Turn uses of in_file_name to in_cuname.
|
|
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
|
|
In the last cycle there have been various changes that have replaced
parts of the CTF format with other parts without format
compatibility. This was not a compat break, because the old format was
never accepted by any version of libctf (the not-in-official-release CTF
compiler patch was emitting an invalid func info section), but
nonetheless it can confuse users using that patch if they link together
object files and find the func info sections in the inputs silently
disappearing.
Scan the linker inputs for this problem and emit a warning if any are
found.
libctf/ChangeLog
2021-01-05 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_warn_outdated_inputs): New.
(ctf_link_write): Call it.
|
|
|
|
When linking fails, we delete all the generated outputs, but we fail to
remove them from the ctf_link_outputs hash we stuck them in before doing
symbol and variable section linking (which we had to do because that's
where ctf_create_per_cu, used by both, looks for them). This leaves
stale pointers to freed memory behind, and crashes soon follow.
Fix obvious.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_deduplicating): Clean up the ctf_link_outputs
hash on error.
|
|
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
|
|
This is embarrassing.
The whole point of CTF is that it remains intact even after a binary is
stripped, providing a compact mapping from symbols to types for
everything in the externally-visible interface of an ELF object: it has
connections to the symbol table for that purpose, and to the string
table to avoid duplicating symbol names. So it's a shame that the hooks
I implemented last year served to hook it up to the .symtab and .strtab,
which obviously disappear on strip, leaving any accompanying the CTF
dict containing references to strings (and, soon, symbols) which don't
exist any more because their containing strtab has been vaporized. The
original Solaris design used .dynsym and .dynstr (well, actually,
.ldynsym, which has more symbols) which do not disappear. So should we.
Thankfully the work we did before serves as guide rails, and adjusting
things to use the .dynstr and .dynsym was fast and easy. The only
annoyance is that the dynsym is assembled inside elflink.c in a fairly
piecemeal fashion, so that the easiest way to get the symbols out was to
hook in before every call to swap_symbol_out (we also leave in a hook in
front of symbol additions to the .symtab because it seems plausible that
we might want to hook them in future too: for now that hook is unused).
We adjust things so that rather than being offered a whole hash table of
symbols at once, libctf is now given symbols one at a time, with st_name
indexes already resolved and pointing at their final .dynstr offsets:
it's now up to libctf to resolve these to names as needed using the
strtab info we pass it separately.
Some bits might be contentious. The ctf_new_dynstr callback takes an
elf_internal_sym, and this remains an elf_internal_sym right down
through the generic emulation layers into ldelfgen. This is no worse
than the elf_sym_strtab we used to pass down, but in the future when we
gain non-ELF CTF symtab support we might want to lower the
elf_internal_sym to some other representation (perhaps a
ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym. We rename the
'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer
have anything to do with symbols.
There are some API changes to pieces of API which are technically public
but actually totally unused by anything and/or unused by anything but ld
so they can change freely: the ctf_link_symbol gains new fields to allow
symbol names to be given as strtab offsets as well as strings, and a
symidx so that the symbol index can be passed in. ctf_link_shuffle_syms
loses its callback parameter: the idea now is that linkers call the new
ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the
strtab entries with ctf_link_add_strtab, and then a call to
ctf_link_shuffle_syms will apply both and arrange to use them to reorder
the CTF symtab at CTF serialization time (which is coming in the next
commit).
Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always
set in v3-format CTF dicts from this commit forwards: CTF dicts without
this flag are associated with .strtab like they used to be, so that old
dicts' external strings don't turn to garbage when loaded by new libctf.
Dicts with this flag are associated with .dynstr and .dynsym instead.
(The flag is not the next in sequence because this commit was written
quite late: the missing flags will be filled in by the next commit.)
Tests forthcoming in a later commit in this series.
bfd/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* elflink.c (elf_finalize_dynstr): Call examine_strtab after
dynstr finalization.
(elf_link_swap_symbols_out): Don't call it here. Call
ctf_new_symbol before swap_symbol_out.
(elf_link_output_extsym): Call ctf_new_dynsym before
swap_symbol_out.
(bfd_elf_final_link): Likewise.
* elf.c (swap_out_syms): Pass in bfd_link_info. Call
ctf_new_symbol before swap_symbol_out.
(_bfd_elf_compute_section_file_positions): Adjust.
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not
.symtab and .strtab.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* bfdlink.h (struct elf_sym_strtab): Replace with...
(struct elf_internal_sym): ... this.
(struct bfd_link_callbacks) <examine_strtab>: Take only a
symstrtab argument.
<ctf_new_symbol>: New.
<ctf_new_dynsym>: Likewise.
* ctf-api.h (struct ctf_link_sym) <st_symidx>: New.
<st_nameidx>: Likewise.
<st_nameidx_set>: Likewise.
(ctf_link_iter_symbol_f): Removed.
(ctf_link_shuffle_syms): Remove most parameters, just takes a
ctf_dict_t now.
(ctf_link_add_linker_symbol): New, split from
ctf_link_shuffle_syms.
* ctf.h (CTF_F_DYNSTR): New.
(CTF_F_MAX): Adjust.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to...
(struct ctf_strtab_iter_cb_arg): ... this, changing fields:
<syms>: Remove.
<symcount>: Remove.
<symstrtab>: Rename to...
<strtab>: ... this.
(ldelf_ctf_strtab_iter_cb): Adjust.
(ldelf_ctf_symbols_iter_cb): Remove.
(ldelf_new_dynsym_for_ctf): New, tell libctf about a single
symbol.
(ldelf_examine_strtab_for_ctf): Rename to...
(ldelf_acquire_strings_for_ctf): ... this, only doing the strtab
portion and not symbols.
* ldelfgen.h: Adjust declarations accordingly.
* ldemul.c (ldemul_examine_strtab_for_ctf): Rename to...
(ldemul_acquire_strings_for_ctf): ... this.
(ldemul_new_dynsym_for_ctf): New.
* ldemul.h: Adjust declarations accordingly.
* ldlang.c (ldlang_ctf_apply_strsym): Rename to...
(ldlang_ctf_acquire_strings): ... this.
(ldlang_ctf_new_dynsym): New.
(lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do
the actual symbol shuffle.
* ldlang.h (struct elf_strtab_hash): Adjust accordingly.
* ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_shuffle_syms): Adjust.
(ctf_link_add_linker_symbol): New, unimplemented stub.
* libctf.ver: Add it.
* ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized
dicts.
* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the
symtab/strtab if not present, dynsym/dynstr otherwise.
* ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from
some arbitrary member of a CTF archive.
* ctf-impl.h (ctf_arc_bufpreamble): Declare it.
|
|
The functions that return ctf_dict_t's given a ctf_archive_t and a name
are very clumsily named. It sounds like they return *archives*, not
dictionaries, and the names are very long and clunky. Why do we
have a ctf_arc_open_by_name when it opens a dictionary, not an archive,
and when there is no way to open a dictionary in any other way? The
answer is purely internal: the function is located in ctf-archive.c,
and everything in there was called ctf_arc_*, and there is another
way to open a dict (by offset in the archive), that is internal to
ctf-archive.c and that nothing else can call.
This is clearly bad naming. The internal organization of the source tree
should not dictate public API names!
So rename things (keeping the old, bad names for compatibility), and
adjust all users. You now open a dict using ctf_dict_open, and
open it giving ELF sections via ctf_dict_open_sections.
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf): Use ctf_dict_open, not
ctf_arc_open_by_name.
* readelf.c (dump_section_as_ctf): Likewise.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c (elfctf_build_psymtabs): Use ctf_dict_open, not
ctf_arc_open_by_name.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_arc_open_by_name): Rename to...
(ctf_dict_open): ... this, keeping compatibility function.
(ctf_arc_open_by_name_sections): Rename to...
(ctf_dict_open_sections): ... this, keeping compatibility function.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-archive.c (ctf_arc_open_by_offset): Rename to...
(ctf_dict_open_by_offset): ... this. Adjust callers.
(ctf_arc_open_by_name_internal): Rename to...
(ctf_dict_open_internal): ... this. Adjust callers.
(ctf_arc_open_by_name_sections): Rename to...
(ctf_dict_open_sections): ... this, keeping compatibility function.
(ctf_arc_open_by_name): Rename to...
(ctf_dict_open): ... this, keeping compatibility function.
* libctf.ver: New functions added.
* ctf-link.c (ctf_link_one_input_archive): Adjusted accordingly.
(ctf_link_deduplicating_open_inputs): Likewise.
|
|
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
|
|
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
|
|
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
|
|
This flag (not used anywhere yet) causes the variables section to be
omitted from the output CTF dict.
include/
* ctf-api.h (CTF_LINK_OMIT_VARIABLES_SECTION): New.
libctf/
* ctf-link.c (ctf_link_one_input_archive_member): Check
CTF_LINK_OMIT_VARIABLES_SECTION.
|
|
The CTF variables section (containing variables that have no
corresponding symtab entries) can cause the string table to get very
voluminous if the names of variables are long. Some callers want to
filter out particular variables they know they won't need.
So add a "variable filter" callback that does that: it's passed the name
of the variable and a corresponding ctf_file_t / ctf_id_t pair, and
should return 1 to filter it out.
ld doesn't use this machinery yet, but we could easily add it later if
desired. (But see later for a commit that turns off CTF variable-
section linking in ld entirely by default.)
include/
* ctf-api.h (ctf_link_variable_filter_t): New.
(ctf_link_set_variable_filter): Likewise.
libctf/
* libctf.ver (ctf_link_set_variable_filter): Add.
* ctf-impl.h (ctf_file_t) <ctf_link_variable_filter>: New.
<ctf_link_variable_filter_arg>: Likewise.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-link.c (ctf_link_set_variable_filter): New, set it.
(ctf_link_one_variable): Call it if set.
|
|
When we link a CTF variable, we check to see if it already exists in the
parent dict first: if it does, and it has a type the same as the type we
would populate it with, we assume we don't need to do anything:
otherwise, we populate it in a per-CU child.
Or that's what we should be doing. Instead, we check if the type is the
same as the type in *source dict*, which is going to be a completely
different value! So we end up concluding all variables are conflicting,
bloating up output possibly quite a lot (variables aren't big in and of
themselves, but each drags around a strtab entry, and CTF dicts in a CTF
archive do not share their strtabs -- one of many problems with CTF
archives as presently constituted.)
Fix trivial: check the right type.
libctf/
* ctf-link.c (ctf_link_one_variable): Check the dst_type for
conflicts, not the source type.
|
|
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
|
|
We were leaking the fd on every invocation.
libctf/
* ctf-link.c (ctf_link_write): Close the fd.
|
|
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
|
|
When you link TUs that contain conflicting types together, the resulting
CTF section is an archive containing many CTF dicts. These dicts appear
in ctf_link_outputs of the shared dict, with each ctf_import'ing that
shared dict. ctf_importing a dict bumps its refcount to stop it going
away while it's in use -- but if the shared dict (whose refcount is
bumped) has the child dict (doing the bumping) in its ctf_link_outputs,
we have a refcount loop, since the child dict only un-ctf_imports and
drops the parent's refcount when it is freed, but the child is only
freed when the parent's refcount falls to zero.
(In the future, this will be able to go wrong on the inputs too, when an
ld -r'ed deduplicated output with conflicts is relinked. Right now this
cannot happen because we don't ctf_import such dicts at all. This will
be fixed in a later commit in this series.)
Fix this by introducing an internal-use-only ctf_import_unref function
that imports a parent dict *witthout* bumping the parent's refcount, and
using it when we create per-CU outputs. This function is only safe to
use if you know the parent cannot go away while the child exists: but if
the parent *owns* the child, as here, this is necessarily true.
Record in the ctf_file_t whether a parent was imported via ctf_import or
ctf_import_unref, so that if you do another ctf_import later on (or a
ctf_import_unref) it can decide whether to drop the refcount of the
existing parent being replaced depending on which function you used to
import that one. Adjust ctf_serialize so that rather than doing a
ctf_import (which is wrong if the original import was
ctf_import_unref'fed), we just copy the parent field and refcount over
and forcibly flip the unref flag on on the old copy we are going to
discard.
ctf_file_close also needs a bit of tweaking to only close the parent if
it was not imported with ctf_import_unref: while we're at it, guard
against repeated closes with a refcount of zero and stop them causing
double-frees, even if destruction of things freed *inside*
ctf_file_close cause such recursion.
Verified no leaks or accesses to freed memory after all of this with
valgrind. (It was leak-happy before.)
libctf/
* ctf-impl.c (ctf_file_t) <ctf_parent_unreffed>: New.
(ctf_import_unref): New.
* ctf-open.c (ctf_file_close) Drop the refcount all the way to
zero. Don't recurse back in if the refcount is already zero.
(ctf_import): Check ctf_parent_unreffed before deciding whether
to close a pre-existing parent. Set it to zero.
(ctf_import_unreffed): New, as above, setting
ctf_parent_unreffed to 1.
* ctf-create.c (ctf_serialize): Do not ctf_import into the new
child: use direct assignment, and set unreffed on the new and
old children.
* ctf-link.c (ctf_create_per_cu): Import the parent using
ctf_import_unreffed.
|
|
The name was just annoyingly long and I kept misspelling it.
It's also a bad name: it's not a mapping the type might be *used* in a
type mapping, but it is itself a representation of a type (a ctf_file_t
/ ctf_id_t pair), not of a mapping at all.
libctf/
* ctf-impl.h (ctf_link_type_mapping_key): Rename to...
(ctf_link_type_key): ... this, adjusting member prefixes to
match.
(ctf_hash_type_mapping_key): Rename to...
(ctf_hash_type_key): ... this.
(ctf_hash_eq_type_mapping_key): Rename to...
(ctf_hash_eq_type_key): ... this.
* ctf-hash.c (ctf_hash_type_mapping_key): Rename to...
(ctf_hash_type_key): ... this, and adjust for member name
changes.
(ctf_hash_eq_type_mapping_key): Rename to...
(ctf_hash_eq_type_key): ... this, and adjust for member name
changes.
* ctf-link.c (ctf_add_type_mapping): Adjust. Note the lack of
need for out-of-memory checking in this code.
(ctf_type_mapping): Adjust.
|
|
|
|
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
|
|
GCC can emit references to type 0 to indicate that this type is one that
is not representable in the version of CTF it emits (for instance,
version 3 cannot encode vector types). Type 0 is already used in the
function section to indicate padding inserted to skip functions we do
not want to encode the type of, so using zero in this way is a good
extension of the format: but libctf reports such types as ECTF_BADID,
which is indistinguishable from file corruption via links to truly
nonexistent types with IDs like 0xDEADBEEF etc, which we really do want
to stop for.
In particular, this stops all traversals of types dead at this point,
preventing us from even dumping CTF files containing unrepresentable
types to see what's going on!
So add a new error, ECTF_NONREPRESENTABLE, which is returned by
recursive type resolution when a reference to a zero type is found. (No
zero type is ever emitted into the CTF file by GCC, only references to
one). We can't do much with types that are ultimately nonrepresentable,
but we can do enough to keep functioning.
Adjust ctf_add_type to ensure that top-level types of type zero and
structure and union members of ultimate type zero are simply skipped
without reporting an error, so we can copy structures and unions that
contain nonrepresentable members (skipping them and leaving a hole where
they would be, so no consumers downstream of the linker need to worry
about this): adjust the dumper so that we dump members of
nonrepresentable types in a simple form that indicates
nonrepresentability rather than terminating the dump, and do not falsely
assume all errors to be -ENOMEM: adjust the linker so that types that
fail to get added are simply skipped, so that both nonrepresentable
types and outright errors do not terminate the type addition, which
could skip many valid types and cause further errors when variables of
those types are added.
In future, when we gain the ability to call back to the linker to report
link-time type resolution errors, we should report failures to add all
but nonrepresentable types. But we can't do that yet.
v5: Fix tabdamage.
include/
* ctf-api.h (ECTF_NONREPRESENTABLE): New.
libctf/
* ctf-types.c (ctf_type_resolve): Return ECTF_NONREPRESENTABLE on
type zero.
* ctf-create.c (ctf_add_type): Detect and skip nonrepresentable
members and types.
(ctf_add_variable): Likewise for variables pointing to them.
* ctf-link.c (ctf_link_one_type): Do not warn for nonrepresentable
type link failure, but do warn for others.
* ctf-dump.c (ctf_dump_format_type): Likewise. Do not assume all
errors to be ENOMEM.
(ctf_dump_member): Likewise.
(ctf_dump_type): Likewise.
(ctf_dump_header_strfield): Do not assume all errors to be ENOMEM.
(ctf_dump_header_sectfield): Do not assume all errors to be ENOMEM.
(ctf_dump_header): Likewise.
(ctf_dump_label): likewise.
(ctf_dump_objts): likewise.
(ctf_dump_funcs): likewise.
(ctf_dump_var): likewise.
(ctf_dump_str): Likewise.
|
|
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
|
|
The compiler describes the name and type of all file-scope variables in
this section. Merging it at link time requires using the type mapping
added in the previous commit to determine the appropriate type for the
variable in the output, given its type in the input: we check the shared
container first, and if the type doesn't exist there, it must be a
conflicted type in the per-CU child, and the variable should go there
too. We also put the variable in the per-CU child if a variable with
the same name but a different type already exists in the parent: we
ignore any such conflict in the child because CTF cannot represent such
things, nor can they happen unless a third-party linking program has
overridden the mapping of CU to CTF archive member name (using machinery
added in a later commit).
v3: rewritten using an algorithm that actually works in the case of
conflicting names. Some code motion from the next commit. Set
the per-CU parent name.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ECTF_INTERNAL): New.
libctf/
* ctf-link.c (ctf_create_per_cu): New, refactored out of...
(ctf_link_one_type): ... here, with parent-name setting added.
(check_variable): New.
(ctf_link_one_variable): Likewise.
(ctf_link_one_input_archive_member): Call it.
* ctf-error.c (_ctf_errlist): Updated with new errors.
|
|
This lets you call ctf_type_mapping (dest_fp, src_fp, src_type_id)
and get told what type ID the corresponding type has in the target
ctf_file_t. This works even if it was added by a recursive call, and
because it is stored in the target ctf_file_t it works even if we
had to add one type to multiple ctf_file_t's as part of conflicting
type handling.
We empty out this mapping after every archive is linked: because it maps
input to output fps, and we only visit each input fp once, its contents
are rendered entirely useless every time the source fp changes.
v3: add several missing mapping additions. Add ctf_dynhash_empty, and
empty after every input archive.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_file_t): New field ctf_link_type_mapping.
(struct ctf_link_type_mapping_key): New.
(ctf_hash_type_mapping_key): Likewise.
(ctf_hash_eq_type_mapping_key): Likewise.
(ctf_add_type_mapping): Likewise.
(ctf_type_mapping): Likewise.
(ctf_dynhash_empty): Likewise.
* ctf-open.c (ctf_file_close): Update accordingly.
* ctf-create.c (ctf_update): Likewise.
(ctf_add_type): Populate the mapping.
* ctf-hash.c (ctf_hash_type_mapping_key): Hash a type mapping key.
(ctf_hash_eq_type_mapping_key): Check the key for equality.
(ctf_dynhash_insert): Fix comment typo.
(ctf_dynhash_empty): New.
* ctf-link.c (ctf_add_type_mapping): New.
(ctf_type_mapping): Likewise.
(empty_link_type_mapping): New.
(ctf_link_one_input_archive): Call it.
|
|
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
|