Age | Commit message (Collapse) | Author | Files | Lines |
|
We were failing to increment component_idx when acquiring datasec info,
meaning that all our variable population got the datasec info from the
first entry in that datasec.
|
|
SQUASH)
If one TU has an extern definition of a variable or function, and another
has a global definition that is otherwise identical (same type, same name),
we'd like to unify the two: they are not conflicting types. But a similar
static definition *is* conflicting.
This is quite simple for functions: introduce a new cd_linkages that
contains entries giving the "linkage we'd like to emit" for a given hash:
bump it from extern to global-allocated when one is seen, and use that as
the linkage at emission time if present. When hashing, promote extern
to non-extern always.
But for variables this is more complex, because non-extern variables come
with datasec info which is hashed in as well: so externs and the non-externs
we'd like to emit instead have *different hashes* and are seen as
conflicting types.
To fix this, introduce a new cd_replacing_hashes which maps from hashes we'd
like to ignore if seen to hashes that replace them: populate it when a
global-allocated variable is seen, using the initial portion of the hash
(shared with extern vars) as the hash value to replace. Use this to bump up
name counts at conflict-marking time; avoid marking the replaced hash as
conflicting if we are also avoiding marking the hash it replaces as
conflicting.
|
|
|
|
When we encounter a conflicting type in cu-mapped mode, it's not enough to
just add it as root-visible if its source type was and then mark it
conflicting later: it might have a conflicting name, in which case addition
will fail (if it's one of those kinds for which conflicting names are a
cause for failure, which is most of them.)
So mark such types non-root-visible explicitly as well.
|
|
When calling a pure function, it helps to assign the result to something!
|
|
The new API function ctf_link_output_is_btf lets you determine
whether the result of a ctf_link is likely to be written out as
CTF or BTF before the write takes place: the new is_btf argument
to ctf_link_write lets you find out whether it actually was (things
like compression that only ctf_link_write is told about may cause
last-minute changes in the decision).
This requires us to split preserialization in two, with the portion that
determines whether a serialized dict is BTF-compatible or not moving into a
new internal function ctf_serialize_output_format, called by
ctf_link_output_is_btf. We move the state used to communicate between
serialization passes into a new sub-struct at the same time.
|
|
We should be looking at the visible kind, not the format-internal
kind: in particular, we want forwards to enums to appear to be
CTF_K_FORWARDs.
|
|
|
|
Datasecs are only used for non-extern variables (those in ELF sections).
Variables added by ctf_add_variable() should not be in any datasec at all.
(This means we have to drop a minor optimization where variables not
mentioned in ctf_var_datasecs were assumed to be in the most-numerous one,
since they might be in none at all; not a great loss.)
|
|
An accidental fallthrough.
|
|
BTF doesn't have a concept of external strings in the ELF strtab, so even if
the ELF linker has reported symbols, we must ignore them when doing BTF
serialization. (We keep the strings around, so that if a later attempt is
made to serialize as CTF we can still use them and generate external string
refs as before, though this will be of limited effectiveness because the
duplicated strings will already have been baked into the strtab. Serializing
a dict in an ELF executable as BTF and then CTF right afterwards is rare in
any case: normally such things get serialized by the linker, which does it
only once per dict.)
|
|
This has its linkage not in the vlen region, but *stuffed into the
actual vlen in the info word* (why this is inconsistent with the
way CTF_K_VAR works, I have no idea).
|
|
CTF_DEDUP_GID() takes input *numbers*, not inputs (which are pointers).
|
|
|
|
|
|
This lets a CU added via ctf_link_add_cu_mapping (fp, foo, "") get emitted
directly into the shared dict, with conflicting types going into hidden members
(like conflicting types in modules in the cu-mapped portions do).
This is the layout in-kernel BTF and pahole want.
|
|
More to come.
|
|
Basically this consists of small tweaks to emit the new stuff, plus
two nasty cases:
- datasecs are skipped: instead, we compute a backmapping from var to
datasec component in cd_var_datasec and chase this to hash the datasec
name and info for each specific variable into the hash for that variable,
then use that to emit both at once when the var is emitted (always into
the same dict).
TODO: this probably gets the offsets wrong: ctf_add_section_variable()
and/or serialization must sort them after all. Dammit.
- decl tags to struct/union members (and to structs/unions in general) have
two separate nasty problems.
The first is that we have to mark them conflicting if their associated
struct is conflicting, but traversal from types to their parents halts at
tagged structs and unions, because the type graph is sharded via stubs at
those points and conflictedness ceases. But we don't want to do that
here: a decl_tag to member 10 of some struct is only valid if that struct
*has* ten members, and if the struct is conflicted, some may have only
one. The decl tag is only valid for the specific struct-with-ten-members
it was originally pointing at, anyway: other structs-with-ten-members may
have entirely different members there, which are not tagged or which are
tagged with something else.)
So we track this by keeping track of the only thing that is knowable
about struct/union stubs: their decorated name. The citers graph
gains mappings from decorated SoU names to decl tags, and conflictedness
marking chases that and marks accordingly.
- The second problem is that we have to emit decl tags to struct members
after the members are emitted, but the members are emitted late because
they might refer to any types in the dict, including types added after
the struct was added. So we need to accumulate decl tags to struct
members in cd_emission_struct_decl_tags and add yet *another* pass
that traverses that and emits all the decl tags in it.
ctf_link is not yet revised: at the very least it should stop trying to emit
variables itself -- but it does need to add data symbols if the variable
section is not omitted, in addition to figuring out whether we want to
delete existing variables if it *is* omitted. This stuff can partly wait
until later: the variables section is never omitted for kernel links.
|
|
ctf_dedup_rhash_type always hashes the type it's called for: the possibility
to return some random other hash via an override has been gone for so long
that it never even made it upstream.
|
|
ctf_tag, ctf_tag_next; dropping type/decl tag lookup from ctf_lookup_by_name
(we can't get it for free because the name tables are unusually structured,
and with multiple IDs mapping to a given tag it's not clear what we could
return in any case); supporting void * (almost no changes, just a tweak to
ctf_type_compat() to note that void * is assignment-compatible with itself);
plus a tiny tweak in the deduplicator's error handling spotted while
checking uses of ctf_dynhash_insert.
(Will all be squashed together and split up in different directions in
future anyway.)
|
|
|
|
Parent/child determination is about to become rather more complex, making a
macro impractical. Use the ctf_type_isparent/ischild function calls
everywhere and remove the macro. Make them more const-correct too, to
make them more widely usable.
While we're about it, change several places that hand-implemented
ctf_get_dict() to call it instead, and armour several functions against
the null returns that were always possible in this case (but previously
unprotected-against).
|
|
This was added last year to let us maintain a backpointer to the movable
refs dynhash in movable ref atoms without spending space for the
backpointer on the majority of (non-movable) refs and also without
causing an atom which had some refs movable and some refs not movable to
dereference unallocated storage when freed.
The backpointer's only purpose was to let us locate the
ctf_str_movable_refs dynhash during item freeing, when we had nothing
but a pointer to the atom being freed. Now we have a proper freeing
arg, we don't need the backpointer at all: we can just pass a pointer to
the dict in to the atoms dynhash as a freeing arg for the atom freeing
functions, and throw the whole backpointer and separate movable ref list
complexity away.
|
|
The distinction between the citer and citers variables in
ctf_dedup_rhash_type is somewhat opaque (it's a micro-optimization to avoid
having to allocate entire sets when we know in advance that we'll only have
to store one value). Add a comment.
libctf/
* ctf-dedup.c (ctf_dedup_rhash_type): Comment on citers variables.
|
|
This is a pretty simple two-phase process (count duplicates that are
actually going to end up in the strtab and aren't e.g. strings without refs,
strings with external refs etc, and move them into the parent) with one
wrinkle: we sorta-abuse the csa_external_offset field in the deduplicated
child atom (normally used to indicate that this string is located in the ELF
strtab) to indicate that this atom is in the *parent*. If you think of
"external" as meaning simply "is in some other strtab, we don't care which
one", this still makes enough sense to not need to change the name, I hope.
This is still not called from anywhere, so strings are (still!) not
deduplicated, and none of the dedup machinery added in earlier commits does
anything yet.
libctf/
* ctf-dedup.c (ctf_dedup_emit_struct_members): Note that strtab
dedup happens (well) after struct member emission.
(ctf_dedup_strings): New.
* ctf-impl.h (ctf_dedup_strings): Declare.
|
|
|
|
The recent change to detect duplicate enum values and return ECTF_DUPLICATE
when found turns out to perturb a great many callers. In particular, the
pahole-created kernel BTF has the same problem we historically did, and
gleefully emits duplicated enum constants in profusion. Handling the
resulting duplicate errors from BTF -> CTF converters reasonably is
unreasonably difficult (it amounts to forcing them to skip some types or
reimplement the deduplicator).
So let's step back a bit. What we care about mostly is that the
deduplicator treat enums with conflicting enumeration constants as
conflicting types: programs that want to look up enumeration constant ->
value mappings using the new APIs to do so might well want the same checks
to apply to any ctf_add_* operations they carry out (and since they're
*using* the new APIs, added at the same time as this restriction was
imposed, there is likely to be no negative consequence of this).
So we want some way to allow processes that know about duplicate detection
to opt into it, while allowing everyone else to stay clear of it: but we
want ctf_link to get this behaviour even if its caller has opted out.
So add a new concept to the API: dict-wide CTF flags, set via
ctf_dict_set_flag, obtained via ctf_dict_get_flag. They are not bitflags
but simple arbitrary integers and an on/off value, stored in an unspecified
manner (the one current flag, we translate into an LCTF_* flag value in the
internal ctf_dict ctf_flags word). If you pass in an invalid flag or value
you get a new ECTF_BADFLAG error, so the caller can easily tell whether
flags added in future are valid with a particular libctf or not.
We check this flag in ctf_add_enumerator, and set it around the link
(including on child per-CU dicts). The newish enumerator-iteration test is
souped up to check the semantics of the flag as well.
The fact that the flag can be set and unset at any time has curious
consequences. You can unset the flag, insert a pile of duplicates, then set
it and expect the new duplicates to be detected, not only by
ctf_add_enumerator but also by ctf_lookup_enumerator. This means we now
have to maintain the ctf_names and conflicting_enums enum-duplication
tracking as new enums are added, not purely as the dict is opened.
Move that code out of init_static_types_internal and into a new
ctf_track_enumerator function that addition can also call.
(None of this affects the file format or serialization machinery, which has
to be able to handle duplicate enumeration constants no matter what.)
include/
* ctf-api.h (CTF_ERRORS) [ECTF_BADFLAG]: New.
(ECTF_NERR): Update.
(CTF_STRICT_NO_DUP_ENUMERATORS): New flag.
(ctf_dict_set_flag): New function.
(ctf_dict_get_flag): Likewise.
libctf/
* ctf-impl.h (LCTF_STRICT_NO_DUP_ENUMERATORS): New flag.
(ctf_track_enumerator): Declare.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_create_per_cu): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link): Likewise.
(ctf_link_write): Likewise.
* ctf-subr.c (ctf_dict_set_flag): New function.
(ctf_dict_get_flag): New function.
* ctf-open.c (init_static_types_internal): Move enum tracking to...
* ctf-create.c (ctf_track_enumerator): ... this new function.
(ctf_add_enumerator): Call it.
* libctf.ver: Add the new functions.
* testsuite/libctf-lookup/enumerator-iteration.c: Test them.
|
|
Drop an unnecessary variable, and fix a buggy comment.
No effect on generated code.
libctf/
* ctf-dedup.c (ctf_dedup_detect_name_ambiguity): Drop unnecessary
variable.
(ctf_dedup_rwalk_output_mapping): Fix comment.
|
|
If you deduplicate non-root-visible types, the resulting type should still
be non-root-visible! We were promoting all such types to root-visible, and
re-demoting them only if their names collided (which might happen on
cu-mapped links if multiple compilation units with conflicting types are
fused into one child dict).
This "worked" before now, in that linking at least didn't fail (if you don't
mind having your non-root flag value destroyed if you're adding
non-root-visible types), but now that conflicting enumerators cause their
containing enums to become conflicted (enums which might have *different
names*), this caused the linker to crash when it hit two enumerators with
conflicting values.
Not testable in ld because cu-mapped links are not exposed to ld, but can be
tested via direct creation of libraries and calls to ctf_link directly.
(This also tests the ctf_dump non-root type printout, which before now
was untested.)
libctf/
* ctf-dedup.c (ctf_dedup_emit_type): Non-root-visible input types
should be emitted as non-root-visible output types.
* testsuite/libctf-writable/ctf-nonroot-linking.c: New test.
* testsuite/libctf-writable/ctf-nonroot-linking.lk: New test.
|
|
The PARENTS arg is carefully passed down through all the layers of hash
functions and then never used for anything. (In the distant past it was
used for cycle detection, but the algorithm eventually committed doesn't
need to do cycle detection...)
The PARENTS arg is still used by ctf_dedup_emit(), but even there we can
loosen the requirements and state that you can just leave entries
corresponding to dicts with no parents at zero (which will be useful
in an upcoming commit).
libctf/
* ctf-dedup.c (ctf_dedup_hash_type): Drop PARENTS arg.
(ctf_dedup_rhash_type): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_emit_struct_members): Mention what you can do to
PARENTS entries for parent dicts.
* ctf-impl.h (ctf_dedup): Adjust accordingly.
* ctf-link.c (ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
|
|
The CTF deduplicator was not considering enumerators inside enum types to be
things that caused type conflicts, so if the following two TUs were linked
together, you would end up with the following in the resulting dict:
1.c:
enum foo { A, B };
2.c:
enum bar { A, B };
linked:
enum foo { A, B };
enum bar { A, B };
This does work -- but it's not something that's valid C, and the general
point of the shared dict is that it is something that you could potentially
get from any valid C TU.
So consider such types to be conflicting, but obviously don't consider
actually identical enums to be conflicting, even though they too have (all)
their identifiers in common. This involves surprisingly little code. The
deduplicator detects conflicting types by counting types in a hash table of
hash tables:
decorated identifier -> (type hash -> count)
where the COUNT is the number of times a given hash has been observed: any
name with more than one hash associated with it is considered conflicting
(the count is used to identify the most common such name for promotion to
the shared dict).
Before now, those identifiers were all the identifiers of types (possibly
decorated with their namespace on the front for enumerator identifiers), but
we can equally well put *enumeration constant names* in there, undecorated
like the identifiers of types in the global namespace, with the type hash
being the hash of each enum containing that enumerator. The existing
conflicting-type-detection code will then accurately identify distinct enums
with enumeration constants in common. The enum that contains the most
commonly-appearing enumerators will be promoted to the shared dict.
libctf/
* ctf-impl.h (ctf_dedup_t) <cd_name_counts>: Extend comment.
* ctf-dedup.c (ctf_dedup_count_name): New, split out of...
(ctf_dedup_populate_mappings): ... here. Call it for all
* enumeration constants in an enum as well as types.
ld/
* testsuite/ld-ctf/enum-3.c: New test CTF.
* testsuite/ld-ctf/enum-4.c: Likewise.
* testsuite/ld-ctf/overlapping-enums.d: New test.
* testsuite/ld-ctf/overlapping-enums-2.d: Likewise.
|
|
The libctf-internal warning function ctf_err_warn() can be passed a libctf
errno as a parameter, and will add its textual errmsg form to the passed-in
error message. But if there is an error on the fp already, and this is
specifically an error and not a warning, ctf_err_warn() will print the error
out regardless: there's no need to pass in anything but 0.
There are still a lot of places where we do
ctf_err_warn (fp, 0, EFOO, ...);
return ctf_set_errno (fp, 0, EFOO);
I've left all of those alone, because fixing it makes the code a bit longer:
but fixing the cases where no return is involved and the error has just been
set on the fp itself costs nothing and reduces redundancy a bit.
libctf/
* ctf-dedup.c (ctf_dedup_walk_output_mapping): Drop the errno arg.
(ctf_dedup_emit): Likewise.
(ctf_dedup_type_mapping): Likewise.
* ctf-link.c (ctf_create_per_cu): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
* ctf-lookup.c (ctf_lookup_symbol_idx): Likewise.
* ctf-subr.c (ctf_assert_fail_internal): Likewise.
|
|
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:
1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
author I haven't committed, 'Kalray SA.', to cover gas testsuite
files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
|
|
Made sure there is no implicit conversion between signed and unsigned
return value for functions setting the ctf_errno value.
An example of the problem is that in ctf_member_next, the "offset" value
is either 0L or (ctf_id_t)-1L, but it should have been 0L or -1L.
The issue was discovered while building a 64 bit ld binary to be
executed on the Windows platform.
Example object file that demonstrates the issue is attached in the PR.
libctf/
Affected functions adjusted.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
|
|
ctf_dedup's intern() function does not return a dynamically allocated
string, so I just spent ten minutes auditing for obvious memory leaks
that couldn't actually happen. Update the comment to note what it
actually returns (a pointer into an atoms table: i.e. possibly not
a new string, and not so easily leakable).
libctf/
* ctf-dedup.c (intern): Update comment.
|
|
If no suitable qsort_r is found in libc, we fall back to an
implementation in ctf-qsort.c. But this implementation routinely calls
the comparison function with two identical arguments. The comparison
function that ensures that the order of output types is stable is not
ready for this, misinterprets it as a type appearing more that once (a
can-never-happen condition) and fails with an assertion failure.
Fixed, audited for further instances of the same failure (none found)
and added a no-qsort test to my regular testsuite run.
libctf/:
PR libctf/30013
* ctf-dedup.c (sort_output_mapping): Inputs are always equal to
themselves.
|
|
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
|
|
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
|
|
When two types conflict and they are not types which can have forwards
(say, two arrays of different sizes with the same name in two different
TUs) the CTF deduplicator uses a popularity contest to decide what to
do: the type cited by the most other types ends up put into the shared
dict, while the others are relegated to per-CU child dicts.
This works well as long as one type *is* most popular -- but what if
there is a tie? If several types have the same popularity count,
we end up picking the first we run across and promoting it, and
unfortunately since we are working over a dynhash in essentially
arbitrary order, this means we promote a random one. So multiple
runs of ld with the same inputs can produce different outputs!
All the outputs are valid, but this is still undesirable.
Adjust things to use the same strategy used to sort types on the output:
when there is a tie, always put the type that appears in a CU that
appeared earlier on the link line (and if there is somehow still a tie,
which should be impossible, pick the type with the lowest type ID).
Add a testcase -- and since this emerged when trying out extern arrays,
check that those work as well (this requires a newer GCC, but since all
GCCs that can emit CTF at all are unreleased this is probably OK as
well).
Fix up one testcase that has slight type ordering changes as a result
of this change.
libctf/ChangeLog:
* ctf-dedup.c (ctf_dedup_detect_name_ambiguity): Use
cd_output_first_gid to break ties.
ld/ChangeLog:
* testsuite/ld-ctf/array-conflicted-ordering.d: New test, using...
* testsuite/ld-ctf/array-char-conflicting-1.c: ... this...
* testsuite/ld-ctf/array-char-conflicting-2.c: ... and this.
* testsuite/ld-ctf/array-extern.d: New test, using...
* testsuite/ld-ctf/array-extern.c: ... this.
* testsuite/ld-ctf/conflicting-typedefs.d: Adjust for ordering
changes.
|
|
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.
The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
|
|
* ctf-impl.h (ctf_dynset_eq_string): Don't declare.
* ctf-hash.c (ctf_dynset_eq_string): Delete function.
* ctf-dedup.c (make_set_element): Use htab_eq_string.
(ctf_dedup_atoms_init, ADD_CITER, ctf_dedup_init): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup_walk_output_mapping): Likewise.
|
|
Before now, types that could not be encoded in CTF were represented as
references to type ID 0, which does not itself appear in the
dictionary. This choice is annoying in several ways, principally that it
forces generators and consumers of CTF to grow special cases for types
that are referenced in valid dicts but don't appear.
Allow an alternative representation (which will become the only
representation in format v4) whereby nonrepresentable types are encoded
as actual types with kind CTF_K_UNKNOWN (an already-existing kind
theoretically but not in practice used for padding, with value 0).
This is backward-compatible, because CTF_K_UNKNOWN was not used anywhere
before now: it was used in old-format function symtypetabs, but these
were never emitted by any compiler and the code to handle them in libctf
likely never worked and was removed last year, in favour of new-format
symtypetabs that contain only type IDs, not type kinds.
In order to link this type, we need an API addition to let us add types
of unknown kind to the dict: we let them optionally have names so that
GCC can emit many different unknown types and those types with identical
names will be deduplicated together. There are also small tweaks to the
deduplicator to actually dedup such types, to let opening of dicts with
unknown types with names work, to return the ECTF_NONREPRESENTABLE error
on resolution of such types (like ID 0), and to print their names as
something useful but not a valid C identifier, mostly for the sake of
the dumper.
Tests added in the next commit.
include/ChangeLog
2021-05-06 Nick Alcock <nick.alcock@oracle.com>
* ctf.h (CTF_K_UNKNOWN): Document that it can be used for
nonrepresentable types, not just padding.
* ctf-api.h (ctf_add_unknown): New.
libctf/ChangeLog
2021-05-06 Nick Alcock <nick.alcock@oracle.com>
* ctf-open.c (init_types): Unknown types may have names.
* ctf-types.c (ctf_type_resolve): CTF_K_UNKNOWN is as
non-representable as type ID 0.
(ctf_type_aname): Print unknown types.
* ctf-dedup.c (ctf_dedup_hash_type): Do not early-exit for
CTF_K_UNKNOWN types: they have real hash values now.
(ctf_dedup_rwalk_one_output_mapping): Treat CTF_K_UNKNOWN types
like other types with no referents: call the callback and do not
skip them.
(ctf_dedup_emit_type): Emit via...
* ctf-create.c (ctf_add_unknown): ... this new function.
* libctf.ver (LIBCTF_1.2): Add it.
|
|
Out-of-memory errors initializing the string atoms table were
disregarded (though they would have caused a segfault very shortly
afterwards). Errors hashing types during deduplication were only
reported if they happened on the output dict, which is almost never the
case (most errors are going to be on the dict we're working over, which
is going to be one of the inputs). (The error was detected in both
cases, but the errno was extracted from the wrong dict.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-dedup.c (ctf_dedup_rhash_type): Report errors on the input
dict properly.
* ctf-open.c (ctf_bufopen_internal): Report errors initializing
the atoms table.
|
|
This series eliminates a lot of special-case code to handle dynamic
types (types added to writable dicts and not yet serialized).
Historically, when such types have variable-length data in their final
CTF representations, libctf has always worked by adding such types to a
special union (ctf_dtdef_t.dtd_u) in the dynamic type definition
structure, then picking the members out of this structure at
serialization time and packing them into their final form.
This has the advantage that the ctf_add_* code doesn't need to know
anything about the final CTF representation, but the significant
disadvantage that all code that looks up types in any way needs two code
paths, one for dynamic types, one for all others. Historically libctf
"handled" this by not supporting most type lookups on dynamic types at
all until ctf_update was called to do a complete reserialization of the
entire dict (it didn't emit an error, it just emitted wrong results).
Since commit 676c3ecbad6e9c4, which eliminated ctf_update in favour of
the internal-only ctf_serialize function, all the type-lookup paths
grew an extra branch to handle dynamic types.
We can eliminate this branch again by dropping the dtd_u stuff and
simply writing out the vlen in (close to) its final form at ctf_add_*
time: type lookup for types using this approach is then identical for
types in writable dicts and types that are in read-only ones, and
serialization is also simplified (we just need to write out the vlen
we already created).
The only complexity lies in type kinds for which multiple
vlen representations are valid depending on properties of the type,
e.g. structures. But we can start simple, adjusting ints, floats,
and slices to work this way, and leaving everything else as is.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_enc>: Remove.
<dtd_u.dtu_slice>: Likewise.
<dtd_vlen>: New.
* ctf-create.c (ctf_add_generic): Perhaps allocate it. All
callers adjusted.
(ctf_dtd_delete): Free it.
(ctf_add_slice): Use the dtd_vlen, not dtu_enc.
(ctf_add_encoded): Likewise. Assert that this must be an int or
float.
* ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen.
* ctf-dedup.c (ctf_dedup_rhash_type): Use the dtd_vlen, not
dtu_slice.
* ctf-types.c (ctf_type_reference): Likewise.
(ctf_type_encoding): Remove most dynamic-type-specific code: just
get the vlen from the right place. Report failure to look up the
underlying type's encoding.
|
|
It's formatted like this:
do
{
...
}
while (...);
Not like this:
do
{
...
} while (...);
or this:
do {
...
} while (...);
We used both in various places in libctf. Fixing it necessitated some
light reindentation.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-archive.c (ctf_archive_next): GNU style fix for do {} while.
* ctf-dedup.c (ctf_dedup_rhash_type): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
* ctf-dump.c (ctf_dump_format_type): Likewise.
* ctf-lookup.c (ctf_symbol_next): Likewise.
* swap.h (swap_thing): Likewise.
|
|
A transient bug in the preceding change (fixed before commit) exposed a
new failure, of ld/testsuite/ld-ctf/diag-parname.d. This attempts to
ensure that if we link a dict with child type IDs but no attached
parent, we get a suitable ECTF_NOPARENT error. This was happening
before this commit, but only by chance, because ctf_variable_iter and
ctf_variable_next check to see if the dict they're passed is a child
dict without an associated parent. We forgot error-checking on the
ctf_variable_next call, and as a result this was concealed -- and
looking for the problem exposed a new bug.
If any of the lookups beneath ctf_dedup_hash_type fail, the CTF link
does *not* fail, but acts quite bizarrely, skipping the type but
emitting an error to the CTF error/warning log -- so the linker will
report an error, emit a partial CTF dict missing some types, and exit
with exitcode 0 as if nothing went wrong. Since ctf_dedup_hash_type is
never expected to fail in normal operation, this is surely wrong:
failures at emission time do not emit partial CTF dicts, so failures
at hashing time should not either.
So propagate the error back up.
Also fix a couple of smaller bugs where we fail to properly free things
and/or propagate error codes on various rare link-time errors and
out-of-memory conditions.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-dedup.c (ctf_dedup): Pass on errors from ctf_dedup_hash_type.
Call ctf_dedup_fini properly on other errors.
(ctf_dedup_emit_type): Set the errno on dynhash insertion failure.
* ctf-link.c (ctf_link_deduplicating_per_cu): Close outputs beyond
output 0 when asserting because >1 output is found.
(ctf_link_deduplicating): Likewise, when asserting because the
shared output is not the same as the passed-in fp.
|
|
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
|
|
The ctf_type_name_raw and ctf_type_aname_raw functions, which return the
raw, unadorned name of CTF types, have one unfortunate wrinkle: they
return NULL not only on error but when returning the name of types
without a name in writable dicts. This was unintended: it not only
makes it impossible to reliably tell if a given call to
ctf_type_name_raw failed (due to a bad string offset say), but also
complicates all its callers, who now have to check for both NULL and "".
The written-out form of CTF has no concept of a NULL pointer instead of
a string: all null strings are strtab offset 0, "". So the more we can
do to remove this distinction from the writable form, the less complex
the rest of our code needs to be.
Armour against NULL in multiple places, arranging to return "" from
ctf_type_name_raw if offset 0 is passed in, and removing a risky
optimization from ctf_str_add* that avoided doing anything if a NULL was
passed in: this added needless irregularity to the functions' API
surface, since "" and NULL should be treated identically, and in the
case of ctf_str_add_ref, we shouldn't skip adding the passed-in REF to
the list of references to be updated no matter what the content of the
string happens to be.
This means we can simplify the deduplicator a tiny bit, also fixing a
bug (latent when used by ld) where if the input dict was writable,
we failed to realise when types were nameless and could end up creating
deeply unhelpful synthetic forwards with no name, which we just banned
a few commits ago, so the link failed.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-string.c (ctf_str_add): Treat adding a NULL as adding "".
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Likewise.
* ctf-types.c (ctf_type_name_raw): Always return "" for offset 0.
* ctf-dedup.c (ctf_dedup_multiple_input_dicts): Don't armour
against NULL name.
(ctf_dedup_maybe_synthesize_forward): Likewise.
|
|
libctf has no intrinsic support for the GCC unnamed structure member
extension. This principally means that you can't look up named members
inside unnamed struct or union members via ctf_member_info: you have to
tiresomely find out the type ID of the unnamed members via iteration,
then look in each of these.
This is ridiculous. Fix it by extending ctf_member_info so that it
recurses into unnamed members for you: this is still unambiguous because
GCC won't let you create ambiguously-named members even in the presence
of this extension.
For consistency, and because the release hasn't happened and we can
still do this, break the ctf_member_next API and add flags: we specify
one flag, CTF_MN_RECURSE, which if set causes ctf_member_next to
automatically recurse into unnamed members for you, returning not only
the members themselves but all their contained members, so that you can
use ctf_member_next to identify every member that it would be valid to
call ctf_member_info with.
New lookup tests are added for all of this.
include/ChangeLog
2021-01-05 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_MN_RECURSE): New.
(ctf_member_next): Add flags argument.
libctf/ChangeLog
2021-01-05 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (struct ctf_next) <u.ctn_next>: Move to...
<ctn_next>: ... here.
* ctf-util.c (ctf_next_destroy): Unconditionally destroy it.
* ctf-lookup.c (ctf_symbol_next): Adjust accordingly.
* ctf-types.c (ctf_member_iter): Reimplement in terms of...
(ctf_member_next): ... this. Support recursive unnamed member
iteration (off by default).
(ctf_member_info): Look up members in unnamed sub-structs.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_member_next call.
(ctf_dedup_emit_struct_members): Likewise.
* testsuite/libctf-lookup/struct-iteration-ctf.c: Test empty unnamed
members, and a normal member after the end.
* testsuite/libctf-lookup/struct-iteration.c: Verify that
ctf_member_count is consistent with the number of successful returns
from a non-recursive ctf_member_next.
* testsuite/libctf-lookup/struct-iteration-*: New, test iteration
over struct members.
* testsuite/libctf-lookup/struct-lookup.c: New test.
* testsuite/libctf-lookup/struct-lookup.lk: New test.
|
|
|