riscv-gnu-toolchain/binutils.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2020-08-27	libctf, binutils, include, ld: gettextize and improve error handling	Nick Alcock	1	-65/+68
	This commit follows on from the earlier commit "libctf, ld, binutils: add textual error/warning reporting for libctf" and converts every error in libctf that was reported using ctf_dprintf to use ctf_err_warn instead, gettextizing them in the process, using N_() where necessary to avoid doing gettext calls unless an error message is actually generated, and rephrasing some error messages for ease of translation. This requires a slight change in the ctf_errwarning_next API: this API is public but has not been in a release yet, so can still change freely. The problem is that many errors are emitted at open time (whether opening of a CTF dict, or opening of a CTF archive): the former of these throws away its incompletely-initialized ctf_file_t rather than return it, and the latter has no ctf_file_t at all. So errors and warnings emitted at open time cannot be stored in the ctf_file_t, and have to go elsewhere. We put them in a static local in ctf-subr.c (which is not very thread-safe: a later commit will improve things here): ctf_err_warn with a NULL fp adds to this list, and the public interface ctf_errwarning_next with a NULL fp retrieves from it. We need a slight exception from the usual iterator rules in this case: with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error" which signifies the end of iteration, so we add a new err parameter to ctf_errwarning_next which is used to report such iteration-related errors. (If an fp is provided -- i.e., if not reporting open errors -- this is optional, but even if it's optional it's still an API change. This is actually useful from a usability POV as well, since ctf_errwarning_next is usually called when there's been an error, so overwriting the error code with ECTF_NEXT_END is not very helpful! So, unusually, ctf_errwarning_next now uses the passed fp for its error code only if no errp pointer is passed in, and leaves it untouched otherwise.) ld, objdump and readelf are adapted to call ctf_errwarning_next with a NULL fp to report open errors where appropriate. The ctf_err_warn API also has to change, gaining a new error-number parameter which is used to add the error message corresponding to that error number into the debug stream when LIBCTF_DEBUG is enabled: changing this API is easy at this point since we are already touching all existing calls to gettextize them. We need this because the debug stream should contain the errno's message, but the error reported in the error/warning stream should not, because the caller will probably report it themselves at failure time regardless, and reporting it in every error message that leads up to it leads to a ridiculous chattering on failure, which is likely to end up as ridiculous chattering on stderr (trimmed a bit): CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable' CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable' CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable' ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable' We only need to be told that the parent CTF dictionary is unavailable once, not over and over again! errmsgs are still emitted on warning generation, because warnings do not usually lead to a failure propagated up to the caller and reported there. Debug-stream messages are not translated. If translation is turned on, there will be a mixture of English and translated messages in the debug stream, but rather that than burden the translators with debug-only output. binutils/ChangeLog 2020-08-27 Nick Alcock <nick.alcock@oracle.com> * objdump.c (dump_ctf_archive_member): Move error- reporting... (dump_ctf_errs): ... into this separate function. (dump_ctf): Call it on open errors. * readelf.c (dump_ctf_archive_member): Move error- reporting... (dump_ctf_errs): ... into this separate function. Support calls with NULL fp. Adjust for new err parameter to ctf_errwarning_next. (dump_section_as_ctf): Call it on open errors. include/ChangeLog 2020-08-27 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h (ctf_errwarning_next): New err parameter. ld/ChangeLog 2020-08-27 Nick Alcock <nick.alcock@oracle.com> * ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp. Adjust for new err parameter to ctf_errwarning_next. Only check for assertion failures when fp is non-NULL. (ldlang_open_ctf): Call it on open errors. * testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid breaking the diags tests. libctf/ChangeLog 2020-08-27 Nick Alcock <nick.alcock@oracle.com> * ctf-subr.c (open_errors): New list. (ctf_err_warn): Calls with NULL fp append to open_errors. Add err parameter, and use it to decorate the debug stream with errmsgs. (ctf_err_warn_to_open): Splice errors from a CTF dict into the open_errors. (ctf_errwarning_next): Calls with NULL fp report from open_errors. New err param to report iteration errors (including end-of-iteration) when fp is NULL. (ctf_assert_fail_internal): Adjust ctf_err_warn call for new err parameter: gettextize. * ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter. (LCTF_VBYTES): Adjust. (ctf_err_warn_to_open): New. (ctf_err_warn): Adjust. (ctf_bundle): Used in only one place: move... * ctf-create.c: ... here. (enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number down as needed. Don't emit the errmsg. Gettextize. (membcmp): Likewise. (ctf_add_type_internal): Likewise. (ctf_write_mem): Likewise. (ctf_compress_write): Likewise. Report errors writing the header or body. (ctf_write): Likewise. * ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not ctf_dprintf, and gettextize, as above. (ctf_arc_write): Likewise. (ctf_arc_bufopen): Likewise. (ctf_arc_open_internal): Likewise. * ctf-labels.c (ctf_label_iter): Likewise. * ctf-open-bfd.c (ctf_bfdclose): Likewise. (ctf_bfdopen): Likewise. (ctf_bfdopen_ctfsect): Likewise. (ctf_fdopen): Likewise. * ctf-string.c (ctf_str_write_strtab): Likewise. * ctf-types.c (ctf_type_resolve): Likewise. * ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict. (get_vbytes_v1): Pass down the ctf dict. (get_vbytes_v2): Likewise. (flip_ctf): Likewise. (flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and gettextize, as above. (upgrade_types_v1): Adjust calls. (init_types): Use ctf_err_warn, not ctf_dprintf, as above. (ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors emitted into individual dicts into the open errors if this turns out to be a failed open in the end. * ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err argument. Gettextize. Don't emit the errmsg. (ctf_dump_funcs): Likewise. Collapse err label into its only case. (ctf_dump_type): Likewise. * ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err argument. Gettextize. Don't emit the errmsg. (ctf_link_one_type): Likewise. (ctf_link_lazy_open): Likewise. (ctf_link_one_input_archive): Likewise. (ctf_link_deduplicating_count_inputs): Likewise. (ctf_link_deduplicating_open_inputs): Likewise. (ctf_link_deduplicating_close_inputs): Likewise. (ctf_link_deduplicating): Likewise. (ctf_link): Likewise. (ctf_link_deduplicating_per_cu): Likewise. Add some missed ctf_set_errnos to obscure error cases. * ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new err argument. Gettextize. Don't emit the errmsg. (ctf_dedup_populate_mappings): Likewise. (ctf_dedup_detect_name_ambiguity): Likewise. (ctf_dedup_init): Likewise. (ctf_dedup_multiple_input_dicts): Likewise. (ctf_dedup_conflictify_unshared): Likewise. (ctf_dedup): Likewise. (ctf_dedup_rwalk_one_output_mapping): Likewise. (ctf_dedup_id_to_target): Likewise. (ctf_dedup_emit_type): Likewise. (ctf_dedup_emit_struct_members): Likewise. (ctf_dedup_populate_type_mapping): Likewise. (ctf_dedup_populate_type_mappings): Likewise. (ctf_dedup_emit): Likewise. (ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error status setting. (ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide unknown-type-kind messages (which signify file corruption).
2020-07-22	libctf, link: tie in the deduplicating linker	Nick Alcock	1	-2/+658
	This fairly intricate commit connects up the CTF linker machinery (which operates in terms of ctf_archive_t's on ctf_link_inputs -> ctf_link_outputs) to the deduplicator (which operates in terms of arrays of ctf_file_t's, all the archives exploded). The nondeduplicating linker is retained, but is not called unless the CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have confidence in the much-more-complex deduplicating linker, I hope the nondeduplicating linker can be removed. In brief, what this does is traverses each input archive in ctf_link_inputs, opening every member (if not already open) and tying child dicts to their parents, shoving them into an array and constructing a corresponding parents array that tells the deduplicator which dict is the parent of which child. We then call ctf_dedup and ctf_dedup_emit with that array of inputs, taking the outputs that result and putting them into ctf_link_outputs where the rest of the CTF linker expects to find them, then linking in the variables just as is done by the nondeduplicating linker. It also implements much of the CU-mapping side of things. The problem CU-mapping introduces is that if you map many input CUs into one output, this is saying that you want many translation units to produce at most one child dict if conflicting types are found in any of them. This means you can suddenly have multiple distinct types with the same name in the same dict, which libctf cannot really represent because it's not something you can do with C translation units. The deduplicator machinery already committed does as best it can with these, hiding types with conflicting names rather than making child dicts out of them: but we still need to call it. This is done similarly to the main link, taking the inputs (one CU output at a time), deduplicating them, taking the output and making it an input to the final link. Two (significant) optimizations are done: we share atoms tables between all these links and the final link (so e.g. all type hash values are shared, all decorated type names, etc); and any CU-mapped links with only one input (and no child dicts) doesn't need to do anything other than renaming the CU: the CU-mapped link phase can be skipped for it. Put together, large CU-mapped links can save 50% of their memory usage and about as much time (and the memory usage for CU-mapped links is significant, because all those output CUs have to have all their types stored in memory all at once). include/ * ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the deduplicator. libctf/ * ctf-impl.h (ctf_list_splice): New. * ctf-util.h (ctf_list_splice): Likewise. * ctf-link.c (link_sort_inputs_cb_arg_t): Likewise. (ctf_link_sort_inputs): Likewise. (ctf_link_deduplicating_count_inputs): Likewise. (ctf_link_deduplicating_open_inputs): Likewise. (ctf_link_deduplicating_close_inputs): Likewise. (ctf_link_deduplicating_variables): Likewise. (ctf_link_deduplicating_per_cu): Likewise. (ctf_link_deduplicating): Likewise. (ctf_link): Call it.
2020-07-22	libctf, link: add CTF_LINK_OMIT_VARIABLES_SECTION	Nick Alcock	1	-1/+2
	This flag (not used anywhere yet) causes the variables section to be omitted from the output CTF dict. include/ * ctf-api.h (CTF_LINK_OMIT_VARIABLES_SECTION): New. libctf/ * ctf-link.c (ctf_link_one_input_archive_member): Check CTF_LINK_OMIT_VARIABLES_SECTION.
2020-07-22	libctf, link: add the ability to filter out variables from the link	Nick Alcock	1	-0/+19
	The CTF variables section (containing variables that have no corresponding symtab entries) can cause the string table to get very voluminous if the names of variables are long. Some callers want to filter out particular variables they know they won't need. So add a "variable filter" callback that does that: it's passed the name of the variable and a corresponding ctf_file_t / ctf_id_t pair, and should return 1 to filter it out. ld doesn't use this machinery yet, but we could easily add it later if desired. (But see later for a commit that turns off CTF variable- section linking in ld entirely by default.) include/ * ctf-api.h (ctf_link_variable_filter_t): New. (ctf_link_set_variable_filter): Likewise. libctf/ * libctf.ver (ctf_link_set_variable_filter): Add. * ctf-impl.h (ctf_file_t) <ctf_link_variable_filter>: New. <ctf_link_variable_filter_arg>: Likewise. * ctf-create.c (ctf_serialize): Adjust. * ctf-link.c (ctf_link_set_variable_filter): New, set it. (ctf_link_one_variable): Call it if set.
2020-07-22	libctf, link: fix spurious conflicts of variables in the variable section	Nick Alcock	1	-1/+1
	When we link a CTF variable, we check to see if it already exists in the parent dict first: if it does, and it has a type the same as the type we would populate it with, we assume we don't need to do anything: otherwise, we populate it in a per-CU child. Or that's what we should be doing. Instead, we check if the type is the same as the type in source dict, which is going to be a completely different value! So we end up concluding all variables are conflicting, bloating up output possibly quite a lot (variables aren't big in and of themselves, but each drags around a strtab entry, and CTF dicts in a CTF archive do not share their strtabs -- one of many problems with CTF archives as presently constituted.) Fix trivial: check the right type. libctf/ * ctf-link.c (ctf_link_one_variable): Check the dst_type for conflicts, not the source type.
2020-07-22	libctf, link: redo cu-mapping handling	Nick Alcock	1	-25/+82
	Now a bunch of stuff that doesn't apply to ld or any normal use of libctf, piled into one commit so that it's easier to ignore. The cu-mapping machinery associates incoming compilation unit names with outgoing names of CTF dictionaries that should correspond to them, for non-gdb CTF consumers that would like to group multiple TUs into a single child dict if conflicting types are found in it (the existing use case is one kernel module, one child CTF dict, even if the kernel module is composed of multiple CUs). The upcoming deduplicator needs to track not only the mapping from incoming CU name to outgoing dict name, but the inverse mapping from outgoing dict name to incoming CU name, so it can work over every CTF dict we might see in the output and link into it. So rejig the ctf-link machinery to do that. Simultaneously (because they are closely associated and were written at the same time), we add a new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the ctf_link machinery to create empty child dicts for each outgoing CU mapping even if no CUs that correspond to it exist in the link. This is a bit (OK, quite a lot) of a waste of space, but some existing consumers require it. (Nobody else should use it.) Its value is not consecutive with existing CTF_LINK flag values because we're about to add more flags that are conceptually closer to the existing ones than this one is. include/ * ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New. libctf/ * ctf-impl.h (ctf_file_t): Improve comments. <ctf_link_cu_mapping>: Split into... <ctf_link_in_cu_mapping>: ... this... <ctf_link_out_cu_mapping>: ... and this. * ctf-create.c (ctf_serialize): Adjust. * ctf-open.c (ctf_file_close): Likewise. * ctf-link.c (ctf_create_per_cu): Look things up in the in_cu_mapping instead of the cu_mapping. (ctf_link_add_cu_mapping): The deduplicating link will define what happens if many FROMs share a TO. (ctf_link_add_cu_mapping): Create in_cu_mapping and out_cu_mapping. Do not create ctf_link_outputs here any more, or create per-CU dicts here: they are already created when needed. (ctf_link_one_variable): Log a debug message if we skip a variable due to its type being concealed in a CU-mapped link. (This is probably too common a case to make into a warning.) (ctf_link): Create empty per-CU dicts if requested.
2020-07-22	libctf, link: fix ctf_link_write fd leak	Nick Alcock	1	-0/+1
	We were leaking the fd on every invocation. libctf/ * ctf-link.c (ctf_link_write): Close the fd.
2020-07-22	libctf, link: add lazy linking: clean up input members: err/warn cleanup	Nick Alcock	1	-100/+281
	This rather large and intertwined pile of changes does three things: First, it transitions from dprintf to ctf_err_warn for things the user might care about: this one file is the major impetus for the ctf_err_warn infrastructure, because things like file names are crucial in linker error messages, and errno values are utterly incapable of communicating them Second, it stabilizes the ctf_link APIs: you can now call ctf_link_add_ctf without a CTF argument (only a NAME), to lazily ctf_open the file with the given NAME when needed, and close it as soon as possible, to save memory. This is not an API change because a null CTF argument was prohibited before now. Since getting CTF directly from files uses ctf_open, passing in only a NAME requires use of libctf, not libctf-nobfd. The linker's behaviour is unchanged, as it still passes in a ctf_archive_t as before. This also let us fix a leak: we were opening ctf_archives and their containing ctf_files, then only closing the files and leaving the archives open. Third, this commit restructures the ctf_link_in_member argument used by the CTF linking machinery and adjusts its users accordingly. We drop two members: - arcname, which is difficult to construct and then only used in error messages (that were only dprintf()ed, so never seen!) - share_mode, since we store the flags passed to ctf_link (including the share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold of it We rename others whose existing names were fairly dreadful: - done_main_member -> done_parent, using consistent terminology for .ctf as the parent of all archive members - main_input_fp -> in_fp_parent, likewise - file_name -> in_file_name, likewise We add one new member, cu_mapped. Finally, we move the various frees of things like mapping table data to the top-level ctf_link, since deduplicating links will want to do that too. include/ * ctf-api.h (ECTF_NEEDSBFD): New. (ECTF_NERR): Adjust. (ctf_link): Rename share_mode arg to flags. libctf/ * Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere. * Makefile.in: Regenerated. * ctf-impl.h (ctf_link_input_name): New. (ctf_file_t) <ctf_link_flags>: New. * ctf-create.c (ctf_serialize): Adjust accordingly. * ctf-link.c: Define ctf_open as weak when PIC. (ctf_arc_close_thunk): Remove unnecessary thunk. (ctf_file_close_thunk): Likewise. (ctf_link_input_name): New. (ctf_link_input_t): New value of the ctf_file_t.ctf_link_input. (ctf_link_input_close): Adjust accordingly. (ctf_link_add_ctf_internal): New, split from... (ctf_link_add_ctf): ... here. Return error if lazy loading of CTF is not possible. Change to just call... (ctf_link_add): ... this new function. (ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the ctf_file_close_thunk. (ctf_link_in_member_cb_arg_t) <file_name> Rename to... <in_file_name>: ... this. <arcname>: Drop. <share_mode>: Likewise (migrated to ctf_link_flags). <done_main_member>: Rename to... <done_parent>: ... this. <main_input_fp>: Rename to... <in_fp_parent>: ... this. <cu_mapped>: New. (ctf_link_one_type): Adjuwt accordingly. Transition to ctf_err_warn, removing a TODO. (ctf_link_one_variable): Note a case too common to warn about. Report in the debug stream if a cu-mapped link prevents addition of a conflicting variable. (ctf_link_one_input_archive_member): Adjust. (ctf_link_lazy_open): New, open a CTF archive for linking when needed. (ctf_link_close_one_input_archive): New, close it again. (ctf_link_one_input_archive): Adjust for lazy opening, member renames, and ctf_err_warn transition. Move the empty_link_type_mapping call to... (ctf_link): ... here. Adjut for renamings and thunk removal. Don't spuriously fail if some input contains no CTF data. (ctf_link_write): ctf_err_warn transition. * libctf.ver: Remove not-yet-stable comment.
2020-07-22	libctf: sort out potential refcount loops	Nick Alcock	1	-1/+1
	When you link TUs that contain conflicting types together, the resulting CTF section is an archive containing many CTF dicts. These dicts appear in ctf_link_outputs of the shared dict, with each ctf_import'ing that shared dict. ctf_importing a dict bumps its refcount to stop it going away while it's in use -- but if the shared dict (whose refcount is bumped) has the child dict (doing the bumping) in its ctf_link_outputs, we have a refcount loop, since the child dict only un-ctf_imports and drops the parent's refcount when it is freed, but the child is only freed when the parent's refcount falls to zero. (In the future, this will be able to go wrong on the inputs too, when an ld -r'ed deduplicated output with conflicts is relinked. Right now this cannot happen because we don't ctf_import such dicts at all. This will be fixed in a later commit in this series.) Fix this by introducing an internal-use-only ctf_import_unref function that imports a parent dict witthout bumping the parent's refcount, and using it when we create per-CU outputs. This function is only safe to use if you know the parent cannot go away while the child exists: but if the parent owns the child, as here, this is necessarily true. Record in the ctf_file_t whether a parent was imported via ctf_import or ctf_import_unref, so that if you do another ctf_import later on (or a ctf_import_unref) it can decide whether to drop the refcount of the existing parent being replaced depending on which function you used to import that one. Adjust ctf_serialize so that rather than doing a ctf_import (which is wrong if the original import was ctf_import_unref'fed), we just copy the parent field and refcount over and forcibly flip the unref flag on on the old copy we are going to discard. ctf_file_close also needs a bit of tweaking to only close the parent if it was not imported with ctf_import_unref: while we're at it, guard against repeated closes with a refcount of zero and stop them causing double-frees, even if destruction of things freed inside ctf_file_close cause such recursion. Verified no leaks or accesses to freed memory after all of this with valgrind. (It was leak-happy before.) libctf/ * ctf-impl.c (ctf_file_t) <ctf_parent_unreffed>: New. (ctf_import_unref): New. * ctf-open.c (ctf_file_close) Drop the refcount all the way to zero. Don't recurse back in if the refcount is already zero. (ctf_import): Check ctf_parent_unreffed before deciding whether to close a pre-existing parent. Set it to zero. (ctf_import_unreffed): New, as above, setting ctf_parent_unreffed to 1. * ctf-create.c (ctf_serialize): Do not ctf_import into the new child: use direct assignment, and set unreffed on the new and old children. * ctf-link.c (ctf_create_per_cu): Import the parent using ctf_import_unreffed.
2020-07-22	libctf: rename the type_mapping_key to type_key	Nick Alcock	1	-9/+12
	The name was just annoyingly long and I kept misspelling it. It's also a bad name: it's not a mapping the type might be used in a type mapping, but it is itself a representation of a type (a ctf_file_t / ctf_id_t pair), not of a mapping at all. libctf/ * ctf-impl.h (ctf_link_type_mapping_key): Rename to... (ctf_link_type_key): ... this, adjusting member prefixes to match. (ctf_hash_type_mapping_key): Rename to... (ctf_hash_type_key): ... this. (ctf_hash_eq_type_mapping_key): Rename to... (ctf_hash_eq_type_key): ... this. * ctf-hash.c (ctf_hash_type_mapping_key): Rename to... (ctf_hash_type_key): ... this, and adjust for member name changes. (ctf_hash_eq_type_mapping_key): Rename to... (ctf_hash_eq_type_key): ... this, and adjust for member name changes. * ctf-link.c (ctf_add_type_mapping): Adjust. Note the lack of need for out-of-memory checking in this code. (ctf_type_mapping): Adjust.
2020-01-01	Update year range in copyright notice of binutils files	Alan Modra	1	-1/+1

2019-10-03	libctf: avoid the need to ever use ctf_update	Nick Alcock	1	-17/+3
	The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so their type ID does not change. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t .) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t . (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-10-03	libctf: handle nonrepresentable types at link time	Nick Alcock	1	-4/+14
	GCC can emit references to type 0 to indicate that this type is one that is not representable in the version of CTF it emits (for instance, version 3 cannot encode vector types). Type 0 is already used in the function section to indicate padding inserted to skip functions we do not want to encode the type of, so using zero in this way is a good extension of the format: but libctf reports such types as ECTF_BADID, which is indistinguishable from file corruption via links to truly nonexistent types with IDs like 0xDEADBEEF etc, which we really do want to stop for. In particular, this stops all traversals of types dead at this point, preventing us from even dumping CTF files containing unrepresentable types to see what's going on! So add a new error, ECTF_NONREPRESENTABLE, which is returned by recursive type resolution when a reference to a zero type is found. (No zero type is ever emitted into the CTF file by GCC, only references to one). We can't do much with types that are ultimately nonrepresentable, but we can do enough to keep functioning. Adjust ctf_add_type to ensure that top-level types of type zero and structure and union members of ultimate type zero are simply skipped without reporting an error, so we can copy structures and unions that contain nonrepresentable members (skipping them and leaving a hole where they would be, so no consumers downstream of the linker need to worry about this): adjust the dumper so that we dump members of nonrepresentable types in a simple form that indicates nonrepresentability rather than terminating the dump, and do not falsely assume all errors to be -ENOMEM: adjust the linker so that types that fail to get added are simply skipped, so that both nonrepresentable types and outright errors do not terminate the type addition, which could skip many valid types and cause further errors when variables of those types are added. In future, when we gain the ability to call back to the linker to report link-time type resolution errors, we should report failures to add all but nonrepresentable types. But we can't do that yet. v5: Fix tabdamage. include/ * ctf-api.h (ECTF_NONREPRESENTABLE): New. libctf/ * ctf-types.c (ctf_type_resolve): Return ECTF_NONREPRESENTABLE on type zero. * ctf-create.c (ctf_add_type): Detect and skip nonrepresentable members and types. (ctf_add_variable): Likewise for variables pointing to them. * ctf-link.c (ctf_link_one_type): Do not warn for nonrepresentable type link failure, but do warn for others. * ctf-dump.c (ctf_dump_format_type): Likewise. Do not assume all errors to be ENOMEM. (ctf_dump_member): Likewise. (ctf_dump_type): Likewise. (ctf_dump_header_strfield): Do not assume all errors to be ENOMEM. (ctf_dump_header_sectfield): Do not assume all errors to be ENOMEM. (ctf_dump_header): Likewise. (ctf_dump_label): likewise. (ctf_dump_objts): likewise. (ctf_dump_funcs): likewise. (ctf_dump_var): likewise. (ctf_dump_str): Likewise.
2019-10-03	libctf: add CU-mapping machinery	Nick Alcock	1	-4/+167
	Once the deduplicator is capable of actually detecting conflicting types with the same name (i.e., not yet) we will place such conflicting types, and types that depend on them, into CTF dictionaries that are the child of the main dictionary we usually emit: currently, this will lead to the .ctf section becoming a CTF archive rather than a single dictionary, with the default-named archive member (_CTF_SECTION, or NULL) being the main shared dictionary with most of the types in it. By default, the sections are named after the compilation unit they come from (complete path and all), with the cuname field in the CTF header providing further evidence of the name without requiring the caller to engage in tiresome parsing. But some callers may not wish the mapping from input CU to output sub-dictionary to be purely CU-based. The machinery here allows this to be freely changed, in two ways: - callers can call ctf_link_add_cu_mapping to specify that a single input compilation unit should have its types placed in some other CU if they conflict: the CU will always be created, even if empty, so the consuming program can depend on its existence. You can map multiple input CUs to one output CU to force all their types to be merged together: if some of those types conflict, the behaviour is currently unspecified (the new deduplicator will specify it). - callers can call ctf_link_set_memb_name_changer to provide a function which is passed every CTF sub-dictionary name in turn (including _CTF_SECTION) and can return a new name, or NULL if no change is desired. The mapping from input to output names should not map two input names to the same output name: if this happens, the two are not merged but will result in an archive with two members with the same name (technically valid, but it's hard to access the second same-named member: you have to do an iteration over archive members). This is used by the kernel's ctfarchive machinery (not yet upstream) to encode CTF under member names like {module name}.ctf rather than .ctf.CU, but it is anticipated that other large projects may wish to have their own storage for CTF outside of .ctf sections and may wish to have new naming schemes that suit their special-purpose consumers. New in v3. v4: check for strdup failure. v5: fix tabdamage. include/ * ctf-api.h (ctf_link_add_cu_mapping): New. (ctf_link_memb_name_changer_f): New. (ctf_link_set_memb_name_changer): New. libctf/ * ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New. <ctf_link_memb_name_changer>: Likewise. <ctf_link_memb_name_changer_arg>: Likewise. * ctf-create.c (ctf_update): Update accordingly. * ctf-open.c (ctf_file_close): Likewise. * ctf-link.c (ctf_create_per_cu): Apply the cu mapping. (ctf_link_add_cu_mapping): New. (ctf_link_set_memb_name_changer): Likewise. (ctf_change_parent_name): New. (ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names allocated by the caller's ctf_link_memb_name_changer. <ndynames>: Likewise. (ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer. (ctf_link_write): Likewise (for _CTF_SECTION only): also call ctf_change_parent_name. Free any resulting names.
2019-10-03	libctf: add linking of the variable section	Nick Alcock	1	-24/+134
	The compiler describes the name and type of all file-scope variables in this section. Merging it at link time requires using the type mapping added in the previous commit to determine the appropriate type for the variable in the output, given its type in the input: we check the shared container first, and if the type doesn't exist there, it must be a conflicted type in the per-CU child, and the variable should go there too. We also put the variable in the per-CU child if a variable with the same name but a different type already exists in the parent: we ignore any such conflict in the child because CTF cannot represent such things, nor can they happen unless a third-party linking program has overridden the mapping of CU to CTF archive member name (using machinery added in a later commit). v3: rewritten using an algorithm that actually works in the case of conflicting names. Some code motion from the next commit. Set the per-CU parent name. v4: check for strdup failure. v5: fix tabdamage. include/ * ctf-api.h (ECTF_INTERNAL): New. libctf/ * ctf-link.c (ctf_create_per_cu): New, refactored out of... (ctf_link_one_type): ... here, with parent-name setting added. (check_variable): New. (ctf_link_one_variable): Likewise. (ctf_link_one_input_archive_member): Call it. * ctf-error.c (_ctf_errlist): Updated with new errors.
2019-10-03	libctf: map from old to corresponding newly-added types in ctf_add_type	Nick Alcock	1	-0/+114
	This lets you call ctf_type_mapping (dest_fp, src_fp, src_type_id) and get told what type ID the corresponding type has in the target ctf_file_t. This works even if it was added by a recursive call, and because it is stored in the target ctf_file_t it works even if we had to add one type to multiple ctf_file_t's as part of conflicting type handling. We empty out this mapping after every archive is linked: because it maps input to output fps, and we only visit each input fp once, its contents are rendered entirely useless every time the source fp changes. v3: add several missing mapping additions. Add ctf_dynhash_empty, and empty after every input archive. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_file_t): New field ctf_link_type_mapping. (struct ctf_link_type_mapping_key): New. (ctf_hash_type_mapping_key): Likewise. (ctf_hash_eq_type_mapping_key): Likewise. (ctf_add_type_mapping): Likewise. (ctf_type_mapping): Likewise. (ctf_dynhash_empty): Likewise. * ctf-open.c (ctf_file_close): Update accordingly. * ctf-create.c (ctf_update): Likewise. (ctf_add_type): Populate the mapping. * ctf-hash.c (ctf_hash_type_mapping_key): Hash a type mapping key. (ctf_hash_eq_type_mapping_key): Check the key for equality. (ctf_dynhash_insert): Fix comment typo. (ctf_dynhash_empty): New. * ctf-link.c (ctf_add_type_mapping): New. (ctf_type_mapping): Likewise. (empty_link_type_mapping): New. (ctf_link_one_input_archive): Call it.
2019-10-03	libctf: add the ctf_link machinery	Nick Alcock	1	-0/+528
	This is the start of work on the core of the linking mechanism for CTF sections. This commit handles the type and string sections. The linker calls these functions in sequence: ctf_link_add_ctf: to add each CTF section in the input in turn to a newly-created ctf_file_t (which will appear in the output, and which itself will become the shared parent that contains types that all TUs have in common (in all link modes) and all types that do not have conflicting definitions between types (by default). Input files that are themselves products of ld -r are supported, though this is not heavily tested yet. ctf_link: called once all input files are added to merge the types in all the input containers into the output container, eliminating duplicates. ctf_link_add_strtab: called once the ELF string table is finalized and all its offsets are known, this calls a callback provided by the linker which returns the string content and offset of every string in the ELF strtab in turn: all these strings which appear in the input CTF strtab are eliminated from it in favour of the ELF strtab: equally, any strings that only appear in the input strtab will reappear in the internal CTF strtab of the output. ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab is finalized, this calls a callback provided by the linker which returns information on every symbol in turn as a ctf_link_sym_t. This is then used to shuffle the function info and data object sections in the CTF section into symbol table order, eliminating the index sections which map those sections to symbol names before that point. Currently just returns ECTF_NOTYET. ctf_link_write: Returns a buffer containing either a serialized ctf_file_t (if there are no types with conflicting definitions in the object files in the link) or a ctf_archive_t containing a large ctf_file_t (the common types) and a bunch of small ones named after individual CUs in which conflicting types are found (containing the conflicting types, and all types that reference them). A threshold size above which compression takes place is passed as one parameter. (Currently, only gzip compression is supported, but I hope to add lzma as well.) Lifetime rules for this are simple: don't close the input CTF files until you've called ctf_link for the last time. We do not assume that symbols or strings passed in by the callback outlast the call to ctf_link_add_strtab or ctf_link_shuffle_syms. Right now, the duplicate elimination mechanism is the one already present as part of the ctf_add_type function, and is not particularly good: it misses numerous actual duplicates, and the conflicting-types detection hardly ever reports that types conflict, even when they do (one of them just tends to get silently dropped): it is also very slow. This will all be fixed in the next few weeks, but the fix hardly touches any of this code, and the linker does work without it, just not as well as it otherwise might. (And when no CTF section is present, there is no effect on performance, of course. So only people using a trunk GCC with not-yet-committed patches will even notice. By the time it gets upstream, things should be better.) v3: Fix error handling. v4: check for strdup failure. v5: fix tabdamage. include/ * ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the libctf linking machinery. (CTF_LINK_SHARE_UNCONFLICTED): New. (CTF_LINK_SHARE_DUPLICATED): New. (ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED. (ECTF_NOTYET): New, a 'not yet implemented' message. (ctf_link_add_ctf): New, add an input file's CTF to the link. (ctf_link): New, merge the type and string sections. (ctf_link_strtab_string_f): New, callback for feeding strtab info. (ctf_link_iter_symbol_f): New, callback for feeding symtab info. (ctf_link_add_strtab): New, tell the CTF linker about the ELF strtab's strings. (ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its symbols into symtab order. (ctf_link_write): New, ask the CTF linker to write the CTF out. libctf/ * ctf-link.c: New file, linking of the string and type sections. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate. * ctf-impl.h (ctf_file_t): New fields ctf_link_inputs, ctf_link_outputs. * ctf-create.c (ctf_update): Update accordingly. * ctf-open.c (ctf_file_close): Likewise. * ctf-error.c (_ctf_errlist): Updated with new errors.