aboutsummaryrefslogtreecommitdiff
path: root/libctf
AgeCommit message (Collapse)AuthorFilesLines
2021-09-27libctf, lookup: fix bounds of pptrtab lookupNick Alcock4-2/+78
An off-by-one bug in the check for pptrtab lookup meant that we could access the pptrtab past its bounds (*well* past its bounds), particularly if we called ctf_lookup_by_name in a child dict with "*foo" where "foo" is a type that exists in the parent but not the child and no previous lookups by name have been carried out. (Note that "*foo" is not even a valid thing to call ctf_lookup_by_name with: foo * is. Nonetheless, users sometimes do call ctf_lookup_by_name with invalid content, and it should return ECTF_NOTYPE, not crash.) ctf_pptrtab_len, as its name suggests (and as other tests of it in ctf-lookup.c confirm), is one higher than the maximum valid permissible index, so the comparison is wrong. (Test added, which should fail pretty reliably in the presence of this bug on any machine with 4KiB pages.) libctf/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * ctf-lookup.c (ctf_lookup_by_name_internal): Fix pptrtab bounds. * testsuite/libctf-writable/pptrtab-writable-page-deep-lookup.*: New test.
2021-09-27libctf, testsuite: fix various warnings in testsNick Alcock11-18/+28
These warnings are all off by default, but if they do fire you get spurious ERRORs when running make check-libctf. libctf/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * testsuite/libctf-lookup/enum-symbol.c: Remove unused label. * testsuite/libctf-lookup/conflicting-type-syms.c: Remove unused variables. * testsuite/libctf-regression/pptrtab.c: Likewise. * testsuite/libctf-regression/type-add-unnamed-struct.c: Likewise. * testsuite/libctf-writable/pptrtab.c: Likewise. * testsuite/libctf-writable/reserialize-strtab-corruption.c: Likewise. * testsuite/libctf-regression/nonstatic-var-section-ld-r.c: Fix format string. * testsuite/libctf-regression/nonstatic-var-section-ld.c: Likewise. * testsuite/libctf-regression/nonstatic-var-section-ld.lk: Adjust. * testsuite/libctf-writable/symtypetab-nonlinker-writeout.c: Fix initializer.
2021-09-27libctf: fix handling of CTF symtypetab sections emitted by older GCCNick Alcock2-3/+11
Older (pre-upstreaming) GCC emits a function symtypetab section of a format never read by any extant libctf. We can detect such CTF dicts by the lack of the CTF_F_NEWFUNCINFO flag in their header, and we do so when reading in the symtypetab section -- but if the set of symbols with types is sufficiently sparse, even an older GCC will emit a function index section. In NEWFUNCINFO-capable compilers, this section will always be the exact same length as the corresponding function section (each is an array of uint32_t, associated 1:1 with each other). But this is not true for the older compiler, for which the sections are different lengths. We check to see if the function symtypetab section and its index are the same length, but we fail to skip this check when this is not a NEWFUNCINFO dict, and emit a spurious corruption error for a CTF dict we could have perfectly well opened and used. Fix trivial: check the flag (and fix the terrible grammar of the error message at the same time). libctf/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * ctf-open.c (ctf_bufopen_internal): Don't complain about corrupt function index symtypetab sections if this is an old-format function symtypetab section (which should be ignored in any case). Fix bad grammar.
2021-09-27configure: regenerate in all projects that use libtool.m4Nick Alcock3-50/+118
(including sim/, which has no changelog.) bfd/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * configure: Regenerate. binutils/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * configure: Regenerate. gas/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * configure: Regenerate. gprof/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * configure: Regenerate. ld/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * configure: Regenerate. libctf/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * configure: Regenerate. * Makefile.in: Regenerate. opcodes/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * configure: Regenerate. zlib/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * configure: Regenerate.
2021-09-27libctf: try several possibilities for linker versioning flagsNick Alcock4-11/+62
Checking for linker versioning by just grepping ld --help output for mentions of --version-script is inadequate now that Solaris 11.4 implements a --version-script with different semantics. Try linking a test program with a small wildcard-using version script with each supported set of flags in turn, to make sure that linker versioning is not only advertised but actually works. The Solaris "GNU-compatible" linker versioning is not quite GNU-compatible enough, but we can work around the differences by generating a new version script that removes the comments from the original (Solaris ld requires #-style comments), and making another version script for libctf-nonbfd in particular which doesn't mention any of the symbols that appear in libctf.la, to avoid Solaris ld introducing corresponding new NOTYPE symbols to match the version script. libctf/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> PR libctf/27967 * configure.ac (VERSION_FLAGS): Replace with... (ac_cv_libctf_version_script): ... this multiple test. (VERSION_FLAGS_NOBFD): Substitute this too. * Makefile.am (libctf_nobfd_la_LDFLAGS): Use it. Split out... (libctf_ldflags_nover): ... non-versioning flags here. (libctf_la_LDFLAGS): Use it. * libctf.ver: Give every symbol not in libctf-nobfd a comment on the same line noting as much.
2021-09-27libctf: link against libiberty before linking in libbfd or libctf-nobfdNick Alcock3-2/+18
This ensures that the CTF_LIBADD, which always contains at least this when doing a shared link: -L`pwd`/../libiberty/pic -liberty appears in the link line before any requirements pulled in by libbfd.la, which include -liberty but because it is install-time do not include the -L`pwd`/../libiberty/pic portion (in an indirect dep like this, the path comes from the libbfd.la file, and is an install path). libiberty also appears after libbfd in the link line by virtue of libctf-nobfd.la, because libctf-nobfd has to follow libbfd in the link line, and that needs symbols from libiberty too. Without this, an installed liberty might well be pulled in by libbfd, and if --enable-install-libiberty is not specified this libiberty might be completely incompatible with what is being installed and break either or boht of libbfd and libctf. (The specific problem observed here is that bsearch_r was not present, but other problems might easily be observed in future too.) Because ld links against libctf, this has a tendency to break the system linker at install time too, if installing with --prefix=/usr. That's quite unpleasant to recover from. libctf/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> PR libctf/27360 * Makefile.am (libctf_la_LIBADD): Link against libiberty before pulling in libbfd.la or pulling in libctf-nobfd.la. * Makefile.in: Regenerate.
2021-09-03CC_FOR_TARGET et alAlan Modra4-15/+17
The top level Makefile, the ld Makefile and others, define CC_FOR_TARGET to be a compiler for the binutils target machine. This is the compiler that should be used for almost all tests with C source. There are _FOR_TARGET versions of CFLAGS, CXX, and CXXFLAGS too. This was all supposed to work with the testsuite .exp files using CC for the target compiler, and CC_FOR_HOST for the host compiler, with the makefiles passing CC=$CC_FOR_TARGET and CC_FOR_HOST=$CC to the runtest invocation. One exception to the rule of using CC_FOR_TARGET is the native-only ld bootstrap test, which uses the newly built ld to link a copy of itself. Since the files being linked were created with the host compiler, the boostrap test should use CC and CFLAGS, in case some host compiler option provides needed libraries automatically. However, bootstrap.exp used CC where it should have used CC_FOR_HOST. I set about fixing that problem, then decided that playing games in the makefiles with CC was a bad idea. Not only is it confusing, but other dejagnu code knows about CC_FOR_TARGET. See dejagnu/target.exp. So this patch gets rid of the makefile variable renaming and changes all the .exp files to use the correct _FOR_TARGET variables. CC_FOR_HOST and CFLAGS_FOR_HOST disappear. A followup patch will correct bootstrap.exp to use CFLAGS, and a number of other things I noticed. binutils/ * testsuite/lib/binutils-common.exp (run_dump_test): Use CC_FOR_TARGET and CFLAGS_FOR_TARGET rather than CC and CFLAGS. ld/ * Makefile.am (check-DEJAGNU): Don't set CC to CC_FOR_TARGET and similar. Pass variables with unchanged names. Don't set CC_FOR_HOST or CFLAGS_FOR_HOST. * Makefile.in: Regenerate. * testsuite/config/default.exp: Update default CC and similar. (compiler_supports, plug_opt): Use CC_FOR_TARGET. * testsuite/ld-cdtest/cdtest.exp: Replace all uses of CC with CC_FOR_TARGET, and similarly for CFLAGS, CXX and CXXFLAGS. * testsuite/ld-auto-import/auto-import.exp: Likewise. * testsuite/ld-cygwin/exe-export.exp: Likewise. * testsuite/ld-elf/dwarf.exp: Likewise. * testsuite/ld-elf/indirect.exp: Likewise. * testsuite/ld-elf/shared.exp: Likewise. * testsuite/ld-elfcomm/elfcomm.exp: Likewise. * testsuite/ld-elfvers/vers.exp: Likewise. * testsuite/ld-elfvsb/elfvsb.exp: Likewise. * testsuite/ld-elfweak/elfweak.exp: Likewise. * testsuite/ld-gc/gc.exp: Likewise. * testsuite/ld-ifunc/ifunc.exp: Likewise. * testsuite/ld-mn10300/mn10300.exp: Likewise. * testsuite/ld-pe/pe-compile.exp: Likewise. * testsuite/ld-pe/pe-run.exp: Likewise. * testsuite/ld-pe/pe-run2.exp: Likewise. * testsuite/ld-pie/pie.exp: Likewise. * testsuite/ld-plugin/lto.exp: Likewise. * testsuite/ld-plugin/plugin.exp: Likewise. * testsuite/ld-scripts/crossref.exp: Likewise. * testsuite/ld-selective/selective.exp: Likewise. * testsuite/ld-sh/sh.exp: Likewise. * testsuite/ld-shared/shared.exp: Likewise. * testsuite/ld-srec/srec.exp: Likewise. * testsuite/ld-undefined/undefined.exp: Likewise. * testsuite/ld-unique/unique.exp: Likewise. * testsuite/ld-x86-64/tls.exp: Likewise. * testsuite/lib/ld-lib.exp: Likewise. libctf/ * Makefile.am (check-DEJAGNU): Don't set CC to CC_FOR_TARGET. Pass CC and CC_FOR_TARGET. Don't set CC_FOR_HOST. * Makefile.in: Regenerate. * testsuite/config/default.exp: Update default CC and similar. * testsuite/lib/ctf-lib.exp (run_native_host_cmd): Use CC rather than CC_FOR_HOST. (run_lookup_test): Use CC_FOR_TARGET and CFLAGS_FOR_TARGET.
2021-09-03ubsan: libctf: applying zero offset to null pointerAlan Modra1-1/+1
* ctf-open.c (init_symtab): Avoid ubsan error.
2021-09-03haiku tidyAlan Modra1-1/+1
--enable-maintainer-mode showed a number of files needing to be regenerated, and in the case of ld/Makefile.in that the file was regenerated by hand. Nothing to see here really. ld/ * Makefile.am (ALL_64_EMULATION_SOURCES): Sort haiku entry. * Makefile.in: Regenerate. * po/BLD-POTFILES.in: Regenerate. libctf/ * configure: Regenerate. zlib/ * configure: Regenerate.
2021-07-03Add markers for 2.37 branchNick Clifton1-0/+4
2021-05-09Use htab_eq_string in libctfAlan Modra4-18/+16
* ctf-impl.h (ctf_dynset_eq_string): Don't declare. * ctf-hash.c (ctf_dynset_eq_string): Delete function. * ctf-dedup.c (make_set_element): Use htab_eq_string. (ctf_dedup_atoms_init, ADD_CITER, ctf_dedup_init): Likewise. (ctf_dedup_conflictify_unshared): Likewise. (ctf_dedup_walk_output_mapping): Likewise.
2021-05-06libctf, ld: fix test results for upstream GCCNick Alcock3-3/+8
The tests currently in binutils are aimed at the original GCC-based implementation of CTF, which emitted CTF directly from GCC's internal representation. The approach now under review emits CTF from DWARF, with an eye to eventually doing this for all non-DWARF debuginfo-like formats GCC supports. It also uses a different flag to enable CTF emission (-gctf rather than -gt). Adjust the testsuite accordingly. Given that the ld testsuite results are dependent on type ordering, which we do not guarantee at all, it's amazing how little changes. We see a few type ordering differences, slices change because the old GCC was buggy (slices were emitted "backwards", from the wrong end of the machine word) and its expected results were wrong, and GCC now emits the underlying integral type for enumerated types, though CTF has no way to record this yet (coming in v4). GCC also now emits even hidden symbols into the symtab (and thus symtypetab), so one symtypetab test changes its expected results slightly to compensate. Also add tests for the CTF_K_UNKNOWN nonrepresentable type: this couldn't be done before now since the only GCC that emits CTF_K_UNKNOWN for nonrepresentable types is the new one. ld/ChangeLog 2021-05-06 Nick Alcock <nick.alcock@oracle.com> * testsuite/ld-ctf/ctf.exp: Use -gctf, not -gt. * testsuite/lib/ld-lib.exp: Likewise. * testsuite/ld-ctf/nonrepresentable-1.c: New test for nonrepresentable types. * testsuite/ld-ctf/nonrepresentable-2.c: Likewise. * testsuite/ld-ctf/nonrepresentable.d: Likewise. * testsuite/ld-ctf/array.d: Larger type section. * testsuite/ld-ctf/data-func-conflicted.d: Likewise. * testsuite/ld-ctf/enums.d: Likewise. * testsuite/ld-ctf/conflicting-enums.d: Don't compare types. * testsuite/ld-ctf/cross-tu-cyclic-conflicting.d: Changed type order. * testsuite/ld-ctf/cross-tu-noncyclic.d: Likewise. * testsuite/ld-ctf/slice.d: Adjust for improved slice emission. libctf/ChangeLog 2021-05-06 Nick Alcock <nick.alcock@oracle.com> * testsuite/lib/ctf-lib.exp: Use -gctf, not -gt. * testsuite/libctf-regression/nonstatic-var-section-ld-r.lk: Hidden symbols now get into the symtypetab anyway.
2021-05-06libctf, include: support an alternative encoding for nonrepresentable typesNick Alcock7-10/+69
Before now, types that could not be encoded in CTF were represented as references to type ID 0, which does not itself appear in the dictionary. This choice is annoying in several ways, principally that it forces generators and consumers of CTF to grow special cases for types that are referenced in valid dicts but don't appear. Allow an alternative representation (which will become the only representation in format v4) whereby nonrepresentable types are encoded as actual types with kind CTF_K_UNKNOWN (an already-existing kind theoretically but not in practice used for padding, with value 0). This is backward-compatible, because CTF_K_UNKNOWN was not used anywhere before now: it was used in old-format function symtypetabs, but these were never emitted by any compiler and the code to handle them in libctf likely never worked and was removed last year, in favour of new-format symtypetabs that contain only type IDs, not type kinds. In order to link this type, we need an API addition to let us add types of unknown kind to the dict: we let them optionally have names so that GCC can emit many different unknown types and those types with identical names will be deduplicated together. There are also small tweaks to the deduplicator to actually dedup such types, to let opening of dicts with unknown types with names work, to return the ECTF_NONREPRESENTABLE error on resolution of such types (like ID 0), and to print their names as something useful but not a valid C identifier, mostly for the sake of the dumper. Tests added in the next commit. include/ChangeLog 2021-05-06 Nick Alcock <nick.alcock@oracle.com> * ctf.h (CTF_K_UNKNOWN): Document that it can be used for nonrepresentable types, not just padding. * ctf-api.h (ctf_add_unknown): New. libctf/ChangeLog 2021-05-06 Nick Alcock <nick.alcock@oracle.com> * ctf-open.c (init_types): Unknown types may have names. * ctf-types.c (ctf_type_resolve): CTF_K_UNKNOWN is as non-representable as type ID 0. (ctf_type_aname): Print unknown types. * ctf-dedup.c (ctf_dedup_hash_type): Do not early-exit for CTF_K_UNKNOWN types: they have real hash values now. (ctf_dedup_rwalk_one_output_mapping): Treat CTF_K_UNKNOWN types like other types with no referents: call the callback and do not skip them. (ctf_dedup_emit_type): Emit via... * ctf-create.c (ctf_add_unknown): ... this new function. * libctf.ver (LIBCTF_1.2): Add it.
2021-03-25libctf: fix ELF-in-BFD checks in the presence of ASANNick Alcock3-13/+18
The address sanitizer contains a redirector that captures dlopen calls, so checks for dlopen with AC_SEARCH_LIBS will always conclude that dlopen is present when the sanitizer is on. This means it won't add -ldl to LIBS even if needed, and the immediately-following attempt to actually link with -lbfd will fail because libbfd also needs dlsym, which ASAN does *not* contain a redirector for. If we check for dlsym instead of dlopen, the check works whether ASAN is on or off. (bfd uses both in close proximity: if it needs one, it will always need the other.) libctf/ChangeLog 2021-03-25 Nick Alcock <nick.alcock@oracle.com> * configure.ac: Check for dlsym, not dlopen. * configure: Regenerate.
2021-03-25libctf: fix memory leak in a testNick Alcock2-0/+6
Harmless, but causes noise that makes it harder to spot other leaks. libctf/ChangeLog 2021-03-25 Nick Alcock <nick.alcock@oracle.com> * testsuite/libctf-writable/symtypetab-nonlinker-writeout.c: Don't leak buf.
2021-03-25libctf: don't dereference out-of-bounds locations in the qualifier hashtabNick Alcock2-3/+13
isqualifier, which is used by ctf_lookup_by_name to figure out if a given word in a type name is a qualifier, takes the address of a possibly out-of-bounds location before checking its bounds. In any reasonable compiler this will just lead to a harmless address computation that is then discarded if out-of-bounds, but it's still undefined behaviour and the sanitizer rightly complains. libctf/ChangeLog 2021-03-25 Nick Alcock <nick.alcock@oracle.com> PR libctf/27628 * ctf-lookup.c (isqualifier): Don't dereference out-of-bounds qhash values.
2021-03-25libctf: make ctf_bfdopen_ctfsect a debugger entry pointNick Alcock2-0/+6
This makes it possible to use LIBCTF_DEBUG to debug things that happen before the ctf_bfdopen_internal call that ctf_bfdopen_ctfsect eventually thunks down to (symtab/strtab lookup, archive opening, etc). This is not important for ctf_open callers, since ctf_fdopen already calls libctf_init_debug, but ctf_bfdopen_ctfsect is a public entry point that can be called directly (e.g. objdump and readelf both do so). libctf/ChangeLog 2021-03-25 Nick Alcock <nick.alcock@oracle.com> * ctf-open-bfd.c (ctf_bfdopen_ctfsect): Initialize debugging.
2021-03-25libctf, serialize: functions with no args have a NULL dtd_vlenNick Alcock2-1/+9
Every place that accesses a function's dtd_vlen accesses it only if the number of args is nonzero, except the serializer, which always tries to memcpy it. The number of bytes it memcpys in this case is zero, but it is still undefined behaviour to copy zero bytes from a null pointer. So check for this case explicitly. libctf/ChangeLog 2021-03-25 Nick Alcock <nick.alcock@oracle.com> PR libctf/27628 * ctf-serialize.c (ctf_emit_type_sect): Allow for a NULL vlen in CTF_K_FUNCTION types.
2021-03-25libctf, dump: do not emit size or alignment if it would errorNick Alcock2-5/+12
When we dump normal types, we emit their size and/or alignment: but size and alignment dumping can return errors if the type is part of a chain that terminates in a forward. Emitting 0xffffffff as a size or alignment is unhelpful, so simply skip emitting this info for any type for which size or alignment checks return an error, no matter what the error is. libctf/ChangeLog 2021-03-25 Nick Alcock <nick.alcock@oracle.com> * ctf-dump.c (ctf_dump_format_type): Don't emit size or alignment on error.
2021-03-21Provide an inline startswith function in bfd.hAlan Modra2-0/+5
bfd/ * bfd-in.h (startswith): New inline. (CONST_STRNEQ): Use startswith. * bfd-in2.h: Regenerate. gdbsupport/ * common-utils.h (startswith): Delete version now supplied by bfd.h. libctf/ * ctf-impl.h: Include string.h.
2021-03-18libctf: support encodings for enumsNick Alcock3-2/+17
The previous commit started to error-check the lookup of ctf_type_encoding for the underlying type that is internally done when carrying out a ctf_type_encoding on a slice. Unfortunately, enums have no encoding, so this has historically been returning an error (which is ignored) and then populating the cte_format with uninitialized data. Now the error is not ignored, this is returning an error, which breaks linking of CTF containing bitfields of enumerated type. CTF format v3 does not record the actual underlying type of a enum, but we can mock up something that is not *too* wrong, and that is at any rate better than uninitialized data. ld/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * testsuite/ld-ctf/slice.c: Check slices of enums too. * testsuite/ld-ctf/slice.d: Results adjusted. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-types.c (ctf_type_encoding): Support, after a fashion, for enums. * ctf-dump.c (ctf_dump_format_type): Do not report enums' degenerate encoding.
2021-03-18libctf: a couple of small error-handling fixesNick Alcock3-10/+25
Out-of-memory errors initializing the string atoms table were disregarded (though they would have caused a segfault very shortly afterwards). Errors hashing types during deduplication were only reported if they happened on the output dict, which is almost never the case (most errors are going to be on the dict we're working over, which is going to be one of the inputs). (The error was detected in both cases, but the errno was extracted from the wrong dict.) libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-dedup.c (ctf_dedup_rhash_type): Report errors on the input dict properly. * ctf-open.c (ctf_bufopen_internal): Report errors initializing the atoms table.
2021-03-18libctf: types: unify code dealing with small-vs-large struct membersNick Alcock3-157/+150
This completes the job of unifying what was once three separate code paths full of duplication for every function dealing with querying the properties of struct and union members. The dynamic code path was already removed: this change removes the distinction between small and large members, by adding a helper that copies out members from the vlen, expanding small members into large ones as it does so. This makes it possible to have *more* representations of things like structure members without needing to change the querying functions at all. It also lets us check for buffer overruns more effectively, verifying that we don't accidentally overrun the end of the vlen in either the dynamic or static type case. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_next_t) <ctn_tp>: New. <u.ctn_mp>: Remove. <u.ctn_lmp>: Remove. <u.ctn_vlen>: New. * ctf-types.c (ctf_struct_member): New. (ctf_member_next): Use it, dropping separate large/small code paths. (ctf_type_align): Likewise. (ctf_member_info): Likewise. (ctf_type_rvisit): Likewise.
2021-03-18libctf: eliminate dtd_u, part 5: structs / unionsNick Alcock8-410/+323
Eliminate the dynamic member storage for structs and unions as we have for other dynamic types. This is much like the previous enum elimination, except that structs and unions are the only types for which a full-sized ctf_type_t might be needed. Up to now, this decision has been made in the individual ctf_add_{struct,union}_sized functions and duplicated in ctf_add_member_offset. The vlen machinery lets us simplify this, always allocating a ctf_lmember_t and setting the dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is really justified and (almost always) repack things down into a ctf_stype_t at ctf_serialize time. This allows us to eliminate the dynamic member paths from the iterators and query functions in ctf-types.c in favour of always using the large-structure vlen stuff for dynamic types (the diff is ugly but that's just because of the volume of reindentation this calls for). This also means the large-structure vlen stuff gets more heavily tested, which is nice because it was an almost totally unused code path before now (it only kicked in for structures of size >4GiB, and how often do you see those?) The only extra complexity here is ctf_add_type. Back in the days of the nondeduplicating linker this was called a ridiculous number of times for countless identical copies of structures: eschewing the repeated lookups of the dtd in ctf_add_member_offset and adding the members directly saved an amazing amount of time. Now the nondeduplicating linker is gone, this is extreme overoptimization: we can rip out the direct addition and use ctf_member_next and ctf_add_member_offset, just like ctf_dedup_emit does. We augment a ctf_add_type test to try adding a self-referential struct, the only thing the ctf_add_type part of this change really perturbs. This completes the elimination of dtd_u. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove. <dtd_u>: Likewise. (ctf_dmdef_t): Remove. (struct ctf_next) <u.ctn_dmd>: Remove. * ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial vlen size. (ctf_add_enum): Use it. (ctf_dtd_delete): Do not free the (removed) dmd; remove string refs from the vlen on struct deletion. (ctf_add_struct_sized): Populate the vlen: do it by hand if promoting forwards. Always populate the full-size lsizehi/lsizelo members. (ctf_add_union_sized): Likewise. (ctf_add_member_offset): Set up the vlen rather than the dmd. Expand it as needed, repointing string refs via ctf_str_move_pending. Add the member names as pending strings. Always populate the full-size lsizehi/lsizelo members. (membadd): Remove, folding back into... (ctf_add_type_internal): ... here, adding via an ordinary ctf_add_struct_sized and _next iteration rather than doing everything by hand. * ctf-serialize.c (ctf_copy_smembers): Remove this... (ctf_copy_lmembers): ... and this... (ctf_emit_type_sect): ... folding into here. Figure out if a ctf_stype_t is needed here, not in ctf_add_*_sized. (ctf_type_sect_size): Figure out the ctf_stype_t stuff the same way here. * ctf-types.c (ctf_member_next): Remove the dmd path and always use the vlen. Force large-structure usage for dynamic types. (ctf_type_align): Likewise. (ctf_member_info): Likewise. (ctf_type_rvisit): Likewise. * testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a self-referential type to this test. * testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted accordingly. * testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18libctf: eliminate dtd_u, part 4: enumsNick Alcock8-122/+277
This is the first tricky one, the first complex multi-entry vlen containing strings. To handle this in vlen form, we have to handle pending refs moving around on realloc. We grow vlen regions using a new ctf_grow_vlen function, and iterate through the existing enums every time a grow happens, telling the string machinery the distance between the old and new vlen region and letting it adjust the pending refs accordingly. (This avoids traversing all outstanding refs to find the refs that need adjusting, at the cost of having to traverse one enum: an obvious major performance win.) Addition of enums themselves (and also structs/unions later) is a bit trickier than earlier forms, because the type might be being promoted from a forward, and forwards have no vlen: so we have to spot that and create it if needed. Serialization of enums simplifies down to just telling the string machinery about the string refs; all the enum type-lookup code loses all its dynamic member lookup complexity entirely. A new test is added that iterates over (and gets values of) an enum with enough members to force a round of vlen growth. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtd_vlen_alloc>: New. (ctf_str_move_pending): Declare. * ctf-string.c (ctf_str_add_ref_internal): Fix error return. (ctf_str_move_pending): New. * ctf-create.c (ctf_grow_vlen): New. (ctf_dtd_delete): Zero out the vlen_alloc after free. Free the vlen later: iterate over it and free enum name refs first. (ctf_add_generic): Populate dtd_vlen_alloc from vlen. (ctf_add_enum): populate the vlen; do it by hand if promoting forwards. (ctf_add_enumerator): Set up the vlen rather than the dmd. Expand it as needed, repointing string refs via ctf_str_move_pending. Add the enumerand names as pending strings. * ctf-serialize.c (ctf_copy_emembers): Remove. (ctf_emit_type_sect): Copy the vlen into place and ref the strings. * ctf-types.c (ctf_enum_next): The dynamic portion now uses the same code as the non-dynamic. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. * testsuite/libctf-lookup/enum-many-ctf.c: New test. * testsuite/libctf-lookup/enum-many.lk: New test.
2021-03-18libctf: do not corrupt strings across ctf_serializeNick Alcock8-12/+209
The preceding change revealed a new bug: the string table is sorted for better compression, so repeated serialization with type (or member) additions in the middle can move strings around. But every serialization flushes the set of refs (the memory locations that are automatically updated with a final string offset when the strtab is updated), so if we are not to have string offsets go stale, we must do all ref additions within the serialization code (which walks the complete set of types and symbols anyway). Unfortunately, we were adding one ref in another place: the type name in the dynamic type definitions, which has a ref added to it by ctf_add_generic. So adding a type, serializing (via, say, one of the ctf_write functions), adding another type with a name that sorts earlier, and serializing again will corrupt the name of the first type because it no longer had a ref pointing to its dtd entry's name when its string offset was shifted later in the strtab to mae way for the other type. To ensure that we don't miss strings, we also maintain a set of *pending refs* that will be added later (during serialization), and remove entries from that set when the ref is finally added. We always use ctf_str_add_pending outside ctf-serialize.c, ensure that ctf_serialize adds all strtab offsets as refs (even those in the dtds) on every serialization, and mandate that no refs are live on entry to ctf_serialize and that all pending refs are gone before strtab finalization. (Of necessity ctf_serialize has to traverse all strtab offsets in the dtds in order to serialize them, so adding them as refs at the same time is easy.) (Note that we still can't erase unused atoms when we roll back, though we can erase unused refs: members and enums are still not removed by rollbacks and might reference strings added after the snapshot.) libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-hash.c (ctf_dynset_elements): New. * ctf-impl.h (ctf_dynset_elements): Declare it. (ctf_str_add_pending): Likewise. (ctf_dict_t) <ctf_str_pending_ref>: New, set of refs that must be added during serialization. * ctf-string.c (ctf_str_create_atoms): Initialize it. (CTF_STR_ADD_REF): New flag. (CTF_STR_MAKE_PROVISIONAL): Likewise. (CTF_STR_PENDING_REF): Likewise. (ctf_str_add_ref_internal): Take a flags word rather than int params. Populate, and clear out, ctf_str_pending_ref. (ctf_str_add): Adjust accordingly. (ctf_str_add_external): Likewise. (ctf_str_add_pending): New. (ctf_str_remove_ref): Also remove the potential ref if it is a pending ref. * ctf-serialize.c (ctf_serialize): Prohibit addition of strings with ctf_str_add_ref before serialization. Ensure that the ctf_str_pending_ref set is empty before strtab finalization. (ctf_emit_type_sect): Add a ref to the ctt_name. * ctf-create.c (ctf_add_generic): Add the ctt_name as a pending ref. * testsuite/libctf-writable/reserialize-strtab-corruption.*: New test.
2021-03-18libctf: don't lose track of all valid types upon serializationNick Alcock2-0/+6
One pattern which is rarely done in libctf but which is meant to work is this: ctf_create(); ctf_add_*(); // add stuff ctf_type_*() // look stuff up ctf_write_*(); ctf_add_*(); // should still work ctf_type_*() // so should this ctf_write_*(); // and this i.e., writing out a dict should not break it and you should be able to do everything you could do with it before, including writing it out again. Unfortunately this has been broken for a while because the field which indicates the maximum valid type ID was not preserved across serialization: so type additions after serialization would overwrite types (obviously disastrous) and type lookups would just fail. Fix trivial. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-serialize.c (ctf_serialize): Preserve ctf_typemax across serialization.
2021-03-18libctf: eliminate dtd_u, part 3: functionsNick Alcock5-35/+28
One more member vanishes from the dtd_u, leaving only the member for struct/union/enum members. There's not much to do here, since as of commit afd78bd6f0a30ba5 we use the same representation (type sizes, etc) in the dtu_argv as we will use in the final vlen, with one exception: the vlen has alignment padding, and the dtu_argv did not. Simplify things by adding suitable padding in both cases. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_argv>: Remove. * ctf-create.c (ctf_dtd_delete): No longer free it. (ctf_add_function): Use the dtd_vlen, not dtu_argv. Properly align. * ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen. * ctf-types.c (ctf_func_type_info): Just use the vlen. (ctf_func_type_args): Likewise.
2021-03-18libctf: eliminate dtd_u, part 2: arraysNick Alcock5-16/+27
This is even simpler than ints, floats and slices, with the only extra complication being the need to manually transfer the array parameter in the rarely-used function ctf_set_array. (Arrays are unique in libctf in that they can be modified post facto, not just created and appended to. I'm not sure why they got this exemption, but it's easy to maintain.) libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_arr>: Remove. * ctf-create.c (ctf_add_array): Use the dtd_vlen, not dtu_arr. (ctf_set_array): Likewise. * ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen. * ctf-types.c (ctf_array_info): Just use the vlen.
2021-03-18libctf: eliminate dtd_u, part 1: int/float/sliceNick Alcock6-78/+92
This series eliminates a lot of special-case code to handle dynamic types (types added to writable dicts and not yet serialized). Historically, when such types have variable-length data in their final CTF representations, libctf has always worked by adding such types to a special union (ctf_dtdef_t.dtd_u) in the dynamic type definition structure, then picking the members out of this structure at serialization time and packing them into their final form. This has the advantage that the ctf_add_* code doesn't need to know anything about the final CTF representation, but the significant disadvantage that all code that looks up types in any way needs two code paths, one for dynamic types, one for all others. Historically libctf "handled" this by not supporting most type lookups on dynamic types at all until ctf_update was called to do a complete reserialization of the entire dict (it didn't emit an error, it just emitted wrong results). Since commit 676c3ecbad6e9c4, which eliminated ctf_update in favour of the internal-only ctf_serialize function, all the type-lookup paths grew an extra branch to handle dynamic types. We can eliminate this branch again by dropping the dtd_u stuff and simply writing out the vlen in (close to) its final form at ctf_add_* time: type lookup for types using this approach is then identical for types in writable dicts and types that are in read-only ones, and serialization is also simplified (we just need to write out the vlen we already created). The only complexity lies in type kinds for which multiple vlen representations are valid depending on properties of the type, e.g. structures. But we can start simple, adjusting ints, floats, and slices to work this way, and leaving everything else as is. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_enc>: Remove. <dtd_u.dtu_slice>: Likewise. <dtd_vlen>: New. * ctf-create.c (ctf_add_generic): Perhaps allocate it. All callers adjusted. (ctf_dtd_delete): Free it. (ctf_add_slice): Use the dtd_vlen, not dtu_enc. (ctf_add_encoded): Likewise. Assert that this must be an int or float. * ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen. * ctf-dedup.c (ctf_dedup_rhash_type): Use the dtd_vlen, not dtu_slice. * ctf-types.c (ctf_type_reference): Likewise. (ctf_type_encoding): Remove most dynamic-type-specific code: just get the vlen from the right place. Report failure to look up the underlying type's encoding.
2021-03-18libctf: fix GNU style for do {} whileNick Alcock6-46/+63
It's formatted like this: do { ... } while (...); Not like this: do { ... } while (...); or this: do { ... } while (...); We used both in various places in libctf. Fixing it necessitated some light reindentation. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-archive.c (ctf_archive_next): GNU style fix for do {} while. * ctf-dedup.c (ctf_dedup_rhash_type): Likewise. (ctf_dedup_rwalk_one_output_mapping): Likewise. * ctf-dump.c (ctf_dump_format_type): Likewise. * ctf-lookup.c (ctf_symbol_next): Likewise. * swap.h (swap_thing): Likewise.
2021-03-18libctf: split up ctf_serializeNick Alcock2-317/+424
ctf_serialize and its various pieces may be split out into a separate file now, but ctf_serialize is still far too long and disordered, mixing header initialization, sizing of multiple CTF sections, sorting and emission of multiple CTF sections, strtab construction and ctf_dict_t copying into a single ugly organically-grown mess. Fix the worst of this by migrating all section sizing and emission into separate functions, two per section (or class of section in the case of the symtypetabs). Only the variable section is now sized and emitted directly in ctf_serialize (because it only takes about three lines to do so). The section sizes themselves are still maintained by ctf_serialize so that it can work out the header offsets, but ctf_symtypetab_sect_sizes and ctf_emit_symtypetab_sects share a lot of extra state: migrate that into a shared structure, emit_symtypetab_state_t. (Test results unchanged.) libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-serialize.c: General reshuffling, and... (emit_symtypetab_state_t): New, migrated from local variables in ctf_serialize. (ctf_serialize): Split out most section sizing and emission. (ctf_symtypetab_sect_sizes): New (split out). (ctf_emit_symtypetab_sects): Likewise. (ctf_type_sect_size): Likewise. (ctf_emit_type_sect): Likewise.
2021-03-18libctf: fix comment above ctf_dict_tNick Alcock2-5/+10
It is perfectly possible to have dynamically allocated data owned by a specific dict: you just have to teach ctf_serialize about it. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dict_t): Fix comment.
2021-03-18libctf: split serialization and file writeout into its own fileNick Alcock5-1324/+1391
The code to serialize CTF dicts just gets bigger and bigger as the dictionary's complexity grows: adding symtypetabs almost doubled it on its own. It's long past time to split this out into its own source file, accompanied by the functions that do the actual writeout. This leaves ctf-create.c populated exclusively by functions related to actual writable dict creation (ctf_add_*, ctf_create etc), and leaves both files a much more reasonable size. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-create.c (symtypetab_delete_nonstatic_vars): Move into ctf-serialize.c. (ctf_symtab_skippable): Likewise. (CTF_SYMTYPETAB_EMIT_FUNCTION): Likewise. (CTF_SYMTYPETAB_EMIT_PAD): Likewise. (CTF_SYMTYPETAB_FORCE_INDEXED): Likewise. (symtypetab_density): Likewise. (emit_symtypetab): Likewise. (emit_symtypetab_index): Likewise. (ctf_copy_smembers): Likewise. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_sort_var): Likewise. (ctf_serialize): Likewise. (ctf_gzwrite): Likewise. (ctf_compress_write): Likewise. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-serialize.c: New file. * Makefile.am (libctf_nobfd_la_SOURCES): Add it. * Makefile.in: Regenerate.
2021-03-18libctf: fix some tabdamage and move some code aroundNick Alcock5-56/+65
ctf-link.c is unnecessarily confusing because ctf_link_lazy_open is positioned near functions that have nothing to do with opening files. Move it around, and fix some tabdamage that's crept in lately. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-link.c (ctf_link_lazy_open): Move up in the file, to near ctf_link_add_ctf. * ctf-lookup.c (ctf_lookup_symbol_idx): Repair tabdamage. (ctf_lookup_by_sym_or_name): Likewise. * testsuite/libctf-lookup/struct-iteration.c: Likewise. * testsuite/libctf-regression/type-add-unnamed-struct.c: Likewise.
2021-03-02bfd, ld, libctf: skip zero-refcount strings in CTF string reportingNick Alcock3-7/+20
This is a tricky one. BFD, on the linker's behalf, reports symbols to libctf via the ctf_new_symbol and ctf_new_dynsym callbacks, which ultimately call ctf_link_add_linker_symbol. But while this happens after strtab offsets are finalized, it happens before the .dynstr is actually laid out, so we can't iterate over it at this stage and it is not clear what the reported symbols are actually called. So a second callback, examine_strtab, is called after the .dynstr is finalized, which calls ctf_link_add_strtab and ultimately leads to ldelf_ctf_strtab_iter_cb being called back repeatedly until the offsets of every string in the .dynstr is passed to libctf. libctf can then use this to get symbol names out of the input (which usually stores symbol types in the form of a name -> type mapping at this stage) and extract the types of those symbols, feeding them back into their final form as a 1:1 association with the real symtab's STT_OBJ and STT_FUNC symbols (with a few skipped, see ctf_symtab_skippable). This representation is compact, but has one problem: if libctf somehow gets confused about the st_type of a symbol, it'll stick an entry into the function symtypetab when it should put it into the object symtypetab, or vice versa, and *every symbol from that one on* will have the wrong CTF type because it's actually looking up the type for a different symbol. And we have just such a bug. ctf_link_add_strtab was not taking the refcounts of strings into consideration, so even strings that had been eliminated from the strtab by virtue of being in objects eliminated via --as-needed etc were being reported. This is harmful because it can lead to multiple strings with the same apparent offset, and if the last duplicate to be reported relates to an eliminated symbol, we look up the wrong symbol from the input and gets its type wrong: if it's unlucky and the eliminated symbol is also of the wrong st_type, we will end up with a corrupted symtypetab. Thankfully the wrong-st_type case is already diagnosed by a this-can-never-happen paranoid warning: CTF warning: Symbol 61a added to CTF as a function but is of type 1 or the converse * CTF warning: Symbol a3 added to CTF as a data object but is of type 2 so at least we can tell when the corruption has spread to more than one symbol's type. Skipping zero-refcounted strings is easy: teach _bfd_elf_strtab_str to skip them, and ldelf_ctf_strtab_iter_cb to loop over skipped strings until it falls off the end or finds one that isn't skipped. bfd/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * elf-strtab.c (_bfd_elf_strtab_str): Skip strings with zero refcount. ld/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ldelfgen.c (ldelf_ctf_strtab_iter_cb): Skip zero-refcount strings. libctf/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-create.c (symtypetab_density): Report the symbol name as well as index in the name != object error; note the likely consequences. * ctf-link.c (ctf_link_shuffle_syms): Report the symbol index as well as name.
2021-03-02libctf: free ctf_dynsyms properlyNick Alcock2-1/+5
In the "no symbols" case (commonplace for executables), we were freeing the ctf_dynsyms using free(), instead of ctf_dynhash_destroy(), leaking a little memory. (This is harmless in the common case of ld usage, but libctf might be used by persistent processes too.) libctf/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-link.c (ctf_link_shuffle_syms): Free ctf_dynsyms properly.
2021-03-02libctf: fix signed/unsigned comparison confusionNick Alcock2-2/+6
Comparing an encoding's cte_bits to a ctf_type_size needs a cast: one is a uint32_t and the other is an ssize_t. libctf/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-dump.c (ctf_dump_format_type): Fix signed/unsigned confusion.
2021-03-02libctf: minor error-handling fixesNick Alcock3-8/+32
A transient bug in the preceding change (fixed before commit) exposed a new failure, of ld/testsuite/ld-ctf/diag-parname.d. This attempts to ensure that if we link a dict with child type IDs but no attached parent, we get a suitable ECTF_NOPARENT error. This was happening before this commit, but only by chance, because ctf_variable_iter and ctf_variable_next check to see if the dict they're passed is a child dict without an associated parent. We forgot error-checking on the ctf_variable_next call, and as a result this was concealed -- and looking for the problem exposed a new bug. If any of the lookups beneath ctf_dedup_hash_type fail, the CTF link does *not* fail, but acts quite bizarrely, skipping the type but emitting an error to the CTF error/warning log -- so the linker will report an error, emit a partial CTF dict missing some types, and exit with exitcode 0 as if nothing went wrong. Since ctf_dedup_hash_type is never expected to fail in normal operation, this is surely wrong: failures at emission time do not emit partial CTF dicts, so failures at hashing time should not either. So propagate the error back up. Also fix a couple of smaller bugs where we fail to properly free things and/or propagate error codes on various rare link-time errors and out-of-memory conditions. libctf/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-dedup.c (ctf_dedup): Pass on errors from ctf_dedup_hash_type. Call ctf_dedup_fini properly on other errors. (ctf_dedup_emit_type): Set the errno on dynhash insertion failure. * ctf-link.c (ctf_link_deduplicating_per_cu): Close outputs beyond output 0 when asserting because >1 output is found. (ctf_link_deduplicating): Likewise, when asserting because the shared output is not the same as the passed-in fp.
2021-03-02libctf: add a deduplicator-specific type mapping tableNick Alcock5-328/+344
When CTF linking is done, the linker has to track the association between types in the inputs and types in the outputs. The deduplicator does this via the cd_output_emission_hashes, which maps from hashes of types (valid in both the input and output) to the IDs of types in the specific dict in which the cd_emission_hashes is held. However, the nondeduplicating linker and ctf_add_type used a different mechanism, a dedicated hashtab stored in the ctf_link_type_mapping, populated via ctf_add_type_mapping and queried via the ctf_type_mapping function. To allow the same functions to be used for variable and symbol population in both the deduplicating and nondeduplicating linker, the deduplicator carefully transferred all its input->output mappings into this hashtab before returning. This is *expensive*. The number of entries in this hashtab scales as the number of input types, and unlike the hashing machinery the type mapping machinery (the only other thing which scales that way) has not been much optimized. Now the nondeduplicating linker is gone, we can throw this out, move the existing type mapping machinery to ctf-create.c and dedicate it to ctf_add_type alone, and add a new function ctf_dedup_type_mapping which uses the deduplicator's built-in knowledge of type mappings directly, without requiring an expensive repopulation phase. This speeds up a test link of nouveau.ko (a good worst-case candidate with a lot of types in each of a lot of input files) from 9.11s to 7.15s in my testing, a speedup of over 20%. libctf/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used by the nondeduplicating linker. (ctf_add_type_mapping): Removed, now static. (ctf_type_mapping): Likewise. (ctf_dedup_type_mapping): New. (ctf_dedup_t) <cd_input_nums>: New. * ctf-dedup.c (ctf_dedup_init): Populate it. (ctf_dedup_fini): Free it again. Emphasise that this has to be the last thing called. (ctf_dedup): Populate it. (ctf_dedup_populate_type_mapping): Removed. (ctf_dedup_populate_type_mappings): Likewise. (ctf_dedup_emit): No longer call it. No longer call ctf_dedup_fini either. (ctf_dedup_type_mapping): New. * ctf-link.c (ctf_unnamed_cuname): New. (ctf_create_per_cu): Arguments must be non-null now. (ctf_in_member_cb_arg): Removed. (ctf_link): No longer populate it. No longer discard the mapping table. (ctf_link_deduplicating_one_symtypetab): Use ctf_dedup_type_mapping, not ctf_type_mapping. Use ctf_unnamed_cuname. (ctf_link_one_variable): Likewise. Pass in args individually: no longer a ctf_variable_iter callback. (empty_link_type_mapping): Removed. (ctf_link_deduplicating_variables): Use ctf_variable_next, not ctf_variable_iter. No longer pack arguments to ctf_link_one_variable into a struct. (ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once all link phases are done. (ctf_link_deduplicating): Likewise. (ctf_link_intern_extern_string): Improve comment. (ctf_add_type_mapping): Migrate... (ctf_type_mapping): ... these functions... * ctf-create.c (ctf_add_type_mapping): ... here... (ctf_type_mapping): ... and make static, for the sole use of ctf_add_type.
2021-03-02libctf: remove reference to "unconflicted link mode".Nick Alcock2-3/+8
There is no such thing, and the comment makes no sense, and doesn't match what the code is doing. We always want to put variables in the same dicts as the types they relate to if at all possible. libctf/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-link.c (ctf_link_one_variable): Remove reference to "unconflicted link mode".
2021-03-02libctf, include: remove the nondeduplicating CTF linkerNick Alcock2-230/+39
The nondeduplicating CTF linker was kept around when the deduplicating one was added so that people had something to fall back to in case the deduplicating linker turned out to be buggy. It's now much more stable than the nondeduplicating linker, in addition to much faster, using much less memory and producing much better output. In addition, while libctf has a linker flag to invoke the nondeduplicating linker, ld does not expose it: the only way to turn it on within ld is an intentionally- undocumented environment variable. So we can remove it without any ABI or user-visibility concerns (the only thing we leave around is the CTF_LINK_NONDEDUP flag, which can easily be interpreted as "deduplicate less", though right now it does nothing). This lets us remove a lot of complexity associated with tracking filenames and CU names separately (something the deduplcating linker never bothered with, since the cunames are always reliable and ld never hands us useful filenames anyway) The biggest lacuna left behind is the ctf_type_mapping machinery, which slows down deduplicating links quite a lot. We can't just ditch it because ctf_add_type uses it: removing the slowdown from the deduplicating linker is a job for another commit. include/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h (CTF_LINK_SHARE_DUPLICATED): Note that this might merely change how much deduplication is done. libctf/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-link.c (ctf_create_per_cu): Drop FILENAME now that it is always identical to CUNAME. (ctf_link_deduplicating_one_symtypetab): Adjust. (ctf_link_one_type): Remove. (ctf_link_one_input_archive_member): Likewise. (ctf_link_close_one_input_archive): Likewise. (ctf_link_one_input_archive): Likewise. (ctf_link): No longer call it. Drop CTF_LINK_NONDEDUP path. Improve header comment a bit (dicts, not files). Adjust ctf_create_per_cu call. (ctf_link_deduplicating_variables): Simplify. (ctf_link_in_member_cb_arg_t) <cu_name>: Remove. <in_input_cu_file>: Likewise. <in_fp_parent>: Likewise. <done_parent>: Likewise. (ctf_link_one_variable): Turn uses of in_file_name to in_cuname.
2021-03-02libctf: fix ChangeLog dateNick Alcock1-1/+1
I pushed this change without fixing up the date by mistake.
2021-03-02libctf: reimplement many _iter iterators in terms of _nextNick Alcock3-124/+69
Ever since the generator-style _next iterators were introduced, there have been separate implementations of the functional-style _iter iterators that do the same thing as _next. This is annoying and adds more dependencies on the internal guts of the file format. Rip them all out and replace them with the corresponding _next iterators. Only ctf_archive_raw_iter and ctf_label_iter survive, the former because there is no access to the raw binary data of archives via any _next iterator, and the latter because ctf_label_next hasn't been implemented (because labels are currently not used for anything). Tested by reverting the change (already applied) that reimplemented ctf_member_iter in terms of ctf_member_next, then verifying that the _iter and _next iterators produced the same results for every iterable entity within a large type archive. libctf/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-types.c (ctf_member_iter): Move 'rc' to an inner scope. (ctf_enum_iter): Reimplement in terms of ctf_enum_next. (ctf_type_iter): Reimplement in terms of ctf_type_next. (ctf_type_iter_all): Likewise. (ctf_variable_iter): Reimplement in terms of ctf_variable_next. * ctf-archive.c (ctf_archive_iter_internal): Remove. (ctf_archive_iter): Reimplement in terms of ctf_archive_next.
2021-03-02libctf: ctf_archive_next should set the parent name consistentlyNick Alcock2-0/+7
The top level of CTF containers is a "CTF archive", which contains a collection of named members (each a CTF dictionary). In the serialized file format, this is optional and skipped if the archive would have only one member, as when no ambiguous types are present: so it is commonplace to have a simple ctf_dict_t written out, with no archive container wrapped around it. But, unlike ctf_archive_iter, ctf_archive_next didn't quite handle this case right. It should set the name of this fake "member" to _CTF_SECTION, i.e. ".ctf", but it was failing to do so, so callers got an unintialized variable back instead and were understandably confused. So set the name properly. libctf/ChangeLog 2021-03-02 Nick Alcock <nick.alcock@oracle.com> * ctf-archive.c (ctf_archive_next): Set the name of parents in single-member archives.
2021-02-26libctf regen for NEWSAlan Modra2-1/+5
The previous regen was done on a tree without the new NEWS file. * Makefile.in: Regenerate.
2021-02-21libctf AC_CANONICAL_TARGETAlan Modra4-81/+142
AC_CANONICAL_TARGET is needed for @target@ substitution in the makefile. AC_CANONICAL_HOST and AC_CANONICAL_BUILD are alread invoked indirectly, make them explicit. * configure.ac: Invoke AC_CANONICAL_TARGET, AC_CANONICAL_HOST and AC_CANONICAL_BUILD. * configure: Regenerate. * Makefile.in: Regenerate.
2021-02-20libctf: add a NEWSNick Alcock1-0/+26
I'm dividing this into three groups for now: new features, bugfixes, and bugfixes also present on a stable branch. Only user-visible bugfixes, not build-system fixes, are listed.
2021-02-20libctf, include: find types of symbols by nameNick Alcock13-179/+567
The existing ctf_lookup_by_symbol and ctf_arc_lookup_symbol functions suffice to look up the types of symbols if the caller already has a symbol number. But the caller often doesn't have one of those and only knows the name of the symbol: also, in object files, the caller might not have a useful symbol number in any sense (and neither does libctf: the 'symbol number' we use in that case literally starts at 0 for the lexicographically first-sorted symbol in the symtypetab and counts those symbols, so it corresponds to nothing useful). This means that even though object files have a symtypetab (generated by the compiler or by ld -r), the only way we can look up anything in it is to iterate over all symbols in turn with ctf_symbol_next until we find the one we want. This is unhelpful and pointlessly inefficient. So add a pair of functions to look up symbols by name in a dict and in a whole archive: ctf_lookup_by_symbol_name and ctf_arc_lookup_symbol_name. These are identical to the existing functions except that they take symbol names rather than symbol numbers. To avoid insane repetition, we do some refactoring in the process, so that both ctf_lookup_by_symbol and ctf_arc_lookup_symbol turn into thin wrappers around internal functions that do both lookup by symbol index and lookup by name. This massively reduces code duplication because even the existing lookup-by-index stuff wants to use a name sometimes (when looking up in indexed sections), and the new lookup-by-name stuff has to turn it into an index sometimes (when looking up in non-indexed sections): doing it this way lets us share most of that. The actual name->index lookup is done by ctf_lookup_symbol_idx. We do not anticipate this lookup to be as heavily used as ld.so symbol lookup by many orders of magnitude, so using the ELF symbol hashes would probably take more time to read them than is saved by using the hashes, and it adds a lot of complexity. Instead, do a linear search for the symbol name, caching all the name -> index mappings as we go, so that future searches are likely to hit in the cache. To avoid having to repeat this search over and over in a CTF archive when ctf_arc_lookup_symbol_name is used, have cached archive lookups (the sort done by ctf_arc_lookup_symbol* and the ctf_archive_next iterator) pick out the first dict they cache in a given archive and store it in a new ctf_archive field, ctfi_crossdict_cache. This can be used to store cross-dictionary cached state that depends on things like the ELF symbol table rather than the contents of any one dict. ctf_lookup_symbol_idx then caches its name->index mappings in the dictionary named in the crossdict cache, if any, so that ctf_lookup_symbol_idx in other dicts in the same archive benefit from the previous linear search, and the symtab only needs to be scanned at most once. (Note that if you call ctf_lookup_by_symbol_name in one specific dict, and then follow it with a ctf_arc_lookup_symbol_name, the former will not use the crossdict cache because it's only populated by the dict opens in ctf_arc_lookup_symbol_name. This is harmless except for a small one-off waste of memory and time: it's only a cache, after all. We can fix this later by using the archive caching machinery more aggressively.) In ctf-archive, we do similar things, turning ctf_arc_lookup_symbol into a wrapper around a new function that does both index -> ID and name -> ID lookups across all dicts in an archive. We add a new ctfi_symnamedicts cache that maps symbol names to the ctf_dict_t * that it was found in (so that linear searches for symbols don't need to be repeated): but we also *remove* a cache, the ctfi_syms cache that was memoizing the actual ctf_id_t returned from every call to ctf_arc_lookup_symbol. This is pointless: all it saves is one call to ctf_lookup_by_symbol, and that's basically an array lookup and nothing more so isn't worth caching. (Equally, given that symbol -> index mappings are cached by ctf_lookup_by_symbol_name, those calls are nearly free after the first call, so there's no point caching the ctf_id_t in that case either.) We fix up one test that was doing manual symbol lookup to use ctf_arc_lookup_symbol instead, and enhance it to check that the caching layer is not totally broken: we also add a new test to do lookups in a .o file, and another to do lookups in an archive with conflicted types and make sure that sort of multi-dict lookup is actually working. include/ChangeLog 2021-02-17 Nick Alcock <nick.alcock@oracle.com> * ctf-api.h (ctf_arc_lookup_symbol_name): New. (ctf_lookup_by_symbol_name): Likewise. libctf/ChangeLog 2021-02-17 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dict_t) <ctf_symhash>: New. <ctf_symhash_latest>: Likewise. (struct ctf_archive_internal) <ctfi_crossdict_cache>: New. <ctfi_symnamedicts>: New. <ctfi_syms>: Remove. (ctf_lookup_symbol_name): Remove. * ctf-lookup.c (ctf_lookup_symbol_name): Propagate errors from parent properly. Make static. (ctf_lookup_symbol_idx): New, linear search for the symbol name, cached in the crossdict cache's ctf_symhash (if available), or this dict's (otherwise). (ctf_try_lookup_indexed): Allow the symname to be passed in. (ctf_lookup_by_symbol): Turn into a wrapper around... (ctf_lookup_by_sym_or_name): ... this, supporting name lookup too, using ctf_lookup_symbol_idx in non-writable dicts. Special-case name lookup in dynamic dicts without reported symbols, which have no symtab or dynsymidx but where name lookup should still work. (ctf_lookup_by_symbol_name): New, another wrapper. * ctf-archive.c (enosym): Note that this is present in ctfi_symnamedicts too. (ctf_arc_close): Adjust for removal of ctfi_syms. Free the ctfi_symnamedicts. (ctf_arc_flush_caches): Likewise. (ctf_dict_open_cached): Memoize the first cached dict in the crossdict cache. (ctf_arc_lookup_symbol): Turn into a wrapper around... (ctf_arc_lookup_sym_or_name): ... this. No longer cache ctf_id_t lookups: just call ctf_lookup_by_symbol as needed (but still cache the dicts those lookups succeed in). Add lookup-by-name support, with dicts of successful lookups cached in ctfi_symnamedicts. Refactor the caching code a bit. (ctf_arc_lookup_symbol_name): New, another wrapper. * ctf-open.c (ctf_dict_close): Free the ctf_symhash. * libctf.ver (LIBCTF_1.2): New version. Add ctf_lookup_by_symbol_name, ctf_arc_lookup_symbol_name. * testsuite/libctf-lookup/enum-symbol.c (main): Use ctf_arc_lookup_symbol rather than looking up the name ourselves. Fish it out repeatedly, to make sure that symbol caching isn't broken. (symidx_64): Remove. (symidx_32): Remove. * testsuite/libctf-lookup/enum-symbol-obj.lk: Test symbol lookup in an unlinked object file (indexed symtypetab sections only). * testsuite/libctf-writable/symtypetab-nonlinker-writeout.c (try_maybe_reporting): Check symbol types via ctf_lookup_by_symbol_name as well as ctf_symbol_next. * testsuite/libctf-lookup/conflicting-type-syms.*: New test of lookups in a multi-dict archive.
2021-02-20Include ld-lib.exp from ctf-lib.expAlan Modra3-172/+11
* testsuite/config/default.exp (ld_L_opt): Define. * testsuite/lib/ctf-lib.exp (load_common_lib): Delete. Instead load ld-lib.exp. (run_host_cmd, run_host_cmd_yesno, check_compiler_available): Delete. (compile_one_cc, check_ctf_available): Delete.