aboutsummaryrefslogtreecommitdiff
path: root/libctf
AgeCommit message (Collapse)AuthorFilesLines
2019-07-18libctf: introduce ctf_func_type_{info,args}, ctf_type_aname_rawNick Alcock4-2/+95
The first two of these allow you to get function type info and args out of the types section give a type ID: astonishingly, this was missing from libctf before now: so even though types of kind CTF_K_FUNCTION were supported, you couldn't find out anything about them. (The existing ctf_func_info and ctf_func_args only allow you to get info about functions in the function section, i.e. given symbol table indexes, not type IDs.) The second of these allows you to get the raw undecorated name out of the CTF section (strdupped for safety) without traversing subtypes to build a full C identifier out of it. It's useful for things that are already tracking the type kind etc and just need an unadorned name. include/ * ctf-api.h (ECTF_NOTFUNC): Fix description. (ctf_func_type_info): New. (ctf_func_type_args): Likewise. libctf/ * ctf-types.c (ctf_type_aname_raw): New. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-error.c (_ctf_errlist): Fix description.
2019-07-01libctf: fix spurious error when rolling back to the first snapshotNick Alcock2-1/+5
The first ctf_snapshot called after CTF file creation yields a snapshot handle that always yields a spurious ECTF_OVERROLLBACK error ("Attempt to roll back past a ctf_update") on ctf_rollback(), even if ctf_update has never been called. The fix is to start with a ctf_snapshot value higher than the zero value that ctf_snapshot_lu ("last update CTF snapshot value") is initialized to. libctf/ * ctf-create.c (ctf_create): Fix off-by-one error.
2019-07-01libctf: deduplicate and sort the string tableNick Alcock8-138/+520
ctf.h states: > [...] the CTF string table does not contain any duplicated strings. Unfortunately this is entirely untrue: libctf has before now made no attempt whatsoever to deduplicate the string table. It computes the string table's length on the fly as it adds new strings to the dynamic CTF file, and ctf_update() just writes each string to the table and notes the current write position as it traverses the dynamic CTF file's data structures and builds the final CTF buffer. There is no global view of the strings and no deduplication. Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding a new dynhash table ctf_str_atoms that maps unique strings to a list of references to those strings: a reference is a simple uint32_t * to some value somewhere in the under-construction CTF buffer that needs updating to note the string offset when the strtab is laid out. Adding a string is now a simple matter of calling ctf_str_add_ref(), which adds a new atom to the atoms table, if one doesn't already exist, and adding the location of the reference to this atom to the refs list attached to the atom: this works reliably as long as one takes care to only call ctf_str_add_ref() once the final location of the offset is known (so you can't call it on a temporary structure and then memcpy() that structure into place in the CTF buffer, because the ref will still point to the old location: ctf_update() changes accordingly). Generating the CTF string table is a matter of calling ctf_str_write_strtab(), which counts the length and number of elements in the atoms table using the ctf_dynhash_iter() function we just added, populating an array of pointers into the atoms table and sorting it into order (to help compressors), then traversing this table and emitting it, updating the refs to each atom as we go. The only complexity here is arranging to keep the null string at offset zero, since a lot of code in libctf depends on being able to leave strtab references at 0 to indicate 'no name'. Once the table is constructed and the refs updated, we know how long it is, so we can realloc() the partial CTF buffer we allocated earlier and can copy the table on to the end of it (and purge the refs because they're not needed any more and have been invalidated by the realloc() call in any case). The net effect of all this is a reduction in uncompressed strtab sizes of about 30% (perhaps a quarter to a half of all strings across the Linux kernel are eliminated as duplicates). Of course, duplicated strings are highly redundant, so the space saving after compression is only about 20%: when the other non-strtab sections are factored in, CTF sizes shrink by about 10%. No change in externally-visible API or file format (other than the reduction in pointless redundancy). libctf/ * ctf-impl.h: (struct ctf_strs_writable): New, non-const version of struct ctf_strs. (struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated. (struct ctf_str_atom): New, disambiguated single string. (struct ctf_str_atom_ref): New, points to some other location that references this string's offset. (struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs. Remove member ctf_dtvstrlen: we no longer track the total strlen as we add strings. (ctf_str_create_atoms): Declare new function in ctf-string.c. (ctf_str_free_atoms): Likewise. (ctf_str_add): Likewise. (ctf_str_add_ref): Likewise. (ctf_str_purge_refs): Likewise. (ctf_str_write_strtab): Likewise. (ctf_realloc): Declare new function in ctf-util.c. * ctf-open.c (ctf_bufopen): Create the atoms table. (ctf_file_close): Destroy it. * ctf-create.c (ctf_update): Copy-and-free it on update. No longer special-case the position of the parname string. Construct the strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the rest of each buffer element is constructed, not via open-coding: realloc the CTF buffer and append the strtab to it. No longer maintain ctf_dtvstrlen. Sort the variable entry table later, after strtab construction. (ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members. (ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref after buffer element construction instead. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_create): No longer maintain the ctf_dtvstrlen. (ctf_dtd_delete): Likewise. (ctf_dvd_delete): Likewise. (ctf_add_generic): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable): Likewise. (membadd): Likewise. * ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts if there are active ctf_str_num_refs. (ctf_strraw): Move to ctf-string.c. (ctf_strptr): Likewise. * ctf-string.c: New file, strtab manipulation. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate.
2019-07-01libctf: add hash traversal helpersNick Alcock3-0/+67
There are two, ctf_dynhash_iter and ctf_dynhash_iter_remove: the latter lets you return a nonzero value to remove the element being iterated over. Used in the next commit. libctf/ * ctf-impl.h (ctf_hash_iter_f): New. (ctf_dynhash_iter): New declaration. (ctf_dynhash_iter_remove): New declaration. * ctf-hash.c (ctf_dynhash_iter): Define. (ctf_dynhash_iter_remove): Likewise. (ctf_hashtab_traverse): New. (ctf_hashtab_traverse_remove): Likewise. (struct ctf_traverse_cb_arg): Likewise. (struct ctf_traverse_remove_cb_arg): Likewise.
2019-07-01libctf: fix hash removalNick Alcock2-1/+6
We must call htab_remove_elt with an element (in this case, a mocked-up one with only the key populated, since no reasonable hash function will need the other fields), not with the key alone. libctf/ * ctf-hash.c (ctf_dynhash_remove): Call with a mocked-up element.
2019-07-01libctf: disambiguate hex output in dumpsNick Alcock2-3/+8
We were sometimes printing hex values without prefixing them with '0x', leading to confusion about what base the numbers were actually in. libctf/ * ctf-dump.c (ctf_dump_format_type): Prefix hex strings with 0x. (ctf_dump_funcs): Likewise.
2019-06-21libctf: fix ctf_open endianness problems with raw CTF filesNick Alcock2-9/+22
ctf_open (or, rather, ctf_fdopen, which underlies it) has several endianness problems, even though it was written after the endian-swapping code was implemented, so should have been endian-aware. Even though the comment right above the relevant check says that it wil check for CTF magic in any endianness, it only checks in the native endianness, so opening raw LE CTF files on BE, or vice-versa, will fail. It also checks the CTF version by hand, without ever endianness-swapping the header, so that too will fail, and is entirely redundant because ctf_simple_open does the job properly in any case. We have a similar problem in the next if block, which checks for raw CTF archives: we are checking in the native endianness while we should be doing a le64toh() on it to check in little-endian form only: so opening CTF archives created on the local machine will fail if the local machine is big-endian. Adding insult to injury, if ctf_simple_open then fails, we go on and try to turn it into a single-element CTF archive regardless, throwing the error away. Since this involves dereferencing null pointers it is not likely to work very well. libctf/ * ctf-open-bfd.c: Add swap.h and ctf-endian.h. (ctf_fdopen): Check for endian-swapped raw CTF magic, and little-endian CTF archive magic. Do not check the CTF version: ctf_simple_open does that in endian-safe ways. Do not dereference null pointers on open failure.
2019-06-21libctf: endianness fixesNick Alcock2-7/+18
Testing of the first code to generate CTF_K_SLICEs on big-endian revealed a bunch of new problems in this area. Most importantly, the trick we did earlier to avoid wasting two bytes on padding in the ctf_slice_t is best avoided: because it leads to the whole file after that point no longer being naturally aligned, all multibyte accesses from then on must use memmove() to avoid unaligned access on platforms where that is fatal. In future, this is planned, but for now we are still doing direct access in many places, so we must revert to making ctf_slice_t properly aligned for storage in an array. Rather than wasting bytes on padding, we boost the size of cts_offset and cts_bits. This is still a waste of space (we cannot have offsets or bits in bitfields > 256) but it cannot be avoided for now, and slices are not so common that this will be a serious problem. A possibly-worse endianness problem fixed at the same time involves a codepath used only for foreign-endian, uncompressed CTF files, where we were not copying the actual CTF data into the buffer, leading to libctf reading only zeroes (or, possibly, uninitialized garbage). Finally, when we read in a CTF file, we copy the header and work from the copy. We were flipping the endianness of the header copy, and of the body of the file buffer, but not of the header in the file buffer itself: so if we write the file back out again we end up with an unreadable frankenfile with header and body of different endiannesses. Fix by flipping both copies of the header. include/ * ctf.h (ctf_slice_t): Make cts_offset and cts_bits unsigned short, so following structures are properly aligned. libctf/ * ctf-open.c (get_vbytes_common): Return the new slice size. (ctf_bufopen): Flip the endianness of the CTF-section header copy. Remember to copy in the CTF data when opening an uncompressed foreign-endian CTF file. Prune useless variable manipulation.
2019-06-21libctf: unidentified type kinds on open are a sign of file corruptionNick Alcock2-0/+9
If we see a CTF type with a kind we do not recognize in its ctt_info during opening, we cannot skip it and continue opening the file: if the type kind is unknown, we do not know how long its vlen is, and we cannot have skipped past it: so if we continue reading we will almost certainly read in part of the vlen as if it were a new ctf_type_t. Avoid this trouble by considering unknown type kinds to be a reason to return ECTF_CORRUPT, just like everything else that reads in type kinds does. libctf/ * ctf-open.c (ctf_types): Fail when unidentified type kinds are seen.
2019-06-21libctf: dump header offsets into the debugging outputNick Alcock2-0/+8
This is an essential first piece of info needed to debug both libctf writing and reading problems, and we weren't recording it anywhere! (This is a short-term fix: fairly soon, we will record all of this in a form that outlives ctf_bufopen, and then ctf_dump() will be able to dump it like it can everything else.) libctf/ * ctf-open.c (ctf_bufopen): Dump header offsets into the debugging output.
2019-06-21libctf: drop mmap()-based CTF data allocatorNick Alcock5-89/+32
This allocator has the ostensible benefit that it lets us mprotect() the memory used for CTF storage: but in exchange for this it adds considerable complexity, since we have to track allocation sizes ourselves for use at freeing time, note whether the data we are storing was ctf_data_alloc()ed or not so we know if we can safely mprotect() it... and while the mprotect()ing has found few bugs, it *has* been the cause of more than one due to errors in all this tracking leading to us mprotect()ing bits of the heap and stuff like that. We are about to start composing CTF buffers from pieces so that we can do usage-based optimizations on the strtab. This means we need realloc(), which needs nonportable mremap() and *more* tracking of the *original* allocation size, and the complexity and bureaucracy of all of this is just too high for its negligible benefits. Drop the whole thing and just use malloc() like everyone else. It knows better than we do when it is safe to use mmap() under the covers, anyway. While we're at it, don't leak the entire buffer if ctf_compress_write() fails to compress it. libctf/ * ctf-subr.c (_PAGESIZE): Remove. (ctf_data_alloc): Likewise. (ctf_data_free): Likewise. (ctf_data_protect): Likewise. * ctf-impl.h: Remove declarations. * ctf-create.c (ctf_update): No longer call ctf_data_protect: use ctf_free, not ctf_data_free. (ctf_compress_write): Use ctf_data_alloc, not ctf_alloc. Free the buffer again on compression error. * ctf-open.c (ctf_set_base): No longer track the size: call ctf_free, not ctf_data_free. (upgrade_types): Likewise. Call ctf_alloc, not ctf_data_alloc. (ctf_bufopen): Likewise. No longer call ctf_data_protect.
2019-06-21libctf: handle errors on dynhash insertion betterNick Alcock3-12/+35
We were missing several cases where dynhash insertion might fail, likely due to OOM but possibly for other reasons. Pass the errors on. libctf/ * ctf-create.c (ctf_dtd_insert): Pass on error returns from ctf_dynhash_insert. (ctf_dvd_insert): Likewise. (ctf_add_generic): Likewise. (ctf_add_variable): Likewise. * ctf-impl.h: Adjust declarations.
2019-06-14Regenerate with approved autotools versionAlan Modra2-1/+5
bfd/ * Makefile.in: Regenerate. * configure: Regenerate. binutils/ * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * doc/Makefile.in: Regenerate. gas/ * Makefile.in: Regenerate. * configure: Regenerate. * doc/Makefile.in: Regenerate. ld/ * Makefile.in: Regenerate. * configure: Regenerate. libctf/ * configure: Regenerate.
2019-06-07libctf: avoid strndupNick Alcock3-1/+7
Not all platforms have it. Use libiberty xstrndup() instead. (The include of libiberty.h happens in an unusual place due to the requirements of synchronization of most source files between this project and another that does not use libiberty. It serves to pull libiberty.h in for all source files in libctf/, which does the trick.) Tested on x86_64-pc-linux-gnu, x86_64-unknown-freebsd12.0, sparc-sun-solaris2.11, i686-pc-cygwin, i686-w64-mingw32. libctf/ * ctf-decls.h: Include <libiberty.h>. * ctf-lookup.c (ctf_lookup_by_name): Call xstrndup(), not strndup().
2019-06-07libctf: explicitly cast more size_t types used in printf()sNick Alcock2-7/+17
Unsigned long will always be adequate (the only cases involving an ssize_t are cases in which no error can be generated, or in which negative output would require a seriously corrupted file: the latter has been rewritten on a branch in any case). Tested on x86_64-pc-linux-gnu, x86_64-unknown-freebsd12.0, sparc-sun-solaris2.11, i686-pc-cygwin, i686-w64-mingw32. libctf/ * ctf-dump.c (ctf_dump_format_type): Cast size_t's used in printf()s. (ctf_dump_objts): Likewise. (ctf_dump_funcs): Likewise. (ctf_dump_member): Likewise. (ctf_dump_str): Likewise.
2019-06-07libctf: mark various args as unused in the !HAVE_MMAP caseNick Alcock3-2/+7
Tested on x86_64-pc-linux-gnu, x86_64-unknown-freebsd12.0, sparc-sun-solaris2.11, i686-pc-cygwin, i686-w64-mingw32. libctf/ * ctf-archive.c (arc_mmap_header): Mark fd as potentially unused. * ctf-subr.c (ctf_data_protect): Mark both args as potentially unused.
2019-06-05libctf: eschew %zi format specifierNick Alcock3-6/+14
Too many platforms don't support it, and we can always safely use %lu or %li anyway, because the only uses are in debugging output. libctf/ * ctf-archive.c (ctf_arc_write): Eschew %zi format specifier. (ctf_arc_open_by_offset): Likewise. * ctf-create.c (ctf_add_type): Likewise.
2019-06-04Use CHAR_BIT instead of NBBY in libctfTom Tromey2-6/+13
On x86-64 Fedora 29, I tried to build a mingw-hosted gdb that targets ppc-linux. You can do this with: ../binutils-gdb/configure --host=i686-w64-mingw32 --target=ppc-linux \ --disable-{binutils,gas,gold,gprof,ld} The build failed with these errors in libctf: In file included from ../../binutils-gdb/libctf/ctf-create.c:20: ../../binutils-gdb/libctf/ctf-create.c: In function 'ctf_add_encoded': ../../binutils-gdb/libctf/ctf-create.c:803:59: error: 'NBBY' undeclared (first use in this function) dtd->dtd_data.ctt_size = clp2 (P2ROUNDUP (ep->cte_bits, NBBY) / NBBY); ^~~~ ../../binutils-gdb/libctf/ctf-impl.h:254:42: note: in definition of macro 'P2ROUNDUP' #define P2ROUNDUP(x, align) (-(-(x) & -(align))) ^~~~~ ../../binutils-gdb/libctf/ctf-create.c:803:59: note: each undeclared identifier is reported only once for each function it appears in dtd->dtd_data.ctt_size = clp2 (P2ROUNDUP (ep->cte_bits, NBBY) / NBBY); ^~~~ ../../binutils-gdb/libctf/ctf-impl.h:254:42: note: in definition of macro 'P2ROUNDUP' #define P2ROUNDUP(x, align) (-(-(x) & -(align))) ^~~~~ ../../binutils-gdb/libctf/ctf-create.c: In function 'ctf_add_slice': ../../binutils-gdb/libctf/ctf-create.c:862:59: error: 'NBBY' undeclared (first use in this function) dtd->dtd_data.ctt_size = clp2 (P2ROUNDUP (ep->cte_bits, NBBY) / NBBY); ^~~~ ../../binutils-gdb/libctf/ctf-impl.h:254:42: note: in definition of macro 'P2ROUNDUP' #define P2ROUNDUP(x, align) (-(-(x) & -(align))) ^~~~~ ../../binutils-gdb/libctf/ctf-create.c: In function 'ctf_add_member_offset': ../../binutils-gdb/libctf/ctf-create.c:1341:21: error: 'NBBY' undeclared (first use in this function) off += lsize * NBBY; ^~~~ ../../binutils-gdb/libctf/ctf-create.c: In function 'ctf_add_type': ../../binutils-gdb/libctf/ctf-create.c:1822:16: warning: unknown conversion type character 'z' in format [-Wformat=] ctf_dprintf ("Conflict for type %s against ID %lx: " ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../binutils-gdb/libctf/ctf-create.c:1823:35: note: format string is defined here "union size differs, old %zi, new %zi\n", ^ ../../binutils-gdb/libctf/ctf-create.c:1822:16: warning: unknown conversion type character 'z' in format [-Wformat=] ctf_dprintf ("Conflict for type %s against ID %lx: " ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../binutils-gdb/libctf/ctf-create.c:1823:44: note: format string is defined here "union size differs, old %zi, new %zi\n", ^ ../../binutils-gdb/libctf/ctf-create.c:1822:16: warning: too many arguments for format [-Wformat-extra-args] ctf_dprintf ("Conflict for type %s against ID %lx: " ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This patch fixes the actual errors in here. I did not try to fix the printf warnings, though I think someone ought to. Ok? libctf/ChangeLog 2019-06-04 Tom Tromey <tromey@adacore.com> * ctf-create.c (ctf_add_encoded, ctf_add_slice) (ctf_add_member_offset): Use CHAR_BIT, not NBBY.
2019-06-04libctf: work on platforms without O_CLOEXEC.Nick Alcock5-0/+67
(Not tested on any such platforms, since I don't have access to any at the moment. Testing encouraged.) libctf/ * configure.ac: Check for O_CLOEXEC. * ctf-decls.h (O_CLOEXEC): Define (to 0), if need be. * config.h.in: Regenerate.
2019-06-04libctf: look for BSD versus GNU qsort_r signaturesNick Alcock10-85/+236
We cannot just look for any declaration of qsort_r, because some operating systems have a qsort_r that has a different prototype but which still has a pair of pointers in the right places (the last two args are interchanged): so use AC_LINK_IFELSE to check for both known variants of qsort_r(), and swap their args into a consistent order in a suitable inline function. (The code for this is taken almost unchanged from gnulib.) (Now we are not using AC_LIBOBJ any more, we can use a better name for the qsort_r replacement as well.) libctf/ * qsort_r.c: Rename to... * ctf-qsort_r.c: ... this. (_quicksort): Define to ctf_qsort_r. * ctf-decls.h (qsort_r): Remove. (ctf_qsort_r): Add. (struct ctf_qsort_arg): New, transport the real ARG and COMPAR. (ctf_qsort_compar_thunk): Rearrange the arguments to COMPAR. * Makefile.am (libctf_a_LIBADD): Remove. (libctf_a_SOURCES): New, add ctf-qsort_r.c. * ctf-archive.c (ctf_arc_write): Call ctf_qsort_r, not qsort_r. * ctf-create.c (ctf_update): Likewise. * configure.ac: Check for BSD versus GNU qsort_r signature. * Makefile.in: Regenerate. * config.h.in: Likewise. * configure: Likewise.
2019-06-04libctf: fix use-after-free in function dumpingNick Alcock2-1/+5
This is actually a free-before-initializing (i.e. a free of garbage). libctf/ * ctf-dump.c (ctf_dump_funcs): Free in the right place.
2019-05-31libctf: fix a number of build problems found on Solaris and NetBSDJose E. Marchesi21-102/+865
- Use of nonportable <endian.h> - Use of qsort_r - Use of zlib without appropriate magic to pull in the binutils zlib - Use of off64_t without checking (fixed by dropping the unused fields that need off64_t entirely) - signedness problems due to long being too short a type on 32-bit platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be used only for functions that return ctf_id_t - One lingering use of bzero() and of <sys/errno.h> All fixed, using code from gnulib where possible. Relatedly, set cts_size in a couple of places it was missed (string table and symbol table loading upon ctf_bfdopen()). binutils/ * objdump.c (make_ctfsect): Drop cts_type, cts_flags, and cts_offset. * readelf.c (shdr_to_ctf_sect): Likewise. include/ * ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset. (ctf_id_t): This is now an unsigned type. (CTF_ERR): Cast it to ctf_id_t. Note that it should only be used for ctf_id_t-returning functions. libctf/ * Makefile.am (ZLIB): New. (ZLIBINC): Likewise. (AM_CFLAGS): Use them. (libctf_a_LIBADD): New, for LIBOBJS. * configure.ac: Check for zlib, endian.h, and qsort_r. * ctf-endian.h: New, providing htole64 and le64toh. * swap.h: Code style fixes. (bswap_identity_64): New. * qsort_r.c: New, from gnulib (with one added #include). * ctf-decls.h: New, providing a conditional qsort_r declaration, and unconditional definitions of MIN and MAX. * ctf-impl.h: Use it. Do not use <sys/errno.h>. (ctf_set_errno): Now returns unsigned long. * ctf-util.c (ctf_set_errno): Adjust here too. * ctf-archive.c: Use ctf-endian.h. (ctf_arc_open_by_offset): Use memset, not bzero. Drop cts_type, cts_flags and cts_offset. (ctf_arc_write): Drop debugging dependent on the size of off_t. * ctf-create.c: Provide a definition of roundup if not defined. (ctf_create): Drop cts_type, cts_flags and cts_offset. (ctf_add_reftype): Do not check if type IDs are below zero. (ctf_add_slice): Likewise. (ctf_add_typedef): Likewise. (ctf_add_member_offset): Cast error-returning ssize_t's to size_t when known error-free. Drop CTF_ERR usage for functions returning int. (ctf_add_member_encoded): Drop CTF_ERR usage for functions returning int. (ctf_add_variable): Likewise. (enumcmp): Likewise. (enumadd): Likewise. (membcmp): Likewise. (ctf_add_type): Likewise. Cast error-returning ssize_t's to size_t when known error-free. * ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions returning int: use CTF_ERR for functions returning ctf_type_id. (ctf_dump_label): Likewise. (ctf_dump_objts): Likewise. * ctf-labels.c (ctf_label_topmost): Likewise. (ctf_label_iter): Likewise. (ctf_label_info): Likewise. * ctf-lookup.c (ctf_func_args): Likewise. * ctf-open.c (upgrade_types): Cast to size_t where appropriate. (ctf_bufopen): Likewise. Use zlib types as needed. * ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions returning int. (ctf_enum_iter): Likewise. (ctf_type_size): Likewise. (ctf_type_align): Likewise. Cast to size_t where appropriate. (ctf_type_kind_unsliced): Likewise. (ctf_type_kind): Likewise. (ctf_type_encoding): Likewise. (ctf_member_info): Likewise. (ctf_array_info): Likewise. (ctf_enum_value): Likewise. (ctf_type_rvisit): Likewise. * ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and cts_offset. (ctf_simple_open): Likewise. (ctf_bfdopen_ctfsect): Likewise. Set cts_size properly. * Makefile.in: Regenerate. * aclocal.m4: Likewise. * config.h: Likewise. * configure: Likewise.
2019-05-29Fix libctf build on non-ELF targets.Nick Alcock5-1/+156
All machinery works as on ELF, except for automatic loading of ELF string and symbol tables in the BFD-style open machinery. * Makefile.def (dependencies): configure-libctf depends on all-bfd and all its deps. * Makefile.in: Regenerated. libctf/ * configure.in: Check for bfd_section_from_elf_index. * configure: Regenerate. * config.h.in [HAVE_BFD_ELF]: Likewise. * libctf/ctf_open_bfd (ctf_bfdopen_ctfsect): Use it. abfd is potentially unused now.
2019-05-28libctf: build systemNick Alcock7-0/+9677
This ties libctf into the build system, and makes binutils depend on it (used by the next commits). * Makefile.def (host_modules): Add libctf. * Makefile.def (dependencies): Likewise. libctf depends on zlib, libiberty, and bfd. * Makefile.in: Regenerated. * configure.ac (host_libs): Add libctf. * configure: Regenerated. libctf/ * Makefile.am: New. * Makefile.in: Regenerated. * config.h.in: Likewise. * aclocal.m4: Likewise. * configure: Likewise.
2019-05-28libctf: debug dumpingNick Alcock2-0/+599
This introduces ctf_dump(), an iterator which returns a series of strings, each representing a debugging dump of one item from a given section in the CTF file. The items may be multiline: a callback is provided to allow the caller to decorate each line as they desire before the line is returned. libctf/ * ctf-dump.c: New. include/ * ctf-api.h (ctf_dump_decorate_f): New. (ctf_dump_state_t): new. (ctf_dump): New.
2019-05-28libctf: labelsNick Alcock2-0/+142
This facility allows you to associate regions of type IDs with *labels*, a labelled tiling of the type ID space. You can use these to define CTF containers with distinct parents for distinct ranges of the ID space, or to assist with parallelization of CTF processing, or for any other purpose you can think of. Notably absent from here (though declared in the API header) is any way to define new labels: this will probably be introduced soon, as part of the linker deduplication work. (One existed in the past, but was deeply tied to the Solaris CTF file generator and had to be torn out.) libctf/ * ctf-labels.c: New. include/ * ctf-api.h (ctf_label_f): New. (ctf_label_set): New. (ctf_label_get): New. (ctf_label_topmost): New. (ctf_label_info): New. (ctf_label_iter): New.
2019-05-28libctf: library version enforcementNick Alcock3-0/+34
This old Solaris standard allows callers to specify that they are expecting one particular API and/or CTF file format from the library. libctf/ * ctf-impl.h (_libctf_version): New declaration. * ctf-subr.c (_libctf_version): Define it. (ctf_version): New. include/ * ctf-api.h (ctf_version): New.
2019-05-28libctf: type copyingNick Alcock2-0/+493
ctf_add_type() allows you to copy types, and all the types they depend on, from one container to another (writable) container. This lets a program maintaining multiple distinct containers (not in a parent-child relationship) introduce types that depend on types in one container in another writable one, by copying the necessary types. libctf/ * ctf-create.c (enumcmp): New. (enumadd): Likewise. (membcmp): Likewise. (membadd): Likewise. (ctf_add_type): Likewise.
2019-05-28libctf: lookups by name and symbolNick Alcock3-0/+377
These functions allow you to look up types given a name in a simple subset of C declarator syntax (no function pointers), to look up the types of variables given a name, and to look up the types of data objects and the type signatures of functions given symbol table offsets. (Despite its name, one function in this commit, ctf_lookup_symbol_name(), is for the internal use of libctf only, and does not appear in any public header files.) libctf/ * ctf-lookup.c (isqualifier): New. (ctf_lookup_by_name): Likewise. (struct ctf_lookup_var_key): Likewise. (ctf_lookup_var): Likewise. (ctf_lookup_variable): Likewise. (ctf_lookup_symbol_name): Likewise. (ctf_lookup_by_symbol): Likewise. (ctf_func_info): Likewise. (ctf_func_args): Likewise. include/ * ctf-api.h (ctf_func_info): New. (ctf_func_args): Likewise. (ctf_lookup_by_symbol): Likewise. (ctf_lookup_by_symbol): Likewise. (ctf_lookup_variable): Likewise.
2019-05-28libctf: core type lookupNick Alcock4-0/+1232
Finally we get to the functions used to actually look up and enumerate properties of types in a container (names, sizes, members, what type a pointer or cv-qual references, determination of whether two types are assignment-compatible, etc). With a very few exceptions these do not work for types newly added via ctf_add_*(): they only work on types in read-only containers, or types added before the most recent call to ctf_update(). This also adds support for lookup of "variables" (string -> type ID mappings) and for generation of C type names corresponding to a type ID. libctf/ * ctf-decl.c: New file. * ctf-types.c: Likewise. * ctf-impl.h: New declarations. include/ * ctf-api.h (ctf_visit_f): New definition. (ctf_member_f): Likewise. (ctf_enum_f): Likewise. (ctf_variable_f): Likewise. (ctf_type_f): Likewise. (ctf_type_isparent): Likewise. (ctf_type_ischild): Likewise. (ctf_type_resolve): Likewise. (ctf_type_aname): Likewise. (ctf_type_lname): Likewise. (ctf_type_name): Likewise. (ctf_type_sizee): Likewise. (ctf_type_align): Likewise. (ctf_type_kind): Likewise. (ctf_type_reference): Likewise. (ctf_type_pointer): Likewise. (ctf_type_encoding): Likewise. (ctf_type_visit): Likewise. (ctf_type_cmp): Likewise. (ctf_type_compat): Likewise. (ctf_member_info): Likewise. (ctf_array_info): Likewise. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_member_iter): Likewise. (ctf_enum_iter): Likewise. (ctf_type_iter): Likewise. (ctf_variable_iter): Likewise.
2019-05-28libctf: ELF file opening via BFDNick Alcock4-0/+376
These functions let you open an ELF file with a customarily-named CTF section in it, automatically opening the CTF file or archive and associating the symbol and string tables in the ELF file with the CTF container, so that you can look up the types of symbols in the ELF file via ctf_lookup_by_symbol(), and so that strings can be shared between the ELF file and CTF container, to save space. It uses BFD machinery to do so. This has now been lightly tested and seems to work. In particular, if you already have a bfd you can pass it in to ctf_bfdopen(), and if you want a bfd made for you you can call ctf_open() or ctf_fdopen(), optionally specifying a target (or try once without a target and then again with one if you get ECTF_BFD_AMBIGUOUS back). We use a forward declaration for the struct bfd in ctf-api.h, so that ctf-api.h users are not required to pull in <bfd.h>. (This is mostly for the sake of readelf.) libctf/ * ctf-open-bfd.c: New file. * ctf-open.c (ctf_close): New. * ctf-impl.h: Include bfd.h. (ctf_file): New members ctf_data_mmapped, ctf_data_mmapped_len. (ctf_archive_internal): New members ctfi_abfd, ctfi_data, ctfi_bfd_close. (ctf_bfdopen_ctfsect): New declaration. (_CTF_SECTION): likewise. include/ * ctf-api.h (struct bfd): New forward. (ctf_fdopen): New. (ctf_bfdopen): Likewise. (ctf_open): Likewise. (ctf_arc_open): Likewise.
2019-05-28libctf: mmappable archivesNick Alcock4-0/+786
If you need to store a large number of CTF containers somewhere, this provides a dedicated facility for doing so: an mmappable archive format like a very simple tar or ar without all the system-dependent format horrors or need for heavy file copying, with built-in compression of files above a particular size threshold. libctf automatically mmap()s uncompressed elements of these archives, or uncompresses them, as needed. (If the platform does not support mmap(), copying into dynamically-allocated buffers is used.) Archive iteration operations are partitioned into raw and non-raw forms. Raw operations pass thhe raw archive contents to the callback: non-raw forms open each member with ctf_bufopen() and pass the resulting ctf_file_t to the iterator instead. This lets you manipulate the raw data in the archive, or the contents interpreted as a CTF file, as needed. It is not yet known whether we will store CTF archives in a linked ELF object in one of these (akin to debugdata) or whether they'll get one section per TU plus one parent container for types shared between them. (In the case of ELF objects with very large numbers of TUs, an archive of all of them would seem preferable, so we might just use an archive, and add lzma support so you can assume that .gnu_debugdata and .ctf are compressed using the same algorithm if both are present.) To make usage easier, the ctf_archive_t is not the on-disk representation but an abstraction over both ctf_file_t's and archives of many ctf_file_t's: users see both CTF archives and raw CTF files as ctf_archive_t's upon opening, the only difference being that a raw CTF file has only a single "archive member", named ".ctf" (the default if a null pointer is passed in as the name). The next commit will make use of this facility, in addition to providing the public interface to actually open archives. (In the future, it should be possible to have all CTF sections in an ELF file appear as an "archive" in the same fashion.) This machinery is also used to allow library-internal creators of ctf_archive_t's (such as the next commit) to stash away an ELF string and symbol table, so that all opens of members in a given archive will use them. This lets CTF archives exploit the ELF string and symbol table just like raw CTF files can. (All this leads to somewhat confusing type naming. The ctf_archive_t is a typedef for the opaque internal type, struct ctf_archive_internal: the non-internal "struct ctf_archive" is the on-disk structure meant for other libraries manipulating CTF files. It is probably clearest to use the struct name for struct ctf_archive_internal inside the program, and the typedef names outside.) libctf/ * ctf-archive.c: New. * ctf-impl.h (ctf_archive_internal): New type. (ctf_arc_open_internal): New declaration. (ctf_arc_bufopen): Likewise. (ctf_arc_close_internal): Likewise. include/ * ctf.h (CTFA_MAGIC): New. (struct ctf_archive): New. (struct ctf_archive_modent): Likewise. * ctf-api.h (ctf_archive_member_f): New. (ctf_archive_raw_member_f): Likewise. (ctf_arc_write): Likewise. (ctf_arc_close): Likewise. (ctf_arc_open_by_name): Likewise. (ctf_archive_iter): Likewise. (ctf_archive_raw_iter): Likewise. (ctf_get_arc): Likewise.
2019-05-28libctf: openingNick Alcock3-0/+1741
This fills in the other half of the opening/creation puzzle: opening of already-existing CTF files. Such files are always read-only: if you want to add to a CTF file opened with one of the opening functions in this file, use ctf_add_type(), in a later commit, to copy appropriate types into a newly ctf_create()d, writable container. The lowest-level opening functions are in here: ctf_bufopen(), which takes ctf_sect_t structures akin to ELF section headers, and ctf_simple_open(), which can be used if you don't have an entire ELF section header to work from. Both will malloc() new space for the buffers only if necessary, will mmap() directly from the file if requested, and will mprotect() it afterwards to prevent accidental corruption of the types. These functions are also used by ctf_update() when converting types in a writable container into read-only types that can be looked up using the lookup functions (in later commits). The files are always of the native endianness of the system that created them: at read time, the endianness of the header magic number is used to determine whether or not the file needs byte-swapping, and the entire thing is aggressively byte-swapped. The agggressive nature of this swapping avoids complicating the rest of the code with endianness conversions, while the native endianness introduces no byte-swapping overhead in the common case. (The endianness-independence code is also much newer than everything else in this file, and deserves closer scrutiny.) The accessors at the top of the file are there to transparently support older versions of the CTF file format, allowing translation from older formats that have different sizes for the structures in ctf.h: currently, these older formats are intermingled with the newer ones in ctf.h: they will probably migrate to a compatibility header in time, to ease readability. The ctf_set_base() function is split out for the same reason: when conversion code to a newer format is written, it would need to malloc() new storage for the entire ctf_file_t if a file format change causes it to grow, and for that we need ctf_set_base() to be a separate function. One pair of linked data structures supported by this file has no creation code in libctf yet: the data and function object sections read by init_symtab(). These will probably arrive soon, when the linker comes to need them. (init_symtab() has hardly been changed since 2009, but if any code in libctf has rotted over time, this will.) A few simple accessors are also present that can even be called on read-only containers because they don't actually modify them, since the relevant things are not stored in the container but merely change its operation: ctf_setmodel(), which lets you specify whether a container is LP64 or not (used to statically determine the sizes of a few types), ctf_import(), which is the only way to associate a parent container with a child container, and ctf_setspecific(), which lets the caller associate an arbitrary pointer with the CTF container for any use. If the user doesn't call these functions correctly, libctf will misbehave: this is particularly important for ctf_import(), since a container built against a given parent container will not be able to resolve types that depend on types in the parent unless it is ctf_import()ed with a parent container with the same set of types at the same IDs, or a superset. Possible future extensions (also noted in the ctf-hash.c file) include storing a count of things so that we don't need to do one pass over the CTF file counting everything, and computing a perfect hash at CTF creation time in some compact form, storing it in the CTF file, and using it to hash things so we don't need to do a second pass over the entire CTF file to set up the hashes used to go from names to type IDs. (There are multiple such hashes, one for each C type namespace: types, enums, structs, and unions.) libctf/ * ctf-open.c: New file. * swap.h: Likewise. include/ * ctf-api.h (ctf_file_close): New declaration. (ctf_getdatasect): Likewise. (ctf_parent_file): Likewise. (ctf_parent_name): Likewise. (ctf_parent_name_set): Likewise. (ctf_import): Likewise. (ctf_setmodel): Likewise. (ctf_getmodel): Likewise. (ctf_setspecific): Likewise. (ctf_getspecific): Likewise.
2019-05-28libctf: creation functionsNick Alcock3-0/+1615
The CTF creation process looks roughly like (error handling elided): int err; ctf_file_t *foo = ctf_create (&err); ctf_id_t type = ctf_add_THING (foo, ...); ctf_update (foo); ctf_*write (...); Some ctf_add_THING functions accept other type IDs as arguments, depending on the type: cv-quals, pointers, and structure and union members all take other types as arguments. So do 'slices', which let you take an existing integral type and recast it as a type with a different bitness or offset within a byte, for bitfields. One class of THING is not a type: "variables", which are mappings of names (in the internal string table) to types. These are mostly useful when encoding variables that do not appear in a symbol table but which some external user has some other way to figure out the address of at runtime (dynamic symbol lookup or querying a VM interpreter or something). You can snapshot the creation process at any point: rolling back to a snapshot deletes all types and variables added since that point. You can make arbitrary type queries on the CTF container during the creation process, but you must call ctf_update() first, which translates the growing dynamic container into a static one (this uses the CTF opening machinery, added in a later commit), which is quite expensive. This function must also be called after adding types and before writing the container out. Because addition of types involves looking up existing types, we add a little of the type lookup machinery here, as well: only enough to look up types in dynamic containers under construction. libctf/ * ctf-create.c: New file. * ctf-lookup.c: New file. include/ * ctf-api.h (zlib.h): New include. (ctf_sect_t): New. (ctf_sect_names_t): Likewise. (ctf_encoding_t): Likewise. (ctf_membinfo_t): Likewise. (ctf_arinfo_t): Likewise. (ctf_funcinfo_t): Likewise. (ctf_lblinfo_t): Likewise. (ctf_snapshot_id_t): Likewise. (CTF_FUNC_VARARG): Likewise. (ctf_simple_open): Likewise. (ctf_bufopen): Likewise. (ctf_create): Likewise. (ctf_add_array): Likewise. (ctf_add_const): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_enum): Likewise. (ctf_add_float): Likewise. (ctf_add_forward): Likewise. (ctf_add_function): Likewise. (ctf_add_integer): Likewise. (ctf_add_slice): Likewise. (ctf_add_pointer): Likewise. (ctf_add_type): Likewise. (ctf_add_typedef): Likewise. (ctf_add_restrict): Likewise. (ctf_add_struct): Likewise. (ctf_add_union): Likewise. (ctf_add_struct_sized): Likewise. (ctf_add_union_sized): Likewise. (ctf_add_volatile): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_member_encoded): Likewise. (ctf_add_variable): Likewise. (ctf_set_array): Likewise. (ctf_update): Likewise. (ctf_snapshot): Likewise. (ctf_rollback): Likewise. (ctf_discard): Likewise. (ctf_write): Likewise. (ctf_gzwrite): Likewise. (ctf_compress_write): Likewise.
2019-05-28libctf: implementation definitions related to file creationNick Alcock2-0/+219
We now enter a series of commits that are sufficiently tangled that avoiding forward definitions is almost impossible: no attempt is made to make individual commits compilable (which is why the build system does not reference any of them yet): the only important thing is that they should form something like conceptual groups. But first, some definitions, including the core ctf_file_t itself. Uses of these definitions will be introduced in later commits. libctf/ * ctf-impl.h: New definitions and declarations for type creation and lookup.
2019-05-28libctf: hashingNick Alcock3-0/+311
libctf maintains two distinct hash ADTs, one (ctf_dynhash) for wrapping dynamically-generated unknown-sized hashes during CTF file construction, one (ctf_hash) for wrapping unchanging hashes whose size is known at creation time for reading CTF files that were previously created. In the binutils implementation, these are both fairly thin wrappers around libiberty hashtab. Unusually, this code is not kept synchronized with libdtrace-ctf, due to its dependence on libiberty hashtab. libctf/ * ctf-hash.c: New file. * ctf-impl.h: New declarations.
2019-05-28libctf: error handlingNick Alcock2-0/+97
CTF functions return zero on success or an extended errno value which can be translated into a string via the functions in this commit. The errno numbers start at -CTF_BASE. libctf/ * ctf-error.c: New file. include/ * ctf-api.h (ctf_errno): New declaration. (ctf_errmsg): Likewise.
2019-05-28libctf: low-level list manipulation and helper utilitiesNick Alcock4-0/+276
These utilities are a bit of a ragbag of small things needed by more than one TU: list manipulation, ELF32->64 translators, routines to look up strings in string tables, dynamically-allocated string appenders, and routines to set the specialized errno values previously committed in <ctf-api.h>. We do still need to dig around in raw ELF symbol tables in places, because libctf allows the caller to pass in the contents of string and symbol sections without telling it where they come from, so we cannot use BFD to get the symbols (BFD reasonably demands the entire file). So extract minimal ELF definitions from glibc into a private header named libctf/elf.h: later, we use those to get symbols. (The start-of- copyright range on elf.h reflects this glibc heritage.) libctf/ * ctf-util.c: New file. * elf.h: Likewise. * ctf-impl.h: Include it, and add declarations.
2019-05-28libctf: lowest-level memory allocation and debug-dumping wrappersNick Alcock3-0/+322
The memory-allocation wrappers are simple things to allow malloc interposition: they are only used inconsistently at present, usually where malloc debugging was required in the past. These provide a default implementation that is environment-variable triggered (initialized on the first call to the libctf creation and file-opening functions, the first functions people will use), and a ctf_setdebug()/ctf_getdebug() pair that allows the caller to explicitly turn debugging off and on. If ctf_setdebug() is called, the automatic setting from an environment variable is skipped. libctf/ * ctf-impl.h: New file. * ctf-subr.c: New file. include/ * ctf-api.h (ctf_setdebug): New. (ctf_getdebug): Likewise.