aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-19libctf: make ctf_lookup of symbols by name work in more casesNick Alcock1-1/+3
In particular, we don't need a symbol table if we're looking up a symbol by name and that type of symbol has an indexed symtypetab, since in that case we get the name from the symtypetab index, not from the symbol table. This lets you do symbol lookups in unlinked object files and unlinked dicts written out via libctf's writeout functions. libctf/ * ctf-lookup.c (ctf_lookup_by_sym_or_name): Allow lookups by index even when there is no symtab.
2024-04-19libctf: improve handling of type dumping errorsNick Alcock1-1/+2
When dumping a type fails with an error, we want to emit a warning noting this: a warning because it's not fatal and we can continue. But warnings don't automatically print out the ctf_errno (because not all cases causing warnings set the errno at all), so we must do it at warning-emission time or lose track of what's gone wrong. libctf/ * ctf-dump.c (ctf_dump_format_type): Dump the underlying error on type dump failure.
2024-04-19libctf: fix tiny dumping errorNick Alcock1-3/+2
Without this, you might get things like this in the output: Flags: 0xa (CTF_F_NEWFUNCINFO, , CTF_F_DYNSTR) Note the spurious comma. libctf/ * ctf-dump.c (ctf_dump_header): Fix comma emission.
2024-04-19libctf: make ctf_serialize() actually serializeNick Alcock7-348/+137
ctf_serialize() evolved from the old ctf_update(), which mutated the in-memory CTF dict to make all the dynamic in-memory types into static, unchanging written-to-the-dict types (by deserializing and reserializing it): back in the days when you could only do type lookups on static types, this meant you could see all the types you added recently, at the small, small cost of making it impossible to change those older types ever again and inducing an amortized O(n^2) cost if you actually wanted to add references to types you added at arbitrary times to later types. It also reset things so that ctf_discard() would throw away only types you added after the most recent ctf_update() call. Some time ago this was all changed so that you could look up dynamic types just as easily as static types: ctf_update() changed so that only its visible side-effect of affecting ctf_discard() remained: the old ctf_update() was renamed to ctf_serialize(), made internal to libctf, and called from the various functions that wrote files out. ... but it was still working by serializing and deserializing the entire dict, swapping out its guts with the newly-serialized copy in an invasive and horrible fashion that coupled ctf_serialize() to almost every field in the ctf_dict_t. This is totally useless, and fixing it is easy: just rip all that code out and have ctf_serialize return a serialized representation, and let everything use that directly. This simplifies most of its callers significantly. (It also points up another bug: ctf_gzwrite() failed to call ctf_serialize() at all, so it would only ever work for a dict you just ctf_write_mem()ed yourself, just for its invisible side-effect of serializing the dict!) This lets us simplify away a bunch of internal-only open-side functionality for overriding the syn_ext_strtab and some just-added functionality for forcing in an existing atoms table, without loss of functionality, and lets us lift the restriction on reserializing a dict that was ctf_open()ed rather than being ctf_create()d: it's now perfectly OK to open a dict, modify it (except for adding members to existing structs, unions, or enums, which fails with -ECTF_RDONLY), and write it out again, just as one would expect. libctf/ * ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos. (ctf_type_sect_size): Add static type sizes too. (ctf_serialize): Return the new dict rather than updating the existing dict. No longer fail for dicts with static types; copy them onto the start of the new types table. (ctf_gzwrite): Actually serialize before gzwriting. (ctf_write_mem): Improve forced (test-mode) endian-flipping: flip dicts even if they are too small to be compressed. Improve confusing variable naming. * ctf-archive.c (arc_write_one_ctf): Don't bother to call ctf_serialize: both the functions we call do so. * ctf-string.c (ctf_str_create_atoms): Drop serializing case (atoms arg). * ctf-open.c (ctf_simple_open): Call ctf_bufopen directly. (ctf_simple_open_internal): Delete. (ctf_bufopen_internal): Delete/rename to ctf_bufopen: no longer bother with syn_ext_strtab or forced atoms table, serialization no longer needs them. * ctf-create.c (ctf_create): Call ctf_bufopen directly. * ctf-impl.h (ctf_str_create_atoms): Drop atoms arg. (ctf_simple_open_internal): Delete. (ctf_bufopen_internal): Likewise. (ctf_serialize): Adjust. * testsuite/libctf-lookup/add-to-opened.c: Adjust now that this is supposed to work.
2024-04-19libctf: rethink strtab writeoutNick Alcock5-150/+292
This commit finally adjusts strtab writeout so that repeated writeouts, or writeouts of a dict that was read in earlier, only sorts the portion of the strtab that was newly added. There are three intertwined changes here: - pull the contents of strtabs from newly ctf_bufopened dicts into the atoms table, so that future additions will reuse the existing offset etc rather than adding new identical strings - allow the internal ctf_bufopen done by serialization to contribute its existing atoms table, so that existing atoms can be used for the remainder of the open process (like name table construction): this atoms table currente gets thrown away in the mass reassignment done later in ctf_serialize in any case, but it needs to be there during the open. - rewrite ctf_str_write_strtab so that a) it uses iterators rather than ctf_*_iter, reducing pointless structures which serve no other purpose than to implement ordinary variable scope, but more clunkily, and b) retains the existing strtab on the front of the new one, with its sort retained, rather than resorting, so all existing already-written strtab offsets remain valid across the call. This latter change finally permits repeated serializations, and reserializations of ctf_open()ed dicts, to work, but for now we keep the code that prevents that because serialization is about to change again in a way that will make it more obvious that doing such things is safe, and we can take it out then. (There are also some smaller changes like moving the purge of the refs table into ctf_str_write_strtab(), since that's where the changes happen that invalidate it, rather than doing it in ctf_serialize(). We also prohibit something that has never worked, opening a dict and then reporting symbols to it via ctf_link_add_strtab() et al: you must do that to newly-created dicts which have had stuff ctf_link()ed into them. This is very unlikely ever to be a problem in practice: linkers just don't do that sort of thing.) libctf/ * ctf-create.c (ctf_create): Add (temporary) atoms arg. * ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New. (ctf_str_create_atoms): Adjust. (ctf_str_write_strtab): Likewise. (ctf_simple_open_internal): Likewise. * ctf-open.c (ctf_simple_open_internal): Add atoms arg. (ctf_bufopen): Likewise. (ctf_bufopen_internal): Initialize just enough of an atoms table: pre-init from the atoms arg if supplied. (ctf_simple_open): Adjust. * ctf-serialize.c (ctf_serialize): Constify the strtab. Move ref list purging into ctf_str_write_strtab. Initialize the new dict with the old dict's atoms table. Accept the new strtab from ctf_str_write_strtab. Adjust for addition of ctf_dynstrtab. * ctf-string.c (ctf_strraw_explicit): Improve comments. (ctf_str_create_atoms): Prepopulate from an existing atoms table, or alternatively pull in all strings from the strtab and turn them into atoms. (ctf_str_free_atoms): Free the dynstrtab and its strtab. (struct ctf_strtab_write_state): Remove. (ctf_str_count_strtab): Fold this... (ctf_str_populate_sorttab): ... and this... (ctf_str_write_strtab): ... into this. Prepend existing strings to the strtab rather than resorting them (and wrecking their offsets). Keep the dynstrtab updated. Update refs for all atoms with refs, whether or not they are strings newly added to the strtab.
2024-04-19libctf: replace 'pending refs' abstractionNick Alcock5-63/+224
A few years ago we introduced a 'pending refs' abstraction to fix one problem: serializing a dict, then changing it would tend to corrupt the dict because the strtab sort we do on strtab writeout (to improve compression efficiency) would modify the offset of any strings that sorted lexicographically earlier in the strtab: so we added a new restriction that all strings are added only at serialization time, and maintained a set of 'pending' refs that were added earlier, whose offsets we could update (like other refs) at writeout time. This was in hindsight seriously problematic for maintenance (because serialization has to traverse all strings in all datatypes in the entire dict), and has become impossible to sustain now that we can read in existing dicts, modify them, and reserialize them again. We really don't want to have to dig through the entire dict we jut read in just in order to dig out all its strtab offsets, then *change* it, just for the sake of a sort that adds a frankly trivial amount of compression efficiency. Sorting *is* still worthwhile -- but it sacrifices very little to only sort newly-added portions of the strtab, reusing older portions as necessary. As a first stage in this, discard the whole "pending refs" abstraction and replace it with "movable" refs, which are exactly like all other refs (addresses containing the strtab offset of some string, which are updated wiht the final strtab offset on serialization) except that we track them in a reverse dict so that we can move the refs around (which we do whenever we realloc() a buffer containing a bunch of structure members or something when we add members to the structure). libctf/ * ctf-create.c (ctf_add_enumerator): Call ctf_str_move_refs; add a movable ref. (ctf_add_member_offset): Likewise. * ctf-util.c (ctf_realloc): Delete. * ctf-serialize.c (ctf_serialize): No longer use it. Adjust to new fields. * ctf-string.c (ctf_str_purge_atom_refs): Purge movable refs. (ctf_str_free_atom): Free freeable atoms' strings. (ctf_str_create_atoms): Create the movable refs dynhash if needed. (ctf_str_free_atoms): Destroy it. (CTF_STR_MOVABLE): Switch (back) from ints to flags (see previous reversion). Add new flag. (aref_create): New, populate movable refs if need be. (ctf_str_add_ref_internal): Switch back to flags, update refs directly for nonprovisional strings (with already-known fixed offsets); create refs via aref_create. Allocate strings only if not within an mmapped strtab. (ctf_str_add_movable_ref): New. (ctf_str_add): Adjust to CTF_STR_* reintroduction. (ctf_str_add_external): LIkewise. (ctf_str_move_refs): New, move refs via ctf_str_movable_refs backpointer. (ctf_str_purge_refs): Drop ctf_str_num_refs. (ctf_str_update_refs): Fix indentation. * ctf-impl.h (struct ctf_str_atom_movable): New. (struct ctf_dict.ctf_str_num_refs): Drop. (struct ctf_dict.ctf_str_movable_refs): New. (ctf_str_add_movable_ref): Declare. (ctf_str_move_refs): Likewise. (ctf_realloc): Drop.
2024-04-19Revert "libctf: do not corrupt strings across ctf_serialize"Nick Alcock5-125/+14
This reverts commit 986e9e3aa03f854bedacef7fac38fe8f009a416c. (We do not revert the testcase -- it remains valid -- but we are taking a different, less complex and more robust approach.) This also deletes the pending refs abstraction without (yet) replacing it, so some tests will fail for a commit or two.
2024-04-19libctf: rename ctf_dict.ctf_{symtab,strtab}Nick Alcock4-25/+25
These two fields are constantly confusing because CTF dicts contain both a symtypetab and strtab, but these fields are not that: they are the symtab and strtab from the ELF file. We have enough string tables now (internal, external, synthetic external, dynamic) that we need to at least name them better than this to avoid getting totally confused. Rename them to ctf_ext_symtab and ctf_ext_strtab. libctf/ * ctf-dump.c (ctf_dump_objts): Rename ctf_symtab -> ctf_ext_symtab. * ctf-impl.h (struct ctf_dict.ctf_symtab): Rename to... (struct ctf_dict.ctf_ext_strtab): ... this. (struct ctf_dict.ctf_strtab): Rename to... (struct ctf_dict.ctf_ext_strtab): ... this. * ctf-lookup.c (ctf_lookup_symbol_name): Adapt. (ctf_lookup_symbol_idx): Adapt. (ctf_lookup_by_sym_or_name): Adapt. * ctf-open.c (ctf_bufopen_internal): Adapt. (ctf_dict_close): Adapt. (ctf_getsymsect): Adapt. (ctf_getstrsect): Adapt. (ctf_symsect_endianness): Adapt.
2024-04-19libctf: fix a comment typoNick Alcock1-3/+3
ctf_update has been called ctf_serialize for years now. libctf/ * ctf-impl.h: Fix comment typo.
2024-04-19libctf: delete LCTF_DIRTYNick Alcock4-24/+1
This flag was meant as an optimization to avoid reserializing dicts unnecessarily. It was critically necessary back when serialization was done by ctf_update() and you had to call that every time you wanted any new modifications to the type table to be usable by other types, but that has been unnecessary for years now, and serialization is only done once when writing out, which one would naturally assume would always serialize the dict. Worse, it never really worked: it only tracked newly-added types, not things like added symbols which might equally well require reserialization, and it gets in the way of an upcoming change. Delete entirely. libctf/ * ctf-create.c (ctf_create): Drop LCTF_DIRTY. (ctf_discard): Likewise. (ctf_rollback): Likewise. (ctf_add_generic): Likewise. (ctf_set_array): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable_forced): Likewise. * ctf-link.c (ctf_link_intern_extern_string): Likewise. (ctf_link_add_strtab): Likewise. * ctf-serialize.c (ctf_serialize): Likewise. * ctf-impl.h (LCTF_DIRTY): Likewise. (LCTF_LINKING): Renumber.
2024-04-19libctf: fix a commentNick Alcock1-1/+1
A mistaken "not" in ctf_err_warn made it seem like we only extracted error messages if this was not an error. libctf/ * ctf-subr.c (ctf_err_warn): Fix comment.
2024-04-19libctf: support addition of types to dicts read via ctf_open()Nick Alcock10-268/+619
libctf has long declared deserialized dictionaries (out of files or ELF sections or memory buffers or whatever) to be read-only: back in the furthest prehistory this was not the case, in that you could add a few sorts of type to such dicts, but attempting to do so often caused horrible memory corruption, so I banned the lot. But it turns out real consumers want it (notably DTrace, which synthesises pointers to types that don't have them and adds them to the ctf_open()ed dicts if it needs them). Let's bring it back again, but without the memory corruption and without the massive code duplication required in days of yore to distinguish between static and dynamic types: the representation of both types has been identical for a few years, with the only difference being that types as a whole are stored in a big buffer for types read in via ctf_open and per-type hashtables for newly-added types. So we discard the internally-visible concept of "readonly dictionaries" in favour of declaring the *range of types* that were already present when the dict was read in to be read-only: you can't modify them (say, by adding members to them if they're structs, or calling ctf_set_array on them), but you can add more types and point to them. (The API remains the same, with calls sometimes returning ECTF_RDONLY, but now they do so less often.) This is a fairly invasive change, mostly because code written since the ban was introduced didn't take the possibility of a static/dynamic split into account. Some of these irregularities were hard to define as anything but bugs. Notably: - The symbol handling was assuming that symbols only needed to be looked for in dynamic hashtabs or static linker-laid-out indexed/ nonindexed layouts, but now we want to check both in case people added more symbols to a dict they opened. - The code that handles type additions wasn't checking to see if types with the same name existed *at all* (so you could do ctf_add_typedef (fp, "foo", bar) repeatedly without error). This seems reasonable for types you just added, but we probably *do* want to ban addition of types with names that override names we already used in the ctf_open()ed portion, since that would probably corrupt existing type relationships. (Doing things this way also avoids causing new errors for any existing code that was doing this sort of thing.) - ctf_lookup_variable entirely failed to work for variables just added by ctf_add_variable: you had to write the dict out and read it back in again before they appeared. - The symbol handling remembered what symbols you looked up but didn't remember their types, so you could look up an object symbol and then find it popping up when you asked for function symbols, which seems less than ideal. Since we had to rejig things enough to be able to distinguish function and object symbols internally anyway (in order to give suitable errors if you try to add a symbol with a name that already existed in the ctf_open()ed dict), this bug suddenly became more visible and was easily fixed. We do not (yet) support writing out dicts that have been previously read in via ctf_open() or other deserializer (you can look things up in them, but not write them out a second time). This never worked, so there is no incompatibility; if it is needed at a later date, the serializer is a little bit closer to having it work now (the only table we don't deal with is the types table, and that's because the upcoming CTFv4 changes are likely to make major changes to the way that table is represented internally, so adding more code that depends on its current form seems like a bad idea). There is a new testcase that tests much of this, in particular that modification of existing types is still banned and that you can add new ones and chase them without error. libctf/ * ctf-impl.h (struct ctf_dict.ctf_symhash): Split into... (ctf_dict.ctf_symhash_func): ... this and... (ctf_dict.ctf_symhash_objt): ... this. (ctf_dict.ctf_stypes): New, counts static types. (LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR. (LCTF_RDWR): Deleted. (LCTF_DIRTY): Renumbered. (LCTF_LINKING): Likewise. (ctf_lookup_variable_here): New. (ctf_lookup_by_sym_or_name): Likewise. (ctf_symbol_next_static): Likewise. (ctf_add_variable_forced): Likewise. (ctf_add_funcobjt_sym_forced): Likewise. (ctf_simple_open_internal): Adjust. (ctf_bufopen_internal): Likewise. * ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with. (ctf_create): Migrate a bunch of initializations into bufopen. Force recreation of name tables. Do not forcibly override the model, let ctf_bufopen do it. (ctf_static_type): New. (ctf_update): Drop LCTF_RDWR check. (ctf_dynamic_type): Likewise. (ctf_add_function): Likewise. (ctf_add_type_internal): Likewise. (ctf_rollback): Check ctf_stypes, not LCTF_RDWR. (ctf_set_array): Likewise. (ctf_add_struct_sized): Likewise. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enumerator): Likewise (only on the target dict). (ctf_add_member_offset): Likewise. (ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types with colliding names. (ctf_add_forward): Note safety under the new rules. (ctf_add_variable): Split all but the existence check into... (ctf_add_variable_forced): ... this new function. (ctf_add_funcobjt_sym): Likewise... (ctf_add_funcobjt_sym_forced): ... for this new function. * ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts with any stypes. (ctf_link_add_strtab): Likewise. (ctf_link_shuffle_syms): Likewise. (ctf_link_intern_extern_string): Note pre-existing prohibition. * ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check. (ctf_lookup_variable): Split out looking in a dict but not its parent into... (ctf_lookup_variable_here): ... this new function. (ctf_lookup_symbol_idx): Track whether looking up a function or object: cache them separately. (ctf_symbol_next): Split out looking in non-dynamic symtypetab entries to... (ctf_symbol_next_static): ... this new function. Don't get confused by the simultaneous presence of static and dynamic symtypetab entries. (ctf_try_lookup_indexed): Don't waste time looking up symbols by index before there can be any idea how symbols are numbered. (ctf_lookup_by_sym_or_name): Distinguish between function and data object lookups. Drop LCTF_RDWR. (ctf_lookup_by_symbol): Adjust. (ctf_lookup_by_symbol_name): Likewise. * ctf-open.c (init_types): Rename to... (init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes. (ctf_simple_open): Drop writable arg. (ctf_simple_open_internal): Likewise. (ctf_bufopen): Likewise. (ctf_bufopen_internal): Populate fields only used for writable dicts. Drop LCTF_RDWR. (ctf_dict_close): Cater for symhash cache split. * ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR. * ctf-types.c (ctf_variable_next): Drop LCTF_RDWR. * testsuite/libctf-lookup/add-to-opened*: New test.
2024-04-19libctf: fix name lookup in dicts containing base-type bitfieldsNick Alcock3-25/+188
The intent of the name lookup code was for lookups to yield non-bitfield basic types except if none existed with a given name, and only then return bitfield types with that name. Unfortunately, the code as written only does this if the base type has a type ID higher than all bitfield types, which is most unlikely (the opposite is almost always the case). Adjust it so that what ends up in the name table is the highest-width zero-offset type with a given name, if any such exist, and failing that the first type with that name we see, no matter its offset. (We don't define *which* bitfield type you get, after all, so we might as well just stuff in the first we find.) Reported by Stephen Brennan <stephen.brennan@oracle.com>. libctf/ * ctf-open.c (init_types): Modify to allow some lookups during open; detect bitfield name reuse and prefer less bitfieldy types. * testsuite/libctf-writable/libctf-bitfield-name-lookup.*: New test.
2024-04-19libctf: remove static/dynamic name lookup distinctionNick Alcock8-201/+157
libctf internally maintains a set of hash tables for type name lookups, one for each valid C type namespace (struct, union, enum, and everything else). Or, rather, it maintains *two* sets of hash tables: one, a ctf_hash *, is meant for lookups in ctf_(buf)open()ed dicts with fixed content; the other, a ctf_dynhash *, is meant for lookups in ctf_create()d dicts. This distinction was somewhat valuable in the far pre-binutils past when two different hashtable implementations were used (one expanding, the other fixed-size), but those days are long gone: the hash table implementations are almost identical, both wrappers around the libiberty hashtab. The ctf_dynhash has many more capabilities than the ctf_hash (iteration, deletion, etc etc) and has no downsides other than starting at a fixed, arbitrary small size. That limitation is easy to lift (via a new ctf_dynhash_create_sized()), following which we can throw away nearly all the ctf_hash implementation, and all the code to choose between readable and writable hashtabs; the few convenience functions that are still useful (for insertion of name -> type mappings) can also be generalized a bit so that the extra string verification they do is potentially available to other string lookups as well. (libctf still has two hashtable implementations, ctf_dynhash, above, and ctf_dynset, which is a key-only hashtab that can avoid a great many malloc()s, used for high-volume applications in the deduplicator.) libctf/ * ctf-create.c (ctf_create): Eliminate ctn_writable. (ctf_dtd_insert): Likewise. (ctf_dtd_delete): Likewise. (ctf_rollback): Likewise. (ctf_name_table): Eliminate ctf_names_t. * ctf-hash.c (ctf_dynhash_create): Comment update. Reimplement in terms of... (ctf_dynhash_create_sized): ... this new function. (ctf_hash_create): Remove. (ctf_hash_size): Remove. (ctf_hash_define_type): Remove. (ctf_hash_destroy): Remove. (ctf_hash_lookup_type): Rename to... (ctf_dynhash_lookup_type): ... this. (ctf_hash_insert_type): Rename to... (ctf_dynhash_insert_type): ... this, moving validation to... * ctf-string.c (ctf_strptr_validate): ... this new function. * ctf-impl.h (struct ctf_names): Extirpate. (struct ctf_lookup.ctl_hash): Now a ctf_dynhash_t. (struct ctf_dict): All ctf_names_t fields are now ctf_dynhash_t. (ctf_name_table): Now returns a ctf_dynhash_t. (ctf_lookup_by_rawhash): Remove. (ctf_hash_create): Likewise. (ctf_hash_insert_type): Likewise. (ctf_hash_define_type): Likewise. (ctf_hash_lookup_type): Likewise. (ctf_hash_size): Likewise. (ctf_hash_destroy): Likewise. (ctf_dynhash_create_sized): New. (ctf_dynhash_insert_type): New. (ctf_dynhash_lookup_type): New. (ctf_strptr_validate): New. * ctf-lookup.c (ctf_lookup_by_name_internal): Adapt. * ctf-open.c (init_types): Adapt. (ctf_set_ctl_hashes): Adapt. (ctf_dict_close): Adapt. * ctf-serialize.c (ctf_serialize): Adapt. * ctf-types.c (ctf_lookup_by_rawhash): Remove.
2024-04-19libctf: don't leak the symbol name in the name->type cacheNick Alcock1-1/+1
This cache replaced a cache of symbol index->ctf_id_t. That cache was just an array, so it could get away with just being free()d, but the ctfi_symnamedicts cache that replaced it is a full dynhash with a dynamically-allocated string as the key. As such, it needs freeing with ctf_dynhash_destroy(), not just free(), or we leak parts of the underlying hashtab, and all the keys. libctf/ChangeLog: * ctf-archive.c (ctf_arc_flush_caches): Fix leak.
2024-04-19binutils, objdump: Add --ctf-parent-sectionNick Alcock2-8/+61
This lets you examine CTF where the parent and child dicts are in entirely different sections, rather than in a CTF archive with members with different names. The linker doesn't emit ELF objects structured like this, but some third-party linkers may; it's also useful for objcopy-constructed files in some cases. (This is what the objdump --ctf-parent option used to do before commit 80b56fad5c99a8c9 in 2021. The new semantics of that option are much more useful, but that doesn't mean the old ones are never useful at all, so let's bring them back.) (I was specifically driven to add this by DTrace's obscure "ctypes" and "dtypes" options, which dump its internal, dynamically-generated dicts out to files for debugging purposes: there are two, one the parent of the other. Since they're in two separate files rather than a CTF archive and we have no tools that paste files together into archives, objdump wouldn't show them -- and even pasting them together into an ELF executable with objcopy didn't help, since objdump had no options that could be used to look in specific sections for the parent dict. With --ctf-parent-section, this sort of obscure use case becomes possible again. You'll never need it for the output of the normal linker.) binutils/ * doc/ctf.options.texi: Add --ctf-parent-section=. * objdump.c (dump_ctf): Implement it. (dump_bfd): Likewise. (main): Likewise.
2024-04-19gdb: Document qIsAddressTagged packetGustavo Romero2-3/+44
This commit documents the qIsAddressTagged packet. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org> Reviewed-by: Eli Zaretskii <eliz@gnu.org> Approved-By: Eli Zaretskii <eliz@gnu.org>
2024-04-19gdb/testsuite: Add unit tests for qIsAddressTagged packetGustavo Romero1-0/+67
Add unit tests for testing qIsAddressTagged packet request creation and reply checks. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Tested-By: Luis Machado <luis.machado@arm.com>
2024-04-19gdb: Add qIsAddressTagged packetGustavo Romero1-0/+75
This commit adds a new packet, qIsAddressTagged, allowing GDB remote targets to use it to query the stub if a given address is tagged. Currently, the memory tagging address check is done via a read query, where the contents of /proc/<PID>/smaps is read and the flags are inspected for memory tagging-related flags that indicate the address is in a memory tagged region. This is not ideal, for example, for QEMU stub and other cases, such as on bare-metal, where there is no notion of an OS file like 'smaps.' Hence, the introduction of qIsAddressTagged packet allows checking if an address is tagged in an agnostic way. The is_address_tagged target hook in remote.c attempts to use the qIsAddressTagged packet first for checking if an address is tagged and if the stub does not support such a packet (reply is empty) it falls back to using the current mechanism that reads the contents of /proc/<PID>/smaps via vFile requests. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Tested-By: Luis Machado <luis.machado@arm.com>
2024-04-19gdb: Introduce is_address_tagged target hookGustavo Romero13-25/+100
This commit introduces a new target hook, target_is_address_tagged, which is used instead of the gdbarch_tagged_address_p gdbarch hook in the upper layer (printcmd.c). This change enables easy specialization of memory tagging address check per target in the future. As target_is_address_tagged continues to utilize the gdbarch_tagged_address_p hook, there is no change in behavior for all the targets that use the new target hook (i.e., the remote.c, aarch64-linux-nat.c, and corelow.c targets). Just the gdbarch_tagged_address_p signature is changed for convenience, since target_is_address_tagged takes the address to be checked as a CORE_ADDR type. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Tested-By: Luis Machado <luis.machado@arm.com>
2024-04-19gdb: Use passed gdbarch instead of calling current_inferiorGustavo Romero1-1/+1
In do_examine function, use passed gdbarch when checking if an address is tagged instead of calling current_inferior()->arch() to make the code more localized and help modularity by not calling a current_* function, which disguises the use of a global state/variable. There is no change in the code behavior. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org> Suggested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Tested-By: Luis Machado <luis.machado@arm.com>
2024-04-19gdb: aarch64: Remove MTE address checking from memtag_matches_pGustavo Romero1-4/+0
This commit removes aarch64_linux_tagged_address_p from aarch64_linux_memtag_matches_p. aarch64_linux_tagged_address_p checks if an address is tagged (MTE) or not. The check is redundant because aarch64_linux_memtag_matches_p is always called from the upper layers (i.e. from printcmd.c via gdbarch hook gdbarch_memtag_matches_p) after either gdbarch_tagged_address_p (that already points to aarch64_linux_tagged_address_p) has been called or after should_validate_memtags (that calls gdbarch_tagged_address_p at the end) has been called, so the address is already checked. Hence: a) in print_command_1, gdbarch_memtag_matches_p is called only after should_validate_memtags is called, which checks the address at its end; b) in memory_tag_check_command, gdbarch_memtag_matches_p is called only after gdbarch_tagged_address_p is called directly. Also, because after this change the address checking only happens at the upper layer it now allows the address checking to be specialized easily per target, via a target hook. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Tested-By: Luis Machado <luis.machado@arm.com>
2024-04-19gdb: aarch64: Move MTE address check out of set_memtagGustavo Romero2-9/+5
Remove check in parse_set_allocation_tag_input as it is redundant: currently the check happens at the end of parse_set_allocation_tag_input and also in set_memtag (called after parse_set_allocation_tag_input). After it, move MTE address check out of set_memtag and add this check to the upper layer, before set_memtag is called. This is a preparation for using a target hook instead of a gdbarch hook on MTE address checks. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com>
2024-04-19gdb: aarch64: Remove MTE address checking from get_memtagGustavo Romero1-4/+0
This commit removes aarch64_linux_tagged_address_p from aarch64_linux_get_memtag. aarch64_linux_tagged_address_p checks if an address is tagged (MTE) or not. The check is redundant because aarch64_linux_get_memtag is always called from the upper layers (i.e. from printcmd.c via gdbarch hook gdbarch_get_memtag) after either gdbarch_tagged_address_p (that already points to aarch64_linux_tagged_address_p) has been called or after should_validate_memtags (that calls gdbarch_tagged_address_p at the end) has been called, so the address is already checked. Hence: a) in print_command_1, aarch64_linux_get_memtag (via gdbarch_get_memtag hook) is called but only after should_validate_memtags, which calls gdbarch_tagged_address_p; b) in do_examine, aarch64_linux_get_memtag is also called only after gdbarch_tagged_address_p is directly called; c) in memory_tag_check_command, gdbarch_get_memtag is called -- tags matching or not -- after the initial check via direct call to gdbarch_tagged_address_p; d) in memory_tag_print_tag_command, address is checked directly via gdbarch_tagged_address_p before gdbarch_get_memtag is called. Also, because after this change the address checking only happens at the upper layer it now allows the address checking to be specialized easily per target, via a target hook. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org> Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Approved-By: Luis Machado <luis.machado@arm.com> Tested-By: Luis Machado <luis.machado@arm.com>
2024-04-19Re: elf: Strip unreferenced weak undefined symbolsAlan Modra1-2/+3
PR ld/31652 * elflink.c (_bfd_elf_link_output_relocs): Don't segfault on NULL rel_hash.
2024-04-18elf: Strip unreferenced weak undefined symbolsH.J. Lu11-28/+134
Linker will resolve an undefined symbol only if it is referenced by relocation. Unreferenced weak undefined symbols serve no purpose. Weak undefined symbols appear in the dynamic symbol table only when they are referenced by dynamic relocation. Mark symbols with relocation and strip undefined weak symbols if they don't have relocation and aren't in the dynamic symbol table. bfd/ PR ld/31652 * elf-bfd.h (elf_link_hash_entry): Add has_reloc. * elf-vxworks.c (elf_vxworks_emit_relocs): Set has_reloc. * elflink.c (_bfd_elf_link_output_relocs): Likewise. (elf_link_output_extsym): Strip undefined weak symbols if they don't have relocation and aren't in the dynamic symbol table. ld/ PR ld/31652 * testsuite/ld-elf/elf.exp: Run undefweak tests. * testsuite/ld-elf/undefweak-1.rd: New file. * testsuite/ld-elf/undefweak-1a.s: Likewise. * testsuite/ld-elf/undefweak-1b.s: Likewise. * testsuite/ld-x86-64/weakundef-1.nd: Likewise. * testsuite/ld-x86-64/weakundef-1a.s: Likewise. * testsuite/ld-x86-64/weakundef-1b.s: Likewise. * testsuite/ld-x86-64/x86-64.exp: Run undefweak tests.
2024-04-19memory leak in bfd/dwarf2.cAlan Modra1-0/+2
* dwarf2.c (_bfd_dwarf2_cleanup_debug_info): Free dwarf_addr_buffer and dwarf_str_offsets_buffer.
2024-04-19mmix disassemble memory leakAlan Modra1-0/+1
It's a once-off and of no consequence, but fix it anyway. The mmix caonoicalize_syms array is an array of pointers. Freeing it won't lose symbol names. * mmix-dis.c (initialize_mmix_dis_info): Free syms.
2024-04-19Automatic date update in version.inGDB Administrator1-1/+1
2024-04-18[gdb/testsuite] Use find_gnatmake instead of gdb_find_gnatmakeTom de Vries2-1/+16
On SLE-11, with an older dejagnu version, I ran into: ... Running gdb.ada/mi_prot.exp ... UNRESOLVED: gdb.ada/mi_prot.exp: \ testcase aborted due to invalid command name: gdb_find_gnatmake ERROR: tcl error sourcing gdb.ada/mi_prot.exp. ERROR: invalid command name "gdb_find_gnatmake" while executing "::gdb_tcl_unknown gdb_find_gnatmake" ("uplevel" body line 1) invoked from within "uplevel 1 ::gdb_tcl_unknown $args" (procedure "::unknown" line 5) invoked from within "gdb_find_gnatmake" (procedure "gnatmake_version_at_least" line 2) invoked from within ... Proc gdb_find_gnatmake is actually a backup for find_gnatmake: ... if {[info procs find_gnatmake] == ""} { rename gdb_find_gnatmake find_gnatmake ... so gnatmake_version_at_least should use find_gnatmake instead. For a recent dejagnu with find_gnatmake, gdb_find_gnatmake is kept, and we don't run into this error. For an older dejagnu without find_gnatmake, gdb_find_gnatmake is renamed to find_gnatmake, and we do run into the error. It's confusing that we're using the gdb_ prefix for gdb_find_gnatmake, it seems something legitimate to use. Maybe we should use future_ or gdb_future_ prefix instead to make this more clear, but I've left that alone for now. Fix this by: - triggering the same error with a recent dejagnu by removing gdb_find_gnatmake unless used (and likewise for other procs in future.exp), and - using find_gnatmake in gnatmake_version_at_least. Tested on x86_64-linux. Approved-By: Tom Tromey <tom@tromey.com> PR testsuite/31647 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31647
2024-04-18[gdb/testsuite] Use allocator_may_return_null=1 in two test-casesTom de Vries3-6/+33
Simon reported [1] that recent commit 06e967dbc9b ("[gdb/python] Throw MemoryError in inferior.read_memory if malloc fails") introduced AddressSanitizer allocation-size-too-big errors in the two test-cases affected by this commit. Fix this by suppressing the error in the two test-cases using allocator_may_return_null=1. Tested on aarch64-linux. Approved-By: Tom Tromey <tom@tromey.com> [1] https://sourceware.org/pipermail/gdb-patches/2024-April/208171.html
2024-04-18gdbsupport: constify some return values in print-utils.{h,cc}Simon Marchi6-22/+22
There is no reason the callers of these functions need to change the returned string, so change the `char *` return types to `const char *`. Update a few callers to also use `const char *`. Change-Id: I94adff574d5e1b326e8cc688cf1817a15b408b96 Approved-By: Tom Tromey <tom@tromey.com>
2024-04-18HPPA64 linker: Do not force the generation of DT_FLAGS for Linux targets.Nick Clifton1-4/+12
PR 30743
2024-04-18Add DW_TAG_compile_unit DIE to Dummy CUsWill Hawkins1-0/+1
Dummy CUs help detect errors and are very helpful. However, the DWARF spec seems to indicate the CUs need a DW_TAG_compile_unit in addition to the header. This patch adds that. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31650 Signed-off-by: Will Hawkins <hawkinsw@obs.cr> Approved-By: Tom de Vries <tdevries@suse.de> Tested-By: Tom de Vries <tdevries@suse.de>
2024-04-18Re: Fix address violations when reading corrupt VMS recordsAlan Modra1-9/+15
Fixes error reports about the length of EEOM records produced by gas. PR 21618 * vms-alpha.c (evax_bfd_print_emh): Don't read subtyp if short record. Consolidate error messages. (evax_bfd_print_eeom): Allow length 10 record.
2024-04-18alpha_vms_get_section_contents vs. fuzzed filesAlan Modra1-24/+26
This patch is in response to an oss-fuzz report regarding use-of-uninitialized-value in bfd_is_section_compressed_info from section contents provided by alpha_vms_get_section_contents. That hole is covered by using bfd_zalloc rather than bfd_alloc. The rest of the patch is mostly a tidy. In a function returning section contents, I tend to prefer a test on the section properties over a test on file properties. That's why I've changed the file flags test to one on section filepos and flags before calling _bfd_generic_get_section_contents. Also, fuzzed objects can easily have sections with file backing in relocatable objects, or sections without file backing in images. Possible confusion is avoided by testing each section. Note that we are always going to run into out-of-memory with fuzzed alpha-vms object files due to sections with contents via ETIR records. eg. ETIR__C_STO_IMMR stores a number of bytes repeatedly, with a 32-bit repeat count. So section contents can be very large from a relatively small file. I'm inclined to think that an out-of-memory error is fine for such files. * vms-alpha.c (alpha_vms_get_section_contents): Handle sections with non-zero filepos or without SEC_HAS_CONTENTS via _bfd_generic_get_section_contents. Zero memory allocated for sections filled by ETIR records.
2024-04-18Tidy objdump opb expressionsAlan Modra1-5/+5
I don't think any of these can overflow, but since all of the expressions I'm editing here are inside a while loop with condition addr_offset < stop_offset, this change makes it more obvious that they can't overflow. * objdump.c (disassemble_bytes): Calculate octet expressions involving both addr_offset and stop_offset by first subtracting addr_offset from stop_offset.
2024-04-18Automatic date update in version.inGDB Administrator1-1/+1
2024-04-17Remove a copy from c-exp.y:parse_numberTom Tromey1-18/+13
parse_number copies its input string, but there is no need to do this. This patch removes the copy. Regression tested on x86-64 Fedora 38. Reviewed-by: Keith Seitz <keiths@redhat.com>
2024-04-17gdb/Windows: Fix detach while runningPedro Alves3-8/+267
While testing a WIP Cygwin GDB that supports non-stop, I noticed that gdb.threads/attach-non-stop.exp exposes that this: (gdb) attach PID& ... (gdb) detach ... hangs. And it turns out that it hangs in all-stop as well. This commits fixes that. After "attach &", the target is set running, we've called ContinueDebugEvent and the process_thread thread is waiting for WaitForDebugEvent events. It is the equivalent of "attach; c&". In windows_nat_target::detach, the first thing we do is unconditionally call windows_continue (for ContinueDebugEvent), which blocks in do_synchronously, until the process_thread sees an event out of WaitForDebugEvent. Unless the inferior happens to run into a breakpoint, etc., then this hangs indefinitely. If we've already called ContinueDebugEvent earlier, then we shouldn't be calling it again in ::detach. Still in windows_nat_target::detach, we have an interesting issue that ends up being the bulk of the patch -- only the process_thread thread can call DebugActiveProcessStop, but if it is blocked in WaitForDebugEvent, we need to somehow force it to break out of it. The only way to do that, is to force the inferior to do something that causes WaitForDebugEvent to return some event. This patch uses CreateRemoteThread to do it, which results in WaitForDebugEvent reporting CREATE_THREAD_DEBUG_EVENT. We then terminate the injected thread before it has a chance to run any userspace code. Note that Win32 functions like DebugBreakProcess and GenerateConsoleCtrlEvent would also inject a new thread in the inferior. I first used DebugBreakProcess, but that is actually more complicated to use, because we'd have to be sure to consume the breakpoint event before detaching, otherwise the inferior would likely die due a breakpoint exception being raised with no debugger around to intercept it. See the new break_out_process_thread method. So the fix has two parts: - Keep track of whether we've called ContinueDebugEvent and the process_thread thread is waiting for events, or whether WaitForDebugEvent already returned an event. - In windows_nat_target::detach, if the process_thread thread is waiting for events, unblock out of its WaitForDebugEvent, before proceeding with the actual detach. New test included. Passes cleanly on GNU/Linux native and gdbserver, and also passes cleanly on Cygwin and MinGW, with the fix. Before the fix, it would hang and fail with a timeout. Tested-By: Hannes Domani <ssbssa@yahoo.de> Reviewed-By: Tom Tromey <tom@tromey.com> Change-Id: Ifb91c58c08af1a9bcbafecedc93dfce001040905
2024-04-17gdb+gdbserver/Linux: Remove USE_SIGTRAP_SIGINFO fallbackPedro Alves3-95/+8
It's been over 9 years (since commit faf09f0119da) since Linux GDB and GDBserver started relying on SIGTRAP si_code to tell whether a breakpoint triggered, which is important for non-stop mode. When that then-new code was added, I had left the then-old code as fallback, in case some architectured still needed it. Given AFAIK there haven't been complaints since, this commit finally removes the fallback code, along with USE_SIGTRAP_SIGINFO. Change-Id: I140a5333a9fe70e90dbd186aca1f081549b2e63d
2024-04-17Use section name in DWARF error messageTom Tromey1-2/+3
A bug points out that a certain error message in read_str_index uses a hard-coded section name. This patch changes it to use dwarf2_section_info::get_name instead, like the other errors in the function. No test because it didn't seem worthwhile. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31639 Approved-By: Simon Marchi <simon.marchi@efficios.com>
2024-04-17gdbsupport, gdbserver, gdb: use -Wno-vla-cxx-extensionSimon Marchi4-0/+4
When building with clang 18, I see: CXX aarch64-linux-tdep.o /home/smarchi/src/binutils-gdb/gdb/aarch64-linux-tdep.c:1299:26: error: variable length arrays in C++ are a Clang extension [-Werror,-Wvla-cxx-extension] 1299 | gdb_byte za_zeroed[za_bytes]; | ^~~~~~~~ /home/smarchi/src/binutils-gdb/gdb/aarch64-linux-tdep.c:1299:26: note: read of non-const variable 'za_bytes' is not allowed in a constant expression /home/smarchi/src/binutils-gdb/gdb/aarch64-linux-tdep.c:1282:10: note: declared here 1282 | size_t za_bytes = std::pow (sve_vl_from_vg (svg), 2); | ^ Since we are using VLAs right now, that warning doesn't make sense for us. add `-Wno-vla-cxx-extension` to the list of warning flags we try to enable. If we ever choose to disallow VLAs, we can remove that flag. Change-Id: Ie41feafc50c343f6e75333d4f836ce32fbeb6d8c
2024-04-17Fix include guard typoMatt Wozniski1-1/+1
Signed-off-by: Matt Wozniski <godlygeek@gmail.com> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31645 Approved-By: Tom Tromey <tom@tromey.com>
2024-04-17gdb/record: minor clean, remove some unneeded argumentsAndrew Burgess3-6/+37
I spotted that the two functions: record_full_open_1 record_full_core_open_1 both took two arguments, neither of which are used. I stumbled onto this while reviewing how filename_completer is used. The 'record full restore' command uses filename_completer and invokes the cmd_record_full_restore function. The cmd_record_full_restore function calls core_file_command and then record_full_open, which then calls one of the above functions. As 'record full restore' takes a filename, this is passed to cmd_record_full_restore, which forwards the filename to both core_file_command and record_full_open. However, record_full_open never actually uses the filename that is passed in. The record_full_open function is also used for 'target record-full'. I propose that record_full_open should no longer expect to see any user supplied arguments passed in (it doesn't use any). In fact, I've added a check that if we do get any user supplied arguments we'll throw an error. Now that we know record_full_open isn't being passed any user arguments we can stop passing the arguments to record_full_open_1 and record_full_core_open_1, this will make no user visible difference as these arguments were not used. It is possible that a user was previously doing: (gdb) target record-full blah blah blah And this previously would work fine, the 'blah blah blah' was ignored. Now this will give an error. Other than this case there should be no user visible changes after this commit. Approved-By: Tom Tromey <tom@tromey.com>
2024-04-17gdb/record: add an assert in cmd_record_startAndrew Burgess1-1/+6
The 'record' command is both a prefix command AND an alias for 'target record-full'. As it is a prefix command, if a user types: (gdb) record blah Then GDB will look for, and try to invoke the 'blah' sub-command. This will either succeed (if blah is found) or throw an error (if blah is not found). As such, the only way to invoke the 'record' command is like: (gdb) record It is impossible to pass arguments to the 'record' command. As we know this is true then we can assert this in cmd_record_start. I added this assert because initially I was going to try forwarding ARGS from cmd_record_start to the 'target record-full' command, but then I realised passing arguments to 'record' was impossible. There should be no user visible changes after this commit. Approved-By: Tom Tromey <tom@tromey.com>
2024-04-17gdb/record: remove unnecessary use of filename_completerAndrew Burgess1-2/+0
Spotted that the 'record' command has its completer set to filename_completer. The problem is that 'record' is a prefix command, as such, its completer is hard-coded to complete on sub-commands. The attempt to use filename_completer is irrelevant. The 'record' command is itself a command though, that is, a user can do this: (gdb) record which is really just an alias for: (gdb) target record-full Nowhere does cmd_record_start (which is called when 'record' is used) expect a filename, and 'target record-full' doesn't expect a filename either. So lets just drop the line which sets filename_completer as the completer for 'record'. Approved-By: Tom Tromey <tom@tromey.com>
2024-04-17[gdb/testsuite] Require DW_LNE_end_sequenceTom de Vries4-0/+43
The dwarf standard requires that every line number program sequence ends with a DW_LNE_end_sequence instruction. Enforce this in the dwarf assembler for the last sequence in a line number program (we have no means to enforce this for earlier sequences), and fix a few test-case that don't have it. Tested on aarch64-linux. PR testsuite/31618 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31618
2024-04-17[gdb/testsuite] Fix end_sequence addressesTom de Vries10-11/+16
I noticed in test-case gdb.reverse/map-to-same-line.exp, that the end of main: ... 00000000004102c4 <end_of_sequence>: 4102c4: 52800000 mov w0, #0x0 // #0 4102c8: 9100c3ff add sp, sp, #0x30 4102cc: d65f03c0 ret ... is not described by the line table: ... File name Line number Starting address View Stmt ... map-to-same-line.c 54 0x4102ac x map-to-same-line.c - 0x4102c4 ... Fix this by ending the line table at $main_end. Likewise in a few other test-cases, found using: ... $ find gdb/testsuite/ -type f \ | xargs grep -B1 DW_LNE_end_sequence \ | grep set_address \ | egrep -v "_end|_len" ... Tested on aarch64-linux.
2024-04-17[gdb/testsuite] Require address update for DW_LNS_copyTom de Vries7-0/+16
No address update before a DW_LNS_copy might mean an incorrect dwarf assembly test-case. Try to catch such incorrect dwarf assembly test-cases by: - requiring an explicit address update for each DW_LNS_copy, and - handling the cases where an update is indeed not needed, by adding "DW_LNS_advance_pc 0". Tested on aarch64-linux.