rocket-tools/fsf-binutils-gdb.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2020-07-22	section_table_xfer_memory: Replace section name with callback predicate	Kevin Buettner	6	-14/+33
	This patch is motivated by the need to be able to select sections that section_table_xfer_memory_partial should consider for memory transfers. I'll use this facility in the next patch in this series. section_table_xfer_memory_partial() can currently be passed a section name which may be used to make name-based selections. This is similar to what I want to do, except that I want to be able to consider section flags instead of the name. I'm replacing the section name parameter with a predicate that, when passed a pointer to a target_section struct, will return true if that section should be further considered, or false which indicates that it shouldn't. I've converted the one existing use where a non-NULL section name is passed to section_table_xfer_memory_partial(). Instead of passing the section name, it now looks like this: auto match_cb = [=] (const struct target_section s) { return (strcmp (section_name, s->the_bfd_section->name) == 0); }; return section_table_xfer_memory_partial (readbuf, writebuf, memaddr, len, xfered_len, table->sections, table->sections_end, match_cb); The other callers all passed NULL; they've been simplified somewhat in that they no longer need to pass NULL. gdb/ChangeLog: exec.h (section_table_xfer_memory): Revise declaration, replacing section name parameter with an optional callback predicate. * exec.c (section_table_xfer_memory): Likewise. * bfd-target.c, exec.c, target.c, corelow.c: Adjust all callers of section_table_xfer_memory.
2020-07-22	Adjust corefile.exp test to show regression after bfd hack removal	Kevin Buettner	2	-0/+10
	In his review of my BZ 25631 patch series, Pedro was unable to reproduce the regression which should occur after patch #1, "Remove hack for GDB which sets the section size to 0", is applied. Pedro was using an ld version older than 2.30. Version 2.30 introduced the linker option -z separate-code. Here's what the man page has to say about it: Create separate code "PT_LOAD" segment header in the object. This specifies a memory segment that should contain only instructions and must be in wholly disjoint pages from any other data. In ld version 2.31, use of separate-code became the default for Linux/x86. So, really, 2.31 or later is required in order to see the regression that occurs in recent Linux distributions when only the bfd hack removal patch is applied. For the test case in question, use of the separate-code linker option means that the global variable "coremaker_ro" ends up in a separate load segment (though potentially with other read-only data). The upshot of this is that when only patch #1 is applied, GDB won't be able to correctly access coremaker_ro. The reason for this is due to the fact that this section will now have a non-zero size, but will not have contents from the core file to find this data. So GDB will ask BFD for the contents and BFD will respond with zeroes for anything from those sections. GDB should instead be looking in the executable for this data. Failing that, it can then ask BFD for a reasonable value. This is what a later patch in this series does. When using ld versions earlier than 2.31 (or 2.30 w/ the -z separate-code option explicitly provided to the linker), there is the possibility that coremaker_ro ends up being placed near other data which is recorded in the core file. That means that the correct value will end up in the core file, simply because it resides on a page that the kernel chooses to put in the core file. This is why Pedro wasn't able to reproduce the regression that should occur after fixing the BFD hack. This patch places a big chunk of memory, two pages worth on x86, in front of "coremaker_ro" to attempt to force it onto another page without requiring use of that new-fangled linker switch. Speaking of which, I considered changing the test to use -z separate-code, but this won't work because it didn't exist prior to version 2.30. The linker would probably complain of an unrecognized switch. Also, it likely won't be available in other linkers not based on current binutils. I.e. it probably won't work in FreeBSD, NetBSD, etc. To make this more concrete, this is what should happen when attempting to access coremaker_ro when only patch #1 is applied: Core was generated by `/mesquite2/sourceware-git/f28-coresegs/bld/gdb/testsuite/outputs/gdb.base/coref'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f68205deefb in raise () from /lib64/libc.so.6 (gdb) p coremaker_ro $1 = 0 Note that this result is wrong; 201 should have been printed instead. But that's the point of the rest of the patch series. However, without this commit, or when using an old Linux distro with a pre-2.31 ld, this is what you might see instead: Core was generated by `/mesquite2/sourceware-git/f28-coresegs/bld/gdb/testsuite/outputs/gdb.base/coref'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f63dd658efb in raise () from /lib64/libc.so.6 (gdb) p coremaker_ro $1 = 201 I.e. it prints the right answer, which sort of makes it seem like the rest of the series isn't required. Now, back to the patch itself... what should be the size of the memory chunk placed before coremaker_ro? It needs to be at least as big as the page size (PAGE_SIZE) from the kernel. For x86 and several other architectures this value is 4096. I used MAPSIZE which is defined to be 8192 in coremaker.c. So it's twice as big as what's currently needed for most Linux architectures. The constant PAGE_SIZE is available from <sys/user.h>, but this isn't portable either. In the end, it seemed simpler to just pick a value and hope that it's big enough. (Running a separate program which finds the page size via sysconf(_SC_PAGESIZE) and then passes it to the compilation via a -D switch seemed like overkill for a case which is rendered moot by recent linker versions.) Further information can be found here: https://sourceware.org/pipermail/gdb-patches/2020-May/168168.html https://sourceware.org/pipermail/gdb-patches/2020-May/168170.html Thanks to H.J. Lu for telling me about the '-z separate-code' linker switch. gdb/testsuite/ChangeLog: * gdb.base/coremaker.c (filler_ro): New global constant.
2020-07-22	Remove hack for GDB which sets the section size to 0	Kevin Buettner	2	-8/+4
	This commit removes a hack for GDB which was introduced in 2007. See: https://sourceware.org/ml/binutils/2007-08/msg00044.html That hack mostly allowed GDB's handling of core files to continue to work without any changes to GDB. The problem with setting the section size to zero is that GDB won't know how big that section is/was. Often, this doesn't matter because the data in question are found in the exec file. But it can happen that the section describes memory that had been allocated, but never written to. In this instance, the contents of that memory region are not written to the core file. Also, since the region in question was dynamically allocated, it won't appear in the exec file. We don't want these regions to appear as inaccessible to GDB (since they were accessible when the process was live), so it's important that GDB know the size of the region. I've made changes to GDB which correctly handles this case. When attempting to access memory, GDB will first consider core file data for which both SEC_ALLOC and SEC_HAS_CONTENTS is set. Next, if that fails, GDB will attempt to find the data in the exec file. Finally, if that also fails, GDB will attempt to access memory in the sections which are flagged as SEC_ALLOC, but not SEC_HAS_CONTENTS. bfd/ChangeLog: * elf.c (_bfd_elf_make_section_from_phdr): Remove hack for GDB.
2020-07-22	Fix crash in -stack-list-arguments	Tom Tromey	7	-3/+148
	-stack-list-arguments will crash when stopped in an Ada procedure that has an argument with a certain name ("_objectO" -- which can only be generated by the compiler). The bug occurs because lookup_symbol will fail in this case. This patch changes -stack-list-arguments to mirror what is done with arguments elsewhere. (As an aside, I don't understand why this lookup is even needed, but I assume it is some stabs thing?) In the longer term I think it would be good to share this code between MI and the CLI. However, due to the upcoming release, I preferred a more local fix. gdb/ChangeLog 2020-07-22 Tom Tromey <tromey@adacore.com> * mi/mi-cmd-stack.c (list_args_or_locals): Use lookup_symbol_search_name. gdb/testsuite/ChangeLog 2020-07-22 Tom Tromey <tromey@adacore.com> * gdb.ada/mi_prot.exp: New file. * gdb.ada/mi_prot/pkg.adb: New file. * gdb.ada/mi_prot/pkg.ads: New file. * gdb.ada/mi_prot/prot.adb: New file.
2020-07-22	libctf: fixes for systems on which sizeof (void *) > sizeof (long)	Nick Alcock	4	-8/+21
	Systems like mingw64 have pointers that can only be represented by 'long long'. Consistently cast integers stored in pointers through uintptr_t to cater for this. libctf/ * ctf-create.c (ctf_dtd_insert): Add uintptr_t casts. (ctf_dtd_delete): Likewise. (ctf_dtd_lookup): Likewise. (ctf_rollback): Likewise. * ctf-hash.c (ctf_hash_lookup_type): Likewise. * ctf-types.c (ctf_lookup_by_rawhash): Likewise.
2020-07-22	libctf: fix isspace casts	Nick Alcock	2	-3/+7
	isspace() notoriously takes an int, not a char. Cast uses appropriately. libctf/ * ctf-lookup.c (ctf_lookup_by_name): Adjust.
2020-07-22	libctf, binutils: fix big-endian libctf archive opening	Nick Alcock	2	-1/+6
	The recent commit "libctf, binutils: support CTF archives like objdump" broke opening of CTF archives on big-endian platforms. This didn't affect anyone much before now because the linker never emitted CTF archives because it wasn't detecting ambiguous types properly: now it does, and this bug becomes obvious. Fix trivial. libctf/ * ctf-archive.c (ctf_arc_bufopen): Endian-swap the archive magic number if needed.
2020-07-22	ld, testsuite: do not run CTF tests at all on non-ELF for now	Nick Alcock	2	-0/+9
	Right now, the linker is not emitting CTF sections on (at least some) non-ELF platforms, because work similar to that done for ELF needs to be done to each platform in turn to emit linker-generated sections whose contents are programmatically derived. (Or something better needs to be done.) So, for now, the CTF tests will fail on non-ELF for lack of a .ctf section in the output: so skip the CTF tests there temporarily. (This is not the same as the permanent skip of the diags tests, which is done because the input for those is assembler that depends on the ELF syntax of pseudos like .section: this is only a temporary skip, until the linker grows support for CTF on more targets.) ld/ * testsuite/ld-ctf/ctf.exp: Skip on non-ELF for now.
2020-07-22	ld: do not produce one empty output .ctf section for every input .ctf	Nick Alcock	2	-1/+9
	The trick we use to prevent ld doing as it does for almost all other sections and copying the input CTF section into the output has recently broken, causing output to be produced with a valid CTF section followed by massive numbers of CTF sections, one per .ctf in the input (minus one, for the one that was filled out by ctf_link). Their size is being forcibly set to zero, but they're still present, wasting space and looking ridiculous. This is not right: ld/ld-new : section size addr .interp 28 4194984 [...] .bss 21840 6788544 .comment 92 0 .ctf 87242 0 .ctf 0 0 .ctf 0 0 [snip 131 more empty sections] .gnu.build.attributes 7704 6818576 .debug_aranges 6592 0 .debug_info 4488859 0 .debug_abbrev 150099 0 .debug_line 796759 0 .debug_str 237926 0 .debug_loc 2247302 0 .debug_ranges 237920 0 Total 10865285 The fix is to exclude these unwanted input sections from being present in the output. We tried this before and it broke things, because if you exclude all the .ctf sections there isn't going to be one in the output so there is nowhere to put the deduplicated CTF. The solution to that is really simple: set SEC_EXCLUDE on all but one CTF section. We don't care which one (they're all the same once their size has been zeroed), so just pick the first we see. ld/ * ldlang.c (ldlang_open_ctf): Set SEC_EXCLUDE on all but the first input .ctf section.
2020-07-22	ld, testsuite: only run CTF tests when ld and GCC support CTF	Nick Alcock	7	-6/+91
	The CTF testsuite runs GCC to generate CTF that it knows matches the input .c files before doing a run_dump_test over it. So we need a GCC capable of doing that, and we need to always avoid running those tests if libctf was disabled because the linker will never be capable of it. ld/ * configure.ac (enable_libctf): Substitute it. * Makefile.am (enablings.exp): New. (EXTRA_DEJAGNU_SITE_CONFIG): Add it. (DISTCLEANFILES): Likewise. * Makefile.in: Regenerate. * configure: Likewise. * testsuite/lib/ld-lib.exp (compile_one_cc): New. (check_ctf_available): Likewise. (skip_ctf_tests): Likewise. * testsuite/ld-ctf/ctf.exp: Call skip_ctf_tests.
2020-07-22	ld: new CTF testsuite	Egeyar Bagcioglu	74	-0/+1833
	Uses the new cc option to run_dump_test to compile most tests from C code, ensuring that the types in the C code accurately describe what the .d file is testing. (Some tests, mostly those testing malformed CTF, run directly from .s, or include both .s and .c.) ld/ * testsuite/ld-ctf/ctf.exp: New file. * testsuite/ld-ctf/A-2.c: New file. * testsuite/ld-ctf/A.c: New file. * testsuite/ld-ctf/B-2.c: New file. * testsuite/ld-ctf/B.c: New file. * testsuite/ld-ctf/C-2.c: New file. * testsuite/ld-ctf/C.c: New file. * testsuite/ld-ctf/array-char.c: New file. * testsuite/ld-ctf/array-int.c: New file. * testsuite/ld-ctf/array.d: New file. * testsuite/ld-ctf/child-float.c: New file. * testsuite/ld-ctf/child-int.c: New file. * testsuite/ld-ctf/conflicting-cycle-1.B-1.d: New file. * testsuite/ld-ctf/conflicting-cycle-1.B-2.d: New file. * testsuite/ld-ctf/conflicting-cycle-1.parent.d: New file. * testsuite/ld-ctf/conflicting-cycle-2.A-1.d: New file. * testsuite/ld-ctf/conflicting-cycle-2.A-2.d: New file. * testsuite/ld-ctf/conflicting-cycle-2.parent.d: New file. * testsuite/ld-ctf/conflicting-cycle-3.C-1.d: New file. * testsuite/ld-ctf/conflicting-cycle-3.C-2.d: New file. * testsuite/ld-ctf/conflicting-cycle-3.parent.d: New file. * testsuite/ld-ctf/conflicting-enums.d: New file. * testsuite/ld-ctf/conflicting-typedefs.d: New file. * testsuite/ld-ctf/cross-tu-1.c: New file. * testsuite/ld-ctf/cross-tu-2.c: New file. * testsuite/ld-ctf/cross-tu-conflicting-2.c: New file. * testsuite/ld-ctf/cross-tu-cyclic-1.c: New file. * testsuite/ld-ctf/cross-tu-cyclic-2.c: New file. * testsuite/ld-ctf/cross-tu-cyclic-3.c: New file. * testsuite/ld-ctf/cross-tu-cyclic-4.c: New file. * testsuite/ld-ctf/cross-tu-cyclic-conflicting.d: New file. * testsuite/ld-ctf/cross-tu-cyclic-nonconflicting.d: New file. * testsuite/ld-ctf/cross-tu-into-cycle.d: New file. * testsuite/ld-ctf/cross-tu-noncyclic.d: New file. * testsuite/ld-ctf/cycle-1.c: New file. * testsuite/ld-ctf/cycle-1.d: New file. * testsuite/ld-ctf/cycle-2.A.d: New file. * testsuite/ld-ctf/cycle-2.B.d: New file. * testsuite/ld-ctf/cycle-2.C.d: New file. * testsuite/ld-ctf/diag-ctf-version-0.d: New file. * testsuite/ld-ctf/diag-ctf-version-0.s: New file. * testsuite/ld-ctf/diag-ctf-version-2-unsupported-feature.d: New file. * testsuite/ld-ctf/diag-ctf-version-2-unsupported-feature.s: New file. * testsuite/ld-ctf/diag-ctf-version-f.d: New file. * testsuite/ld-ctf/diag-ctf-version-f.s: New file. * testsuite/ld-ctf/diag-cttname-invalid.d: New file. * testsuite/ld-ctf/diag-cttname-invalid.s: New file. * testsuite/ld-ctf/diag-cttname-null.d: New file. * testsuite/ld-ctf/diag-cttname-null.s: New file. * testsuite/ld-ctf/diag-cuname.d: New file. * testsuite/ld-ctf/diag-cuname.s: New file. * testsuite/ld-ctf/diag-decompression-failure.d: New file. * testsuite/ld-ctf/diag-decompression-failure.s: New file. * testsuite/ld-ctf/diag-parlabel.d: New file. * testsuite/ld-ctf/diag-parlabel.s: New file. * testsuite/ld-ctf/diag-parname.d: New file. * testsuite/ld-ctf/diag-parname.s: New file. * testsuite/ld-ctf/diag-unsupported-flag.d: New file. * testsuite/ld-ctf/diag-unsupported-flag.s: New file. * testsuite/ld-ctf/diag-wrong-magic-number-mixed.d: New file. * testsuite/ld-ctf/diag-wrong-magic-number.d: New file. * testsuite/ld-ctf/diag-wrong-magic-number.s: New file. * testsuite/ld-ctf/enum-2.c: New file. * testsuite/ld-ctf/enum.c: New file. * testsuite/ld-ctf/function.c: New file. * testsuite/ld-ctf/function.d: New file. * testsuite/ld-ctf/slice.c: New file. * testsuite/ld-ctf/slice.d: New file. * testsuite/ld-ctf/super-sub-cycles.c: New file. * testsuite/ld-ctf/super-sub-cycles.d: New file. * testsuite/ld-ctf/typedef-int.c: New file. * testsuite/ld-ctf/typedef-long.c: New file. * testsuite/ld-ctf/union-1.c: New file.
2020-07-22	binutils, testsuite: allow compilation before doing run_dump_test	Nick Alcock	2	-5/+58
	The CTF assembler emitted by GCC has architecture-dependent pseudos in it, and is (obviously) tightly tied to a particular set of C source files with specific types in them. The CTF tests do run_dump_test on some candidate input, link it using the run_dump_test ld machinery, and compare objdump --ctf output. To avoid skew, we'd like to be able to easily regenerate the .s being scanned so that the .c doesn't get out of sync with it, but since GCC emits arch-dependent pseudos, we are forced to hand-hack the output every time (quite severely on some arches, like x86-32 and -64, where every single pseudo used is not only arch-dependent but undocumented). To avoid this, teach run_dump_test how to optionally compile things given new, optional additional flags passed in in the cc option. Only sources with the .c suffix are compiled, so there is no effect on any existing tests. The .s files go into the tmpdir, from which existing run_dump_test code picks them up as usual. binutils/ * testsuite/lib/binutils-common.exp (run_dump_test): Add 'cc' option.
2020-07-22	ld: new options --ctf-variables and --ctf-share-types	Nick Alcock	7	-1/+107
	libctf recently changed to make it possible to not emit the CTF variables section. Make this the default for ld: the variables section is a simple name -> type mapping, and the names can be quite voluminous. Nothing in the variables section appears in the symbol table, by definition, so GDB cannot make use of them: special-purpose projects that implement their own analogues of symbol table lookup can do so, but they'll need to tell the linker to emit the variables section after all. The new --ctf-variables option does this. The --ctf-share-types option (valid values "share-duplicated" and "share-unconflicted") allow the caller to specify the CTF link mode. Most users will want share-duplicated, since it allows for more convenient debugging: but very large projects composed of many decoupled components may want to use share-unconflicted mode, which places types that appear in only one TU into per-TU dicts. (They may also want to relink the CTF using the ctf_link API and cu-mapping, to make their "components" larger than a single TU. Right now the linker does not expose the CU-mapping machinery. Perhaps it should in future to make this use case easier.) For now, giving the linker the ability to emit share-duplicated CTF lets us add testcases for that mode to the testsuite. ld/ * ldlex.h (option_values) <OPTION_CTF_VARIABLES, OPTION_NO_CTF_VARIABLES, OPTION_CTF_SHARE_TYPES>: New. * ld.h (ld_config_type) <ctf_variables, ctf_share_duplicated>: New fields. * ldlang.c (lang_merge_ctf): Use them. * lexsup.c (ld_options): Add ctf-variables, no-ctf-variables, ctf-share-types. (parse_args) <OPTION_CTF_VARIABLES, OPTION_NO_CTF_VARIABLES, OPTION_CTF_SHARE_TYPES>: New cases. * ld.texi: Document new options. * NEWS: Likewise.
2020-07-22	ld: Reformat CTF errors into warnings.	Egeyar Bagcioglu	2	-10/+19
	ld/ * ldlang.c (lang_merge_ctf): Turn errors into warnings. Fix a comment typo. (lang_write_ctf): Turn an error into a warning. (ldlang_open_ctf): Reformat warnings. Fix printing file names. Reviewed-by: Nick Alcock <nick.alcock@oracle.com>
2020-07-22	binutils: objdump: ctf: drop incorrect linefeeds	Nick Alcock	2	-4/+9
	The CTF objdumping code is adding linefeeds in calls to non_fatal, which is wrong and looks ugly. binutils/ * objdump.c (dump_ctf_archive_member): Remove linefeeds. (dump_ctf): Likewise.
2020-07-22	libctf, link: tie in the deduplicating linker	Nick Alcock	6	-2/+701
	This fairly intricate commit connects up the CTF linker machinery (which operates in terms of ctf_archive_t's on ctf_link_inputs -> ctf_link_outputs) to the deduplicator (which operates in terms of arrays of ctf_file_t's, all the archives exploded). The nondeduplicating linker is retained, but is not called unless the CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have confidence in the much-more-complex deduplicating linker, I hope the nondeduplicating linker can be removed. In brief, what this does is traverses each input archive in ctf_link_inputs, opening every member (if not already open) and tying child dicts to their parents, shoving them into an array and constructing a corresponding parents array that tells the deduplicator which dict is the parent of which child. We then call ctf_dedup and ctf_dedup_emit with that array of inputs, taking the outputs that result and putting them into ctf_link_outputs where the rest of the CTF linker expects to find them, then linking in the variables just as is done by the nondeduplicating linker. It also implements much of the CU-mapping side of things. The problem CU-mapping introduces is that if you map many input CUs into one output, this is saying that you want many translation units to produce at most one child dict if conflicting types are found in any of them. This means you can suddenly have multiple distinct types with the same name in the same dict, which libctf cannot really represent because it's not something you can do with C translation units. The deduplicator machinery already committed does as best it can with these, hiding types with conflicting names rather than making child dicts out of them: but we still need to call it. This is done similarly to the main link, taking the inputs (one CU output at a time), deduplicating them, taking the output and making it an input to the final link. Two (significant) optimizations are done: we share atoms tables between all these links and the final link (so e.g. all type hash values are shared, all decorated type names, etc); and any CU-mapped links with only one input (and no child dicts) doesn't need to do anything other than renaming the CU: the CU-mapped link phase can be skipped for it. Put together, large CU-mapped links can save 50% of their memory usage and about as much time (and the memory usage for CU-mapped links is significant, because all those output CUs have to have all their types stored in memory all at once). include/ * ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the deduplicator. libctf/ * ctf-impl.h (ctf_list_splice): New. * ctf-util.h (ctf_list_splice): Likewise. * ctf-link.c (link_sort_inputs_cb_arg_t): Likewise. (ctf_link_sort_inputs): Likewise. (ctf_link_deduplicating_count_inputs): Likewise. (ctf_link_deduplicating_open_inputs): Likewise. (ctf_link_deduplicating_close_inputs): Likewise. (ctf_link_deduplicating_variables): Likewise. (ctf_link_deduplicating_per_cu): Likewise. (ctf_link_deduplicating): Likewise. (ctf_link): Call it.
2020-07-22	libctf, link: add CTF_LINK_OMIT_VARIABLES_SECTION	Nick Alcock	4	-1/+14
	This flag (not used anywhere yet) causes the variables section to be omitted from the output CTF dict. include/ * ctf-api.h (CTF_LINK_OMIT_VARIABLES_SECTION): New. libctf/ * ctf-link.c (ctf_link_one_input_archive_member): Check CTF_LINK_OMIT_VARIABLES_SECTION.
2020-07-22	libctf, dedup: add deduplicator	Nick Alcock	14	-24/+3401
	This adds the core deduplicator that the ctf_link machinery calls (possibly repeatedly) to link the CTF sections: it takes an array of input ctf_file_t's and another array that indicates which entries in the input array are parents of which other entries, and returns an array of outputs. The first output is always the ctf_file_t on which ctf_link/ctf_dedup/etc was called: the other outputs are child dicts that have the first output as their parent. include/ * ctf-api.h (CTF_LINK_SHARE_DUPLICATED): No longer unimplemented. libctf/ * ctf-impl.h (ctf_type_id_key): New, the key in the cd_id_to_file_t. (ctf_dedup): New, core deduplicator state. (ctf_file_t) <ctf_dedup>: New. <ctf_dedup_atoms>: New. <ctf_dedup_atoms_alloc>: New. (ctf_hash_type_id_key): New prototype. (ctf_hash_eq_type_id_key): Likewise. (ctf_dedup_atoms_init): Likewise. * ctf-hash.c (ctf_hash_eq_type_id_key): New. (ctf_dedup_atoms_init): Likewise. * ctf-create.c (ctf_serialize): Adjusted. (ctf_add_encoded): No longer static. (ctf_add_reftype): Likewise. * ctf-open.c (ctf_file_close): Destroy the ctf_dedup_atoms_alloc. * ctf-dedup.c: New file. * ctf-decls.h [!HAVE_DECL_STPCPY]: Add prototype. * configure.ac: Check for stpcpy. * Makefile.am: Add it. * Makefile.in: Regenerate. * config.h.in: Regenerate. * configure: Regenerate.
2020-07-22	libctf, dedup: add new configure option --enable-libctf-hash-debugging	Nick Alcock	7	-2/+65
	Add a new debugging configure option, --enable-libctf-hash-debugging, off by default, which lets you configure in expensive internal consistency checks and enable the printing of debugging output when LIBCTF_DEBUG=t before type deduplication has happened. In this commit we just add the option and cause it to turn ctf_assert into a real, hard assert for easier debugging. libctf/ * configure.ac: Add --enable-libctf-hash-debugging. * aclocal.m4: Pull in enable.m4, for GCC_ENABLE. * Makefile.in: Regenerated. * configure: Likewise. * config.h.in: Likewise. * ctf-impl.h [ENABLE_LIBCTF_HASH_DEBUGGING] (ctf_assert): Define to assert.
2020-07-22	libctf: add SHA-1 support for libctf	Nick Alcock	6	-12/+139
	This very thin abstraction layer provides SHA-1ing facilities to all of libctf, almost all inlined wrappers around the libiberty functionality other than ctf_sha1_fini. The deduplicator will use this to recursively hash types to prove their identity. libctf/ * ctf-sha1.h: New, inline wrappers around sha1_init_ctx and sha1_process_bytes. * ctf-impl.h: Include it. (ctf_sha1_init): New. (ctf_sha1_add): Likewise. (ctf_sha1_fini): Likewise. * ctf-sha1.c: New, non-inline wrapper around sha1_finish_ctx producing strings. * Makefile.am: Add file. * Makefile.in: Regenerate.
2020-07-22	libctf, link: add the ability to filter out variables from the link	Nick Alcock	7	-1/+48
	The CTF variables section (containing variables that have no corresponding symtab entries) can cause the string table to get very voluminous if the names of variables are long. Some callers want to filter out particular variables they know they won't need. So add a "variable filter" callback that does that: it's passed the name of the variable and a corresponding ctf_file_t / ctf_id_t pair, and should return 1 to filter it out. ld doesn't use this machinery yet, but we could easily add it later if desired. (But see later for a commit that turns off CTF variable- section linking in ld entirely by default.) include/ * ctf-api.h (ctf_link_variable_filter_t): New. (ctf_link_set_variable_filter): Likewise. libctf/ * libctf.ver (ctf_link_set_variable_filter): Add. * ctf-impl.h (ctf_file_t) <ctf_link_variable_filter>: New. <ctf_link_variable_filter_arg>: Likewise. * ctf-create.c (ctf_serialize): Adjust. * ctf-link.c (ctf_link_set_variable_filter): New, set it. (ctf_link_one_variable): Call it if set.
2020-07-22	libctf, link: fix spurious conflicts of variables in the variable section	Nick Alcock	2	-1/+6
	When we link a CTF variable, we check to see if it already exists in the parent dict first: if it does, and it has a type the same as the type we would populate it with, we assume we don't need to do anything: otherwise, we populate it in a per-CU child. Or that's what we should be doing. Instead, we check if the type is the same as the type in source dict, which is going to be a completely different value! So we end up concluding all variables are conflicting, bloating up output possibly quite a lot (variables aren't big in and of themselves, but each drags around a strtab entry, and CTF dicts in a CTF archive do not share their strtabs -- one of many problems with CTF archives as presently constituted.) Fix trivial: check the right type. libctf/ * ctf-link.c (ctf_link_one_variable): Check the dst_type for conflicts, not the source type.
2020-07-22	libctf, link: redo cu-mapping handling	Nick Alcock	7	-32/+135
	Now a bunch of stuff that doesn't apply to ld or any normal use of libctf, piled into one commit so that it's easier to ignore. The cu-mapping machinery associates incoming compilation unit names with outgoing names of CTF dictionaries that should correspond to them, for non-gdb CTF consumers that would like to group multiple TUs into a single child dict if conflicting types are found in it (the existing use case is one kernel module, one child CTF dict, even if the kernel module is composed of multiple CUs). The upcoming deduplicator needs to track not only the mapping from incoming CU name to outgoing dict name, but the inverse mapping from outgoing dict name to incoming CU name, so it can work over every CTF dict we might see in the output and link into it. So rejig the ctf-link machinery to do that. Simultaneously (because they are closely associated and were written at the same time), we add a new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the ctf_link machinery to create empty child dicts for each outgoing CU mapping even if no CUs that correspond to it exist in the link. This is a bit (OK, quite a lot) of a waste of space, but some existing consumers require it. (Nobody else should use it.) Its value is not consecutive with existing CTF_LINK flag values because we're about to add more flags that are conceptually closer to the existing ones than this one is. include/ * ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New. libctf/ * ctf-impl.h (ctf_file_t): Improve comments. <ctf_link_cu_mapping>: Split into... <ctf_link_in_cu_mapping>: ... this... <ctf_link_out_cu_mapping>: ... and this. * ctf-create.c (ctf_serialize): Adjust. * ctf-open.c (ctf_file_close): Likewise. * ctf-link.c (ctf_create_per_cu): Look things up in the in_cu_mapping instead of the cu_mapping. (ctf_link_add_cu_mapping): The deduplicating link will define what happens if many FROMs share a TO. (ctf_link_add_cu_mapping): Create in_cu_mapping and out_cu_mapping. Do not create ctf_link_outputs here any more, or create per-CU dicts here: they are already created when needed. (ctf_link_one_variable): Log a debug message if we skip a variable due to its type being concealed in a CU-mapped link. (This is probably too common a case to make into a warning.) (ctf_link): Create empty per-CU dicts if requested.
2020-07-22	libctf, link: fix ctf_link_write fd leak	Nick Alcock	2	-0/+5
	We were leaking the fd on every invocation. libctf/ * ctf-link.c (ctf_link_write): Close the fd.
2020-07-22	libctf, link: add lazy linking: clean up input members: err/warn cleanup	Nick Alcock	9	-133/+611
	This rather large and intertwined pile of changes does three things: First, it transitions from dprintf to ctf_err_warn for things the user might care about: this one file is the major impetus for the ctf_err_warn infrastructure, because things like file names are crucial in linker error messages, and errno values are utterly incapable of communicating them Second, it stabilizes the ctf_link APIs: you can now call ctf_link_add_ctf without a CTF argument (only a NAME), to lazily ctf_open the file with the given NAME when needed, and close it as soon as possible, to save memory. This is not an API change because a null CTF argument was prohibited before now. Since getting CTF directly from files uses ctf_open, passing in only a NAME requires use of libctf, not libctf-nobfd. The linker's behaviour is unchanged, as it still passes in a ctf_archive_t as before. This also let us fix a leak: we were opening ctf_archives and their containing ctf_files, then only closing the files and leaving the archives open. Third, this commit restructures the ctf_link_in_member argument used by the CTF linking machinery and adjusts its users accordingly. We drop two members: - arcname, which is difficult to construct and then only used in error messages (that were only dprintf()ed, so never seen!) - share_mode, since we store the flags passed to ctf_link (including the share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold of it We rename others whose existing names were fairly dreadful: - done_main_member -> done_parent, using consistent terminology for .ctf as the parent of all archive members - main_input_fp -> in_fp_parent, likewise - file_name -> in_file_name, likewise We add one new member, cu_mapped. Finally, we move the various frees of things like mapping table data to the top-level ctf_link, since deduplicating links will want to do that too. include/ * ctf-api.h (ECTF_NEEDSBFD): New. (ECTF_NERR): Adjust. (ctf_link): Rename share_mode arg to flags. libctf/ * Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere. * Makefile.in: Regenerated. * ctf-impl.h (ctf_link_input_name): New. (ctf_file_t) <ctf_link_flags>: New. * ctf-create.c (ctf_serialize): Adjust accordingly. * ctf-link.c: Define ctf_open as weak when PIC. (ctf_arc_close_thunk): Remove unnecessary thunk. (ctf_file_close_thunk): Likewise. (ctf_link_input_name): New. (ctf_link_input_t): New value of the ctf_file_t.ctf_link_input. (ctf_link_input_close): Adjust accordingly. (ctf_link_add_ctf_internal): New, split from... (ctf_link_add_ctf): ... here. Return error if lazy loading of CTF is not possible. Change to just call... (ctf_link_add): ... this new function. (ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the ctf_file_close_thunk. (ctf_link_in_member_cb_arg_t) <file_name> Rename to... <in_file_name>: ... this. <arcname>: Drop. <share_mode>: Likewise (migrated to ctf_link_flags). <done_main_member>: Rename to... <done_parent>: ... this. <main_input_fp>: Rename to... <in_fp_parent>: ... this. <cu_mapped>: New. (ctf_link_one_type): Adjuwt accordingly. Transition to ctf_err_warn, removing a TODO. (ctf_link_one_variable): Note a case too common to warn about. Report in the debug stream if a cu-mapped link prevents addition of a conflicting variable. (ctf_link_one_input_archive_member): Adjust. (ctf_link_lazy_open): New, open a CTF archive for linking when needed. (ctf_link_close_one_input_archive): New, close it again. (ctf_link_one_input_archive): Adjust for lazy opening, member renames, and ctf_err_warn transition. Move the empty_link_type_mapping call to... (ctf_link): ... here. Adjut for renamings and thunk removal. Don't spuriously fail if some input contains no CTF data. (ctf_link_write): ctf_err_warn transition. * libctf.ver: Remove not-yet-stable comment.
2020-07-22	libctf: drop error-prone ctf_strerror	Nick Alcock	4	-8/+9
	This utility function is almost useless (all it does is casts the result of a strerror) but has a seriously confusing name. Over and over again I have accidentally called it instead of ctf_errmsg, and hidden a time-bomb for myself in a hard-to-test error-handling path: since ctf_strerror is just a strerror wrapper, it cannot handle CTF errnos, unlike ctf_errmsg. It's astonishingly lucky that none of these errors have crept into any commits to date. Fuse it into ctf_errmsg and drop it. libctf/ * ctf-impl.h (ctf_strerror): Delete. * ctf-subr.c (ctf_strerror): Likewise. * ctf-error.c (ctf_errmsg): Stop using ctf_strerror: just use strerror directly.
2020-07-22	libctf: sort out potential refcount loops	Nick Alcock	5	-8/+67
	When you link TUs that contain conflicting types together, the resulting CTF section is an archive containing many CTF dicts. These dicts appear in ctf_link_outputs of the shared dict, with each ctf_import'ing that shared dict. ctf_importing a dict bumps its refcount to stop it going away while it's in use -- but if the shared dict (whose refcount is bumped) has the child dict (doing the bumping) in its ctf_link_outputs, we have a refcount loop, since the child dict only un-ctf_imports and drops the parent's refcount when it is freed, but the child is only freed when the parent's refcount falls to zero. (In the future, this will be able to go wrong on the inputs too, when an ld -r'ed deduplicated output with conflicts is relinked. Right now this cannot happen because we don't ctf_import such dicts at all. This will be fixed in a later commit in this series.) Fix this by introducing an internal-use-only ctf_import_unref function that imports a parent dict witthout bumping the parent's refcount, and using it when we create per-CU outputs. This function is only safe to use if you know the parent cannot go away while the child exists: but if the parent owns the child, as here, this is necessarily true. Record in the ctf_file_t whether a parent was imported via ctf_import or ctf_import_unref, so that if you do another ctf_import later on (or a ctf_import_unref) it can decide whether to drop the refcount of the existing parent being replaced depending on which function you used to import that one. Adjust ctf_serialize so that rather than doing a ctf_import (which is wrong if the original import was ctf_import_unref'fed), we just copy the parent field and refcount over and forcibly flip the unref flag on on the old copy we are going to discard. ctf_file_close also needs a bit of tweaking to only close the parent if it was not imported with ctf_import_unref: while we're at it, guard against repeated closes with a refcount of zero and stop them causing double-frees, even if destruction of things freed inside ctf_file_close cause such recursion. Verified no leaks or accesses to freed memory after all of this with valgrind. (It was leak-happy before.) libctf/ * ctf-impl.c (ctf_file_t) <ctf_parent_unreffed>: New. (ctf_import_unref): New. * ctf-open.c (ctf_file_close) Drop the refcount all the way to zero. Don't recurse back in if the refcount is already zero. (ctf_import): Check ctf_parent_unreffed before deciding whether to close a pre-existing parent. Set it to zero. (ctf_import_unreffed): New, as above, setting ctf_parent_unreffed to 1. * ctf-create.c (ctf_serialize): Do not ctf_import into the new child: use direct assignment, and set unreffed on the new and old children. * ctf-link.c (ctf_create_per_cu): Import the parent using ctf_import_unreffed.
2020-07-22	libctf: rename the type_mapping_key to type_key	Nick Alcock	4	-28/+50
	The name was just annoyingly long and I kept misspelling it. It's also a bad name: it's not a mapping the type might be used in a type mapping, but it is itself a representation of a type (a ctf_file_t / ctf_id_t pair), not of a mapping at all. libctf/ * ctf-impl.h (ctf_link_type_mapping_key): Rename to... (ctf_link_type_key): ... this, adjusting member prefixes to match. (ctf_hash_type_mapping_key): Rename to... (ctf_hash_type_key): ... this. (ctf_hash_eq_type_mapping_key): Rename to... (ctf_hash_eq_type_key): ... this. * ctf-hash.c (ctf_hash_type_mapping_key): Rename to... (ctf_hash_type_key): ... this, and adjust for member name changes. (ctf_hash_eq_type_mapping_key): Rename to... (ctf_hash_eq_type_key): ... this, and adjust for member name changes. * ctf-link.c (ctf_add_type_mapping): Adjust. Note the lack of need for out-of-memory checking in this code. (ctf_type_mapping): Adjust.
2020-07-22	libctf: check for vasprintf	Nick Alcock	4	-12/+32
	We've been using this for all of libctf's history in binutils: we should check for it in configure. libctf/ configure.ac: Check for vasprintf. configure: Regenerated. config.h.in: Likewise.
2020-07-22	libctf, archive: fix bad error message	Nick Alcock	2	-1/+5
	Get the function name right. libctf/ * ctf-archive.c (ctf_arc_bufopen): Fix message.
2020-07-22	libctf, open: fix opening CTF in binaries with no symtab	Nick Alcock	4	-26/+70
	This is a perfectly possible case, and half of ctf_bfdopen_ctfsect handled it fine. The other half hit a divide by zero or two before we got that far, and had no code path to load the strtab from anywhere in the absence of a symtab to point at it in any case. So, as a fallback, if there is no symtab, try loading ".strtab" explicitly by name, like we used to before we started looking for the strtab the symtab used. Of course, such a strtab is not kept hold of by BFD, so this means we have to bring back the code to possibly explicitly free the strtab that we read in. libctf/ * ctf-impl.h (struct ctf_archive_internal) <ctfi_free_strsect> New. * ctf-open-bfd.c (ctf_bfdopen_ctfsect): Explicitly open a strtab if the input has no symtab, rather than dividing by zero. Arrange to free it later via ctfi_free_ctfsect. * ctf-archive.c (ctf_new_archive_internal): Do not ctfi_free_strsect by default. (ctf_arc_close): Possibly free it here.
2020-07-22	libctf, dump: fix slice dumping	Nick Alcock	2	-35/+65
	Now that we can have slices of anything terminating in an int, we must dump things accordingly, or slices of typedefs appear as c5b: __u8 -> 16c: __u8 -> 78: short unsigned int (size 0x2) which is unhelpful. If things are printed as slices, the name is missing: a15: [slice 0x8:0x4]-> 16c: __u8 -> 78: short unsigned int (size 0x2) And struct members give no clue they're a slice at all, which is a shame since bitfields are the major use of this type kind: [0x8] (ID 0xa15) (kind 10) __u8 dst_reg Fix things so that everything slicelike or integral gets its encoding printed, and everything with a name gets the name printed: a15: __u8 [slice 0x8:0x4] (size 0x1) -> 1ff: __u8 (size 0x1) -> 37: unsigned char [0x0:0x8] (size 0x1) [0x0] (ID 0xa15) (kind 10) __u8:4 (aligned at 0x1, format 0x2, offset:bits 0x8:0x4) Bitfield struct members get a technically redundant but much easier-to-understand dumping now: [0x0] (ID 0x80000005) (kind 6) struct bpf_insn (aligned at 0x1) [0x0] (ID 0x222) (kind 10) __u8 code (aligned at 0x1) [0x8] (ID 0x1e9e) (kind 10) __u8 dst_reg:4 (aligned at 0x1, format 0x2, offset:bits 0x8:0x4) [0xc] (ID 0x1e46) (kind 10) __u8 src_reg:4 (aligned at 0x1, format 0x2, offset:bits 0xc:0x4) [0x10] (ID 0xf35) (kind 10) __s16 off (aligned at 0x2) [0x20] (ID 0x1718) (kind 10) __s32 imm (aligned at 0x4) This also fixes one place where a failure to format a type would be erroneously considered an out-of-memory condition. libctf/ * ctf-dump.c (ctf_is_slice): Delete, unnecessary. (ctf_dump_format_type): improve slice formatting. Always print the type size, even of slices. (ctf_dump_member): Print slices (-> bitfields) differently from non-slices. Failure to format a type is not an OOM.
2020-07-22	libctf, dump: migrate towards dumping errors rather than truncation	Nick Alcock	2	-10/+19
	If we get an error emitting a single type, variable, or label, right now we emit the error into the ctf_dprintf stream and propagate the error all the way up the stack, causing the entire output to be silently truncated (unless libctf debugging is on). Instead, emit an error and keep going. (This makes sense for this use case: if you're dumping types and a type is corrupted, you want to know!) Not all instances of this are fixed in this commit, only ones associated with type formatting: more fixes will come. libctf/ * ctf-dump.c (ctf_dump_format_type): Emit a warning. (ctf_dump_label): Swallow errors from ctf_dump_format_type. (ctf_dump_objts): Likewise. (ctf_dump_var): Likewise. (ctf_dump_type): Do not emit a duplicate message. Move to ctf_err_warning, and swallow all errors.
2020-07-22	libctf, decl: avoid leaks of the formatted string on error	Nick Alcock	2	-1/+9
	ctf_decl_sprintf builds up a formatted string in the ctf_decl_t's cd_buf, but then on error this is hardly ever freed: we assume that ctf_decl_fini frees it, but it leaks it instead. Make it free it like any decent ADT should. libctf/ * ctf-decl.c (ctf_decl_fini): Free the cd_buf. (ctf_decl_buf): Once it escapes, don't try to free it later.
2020-07-22	libctf, types: enhance ctf_type_aname to print function arg types	Nick Alcock	3	-50/+89
	Somehow this never got implemented, which makes debugging any kind of bug that has to do with argument types fantastically confusing, because it looks like the func type takes no arguments though in fact it does. This also lets us simplify the dumper slightly (and introduces our first uses of ctf_assert and ctf_err_warn: there will be many more). ctf_type_aname dumps function types without including the function pointer name itself: ctf_dump search-and-replaces it in. This seems to give the nicest-looking results for existing users of both, even if it is a bit fiddly. libctf/ * ctf-types.c (ctf_type_aname): Print arg types here... * ctf-dump.c (ctf_dump_funcs): ... not here: but do substitute in the type name here.
2020-07-22	libctf, ld, binutils: add textual error/warning reporting for libctf	Nick Alcock	14	-3/+235
	This commit adds a long-missing piece of infrastructure to libctf: the ability to report errors and warnings using all the power of printf, rather than being restricted to one errno value. Internally, libctf calls ctf_err_warn() to add errors and warnings to a list: a new iterator ctf_errwarning_next() then consumes this list one by one and hands it to the caller, which can free it. New errors and warnings are added until the list is consumed by the caller or the ctf_file_t is closed, so you can dump them at intervals. The caller can of course choose to print only those warnings it wants. (I am not sure whether we want objdump, readelf or ld to print warnings or not: right now I'm printing them, but maybe we only want to print errors? This entirely depends on whether warnings are voluminous things describing e.g. the inability to emit single types because of name clashes or something. There are no users of this infrastructure yet, so it's hard to say.) There is no internationalization here yet, but this at least adds a place where internationalization can be added, to one of ctf_errwarning_next or ctf_err_warn. We also provide a new ctf_assert() function which uses this infrastructure to provide non-fatal assertion failures while emitting an assert-like string to the caller: to save space and avoid needlessly duplicating unchanging strings, the assertion test is inlined but the print-things-out failure case is not. All assertions in libctf will be converted to use this machinery in future commits and propagate assertion-failure errors up, so that the linker in particular cannot be killed by libctf assertion failures when it could perfectly well just print warnings and drop the CTF section. include/ * ctf-api.h (ECTF_INTERNAL): Adjust error text. (ctf_errwarning_next): New. libctf/ * ctf-impl.h (ctf_assert): New. (ctf_err_warning_t): Likewise. (ctf_file_t) <ctf_errs_warnings>: Likewise. (ctf_err_warn): New prototype. (ctf_assert_fail_internal): Likewise. * ctf-inlines.h (ctf_assert_internal): Likewise. * ctf-open.c (ctf_file_close): Free ctf_errs_warnings. * ctf-create.c (ctf_serialize): Copy it on serialization. * ctf-subr.c (ctf_err_warn): New, add an error/warning. (ctf_errwarning_next): New iterator, free and pass back errors/warnings in succession. * libctf.ver (ctf_errwarning_next): Add. ld/ * ldlang.c (lang_ctf_errs_warnings): New, print CTF errors and warnings. Assert when libctf asserts. (lang_merge_ctf): Call it. (land_write_ctf): Likewise. binutils/ * objdump.c (ctf_archive_member): Print CTF errors and warnings. * readelf.c (dump_ctf_archive_member): Likewise.
2020-07-22	libctf, types: ensure the emission of ECTF_NOPARENT	Egeyar Bagcioglu	2	-1/+5
	ctf_variable_iter was returning a (positive!) error code rather than setting the error in the passed-in ctf_file_t. Reviewed-by: Nick Alcock <nick.alcock@oracle.com> libctf/ * ctf-types.c (ctf_variable_iter): Fix error return.
2020-07-22	libctf: error out on corrupt CTF with invalid header flags	Nick Alcock	5	-3/+18
	If corrupt CTF with invalid header flags is passed in, return the new error ECTF_FLAGS. include/ * ctf-api.h (ECTF_FLAGS): New. (ECTF_NERR): Adjust. * ctf.h (CTF_F_MAX): New. libctf/ * ctf-open.c (ctf_bufopen_internal): Diagnose invalid flags.
2020-07-22	libctf: pass the thunk down properly when wrapping qsort_r	Nick Alcock	2	-1/+5
	When wrapping qsort_r on a system like FreeBSD on which the compar argument comes first, we wrap the passed arg in a thunk so we can pass down both the caller-supplied comparator function and its argument. We should pass the argument down to the comparator, not the thunk, which is basically random nonsense on the stack from the point of view of the caller of qsort_r. libctf/ ctf-decls.h (ctf_qsort_compar_thunk): Fix arg passing.
2020-07-22	libctf, next, hash: add dynhash and dynset _next iteration	Nick Alcock	5	-1/+305
	This lets you iterate over dynhashes and dynsets using the _next API. dynhashes can be iterated over in sorted order, which works by populating an array of key/value pairs using ctf_dynhash_next itself, then sorting it with qsort. Convenience inline functions named ctf_dyn{hash,set}_cnext are also provided that take (-> return) const keys and values. libctf/ * ctf-impl.h (ctf_next_hkv_t): New, kv-pairs passed to sorting functions. (ctf_next_t) <u.ctn_sorted_hkv>: New, sorted kv-pairs for ctf_dynhash_next_sorted. <cu.ctn_h>: New, pointer to the dynhash under iteration. <cu.ctn_s>: New, pointer to the dynset under iteration. (ctf_hash_sort_f): Sorting function passed to... (ctf_dynhash_next_sorted): ... this new function. (ctf_dynhash_next): New. (ctf_dynset_next): New. * ctf-inlines.h (ctf_dynhash_cnext_sorted): New. (ctf_dynhash_cnext): New. (ctf_dynset_cnext): New. * ctf-hash.c (ctf_dynhash_next_sorted): New. (ctf_dynhash_next): New. (ctf_dynset_next): New. * ctf-util.c (ctf_next_destroy): Free the u.ctn_sorted_hkv if needed. (ctf_next_copy): Alloc-and-copy the u.ctn_sorted_hkv if needed.
2020-07-22	libctf, next: introduce new class of easier-to-use iterators	Nick Alcock	9	-10/+606
	The libctf machinery currently only provides one way to iterate over its data structures: ctf__iter functions that take a callback and an arg and repeatedly call it. This works, but if you are doing a lot of iteration it is really quite inconvenient: you have to package up your local variables into structures over and over again and spawn lots of little functions even if it would be clearer in a single run of code. Look at ctf-string.c for an extreme example of how unreadable this can get, with three-line-long functions proliferating wildly. The deduplicator takes this to the Nth level. It iterates over a whole bunch of things: if we'd had to use _iter-class iterators for all of them there would be twenty additional functions in the deduplicator alone, for no other reason than that the iterator API requires it. Let's do something better. strtok_r gives us half the design: generators in a number of other languages give us the other half. The _next API allows you to iterate over CTF-like entities in a single function using a normal while loop. e.g. here we are iterating over all the types in a dict: ctf_next_t i = NULL; int hidden; ctf_id_t id; while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR) { /* do something with 'hidden' and 'id' / } if (ctf_errno (fp) != ECTF_NEXT_END) / iteration error / Here we are walking through the members of a struct with CTF ID 'struct_type': ctf_next_t i = NULL; ssize_t offset; const char name; ctf_id_t membtype; while ((offset = ctf_member_next (fp, struct_type, &i, &name, &membtype)) >= 0 { / do something with offset, name, and membtype / } if (ctf_errno (fp) != ECTF_NEXT_END) / iteration error / Like every other while loop, this means you have access to all the local variables outside the loop while inside it, with no need to tiresomely package things up in structures, move the body of the loop into a separate function, etc, as you would with an iterator taking a callback. ctf__next allocates 'i' for you on first entry (when it must be NULL), and frees and NULLs it and returns a _next-dependent flag value when the iteration is over: the fp errno is set to ECTF_NEXT_END when the iteartion ends normally. If you want to exit early, call ctf_next_destroy on the iterator. You can copy iterators using ctf_next_copy, which copies their current iteration position so you can remember loop positions and go back to them later (or ctf_next_destroy them if you don't need them after all). Each _next function returns an always-likely-to-be-useful property of the thing being iterated over, and takes pointers to parameters for the others: with very few exceptions all those parameters can be NULLs if you're not interested in them, so e.g. you can iterate over only the offsets of members of a structure this way: while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0) If you pass an iterator in use by one iteration function to another one, you get the new error ECTF_NEXT_WRONGFUN back; if you try to change ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back. Internally the ctf_next_t remembers the iteration function in use, various sizes and increments useful for almost all iterations, then uses unions to overlap the actual entities being iterated over to keep ctf_next_t size down. Iterators available in the public API so far (all tested in actual use in the deduplicator): /* Iterate over the members of a STRUCT or UNION, returning each member's offset and optionally name and member type in turn. On end-of-iteration, returns -1. / ssize_t ctf_member_next (ctf_file_t fp, ctf_id_t type, ctf_next_t it, const char name, ctf_id_t membtype); / Iterate over the members of an enum TYPE, returning each enumerand's NAME or NULL at end of iteration or error, and optionally passing back the enumerand's integer VALue. / const char ctf_enum_next (ctf_file_t fp, ctf_id_t type, ctf_next_t it, int val); /* Iterate over every type in the given CTF container (not including parents), optionally including non-user-visible types, returning each type ID and optionally the hidden flag in turn. Returns CTF_ERR on end of iteration or error. / ctf_id_t ctf_type_next (ctf_file_t fp, ctf_next_t *it, int flag, int want_hidden); /* Iterate over every variable in the given CTF container, in arbitrary order, returning the name and type of each variable in turn. The NAME argument is not optional. Returns CTF_ERR on end of iteration or error. / ctf_id_t ctf_variable_next (ctf_file_t fp, ctf_next_t it, const char name); /* Iterate over all CTF files in an archive, returning each dict in turn as a ctf_file_t, and NULL on error or end of iteration. It is the caller's responsibility to close it. Parent dicts may be skipped. Regardless of whether they are skipped or not, the caller must ctf_import the parent if need be. / ctf_file_t ctf_archive_next (const ctf_archive_t wrapper, ctf_next_t it, const char name, int skip_parent, int errp); ctf_label_next is prototyped but not implemented yet. include/ * ctf-api.h (ECTF_NEXT_END): New error. (ECTF_NEXT_WRONGFUN): Likewise. (ECTF_NEXT_WRONGFP): Likewise. (ECTF_NERR): Adjust. (ctf_next_t): New. (ctf_next_create): New prototype. (ctf_next_destroy): Likewise. (ctf_next_copy): Likewise. (ctf_member_next): Likewise. (ctf_enum_next): Likewise. (ctf_type_next): Likewise. (ctf_label_next): Likewise. (ctf_variable_next): Likewise. libctf/ * ctf-impl.h (ctf_next): New. (ctf_get_dict): New prototype. * ctf-lookup.c (ctf_get_dict): New, split out of... (ctf_lookup_by_id): ... here. * ctf-util.c (ctf_next_create): New. (ctf_next_destroy): New. (ctf_next_copy): New. * ctf-types.c (includes): Add <assert.h>. (ctf_member_next): New. (ctf_enum_next): New. (ctf_type_iter): Document the lack of iteration over parent types. (ctf_type_next): New. (ctf_variable_next): New. * ctf-archive.c (ctf_archive_next): New. * libctf.ver: Add new public functions.
2020-07-22	libctf: add ctf_ref	Nick Alcock	5	-0/+22
	This allows you to bump the refcount on a ctf_file_t, so that you can smuggle it out of iterators which open and close the ctf_file_t for you around the loop body (like ctf_archive_iter). You still can't use this to preserve a ctf_file_t for longer than the lifetime of its containing entity (e.g. ctf_archive). include/ * ctf-api.h (ctf_ref): New. libctf/ * libctf.ver (ctf_ref): New. * ctf-open.c (ctf_ref): Implement it.
2020-07-22	libctf: add ctf_forwardable_kind	Nick Alcock	3	-1/+12
	The internals of the deduplicator want to know if something is a type that can have a forward to it fairly often, often enough that inlining it brings a noticeable performance gain. Convert the one place in libctf that can already benefit, even though it doesn't bring any sort of performance gain there. libctf/ * ctf-inlines.h (ctf_forwardable_kind): New. * ctf-create.c (ctf_add_forward): Use it.
2020-07-22	libctf: move existing inlines into ctf-inlines.h	Nick Alcock	3	-8/+14
	Just housekeeping. libctf/ * ctf-impl.h (ctf_get_ctt_size): Move definition from here... * ctf-inlines.h (ctf_get_ctt_size): ... to here.
2020-07-22	libctf, hash: introduce the ctf_dynset	Nick Alcock	4	-11/+203
	There are many places in the deduplicator which use hashtables as tiny sets: keys with no value (and usually, but not always, no freeing function) often with only one or a few members. For each of these, even after the last change to not store the freeing functions, we are storing a little malloced block for each item just to track the key/value pair, and a little malloced block for the hash table itself just to track the freeing function because we can't use libiberty hashtab's freeing function because we are using that to free the little malloced per-item block. If we only have a key, we don't need any of that: we can ditch the per-malloced block because we don't have a value, and we can ditch the per-hashtab structure because we don't need to independently track the freeing functions since libiberty hashtab is doing it for us. That means we don't need an owner field in the (now nonexistent) item block either. Roughly speaking, this datatype saves about 25% in time and 20% in peak memory usage for normal links, even fairly big ones. So this might seem redundant, but it's really worth it. Instead of a _lookup function, a dynset has two distinct functions: ctf_dynset_exists, which returns true or false and an optional pointer to the set member, and ctf_dynhash_lookup_any, which is used if all members of the set are expected to be equivalent and we just want any member and we don't care which one. There is no iterator in this set of functions, not because we don't iterate over dynset members -- we do, a lot -- but because the iterator here is a member of an entirely new family of much more convenient iteration functions, introduced in the next commit. libctf/ * ctf-hash.c (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (DYNSET_EMPTY_ENTRY_REPLACEMENT): New. (DYNSET_DELETED_ENTRY_REPLACEMENT): New. (key_to_internal): New. (internal_to_key): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. (ctf_hash_insert_type): Coding style. (ctf_hash_define_type): Likewise. * ctf-impl.h (ctf_dynset_t): New. (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. * ctf-inlines.h (ctf_dynset_cinsert): New.
2020-07-22	libctf, hash: save per-item space when no key/item freeing function	Nick Alcock	2	-21/+62
	The libctf dynhash hashtab abstraction supports per-hashtab arbitrary key/item freeing functions -- but it also has a constant slot type that holds both key and value requested by the user, so it needs to use its own freeing function to free that -- and it has nowhere to store the freeing functions the caller requested. So it copies them into every hash item, bloating every slot, even though all items in a given hash table must have the same key and value freeing functions. So point back to the owner using a back-pointer, but don't even spend space in the item or the hashtab allocating those freeing functions unless necessary: if none are needed, we can simply arrange to not pass in ctf_dynhash_item_free as a del_f to hashtab_create_alloc, and none of those fields will ever be accessed. The only downside is that this makes the code sensitive to the order of fields in the ctf_helem_t and ctf_hashtab_t: but the deduplicator allocates so many hash tables that doing this alone cuts memory usage during deduplication by about 10%. (libiberty hashtab itself has a lot of per-hashtab bloat: in the future we might trim that down, or make a trimmer version.) libctf/ * ctf-hash.c (ctf_helem_t) <key_free>: Remove. <value_free>: Likewise. <owner>: New. (ctf_dynhash_item_free): Indirect through the owner. (ctf_dynhash_create): Only pass in ctf_dynhash_item_free and allocate space for the key_free and value_free fields fields if necessary. (ctf_hashtab_insert): Likewise. Fix OOM errno value. (ctf_dynhash_insert): Only access ctf_hashtab's key_free and value_free if they will exist. Set the slot's owner, but only if it exists. (ctf_dynhash_remove): Adjust.
2020-07-22	libctf, hash: improve insertion of existing keys into dynhashes	Nick Alcock	2	-2/+7
	Right now, if you insert a key/value pair into a dynhash, the old slot's key is freed and the new one always assigned. This seemed sane to me when I wrote it, but I got it wrong time and time again. It's much less confusing to free the key passed in: if a key-freeing function was passed, you are asserting that the dynhash owns the key in any case, so if you pass in a key it is always buggy to assume it sticks around. Freeing the old key means that you can't even safely look up a key from out of a dynhash and hold on to it, because some other matching key might force it to be freed at any time. In the new model, you can always get a key out of a dynhash with ctf_dynhash_lookup_kv and hang on to it until the kv-pair is actually deleted from the dynhash. In the old model the pointer to the key might be freed at any time if a matching key was inserted. libctf/ * ctf-hash.c (ctf_hashtab_insert): Free the key passed in if there is a key-freeing function and the key already exists.
2020-07-22	libctf: add new dynhash functions	Nick Alcock	4	-0/+122
	Future commits will use these. ctf_dynhash_elements: count elements in a dynhash ctf_dynhash_lookup_kv: look up and return pointers to the original key and value in a dynhash (the only way of getting a reference to the original key) ctf_dynhash_iter_find: iterate until an item is found, then return its key ctf_dynhash_cinsert: insert a const key / value into a dynhash (a thim wrapper in a new header dedicated to inline functions). As with the rest of ctf_dynhash, this is not public API. No impact on existing callers is expected. libctf/ * ctf-inlines.h: New file. * ctf-impl.h: Include it. (ctf_hash_iter_find_f): New typedef. (ctf_dynhash_elements): New. (ctf_dynhash_lookup_kv): New. (ctf_dynhash_iter_find): New. * ctf-hash.c (ctf_dynhash_lookup_kv): New. (ctf_traverse_find_cb_arg_t): New. (ctf_hashtab_traverse_find): New. (ctf_dynhash_iter_find): New. (ctf_dynhash_elements): New.
2020-07-22	libctf: fix __extension__ with non-GNU C compilers	Nick Alcock	2	-0/+5
	We forgot to #define __extension__ to nothing in this case. libctf/ * ctf-impl.h [!__GNUC__] (__extension__): Define to nothing.
2020-07-22	libctf: add ctf_archive_count	Nick Alcock	5	-0/+21
	Another count that was otherwise unavailable without doing expensive operations. include/ * ctf-api.h (ctf_archive_count): New. libctf/ * ctf-archive.c (ctf_archive_count): New. * libctf.ver: New public function.