aboutsummaryrefslogtreecommitdiff
path: root/lld/ELF/SyntheticSections.cpp
AgeCommit message (Collapse)AuthorFilesLines
2022-02-18[ELF] Fix .strtab corruption when a symbol name is emptyorigin/release/14.xFangrui Song1-0/+1
This is a simplified c12d49c4e286fa108d4d69f1c6d2b8d691993ffd in main which just fixes the bug but does not affect the -O2 deduplication.
2022-02-01[ELF] Deduplicate names of local symbols only with -O2Fangrui Song1-2/+5
The deduplication requires a DenseMap of the same size of the local part of .strtab . I optimized it in e20544543478b259eb09fa0a253d4fb1a5525d9e but it is still quite slow. For Release build of clang, deduplication makes .strtab 1.1% smaller and makes the link 3% slower. For chrome, deduplication makes .strtab 0.1% smaller and makes the link 6% slower. I suggest that we only perform the optimization with -O2 (default is -O1). Not deduplicating local symbol names will simplify parallel symbol table write. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D118577
2022-02-01[ELF] Change vector<InputSection *> to SmallVector. NFCFangrui Song1-1/+1
My x86-64 lld executable is 8KiB smaller.
2022-01-30[ELF] Change splitSections to objectFiles based parallelForEach. NFCFangrui Song1-7/+13
The work is more balanced.
2022-01-29[ELF] Add some Mips*Section to InStruct and change make<Mips*Section> to ↵Fangrui Song1-6/+9
std::make_unique Similar to D116143. My x86-64 lld executable is 20+KiB smaller.
2022-01-29[ELF] --gdb-index: switch to SmallVector. NFCFangrui Song1-10/+10
2022-01-29[ELF] Refactor -z combrelocFangrui Song1-37/+37
* `RelocationBaseSection::addReloc` increases `numRelativeRelocs`, which duplicates the work done by RelocationSection<ELFT>::writeTo. * --pack-dyn-relocs=android has inappropropriate DT_RELACOUNT. AndroidPackedRelocationSection does not necessarily place relative relocations in the front and DT_RELACOUNT might cause semantics error (though our implementation doesn't and Android bionic doesn't use DT_RELACOUNT anyway.) Move `llvm::partition` to a new function `partitionRels` and compute `numRelativeRelocs` there. Now `RelocationBaseSection::addReloc` is trivial and can be moved to the header to enable inlining. The rest of DynamicReloc and `-z combreloc` handling is moved to the non-template `RelocationBaseSection::computeRels` to decrease code size. My x86-64 lld executable is 44+KiB smaller. While here, rename `sort` to `combreloc`.
2022-01-25[ELF] --gdb-index: replace vector<uint8_t> with unique_ptr<uint8_t[]>. NFCFangrui Song1-5/+8
2022-01-25[ELF] Optimize .relr.dyn to not grow vector<uint64_t>. NFCFangrui Song1-5/+5
2022-01-25[ELF] Simplify and optimize .relr.dyn NFCFangrui Song1-17/+5
2022-01-20Re-land [LLD] Remove global state in lldCommonAlexandre Ganea1-4/+3
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext. See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html The previous land f860fe362282ed69b9d4503a20e5d20b9a041189 caused issues in https://lab.llvm.org/buildbot/#/builders/123/builds/8383, fixed by 22ee510dac9440a74b2e5b3fe3ff13ccdbf55af3. Differential Revision: https://reviews.llvm.org/D108850
2022-01-17[ELF] Change std::vector<InputSectionBase *> to SmallVectorFangrui Song1-6/+5
There is no remaining std::vector<InputSectionBase> now. My x86-64 lld executable is 2KiB small.
2022-01-17[ELF] GnuHashTableSection: replace stable_sort with 2-key sort. NFCFangrui Song1-2/+3
strTabOffset stabilizes llvm::sort. My x86-64 executable is 5+KiB smaller.
2022-01-16[ELF] Remove unneeded SyntheticSection memset(*, 0, *)Fangrui Song1-8/+2
After the D33630 fallout was properly fixed by a4c5db30be4e216834b44e31b47304ea1b92635f. Tested by D37462/D44986 tests, the new --no-rosegment test in build-id.s, and a few --rosegment/--no-rosegment programs.
2022-01-16[ELF] Remove redundant fillTrap and memset(*, 0, *). NFCFangrui Song1-9/+0
The new tests in build-id.s would catch problems if we made a mistake here.
2022-01-16[ELF] RelocationSection<ELFT>::writeTo: use unstable partitionFangrui Song1-2/+1
2022-01-16[ELF] StringTableSection: Use DenseMap<CachedHashStringRef> to avoid ↵Fangrui Song1-1/+1
redundant hash computation 5~6% speedup when linking clang and chrome.
2022-01-16Revert [LLD] Remove global state in lldCommonAlexandre Ganea1-3/+4
It seems to be causing issues on https://lab.llvm.org/buildbot/#/builders/123/builds/8383
2022-01-16[LLD] Remove global state in lldCommonAlexandre Ganea1-4/+3
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext. See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html Differential Revision: https://reviews.llvm.org/D108850
2022-01-15[ELF] Optimize -z combrelocFangrui Song1-3/+8
Sorting dynamic relocations is a bottleneck. Simplifying the comparator improves performance. Linking clang is 4~5% faster with --threads=8. This change may shuffle R_MIPS_REL32 for Mips and is a NFC for non-Mips.
2022-01-12[ELF] Refactor how .gnu.hash and .hash are discardedFangrui Song1-2/+2
Switch to the D114180 approach which is simpler and allows gnuHashTab/hashTab to switch to unique_ptr.
2022-01-12[ELF] Support discarding .relr.dynFangrui Song1-1/+2
db08df0570b6dfaf00d7b1b8555c1d2d4effb224 does not work because part.relrDyn is a unique_ptr and `reset` destroys the object which may still be referenced. This commit uses the D114180 approach. Also improve the test to check that there is no R_X86_64_RELATIVE.
2022-01-10[ELF] Support mixed TLSDESC and TLS GDFangrui Song1-0/+15
We only support both TLSDESC and TLS GD for x86 so this is an x86-specific problem. If both are used, only one R_X86_64_TLSDESC is produced and TLS GD accesses will incorrectly reference R_X86_64_TLSDESC. Fix this by introducing SymbolAux::tlsDescIdx. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D116900
2022-01-09[ELF] Move gotIndex/pltIndex/globalDynIndex to SymbolAuxFangrui Song1-17/+25
to decrease sizeof(SymbolUnion) by 8 on ELF64 platforms. Symbols needing such information are typically 1% or fewer (5134 out of 560520 when linking clang, 19898 out of 5550705 when linking chrome). Storing them elsewhere can decrease memory usage and symbol initialization time. There is a ~0.8% saving on max RSS when linking a large program. Future direction: * Move some of dynsymIndex/verdefIndex/versionId to SymbolAux * Support mixed TLSDESC and TLS GD without increasing sizeof(SymbolUnion) Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D116281
2021-12-27[ELF] Change InStruct/Partition pointers to unique_ptrFangrui Song1-12/+37
and remove associated make<XXX> calls. gnuHash and sysvHash are unchanged, otherwise LinkerScript::discard would destroy the objects which may be referenced by input section descriptions. My x86-64 lld executable is 121+KiB smaller.
2021-12-27[ELF] Use const reference. NFCFangrui Song1-5/+6
2021-12-27[ELF] Simplify and optimize SymbolTableSection<ELFT>::writeToFangrui Song1-29/+26
2021-12-25[ELF] Remove one redundant computeBindingFangrui Song1-9/+4
This does resolve the redundancy in includeInDynsym().
2021-12-25[ELF] sortSymTabSymbols: change vector to SmallVectorFangrui Song1-2/+2
This function may take ~1% time. SmallVector<SymbolTableEntry, 0> is smaller (16 bytes instead of 24) and more efficient.
2021-12-24[ELF] Avoid referencing SectionBase::repl after ICFFangrui Song1-1/+1
It is fairly easy to forget SectionBase::repl after ICF. Let ICF rewrite a Defined symbol's `section` field to avoid references to SectionBase::repl in subsequent passes. This slightly improves the --icf=none performance due to less indirection (maybe for --icf={safe,all} as well if most symbols are Defined). With this change, there is only one reference to `repl` (--gdb-index D89751). We can undo f4fb5fd7523f8e3c3b3966d43c0a28457b59d1d8 (`Move Repl to SectionBase.`) but move `repl` to `InputSection` instead. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D116093
2021-12-23[ELF][PPC32] Support .got2 in an output section descriptionFangrui Song1-5/+3
I added `PPC32Got2Section` D62464 to support .got2 but did not implement .got2 in another output section. PR52799 has a linker script placing .got2 in .rodata, which causes a null pointer dereference because a MergeSyntheticSection's file is nullptr. Add the support.
2021-12-22Revert "[ELF] Make Partition/InStruct members unique_ptr and remove ↵Fangrui Song1-19/+17
associate make<XXX>" This reverts commit e48b1c8a27f0fbd791edc8e45756b268caadfa66. This reverts commit d019de23a1d761225fdaf0c47394ba58143aea9a. The changes caused memory leaks (non-final classes cannot use unique_ptr).
2021-12-22[ELF] Change nonnull pointer parameters to referencesFangrui Song1-8/+8
2021-12-22[ELF] Make Partition members unique_ptr and remove associate make<XXX>Fangrui Song1-14/+15
See D116143 for benefits. My lld executable (x86-64) is 103+KiB smaller.
2021-12-22[ELF] Make InStruct members unique_ptr and remove associate make<XXX>Fangrui Song1-4/+5
See D116143 for benefits. My lld executable (x86-64) is 24+KiB smaller.
2021-12-22[ELF] Change nonnull pointer parameters to references. NFCFangrui Song1-6/+6
2021-12-22[ELF] Change some non-null pointer parameters to references. NFCFangrui Song1-24/+24
2021-12-21[ELF] Change mipsGotIndex to uint32_tFangrui Song1-9/+8
This does not decrease sizeof(InputSection) (important for memory usage) on ELF64 by itself but allows we to add another uint32_t.
2021-12-21[ELF] Optimize RelocationSection<ELFT>::writeToFangrui Song1-15/+26
When linking a 1.2G output (nearly no debug info, 2846621 dynamic relocations) using `--threads=8`, I measured ``` 9.131462 Total ExecuteLinker 1.449913 Total Write output file 1.445784 Total Write sections 0.657152 Write sections {"detail":".rela.dyn"} ``` This change decreases the .rela.dyn time to 0.25, leading to 4% speed up in the total time. * The parallelSort is slow because of expensive r_sym/r_offset computation. Cache the values. * The iteration is slow. Move r_sym/r_addend computation ahead of time and parallelize it. With the change, the new encodeDynamicReloc is cheap (0.05s). So no need to parallelize it. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D115993
2021-12-17[ELF] Parallelize MergeNoTailSection::writeToFangrui Song1-2/+2
With this patch, writing .debug_str is significantly for a program with 1.5G .debug_str: * .debug_info 1.22s * .debug_str 2.57s decreases to 0.66
2021-12-17[ELF] Use SmallVector for many SyntheticSections. NFCFangrui Song1-7/+8
This decreases struct sizes and usually decreases the lld executable size (39KiB for my x86-64 executable) (unless in some cases smaller SmallVector leads to more inlining, e.g. StringTableBuilder). For --gdb-index, there may be memory usage saving.
2021-12-16[ELF] Internalize createMergeSynthetic. NFCFangrui Song1-9/+0
Only called once. Moving to OutputSections.cpp can make it inlined. finalizeInputSections can be very hot, especially in -O1 links with much debug info.
2021-12-15[ELF] Replace make<Defined> with makeDefined. NFCFangrui Song1-2/+2
This removes SpecificAlloc<Defined> and makes my lld executable 1.5k smaller. This drops the small memory waste due to the separate BumpPtrAllocator.
2021-12-14Reland D114783/D115603 [ELF] Split scanRelocations into ↵Fangrui Song1-3/+4
scanRelocations/postScanRelocations (Fixed an issue about GOT on a copy relocated alias.) (Fixed an issue about not creating r_addend=0 IRELATIVE for unreferenced non-preemptible ifunc.) The idea is to make scanRelocations mark some actions are needed (GOT/PLT/etc) and postpone the real work to postScanRelocations. It gives some flexibility: * Make it feasible to support .plt.got (PR32938): we need to know whether GLOB_DAT and JUMP_SLOT are both needed. * Make non-preemptible IFUNC handling slightly cleaner: avoid setting/clearing sym.gotInIgot * -z nocopyrel: report all copy relocation places for one symbol * Make GOT deduplication feasible * Make parallel relocation scanning feasible (if we can avoid all stateful operations and make Symbol attributes atomic), but parallelism may not be the appealing choice Since this patch moves a large chunk of code out of ELFT templates. My x86-64 executable is actually a few hundred bytes smaller. For ppc32-ifunc-nonpreemptible-pic.s: I remove absolute relocation references to non-preemptible ifunc because absolute relocation references are incorrect in -fpie mode. Reviewed By: peter.smith, ikudrin Differential Revision: https://reviews.llvm.org/D114783
2021-12-14Revert D114783 [ELF] Split scanRelocations into ↵Fangrui Song1-4/+3
scanRelocations/postScanRelocations May cause a failure for non-preemptible `bcmp` in a glibc -static link.
2021-12-14[ELF] Remove needsPltAddr in favor of needsCopyFangrui Song1-3/+4
needsPltAddr is equivalent to `needsCopy && isFunc`. In many places, it is equivalent to `needsCopy` because the non-STT_FUNC cases are ruled out. Reviewed By: ikudrin, peter.smith Differential Revision: https://reviews.llvm.org/D115603
2021-12-12[ELF] Use parallelSort for .rela.dynFangrui Song1-1/+1
An unstable sort suffices. In a large link (11.06s), this decreases .rela.dyn writeTo time from 1.52s to 0.81s, resulting in 6% total time speedup (the benefit will greatly dilute if --pack-dyn-relocs=relr becomes prevailing). Encoding the dynamic relocations then sorting raw Elf_Rel/Elf_Rela doesn't seem to improve much (doing that would require code duplicate because of Elf_Rel/Elf_Rela plus unfortunate mips64le), so don't do that.
2021-11-26[ELF] Rename fetch to extractFangrui Song1-1/+1
The canonical term is "extract" (GNU ld documentation, Solaris's `-z *extract` options). Avoid inventing a term and match --why-extract. (ld64 prefers "load" but the word is overloaded too much) Mostly MFC, except for --help messages and the header row in --print-archive-stats output.
2021-11-25[ELF] Rename BaseCommand to SectionCommand. NFCFangrui Song1-7/+7
BaseCommand was picked when PHDRS/INSERT/etc were not implemented. Rename it to SectionCommand to match `sectionCommands` and make it clear that the commands are used in SECTIONS (except a special case for SymbolAssignment). Also, improve naming of some BaseCommand variables (base -> cmd).
2021-11-25[ELF] Rename OutputSection::sectionCommands to commands. NFCFangrui Song1-3/+3
This partially reverts r315409: the description applies to LinkerScript, but not to OutputSection. The name "sectionCommands" is used in both LinkerScript::sectionCommands and OutputSection::sectionCommands, which may lead to confusion. "commands" in OutputSection has no ambiguity because there are no other types of commands.