aboutsummaryrefslogtreecommitdiff
path: root/lld/MachO/UnwindInfoSection.cpp
AgeCommit message (Collapse)AuthorFilesLines
2026-01-13[LLD][MachO][NFC] Rename Reloc to Relocation (#175586)Alexis Engelke1-2/+2
Due to heavy use of using namespace llvm, Reloc is often ambiguous with llvm::Reloc, the relocation model. Previously, this was sometimes disambiguated with macho::Reloc. This ambiguity is even more problematic when using pre-compiled headers, where it's no longer "obvious" whether it should be Reloc or macho::Reloc. Therefore, rename Reloc to Relocation. This is also consistent with lld/ELF, where the type is also named Relocation.
2025-12-02[lld-macho] Remove cuIndices indirection in UnwindInfoSection. NFC (#170252)Fangrui Song1-43/+32
cuEntries was sorted indirectly through a separate `cuIndices`. Eliminate cuIndices for simplicity. Linking chromium_framework from `#48001` with `-no_uuid` gives identical executable using this patch.
2025-06-11[lld] Use std::tie to implement comparison operators (NFC) (#143726)Kazu Hirata1-5/+3
std::tie facilitates lexicographical comparisons through std::tuple's built-in operator< and operator>.
2024-12-14[lld] Migrate away from PointerUnion::{is,get} (NFC) (#119993)Kazu Hirata1-1/+1
Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.
2024-06-11[lld-macho] Fix duplicate GOT entries for personality functions (#95054)Daniel Bertalan1-11/+23
As stated in `UnwindInfoSectionImpl::prepareRelocations`'s comments, the unwind info uses section+addend relocations for personality functions defined in the same file as the function itself. As personality functions are always accessed via the GOT, we need to resolve those to a symbol. Previously, we did this by keeping a map which resolves these to symbols, creating a synthetic symbol if we didn't find it in the map. This approach has an issue: if we process the object file containing the personality function before any external uses, the entry in the map remains unpopulated, so we create a synthetic symbol and a corresponding GOT entry. If we encounter a relocation to it in a later file which requires GOT (such as in `__eh_frame`), we add that symbol to the GOT, too, effectively creating two entries which point to the same piece of code. This commit fixes that by searching the personality function's section for a symbol at that offset which already has a GOT entry, and only creating a synthetic symbol if there is none. As all non-unwind sections are already processed by this point, it ensures no duplication. This should only really affect our tests (and make them clearer), as personality functions are usually defined in platform runtime libraries. Or even if they are local, they are likely not in the first object file to be linked.
2024-04-18[lld-macho][NFC] Preserve original symbol isec, unwindEntry and size (#88357)alx321-19/+20
Currently, when moving symbols from one `InputSection` to another (like in ICF) we directly update the symbol's `isec`, `unwindEntry` and `size`. By doing this we lose the original information. This information will be needed in a future change. Since when moving symbols we always set the symbol's `wasCoalesced` and `isec-> replacement`, we can just use this info to conditionally get the information we need at access time.
2023-07-24[Support] Change MapVector's default template parameter to SmallVector<*, 0>Fangrui Song1-1/+1
SmallVector<*, 0> is often a better replacement for std::vector : both the object size and the code size are smaller. (SmallMapVector uses SmallVector as well, but it is not common.) clang size decreases by 0.0226%. instructions:u decreases 0.037% when compiling a sqlite3 amalgram. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D156016
2023-06-07Reland "D144999 [MC][MachO]Only emits compact-unwind format for "canonical" ↵Vy Nguyen1-2/+5
personality symbols. For the rest, use DWARFs." Reasons for rolling forward: - the crash reported from Chromium was fixed in D151824 (not related to this patch at all) - since D152824 was committed, it should now be safe to roll this forward. New change: - add an additional _ in name check This reverts commit 4980eead4d0b4666d53dad07afb091375b3a13a0.
2023-05-20[lld-macho] Remove partially supported 32-bit ARM archVincent Lee1-1/+1
We never really supported 32-bit ARM arch entirely, and partial support was added for very specific features. Regardless, it fails to even link the most basic applications that at this point, it might be better to move this arch as unsupported. Given that Apple will be moving towards arm64 long term, I don't see any reason for anyone to invest time in supporting this either, and for those who still need it should use apple's ld64 linker. Fixes #62691 Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D150544
2023-05-19Revert "[RFC][MC][MachO]Only emits compact-unwind format for "canonical" ↵Nico Weber1-5/+2
personality symbols. For the rest, use DWARFs." This reverts commit 09aaf53a05e3786eea374f3ce57574225036412d. Causes toolchain asserts building libc++ for x86_64, see https://reviews.llvm.org/D144999#4356215
2023-05-18[RFC][MC][MachO]Only emits compact-unwind format for "canonical" personality ↵Vy Nguyen1-2/+5
symbols. For the rest, use DWARFs. Details: https://github.com/rust-lang/rust/issues/102754 The MachO format uses 2 bits to encode these personality funtions, with 0 reserved for "no-personality". This means we can only have up to 3 personality. There are already three popular personalities: __gxx_personality_v0, __gcc_personality_v0, and __objc_personality_v0. As a result, any system that needs custom-personality will run into a problem. This patch implemented jyknight's proposal to simply force DWARFs for all non-canonical personality functions. Differential Revision: https://reviews.llvm.org/D144999
2023-04-05[lld-macho][nfc] Clean up a bunch of clang-tidy issuesJez Ng1-3/+3
2023-04-04[lld-macho] Check if DWARF offset is too large for compact unwindJez Ng1-1/+20
For functions that use DWARF encodings, their compact unwind entry will contain a hint about the offset of their DWARF entry from the start of the `__eh_frame` section. The encoding only has 3 bytes to encode this hint. Previously, I neglected to check for overflow (and didn't realize that the value was merely a hint without needing to be exact.) So for large `__eh_frame` sections, the hint would overflow and cause the compact unwind MODE flag to be corrupted, leading to uncaught exceptions at runtime. This diff fixes things by encoding zero as the hint for offsets that are too large. The unwinder will start a linear search at the hint location for the matching CFI record. The only requirement is that the hint points to a valid CFI record start, and the start of the section is always the start of a CFI record (in well-formed programs). I'm not adding a test for this because generating the test inputs takes a bit too much time. However, I have been testing locally with this lit file, which takes about 15s to run on my machine: ``` # RUN: rm -rf %t; mkdir %t # RUN: llvm-mc -filetype=obj -triple=x86_64-apple-macos11.0 %s -o %t/test.o # RUN: %lld -dylib -lSystem %t/test.o -o %t/test .subsections_via_symbols .text .p2align 2 _f: .cfi_startproc .rept 0x7fffff .cfi_escape 0x2e, 0x10 .endr ret .cfi_endproc _g: .cfi_startproc .cfi_escape 0x2e, 0x10 ret .cfi_endproc ``` Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D147505
2023-03-30[lld-macho][re-land] Warn on method name collisions from category definitionsJez Ng1-45/+19
This implements ld64's checks for duplicate method names in categories & classes. In addition, this sets us up for implementing Obj-C category merging. This diff handles the most of the parsing work; what's left is rewriting those category / class structures. Numbers for chromium_framework: base diff difference (95% CI) sys_time 2.182 ± 0.027 2.200 ± 0.047 [ -0.2% .. +1.8%] user_time 6.451 ± 0.034 6.479 ± 0.062 [ -0.0% .. +0.9%] wall_time 6.841 ± 0.048 6.885 ± 0.105 [ -0.1% .. +1.4%] samples 33 22 Fixes https://github.com/llvm/llvm-project/issues/54912. Issues seen with the previous land will be fixed in the next commit. Reviewed By: #lld-macho, thevinster, oontvoo Differential Revision: https://reviews.llvm.org/D142916
2023-03-08Revert "[lld-macho] Warn on method name collisions from category definitions"Jez Ng1-19/+45
This reverts commit ef122753db7fe8e9a0b7bedd46d2f3668a780fcb. Apparently it is causing some crashes: https://reviews.llvm.org/D142916#4178869
2023-03-07[lld-macho] Warn on method name collisions from category definitionsJez Ng1-45/+19
This implements ld64's checks for duplicate method names in categories & classes. In addition, this sets us up for implementing Obj-C category merging. This diff handles the most of the parsing work; what's left is rewriting those category / class structures. Numbers for chromium_framework: base diff difference (95% CI) sys_time 2.182 ± 0.027 2.200 ± 0.047 [ -0.2% .. +1.8%] user_time 6.451 ± 0.034 6.479 ± 0.062 [ -0.0% .. +0.9%] wall_time 6.841 ± 0.048 6.885 ± 0.105 [ -0.1% .. +1.4%] samples 33 22 Fixes https://github.com/llvm/llvm-project/issues/54912. Reviewed By: #lld-macho, thevinster, oontvoo Differential Revision: https://reviews.llvm.org/D142916
2023-01-28Use llvm::count{lr}_{zero,one} (NFC)Kazu Hirata1-1/+1
2022-12-23[reland][lld-macho] Private label aliases to weak symbols should not retain ↵Jez Ng1-1/+1
section data This reverts commit a650f2ec7a37cf1f495108bbb313e948c232c29c. The crashes it was causing will be fixed by the stacked diff {D140606}.
2022-12-15Revert "[lld-macho] Private label aliases to weak symbols should not retain ↵Jez Ng1-1/+1
section data" This reverts commit 6736bce6db5fe15bcb765b976c99fff34500d1eb. It's causing Swift-related crashes in e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=1400716 and elsewhere.
2022-12-05[lld-macho] Canonicalize LSDA pointersJez Ng1-0/+5
This was causing an uncaught exception issue in one of our programs. The issue was fairly subtle / rare as it required two identical LSDAs that were referenced by a pair of non-identical compact unwind encodings. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D139269
2022-12-01[lld-macho] Private label aliases to weak symbols should not retain section dataJez Ng1-1/+1
If we have two files with the same weak symbol like so: ``` ltmp0: _weak: <contents> ``` and ``` ltmp1: _weak: <contents> ``` Linking them together should leave only one copy of `<contents>`, not two. Previously, we would keep around both copies because of the private-label `ltmp<N>` symbols (i.e. symbols that start with `l`) -- we would not coalesce those, so we would treat them as retaining the contents. This matters for more than just size -- we are depending upon this behavior internally for emitting a certain file format. This file format's header is repeated in each object file, but we want it to appear just once in our output. Why can't we not emit those aliases to `_weak`, or reference the `ltmp<N>` symbols instead of `_weak`? Well, MC actually adds `ltmp<N>` symbols as part of the assembly-to-binary translation step. So any codegen at the clang level can't access them. All that said... this solution is actually kind of hacky. Here, we avoid creating the private-label symbols at parse time. This is acceptable since we never emit those symbols in our output. However, in ld64, any aliasing temporary symbols (ignored or otherwise) won't retain coalesced data. But implementing this is harder -- we would have to create those symbols first (so we can emit their names later), but we would have to ensure the linker correctly shuffles them around when their aliasees get coalesced. Additionally, ld64 treats these temporary symbols as functionally equivalent to the weak symbols themselves -- that is, it will emit weak binds when those non-weak temporary aliases are referenced. We have imitated this behavior for private-label symbols, but implementing it for local aliases in general seems substantially more difficult. I'm not sure if any programs actually depend on this behavior though, so maybe it's a moot point. Finally, ld64 does all this regardless of whether `.subsections_via_symbols` is specified. We don't. But again, given how rare the lack of that directive is (I've only seen it from hand-written assembly inputs), I don't think we need to worry about it. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D139069
2022-11-22[lld-macho] Fix bug in CUE folding that resulted in wrong unwind table.Vy Nguyen1-3/+14
PR/59070 Differential Revision: https://reviews.llvm.org/D138320
2022-11-08[lld] Fix duplicate word typos. NFCFangrui Song1-5/+4
Based on lld/ part of D137338 but reflowed comments.
2022-10-19[lld-macho][nfc] Clean up includesVy Nguyen1-2/+2
- remove unused/duplicate includes - reformatting/whitespaces Differential Revision: https://reviews.llvm.org/D136266
2022-10-14[lld-macho][nfc] define command UNWIND_MODE_MASK for convenience and rewrite ↵Vy Nguyen1-3/+1
mode-mask checking logic for clarity The previous form is currently "harmless" and happened to work but may not in the future: Consider the struct: (for x86-64, but same issue can be said for the ARM/64 families): ``` UNWIND_X86_64_MODE_MASK = 0x0F000000, UNWIND_X86_64_MODE_RBP_FRAME = 0x01000000, UNWIND_X86_64_MODE_STACK_IMMD = 0x02000000, UNWIND_X86_64_MODE_STACK_IND = 0x03000000, UNWIND_X86_64_MODE_DWARF = 0x04000000, ``` Previously, we were doing: `(encoding & MODE_DWARF) == MODE_DWARF` As soon as a new `UNWIND_X86_64_MODE_FOO = 0x05000000` is defined, then the check above would always return true for encoding=MODE_FOO (because `(0b0101 & 0b0100) == 0b0100` ) Differential Revision: https://reviews.llvm.org/D135359
2022-10-11[lld-macho] Canonicalize personality pointers in EH framesJez Ng1-5/+30
We already do this for personality pointers referenced from compact unwind entries; this patch extends that behavior to personalities referenced via EH frames as well. This reduces the number of distinct personalities we need in the final binary, and helps us avoid hitting the "too many personalities" error. I renamed `UnwindInfoSection::prepareRelocations()` to simply `prepare` since we now do some non-reloc-specific stuff within. Fixes #58277. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D135728
2022-09-03Drop empty string literals from static_assert (NFC)Kazu Hirata1-4/+2
Identified with modernize-unary-static-assert.
2022-08-30[MachO] Don't fold compact unwind entries with LSDAShoaib Meenai1-12/+33
Folding them will cause the unwinder to compute the incorrect function start address for the folded entries, which in turn will cause the personality function to interpret the LSDA incorrectly and break exception handling. You can verify the end-to-end flow by creating a simple C++ file: ``` void h(); int main() { h(); } ``` and then linking this file against the liblsda.dylib produced by the test case added here. Before this change, running the resulting program would result in a program termination with an uncaught exception. Afterwards, it works correctly. Reviewed By: #lld-macho, thevinster Differential Revision: https://reviews.llvm.org/D132845
2022-08-29[MachO] Remove stale commentsShoaib Meenai1-8/+0
https://reviews.llvm.org/D93267 implemented handling more than 127 compact unwind encodings, and https://reviews.llvm.org/D123435 and https://reviews.llvm.org/D124561 implemented stripping redundant __eh_frame entries.
2022-08-03[LLD] [MachO] Fix GCC build warningsMartin Storsjö1-2/+6
This fixes the following warnings produced by GCC 9: ../tools/lld/MachO/Arch/ARM64.cpp: In member function ‘void {anonymous}::OptimizationHintContext::applyAdrpLdr(const lld::macho::OptimizationHint&)’: ../tools/lld/MachO/Arch/ARM64.cpp:448:18: warning: comparison of integer expressions of different signedness: ‘int64_t’ {aka ‘long int’} and ‘uint64_t’ {aka ‘long unsigned int’} [-Wsign-compare] 448 | if (ldr.offset != (rel1->referentVA & 0xfff)) | ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../tools/lld/MachO/UnwindInfoSection.cpp: In function ‘bool canFoldEncoding(compact_unwind_encoding_t)’: ../tools/lld/MachO/UnwindInfoSection.cpp:404:44: warning: comparison between ‘enum<unnamed>’ and ‘enum<unnamed>’ [-Wenum-compare] 404 | static_assert(UNWIND_X86_64_MODE_MASK == UNWIND_X86_MODE_MASK, ""); | ^~~~~~~~~~~~~~~~~~~~ ../tools/lld/MachO/UnwindInfoSection.cpp:405:49: warning: comparison between ‘enum<unnamed>’ and ‘enum<unnamed>’ [-Wenum-compare] 405 | static_assert(UNWIND_X86_64_MODE_STACK_IND == UNWIND_X86_MODE_STACK_IND, ""); | ^~~~~~~~~~~~~~~~~~~~~~~~~ Differential Revision: https://reviews.llvm.org/D130970
2022-07-21[lld-macho] Fix assertion when two symbols at same addr have unwind infoJez Ng1-1/+1
If there are multiple symbols at the same address, our unwind info implementation assumes that we always register unwind entries to a single canonical symbol. This assumption was violated by the `registerEhFrame` code. Fixes #56570. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D130208
2022-07-11[lld-macho] Fix compact unwind output for 32 bit buildsDavid Spickett1-1/+1
This test was failing on our 32 bit build bot: https://lab.llvm.org/buildbot/#/builders/178/builds/2463 This happened because in UnwindInfoSectionImpl::finalize a decision is made whether to write out regular or compressed unwind info. One check in this does: ``` if (cuPtr->functionAddress >= functionAddressMax) { break; ``` Where cuPtr->functionAddress was uint64_t and functionAddressMax was uintptr_t, which is 4 bytes on a 32 bit system. Using uint64_t for functionAddressMax fixes this problem. Presumably because at only 4 bytes, the max is much lower than we expect. We're targetting 64 bit though so the size of the max should match the size of the addresses. Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D129363
2022-06-19Rename parallelForEachN to just parallelForNico Weber1-1/+1
Patch created by running: rg -l parallelForEachN | xargs sed -i '' -c 's/parallelForEachN/parallelFor/' No behavior change. Differential Revision: https://reviews.llvm.org/D128140
2022-06-14[lld-macho] Print the name of functions containing undefined referencesDaniel Bertalan1-1/+1
The error used to look like this: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o Now it displays the name of the function that contains the undefined reference as well: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o:(symbol _baz+0x4) Differential Revision: https://reviews.llvm.org/D127696
2022-06-13[lld-macho][reland] Initial support for EH FramesJez Ng1-1/+15
This reverts commit 942f4e3a7cc9a9f8b2654817cff12907d1276031. The additional change required to avoid the assertion errors seen previously is: --- a/lld/MachO/ICF.cpp +++ b/lld/MachO/ICF.cpp @@ -443,7 +443,9 @@ void macho::foldIdenticalSections() { /*relocVA=*/0); isec->data = copy; } - } else { + } else if (!isEhFrameSection(isec)) { + // EH frames are gathered as hashables from unwindEntry above; give a + // unique ID to everything else. isec->icfEqClass[0] = ++icfUniqueID; } } Differential Revision: https://reviews.llvm.org/D123435
2022-06-09Revert "[lld-macho] Initial support for EH Frames"Douglas Yung1-15/+1
This reverts commit 826be330af9c0a8553a5b32718ecd2d97e10438e. This was causing a test failure on build bots: - https://lab.llvm.org/buildbot/#/builders/36/builds/21770 - https://lab.llvm.org/buildbot/#/builders/58/builds/23913
2022-06-08[lld-macho] Initial support for EH FramesJez Ng1-1/+15
== Background == `llvm-mc` generates unwind info in both compact unwind and DWARF formats. LLD already handles the compact unwind format; this diff gets us close to handling the DWARF format properly. == Caveats == It's not quite done yet, but I figure it's worth getting this reviewed and landed first as it's shaping up to be a fairly large code change. **Known limitations of the current code:** * Only works for x86_64, for which `llvm-mc` emits "abs-ified" relocations as described in https://github.com/llvm/llvm-project/commit/618def651b59bd42c05bbd91d825af2fb2145683. `llvm-mc` emits regular relocations for ARM EH frames, which we do not yet handle correctly. Since the feature is not ready for real use yet, I've gated it behind a flag that only gets toggled on during test suite runs. With most of the new code disabled, we see just a hint of perf regression, so I don't think it'd be remiss to land this as-is: base diff difference (95% CI) sys_time 1.926 ± 0.168 1.979 ± 0.117 [ -1.2% .. +6.6%] user_time 3.590 ± 0.033 3.606 ± 0.028 [ +0.0% .. +0.9%] wall_time 7.104 ± 0.184 7.179 ± 0.151 [ -0.2% .. +2.3%] samples 30 31 == Design == Like compact unwind entries, EH frames are also represented as regular ConcatInputSections that get pointed to via `Defined::unwindEntry`. This allows them to be handled generically by e.g. the MarkLive and ICF code. (But note that unlike compact unwind subsections, EH frame subsections do end up in the final binary.) In order to make EH frames "look like" a regular ConcatInputSection, some processing is required. First, we need to split the `__eh_frame` section along EH frame boundaries rather than along symbol boundaries. We do this by decoding the length field of each EH frame. Second, the abs-ified relocations need to be turned into regular Relocs. == Next Steps == In order to support EH frames on ARM targets, we will either have to teach LLD how to handle EH frames with explicit relocs, or we can try to make `llvm-mc` emit abs-ified relocs for ARM as well. I'm hoping to do the latter as I think it will make the LLD implementation both simpler and faster to execute. == Misc == The `obj-file-with-stabs.s` test had to be updated as the previous version would trip assertion errors in the code. It appears that in our attempt to produce a minimal YAML test input, we created a file with invalid EH frame data. I've fixed this by re-generating the YAML and not doing any hand-pruning of it. Reviewed By: #lld-macho, Roger Differential Revision: https://reviews.llvm.org/D123435
2022-05-20[lld-macho] Stop crash when emitting personalities with -dead_stripAlex Brachet1-0/+1
The <internal> symbol was tripping an assertion in getVA() because it was not marked as used. Per the comment above that symbols creation, dead stripping has already occurred so marking this symbol as used is accurate. Fixes https://github.com/llvm/llvm-project/issues/55565 Differential revision: https://reviews.llvm.org/D126072
2022-04-13[lld-macho][nfc] De-templatize UnwindInfoSectionJez Ng1-38/+68
Follow-on to {D123276}. Now that we work with an internal representation of compact unwind entries, we no longer need to template our UnwindInfoSectionImpl code based on the pointer size of the target architecture. I've still kept the split between `UnwindInfoSectionImpl` and `UnwindInfoSection`. I'd introduced that split in order to do type erasure, but I think it's still useful to have in order to keep `UnwindInfoSection`'s definition in the header file clean. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D123277
2022-04-11[lld-macho][nfc] Use includeInSymtab for all symtab-skipping logicJez Ng1-0/+1
{D123302} got me looking deeper at `includeInSymtab`. I thought it was a little odd that there were excluded (live) symbols for which `includeInSymtab` was false; we shouldn't have so many different ways to exclude a symbol. As such, this diff makes the `L`-prefixed-symbol exclusion code use `includeInSymtab` too. (Note that as part of our support for `__eh_frame`, we will also be excluding all `__eh_frame` symbols from the symtab in a future diff.) Another thing I noticed is that the `emitStabs` code never has to deal with excluded symbols because `SymtabSection::finalize()` already filters them out. As such, I've updated the comments and asserts from {D123302} to reflect this. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D123433
2022-04-08[lld-macho] Use fewer indirections in UnwindInfo implementationJez Ng1-101/+44
The previous implementation of UnwindInfoSection materialized all the compact unwind entries & applied their relocations, then parsed the resulting data to generate the final unwind info. This design had some unfortunate conseqeuences: since relocations can only be applied after their referents have had addresses assigned, operations that need to happen before address assignment must contort themselves. (See {D113582} and observe how this diff greatly simplifies it.) Moreover, it made synthesizing new compact unwind entries awkward. Handling PR50956 will require us to do this synthesis, and is the main motivation behind this diff. Previously, instead of generating a new CompactUnwindEntry directly, we would have had to generate a ConcatInputSection with a number of `Reloc`s that would then get "flattened" into a CompactUnwindEntry. This diff introduces an internal representation of `CompactUnwindEntry` (the former `CompactUnwindEntry` has been renamed to `CompactUnwindLayout`). The new CompactUnwindEntry stores references to its personality symbol and LSDA section directly, without the use of `Reloc` structs. In addition to being easier to work with, this diff also allows us to handle unwind info whose personality symbols are located in sections placed after the `__unwind_info`. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D123276
2022-02-07[lld-macho] Include address offsets in error messagesJez Ng1-5/+6
This makes it easier to pinpoint the source of the problem. TODO: Have more relocation error messages make use of this functionality. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D118798
2022-02-01[lld-macho][nfc] Comments and style fixesJez Ng1-1/+1
Added some comments (particularly around finalize() and finalizeContents()) as well as doing some rephrasing / grammar fixes for existing comments. Also did some minor style fixups, such as by putting methods together in a class definition and having fields of similar types next to each other. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D118714
2022-01-19[lld-macho] Add --start-lib --end-libFangrui Song1-1/+1
In ld.lld, when an ObjFile/BitcodeFile is read in --start-lib state, the file is given archive semantics. --end-lib closes the previous --start-lib. A build system can use this feature as an alternative to archives. This patch ports the feature to lld-macho. --start-lib and --end-lib are positional, unlike usual ld64 options. I think the slight drawback does not matter as (a) reusing option names make build systems convenient (b) `--start-lib a.o b.o --end-lib` conveys more information than an alternative design: `-objlib a.o -objlib b.o` because --start-lib makes it clear which objects are in the same conceptual archive. This provides flexibility (c) `-objlib`/`-filelist` interaction may be weird. Close https://github.com/llvm/llvm-project/issues/52931 Reviewed By: #lld-macho, Jez Ng, oontvoo Differential Revision: https://reviews.llvm.org/D116913
2022-01-11[lld-macho] Rename LazySymbol to LazyArchive. NFCFangrui Song1-1/+1
D116913 will add LazyObject. Rename LazySymbol to LazyArchive to avoid confusion and mirror ELF. Reviewed By: #lld-macho, Jez Ng Differential Revision: https://reviews.llvm.org/D116914
2021-11-22[lld-macho] Don't replace local personality symbol with LazySymbolVy Nguyen1-2/+3
Follup-up to D107533, where we replaced local syms with non-local. It doesn't make sense to replace local symbol with lazy. Differential Revision: https://reviews.llvm.org/D110040
2021-11-17[lld-macho][nfc] Factor-out NFC changes from main __eh_frame diffGreg McGary1-14/+22
In order to keep signal:noise high for the `__eh_frame` diff, I have teased-out the NFC changes and put them here. Differential Revision: https://reviews.llvm.org/D114017
2021-11-12[lld-macho] Fix symbol relocs handling for LSDAsJez Ng1-6/+10
Similar to D113702, but for the LSDAs. Clang seems to emit all LSDA relocs as section relocs, but ld -r can turn those relocs into symbol ones. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D113721
2021-11-12[lld-macho] Teach ICF to dedup functions with identical unwind infoJez Ng1-4/+4
Dedup'ing unwind info is tricky because each CUE contains a different function address, if ICF operated naively and compared the entire contents of each CUE, entries with identical unwind info but belonging to different functions would never be considered identical. To work around this problem, we slice away the function address before performing ICF. We rely on `relocateCompactUnwind()` to correctly handle these truncated input sections. Here are the numbers before and after D109944, D109945, and this diff were applied, as tested on my 3.2 GHz 16-Core Intel Xeon W: Without any optimizations: base diff difference (95% CI) sys_time 0.849 ± 0.015 0.896 ± 0.012 [ +4.8% .. +6.2%] user_time 3.357 ± 0.030 3.512 ± 0.023 [ +4.3% .. +5.0%] wall_time 3.944 ± 0.039 4.032 ± 0.031 [ +1.8% .. +2.6%] samples 40 38 With `-dead_strip`: base diff difference (95% CI) sys_time 0.847 ± 0.010 0.896 ± 0.012 [ +5.2% .. +6.5%] user_time 3.377 ± 0.014 3.532 ± 0.015 [ +4.4% .. +4.8%] wall_time 3.962 ± 0.024 4.060 ± 0.030 [ +2.1% .. +2.8%] samples 47 30 With `-dead_strip` and `--icf=all`: base diff difference (95% CI) sys_time 0.935 ± 0.013 0.957 ± 0.018 [ +1.5% .. +3.2%] user_time 3.472 ± 0.022 6.531 ± 0.046 [ +87.6% .. +88.7%] wall_time 4.080 ± 0.040 5.329 ± 0.060 [ +30.0% .. +31.2%] samples 37 30 Unsurprisingly, ICF is now a lot slower, likely due to the much larger number of input sections it needs to process. But the rest of the linker only suffers a mild slowdown. Note that the compact-unwind-bad-reloc.s test was expanded because we now handle the relocation for CUE's function address in a separate code path from the rest of the CUE relocations. The extended test covers both code paths. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D109946
2021-11-10[lld-macho] Support renaming of LSDA sectionJez Ng1-82/+131
Previously, our unwind info finalization logic assumed that the LSDA section referenced by `__compact_unwind` was already finalized before `__TEXT,__unwind_info` itself. However, that assumption could be broken by the use of `-rename_section` -- it could be (and is) used to move `__gcc_except_tab` it into a different segment later in the file. (__TEXT is always the first non-zerofill segment, so any rename basically guarantees that the section will be ordered after `__unwind_info`.) To handle this case, we compare LSDA relocations instead of their final values in `UnwindInfoSection::finalize()`, and we actually relocate those LSDAs in `UnwindInfoSection::writeTo()`. In order to do this, we need an easy way to track which Symbol a given CUE corresponds to. My solution was to change our `cuPtrVector` into a vector of indices, with each index used for both the symbols vector (`symbolsVec`) as well as the CUE vector (`cuVector`). This change seems perf neutral. Numbers for linking chromium_framework on my 16 core Mac Pro: base diff difference (95% CI) sys_time 1.248 ± 0.025 1.245 ± 0.026 [ -1.3% .. +0.8%] user_time 3.588 ± 0.045 3.587 ± 0.037 [ -0.6% .. +0.5%] wall_time 4.605 ± 0.069 4.595 ± 0.069 [ -1.0% .. +0.5%] samples 42 26 Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D113582