aboutsummaryrefslogtreecommitdiff
path: root/lld/COFF/InputFiles.cpp
AgeCommit message (Collapse)AuthorFilesLines
2026-01-21[NFC][LTO] Move isPreservedName out of IRSymtab into LTO's Symbol as ↵Daniel Thornburgh1-1/+4
isLibcall (#177046) This resolves the FIXME in IRSymtab and cleans up the semantics of the IRSymtab. The list of preserved symbols really shouldn't be seen as a property of the IR symbol table, since it's an LTO-specific concern, and it's very tenuous to claim that this information is actually present in the bitcode file to be exposed through its symbol table. Instead, this PR moves this logic into LTO's view of the symbol, which allows consumers to determine preserved-ness themselves. This was broken out of #164916; this prevents that PR from introducing a circular dependency, but it still seems like an independently good idea by virtue of the above.
2025-12-31[DTLTO][ELF][COFF] Add archive support for DTLTO. (#157043)Konstantin Belochapka1-0/+1
This patch implements support for handling archive members in DTLTO. Unlike ThinLTO, where archive members are passed as in-memory buffers, DTLTO requires archive members to be materialized as individual files on the filesystem. This is necessary because DTLTO invokes clang externally, which expects file-based inputs. To support this, this implementation identifies archive members among the input files, saves them to the filesystem, and updates their module_id to match their file paths.
2025-07-28[LLD][COFF] Discard .llvmbc and .llvmcmd sections (#150897)Haohai Wen1-0/+5
Those sections are generated by -fembed-bitcode and do not need to be kept in executable files.
2025-07-21[LLD][COFF] Follow up comments on pr146610 (#147152)Alexandre Ganea1-11/+19
This is a follow-up PR for post-commit comments in https://github.com/llvm/llvm-project/pull/146610 - Changed "exporteddllmain" references to "importeddllmain". - Add support for x86 target and test coverage. - Changed a comment to better express why we're skipping importing `DllMain`.
2025-07-02[LLD][COFF] Disallow importing DllMain from import libraries (#146610)Alexandre Ganea1-1/+39
This is a workaround for https://github.com/llvm/llvm-project/issues/82050 by skipping the `DllMain` symbol if seen in aimport library. If this situation occurs, after this commit a warning will also be displayed. The warning can be silenced with `/ignore:exporteddllmain`
2025-05-29[LLD][COFF] Add support for DLL imports on ARM64EC (#141587)Jacek Caban1-0/+11
Define additional `__imp_aux_` and mangled lazy symbols. Also allow overriding EC aliases with lazy symbols, as we do for other lazy symbol types.
2025-05-25[lld] Remove unused includes (NFC) (#141421)Kazu Hirata1-4/+0
2025-05-15[LLD][COFF] Add support for including native ARM64 objects in ARM64EC images ↵Jacek Caban1-3/+1
(#137653) MSVC linker accepts native ARM64 object files as input with `-machine:arm64ec`, similar to `-machine:arm64x`. Its usefulness is very limited; for example, both exports and imports are not reflected in the PE structures and can't work. However, their symbol tables are otherwise functional. Since we already have handling of multiple symbol tables implemented for ARM64X, the required changes are mostly about adjusting relevant checks to account for them on the ARM64EC target. Delay-load helper handling is a bit of a shortcut. The patch never pulls it for native object files and just ensures that the code is fine with that. In general, I think it would be nice to adjust the driver to pull it only when it's actually referenced, which would allow applying the same logic to the native symbol table on ARM64EC without worrying about pulling too much.
2025-04-11[LLD][COFF] Remove no longer needed symtabEC from COFFLinkerContext (NFC) ↵Jacek Caban1-2/+2
(#135094) With #135093, we may just use `symtab` instead.
2025-04-11[LLD][COFF] Swap the meaning of symtab and hybridSymtab in hybrid images ↵Jacek Caban1-0/+1
(#135093) Originally, the intent behind symtab was to represent the symbol table seen in the PE header (without applying ARM64X relocations). However, in most cases outside of `writeHeader()`, the code references either both symbol tables or only the EC one, for example, `mainSymtab` in `linkerMain()` maps to `hybridSymtab` on ARM64X. MSVC's link.exe allows pure ARM64EC images to include native ARM64 files. This patch prepares LLD to support the same, which will require `hybridSymtab` to be available even for ARM64EC. At that point, `writeHeader()` will need to use the EC symbol table, and the original reasoning for keeping it in `hybridSymtab` no longer applies. Given this, it seems cleaner to treat the EC symbol table as the “main” one, assigning it to `symtab`, and use `hybridSymtab` for the native symbol table instead. Since `writeHeader()` will need to be conditional anyway, this change simplifies the rest of the code by allowing other parts to consistently treat `ctx.symtab` as the main symbol table. As a further simplification, this also allows us to eliminate `symtabEC` and use `symtab` directly; I’ll submit that as a separate PR. The map file now uses the EC symbol table for printed entry points and exports, matching MSVC behavior.
2025-03-15[LLD][COFF] Clarify EC vs. native symbols in diagnostics on ARM64X (#130857)Jacek Caban1-3/+3
On ARM64X, symbol names alone are ambiguous as they may refer to either a native or an EC symbol. Append '(EC symbol)' or '(native symbol)' in diagnostic messages to distinguish them.
2025-02-22[LLD][COFF] Add support for x86_64 archives on ARM64X (#128241)Jacek Caban1-1/+42
If the ECSYMBOLS section is missing in the archive, the archive could be either a native-only ARM64 or x86_64 archive. Check the machine type of the object containing a symbol to determine which symbol table to use.
2025-01-26[LLD][COFF] Implement support for hybrid IAT on ARM64X (#124189)Jacek Caban1-4/+10
In hybrid images, the PE header references a single IAT for both native and EC views, merging entries where possible. When merging isn't feasible, different imports are grouped together, and ARM64X relocations are emitted as needed.
2025-01-24[lld/COFF] Fix -start-lib / -end-lib more after reviews.llvm.org/D116434 ↵Nico Weber1-0/+2
(#124294) This is a follow-up to #120452 in a way. Since lld/COFF does not yet insert all defined in an obj file before all undefineds (ELF and MachO do this, see #67445 and things linked from there), it's possible that: 1. We add an obj file a.obj 2. a.obj contains an undefined that's in b.obj, causing b.obj to be added 3. b.obj contains an undefined that's in a part of a.obj that's not yet in the symbol table, causing a recursive load of a.obj, which adds the symbols in there twice, leading to duplicate symbol errors. For normal archives, `ArchiveFile::addMember()` has a `seen` check to prevent this. For start-lib lazy objects, we can just check if the archive is still lazy at the recursive call. This bug is similar to issue #59162. (Eventually, we'll probably want to do what the MachO and ELF ports do.) Includes a test that caused duplicate symbol diagnostics before this code change.
2025-01-23Reland [LLD] [COFF] Fix linking MSVC generated implib header objects (#123916)Martin Storsjö1-8/+35
ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b tried to fix cases when LLD links what seems to be import library header objects from MSVC. However, the fix seems incorrect; the review at https://reviews.llvm.org/D133627 concluded that if this (treating this kind of symbol as a common symbol) is what link.exe does, it's fine. However, this is most probably not what link.exe does. The symbol mentioned in the commit message of ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b would be a common symbol with a size of around 3 GB; this is not what might have been intended. That commit tried to avoid running into the error ".idata$4 should not refer to special section 0"; that issue is fixed for a similar style of section symbols in 4a4a8a1476b1386b523dc5b292ba9a5a6748a9cf. Therefore, revert ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b and extend the fix from 4a4a8a1476b1386b523dc5b292ba9a5a6748a9cf to also work for the section symbols in MSVC generated import libraries. The main detail about them, is that for symbols of type IMAGE_SYM_CLASS_SECTION, the Value field is not an offset, but it is an optional set of flags, corresponding to the Characteristics of the section header (although it may be empty). This is a reland of a previous version of this commit, earlier merged in 9457418e66766d8fafc81f85eb8045986220ca3e / #122811. The previous version failed tests when run with address sanitizer. The issue was that the synthesized coff_symbol_generic object actually will be used to access a full coff_symbol16 or coff_symbol32 struct, see DefinedCOFF::getCOFFSymbol. Therefore, we need to make a copy of the full size of either of them.
2025-01-21Revert "[LLD] [COFF] Fix linking MSVC generated implib header objects" (#123877)Thurston Dang1-23/+8
Reverts llvm/llvm-project#122811 due to buildbot breakage e.g., https://lab.llvm.org/buildbot/#/builders/52/builds/5421/steps/11/logs/stdio ASan output from local re-run: ``` ==2780289==ERROR: AddressSanitizer: use-after-poison on address 0x7e0b87e28d28 at pc 0x55a979a99e7e bp 0x7ffe4b18f0b0 sp 0x7ffe4b18f0a8 READ of size 1 at 0x7e0b87e28d28 thread T0 #0 0x55a979a99e7d in getStorageClass /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:344 #1 0x55a979a99e7d in isSectionDefinition /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:429:9 #2 0x55a979a99e7d in getSymbols /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:54:42 #3 0x55a979a99e7d in lld::coff::writeLLDMapFile(lld::coff::COFFLinkerContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:103:40 #4 0x55a979a16879 in (anonymous namespace)::Writer::run() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:810:3 #5 0x55a979a00aac in lld::coff::writeResult(lld::coff::COFFLinkerContext&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:354:15 #6 0x55a97985f7ed in lld::coff::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:2826:3 #7 0x55a97984cdd3 in lld::coff::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:97:15 #8 0x55a9797f9793 in lld::unsafeLldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:163:12 #9 0x55a9797fa3b6 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:188:15 #10 0x55a9797fa3b6 in void llvm::function_ref<void ()>::callback_fn<lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>)::$_0>(long) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12 #11 0x55a97966cb93 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:69:12 #12 0x55a97966cb93 in llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:426:3 #13 0x55a9797f9dc3 in lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:187:14 #14 0x55a979627512 in lld_main(int, char**, llvm::ToolContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/tools/lld/lld.cpp:103:14 #15 0x55a979628731 in main /usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/tools/lld/tools/lld/lld-driver.cpp:17:10 #16 0x7ffb8b202c89 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #17 0x7ffb8b202d44 in __libc_start_main csu/../csu/libc-start.c:360:3 #18 0x55a97953ef60 in _start (/usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/bin/lld+0x8fd1f60) ```
2025-01-21[LLD] [COFF] Fix linking MSVC generated implib header objects (#122811)Martin Storsjö1-8/+23
ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b tried to fix cases when LLD links what seems to be import library header objects from MSVC. However, the fix seems incorrect; the review at https://reviews.llvm.org/D133627 concluded that if this (treating this kind of symbol as a common symbol) is what link.exe does, it's fine. However, this is most probably not what link.exe does. The symbol mentioned in the commit message of ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b would be a common symbol with a size of around 3 GB; this is not what might have been intended. That commit tried to avoid running into the error ".idata$4 should not refer to special section 0"; that issue is fixed for a similar style of section symbols in 4a4a8a1476b1386b523dc5b292ba9a5a6748a9cf. Therefore, revert ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b and extend the fix from 4a4a8a1476b1386b523dc5b292ba9a5a6748a9cf to also work for the section symbols in MSVC generated import libraries. The main detail about them, is that for symbols of type IMAGE_SYM_CLASS_SECTION, the Value field is not an offset, but it is an optional set of flags, corresponding to the Characteristics of the section header (although it may be empty).
2025-01-17[LLD][COFF] Process bitcode files separately for each symbol table on ARM64X ↵Jacek Caban1-6/+13
(#123194)
2025-01-16[LLD] [COFF] Fix linking import libraries with -wholearchive: (#122806)Martin Storsjö1-0/+20
When LLD links against an import library (for the regular, short import libraries), it doesn't actually link in the header/trailer object files at all, but synthesizes new corresponding data structures into the right sections. If the whole of such an import library is forced to be linked, e.g. with the -wholearchive: option, we actually end up linking in those header/trailer objects. The header objects contain a construct which LLD fails to handle; previously we'd error out with the error ".idata$4 should not refer to special section 0". Within the import library header object, in the import directory we have relocations towards the IAT (.idata$4 and .idata$5), but the header object itself doesn't contain any data for those sections. In the case of GNU generated import libraries, the header objects contain zero length sections .idata$4 and .idata$5, with relocations against them. However in the case of LLVM generated import libraries, the sections .idata$4 and .idata$5 are not included in the list of sections. The symbol table does contain section symbols for these sections, but without any actual associated section. This can probably be seen as a declaration of an empty section. If the header/trailer objects of a short import library are linked forcibly and we also reference other functions in the library, we end up with two import directory entries for this DLL, one that gets synthesized by LLD, and one from the actual header object file. This is inelegant, but should be acceptable. While it would seem unusual to link import libraries with the -wholearchive: option, this can happen in certain scenarios. Rust builds libraries that contain relevant import libraries bundled along with compiled Rust code as regular object files, all within one single archive. Such an archive can then end up linked with the -wholarchive: option, if build systems decide to use such an option for including static libraries. This should fix https://github.com/msys2/MINGW-packages/issues/21017. This works for the header/trailer object files in import libraries generated by LLVM; import libraries generated by MSVC are vaguely different. ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b did an attempt at fixing the issue for MSVC generated libraries, but it's not entirely correct, and isn't enough for making things work for that case.
2025-01-14[LLD][COFF] Skip sections marked as IMAGE_SCN_LNK_INFO in the output image ↵Jacek Caban1-1/+1
(#122752) Fixes #106275.
2025-01-03[lld/COFF] Support thin archives in /reproduce: files (#121512)Nico Weber1-0/+8
This already worked without /wholearchive; now it works with it too. (Only for thin archives containing relative file names, matching the ELF and Mach-O ports.)
2025-01-01[LLD][COFF] Move addFile implementation to LinkerDriver (NFC) (#121342)Jacek Caban1-1/+1
The addFile implementation does not rely on the SymbolTable object. With #119294, the symbol table for input files is determined during the construction of the objects representing them. To clarify that relationship, this change moves the implementation from the SymbolTable class to the LinkerDriver class.
2024-12-19[lld/coff] Fix assert on /start-lib foo.obj /end-lib during eager loads ↵Nico Weber1-1/+6
(#120292) If foo.obj is eagerly loaded (due to a prior undef referencing one if its symbols) and has more than one symbol, we used to assert: SymbolTable::addLazyObject() for the first symbol would set `lazy` to false and load all symbols from the file, but the outer ObjFile::parseLazy() loop would continue to run and call addLazyObject() for the second symbol, which would assert. Instead, just stop adding lazy symbols if the file got loaded for real while adding a symbol. (The ELF port has a similar early exit in `ObjFile<ELFT>::parseLazy()`.)
2024-12-17[LLD][COFF] Introduce hybrid symbol table for EC input files on ARM64X (#119294)Jacek Caban1-4/+5
2024-12-17[LLD][COFF] Create COFFObjectFile instance when constructing ObjFile (NFC) ↵Jacek Caban1-17/+18
(#120144) This change moves the creation of COFFObjectFile to the construction of ObjFile, instead of delaying it until parsing.
2024-12-15Revert "[LLD][COFF] Introduce hybrid symbol table for EC input files on ↵Jacek Caban1-10/+8
ARM64X (#119294)" This reverts commit a8206e7b37929f4754806667680ffba0206eef95 due to sanitizer failures.
2024-12-15[LLD][COFF] Introduce hybrid symbol table for EC input files on ARM64X (#119294)Jacek Caban1-8/+10
On hybrid ARM64X targets, ARM64 and ARM64EC input files operate in separate namespaces and cannot reference each other. This change introduces separate `SymbolTable` instances and associates each `InputFile` with the appropriate table to reflect this behavior.
2024-12-15[LLD][COFF] Store machine type in SymbolTable (NFC) (#119298)Jacek Caban1-4/+4
This change prepares for hybrid ARM64X support, which requires two `SymbolTable` instances: one for native symbols and one for EC symbols. In such cases, `config.machine` will remain ARM64X, while the `SymbolTable` instances will store ARM64 and ARM64EC machine types.
2024-12-15[LLD][COFF] Store reference to SymbolTable instead of COFFLinkerContext in ↵Jacek Caban1-92/+100
InputFile (NFC) (#119296) This change prepares for the introduction of separate hybrid namespaces. Hybrid images will require two `SymbolTable` instances, making it necessary to associate `InputFile` objects with the relevant one.
2024-12-05[lld-link] Simplify some << toStringFangrui Song1-1/+1
2024-12-05[lld-link] Replace fatal(...) with FatalFangrui Song1-14/+18
2024-12-05[lld-link] Replace error(...) with ErrFangrui Song1-6/+6
2024-12-04[lld-link] Replace log(...) with LogFangrui Song1-6/+5
2024-12-03[lld-link] Replace warn(...) with Warn(ctx)Fangrui Song1-1/+1
2024-12-03[lld-link] Add context-aware diagnostic functions (#118430)Fangrui Song1-0/+5
Similar to #112319 for ELF. While there is some initial boilerplate, it can simplify some call sites that use Twine, especially when a printed element uses `ctx` or toString.
2024-11-09[LLD][COFF] Support ARM64EC in BitcodeFile::getMachineType (#115474)Jacek Caban1-2/+3
2024-11-04[LLD][COFF] Add EC alias symbols for undefined x86_64 symbols on ARM64EC ↵Jacek Caban1-1/+16
target (#114466)
2024-10-23[LLD][COFF] Allow overriding EC alias symbols with lazy archive symbols ↵Jacek Caban1-6/+30
(#113283) On ARM64EC, external function calls emit a pair of weak-dependency aliases: `func` to `#func` and `#func` to the `func` guess exit thunk (instead of a single undefined `func` symbol, which would be emitted on other targets). Allow such aliases to be overridden by lazy archive symbols, just as we would for undefined symbols.
2024-10-21[LLD][COFF] Support anti-dependency symbols (#112542)Jacek Caban1-15/+24
Co-authored-by: Billy Laws <blaws05@gmail.com> Anti-dependency symbols are allowed to be duplicated, with the first definition taking precedence. If a regular weak alias is present, it is preferred over an anti-dependency definition. Chaining anti-dependencies is not allowed.
2024-09-19[LLD][COFF] Process all ARM64EC import symbols in MapFile's getSymbols (#109118)Jacek Caban1-1/+2
2024-09-17[LLD][COFF] Add Support for auxiliary IAT copy (#108610)Jacek Caban1-0/+6
In addition to the auxiliary IAT, ARM64EC modules also contain a copy of it. At runtime, the auxiliary IAT is filled with the addresses of actual ARM64EC functions when possible. If patching is detected, the OS may use the IAT copy to revert the auxiliary IAT, ensuring that the call checker is used for calls to imported functions.
2024-09-13[LLD][COFF] Add Support for ARM64EC Import Thunks (#108460)Jacek Caban1-2/+9
ARM64EC import thunks function similarly to regular ARM64 thunks but use a mangled name and perform the call through the auxiliary IAT.
2024-09-13[LLD][COFF][NFC] Store live flag in ImportThunkChunk. (#108459)Jacek Caban1-1/+1
Instead of ImportFile. This is a preparation for ARM64EC support, which has both x86 and ARM64EC thunks and each of them needs a separate flag.
2024-09-12[LLD][COFF] Add support for ARM64EC auxiliary IAT (#108304)Jacek Caban1-3/+23
In addition to the regular IAT, ARM64EC also includes an auxiliary IAT. At runtime, the regular IAT is populated with the addresses of imported functions, which may be x86_64 functions or the export thunks of ARM64EC functions. The auxiliary IAT contains versions of functions that are guaranteed to be directly callable by ARM64 code. The linker fills the auxiliary IAT with the addresses of `__impchk_` thunks. These thunks perform a call on the IAT address using `__icall_helper_arm64ec` with the target address from the IAT. If the imported function is an ARM64EC function, the OS may replace the address in the auxiliary IAT with the address of the ARM64EC version of the function (not its export thunk), avoiding the runtime call checker for better performance.
2024-09-11[LLD][COFF] Add support for ARM64EC import call thunks. (#107931)Jacek Caban1-0/+7
These thunks can be accessed using `__impchk_*` symbols, though they are typically not called directly. Instead, they are used to populate the auxiliary IAT. When the imported function is x86_64 (or an ARM64EC function with a patched export thunk), the thunk is used to call it. Otherwise, the OS may replace the thunk at runtime with a direct pointer to the ARM64EC function to avoid the overhead.
2024-09-11[LLD][COFF][NFC] Create import thunks in ImportFile::parse. (#107929)Jacek Caban1-2/+17
2024-09-04[LLD][COFF] Initial support for ARM64EC importlibs. (#107164)Jacek Caban1-4/+19
Use demangled symbol name for __imp_ symbols and define demangled thunk symbol as AMD64 thunk.
2024-09-04[LLD][COFF][NFC] Store impSym as DefinedImportData in ImportFile. (#107162)Jacek Caban1-2/+1
2024-09-02[LLD][COFF] Use archive's ECSYMBOLS on ARM64EC target when available. (#106904)Jacek Caban1-0/+14
2024-08-26[LLD][COFF] Use parentName for import files in toString. (#106104)Jacek Caban1-1/+1
Improves diagnostic messages.