aboutsummaryrefslogtreecommitdiff
path: root/lld/COFF/SymbolTable.cpp
AgeCommit message (Collapse)AuthorFilesLines
2026-01-21[lld][COFF] Use `.contains` rather than `.count` for set membership. NFC ↵Sam Clegg1-2/+2
(#177067) Also converted a couple of `std::set` to `llvm::StringSet`/`llvm::SmallSet`. This matches the usage in the other linker backends. See #176610
2025-09-05[LLD][COFF] Add more `--time-trace` tags for ThinLTO linking (#156471)Alexandre Ganea1-3/+5
In order to better see what's going on during ThinLTO linking, this PR adds more profile tags when using `--time-trace` on a `lld-link.exe` invocation. After PR, linking `clang.exe`: <img width="3839" height="2026" alt="Capture d’écran 2025-09-02 082021" src="https://github.com/user-attachments/assets/bf0c85ba-2f85-4bbf-a5c1-800039b56910" /> Linking a custom (Unreal Engine game) binary gives a completly different picture, probably because of using Unity files, and the sheer amount of input files (here, providing over 60 GB of .OBJs/.LIBs). <img width="1940" height="1008" alt="Capture d’écran 2025-09-02 102048" src="https://github.com/user-attachments/assets/60b28630-7995-45ce-9e8c-13f3cb5312e0" />
2025-08-22[LLD][COFF] Set isUsedInRegularObj for target symbols in ↵Jacek Caban1-0/+1
resolveAlternateNames (#154837) Fixes: #154595 Prior to commit bbc8346e6bb543b0a87f52114fed7d766446bee1, this flag was set by `insert()` from `addUndefined()`. Set it explicitly now.
2025-08-05[LLD][COFF] Don't resolve weak aliases when performing local import (#152000)Jacek Caban1-12/+5
Fixes crashes reported in #151255. The alias may have already been stored for later resolution, which can lead to treating a resolved alias as if it were still undefined. Instead, use the alias target directly for the import. Also extended the test to make reproducing the problem more likely, and added an assert that catches the issue.
2025-07-31[LLD][COFF] Add support for ARM64X same-address thunks (#151255)Jacek Caban1-5/+37
Fixes MSVC CRT thread-local constructors support on hybrid ARM64X targets. `-arm64xsameaddress` is an undocumented option that ensures the specified function has the same address in both native and EC views of hybrid images. To achieve this, the linker emits additional thunks and replaces the symbols of those functions with the thunk symbol (the same thunk is used in both views). The thunk code jumps to the native function (similar to range extension thunks), but additional ARM64X relocations are emitted to replace the target with the EC function in the EC view. MSVC appears to generate thunks even for non-hybrid ARM64EC images. As a side effect, the native symbol is pulled in. Since this is used in the CRT for thread-local constructors, it results in the image containing unnecessary native code. Because these thunks do not appear to be useful in that context, we limit this behavior to actual hybrid targets. This may change if compatibility requires it. The tricky part is that thunks should be skipped if the symbol is not live in either view, and symbol replacement must be reflected in weak aliases. This requires thunk generation to happen before resolving weak aliases but after the GC pass. To enable this, the `markLive` call was moved earlier, and the final weak alias resolution was postponed until afterward. This requires more code to be aware of weak aliases, which previously could assume they were already resolved.
2025-07-28[LLD][COFF] Avoid resolving symbols with -alternatename if the target is ↵Jacek Caban1-1/+13
undefined (#149496) This change fixes an issue with the use of `-alternatename` in the MSVC CRT on ARM64EC, where both mangled and demangled symbol names are specified. Without this patch, the demangled name could be resolved to an anti-dependency alias of the target. Since chaining anti-dependency aliases is not allowed, this results in an undefined symbol. The root cause isn't specific to ARM64EC, it can affect other targets as well, even when anti-dependency aliases aren't involved. The accompanying test case demonstrates a scenario where the symbol could be resolved from an archive. However, because the archive member is pulled in after the first pass of alternate name resolution, and archive members don't override weak aliases, eager resolution would incorrectly skip it.
2025-07-28[LLD][COFF] Move resolving alternate names to SymbolTable (NFC) (#149495)Jacek Caban1-0/+25
2025-05-29[LLD][COFF] Add support for DLL imports on ARM64EC (#141587)Jacek Caban1-1/+1
Define additional `__imp_aux_` and mangled lazy symbols. Also allow overriding EC aliases with lazy symbols, as we do for other lazy symbol types.
2025-05-29[LLD][COFF] Avoid forcing lazy symbols in loadMinGWSymbols during symbol ↵Jacek Caban1-0/+9
table enumeration (#141593) Forcing lazy symbols at this point may introduce new entries into the symbol table. Avoid mutating `symTab` while iterating over it.
2025-05-15[LLD][COFF] Add support for including native ARM64 objects in ARM64EC images ↵Jacek Caban1-1/+1
(#137653) MSVC linker accepts native ARM64 object files as input with `-machine:arm64ec`, similar to `-machine:arm64x`. Its usefulness is very limited; for example, both exports and imports are not reflected in the PE structures and can't work. However, their symbol tables are otherwise functional. Since we already have handling of multiple symbol tables implemented for ARM64X, the required changes are mostly about adjusting relevant checks to account for them on the ARM64EC target. Delay-load helper handling is a bit of a shortcut. The patch never pulls it for native object files and just ensures that the code is fine with that. In general, I think it would be nice to adjust the driver to pull it only when it's actually referenced, which would allow applying the same logic to the native symbol table on ARM64EC without worrying about pulling too much.
2025-04-07[LLD][COFF] Don't dllimport from static libraries (#134443)Alexandre Ganea1-8/+12
This reverts commit 6a1bdd9 and re-instate behavior that matches what MSVC link.exe does, that is, error out when trying to dllimport a symbol from a static library. A hint is now displayed in stdout, mentioning that we should rather dllimport the symbol from a import library. Fixes https://github.com/llvm/llvm-project/issues/131807
2025-03-15[LLD][COFF] Clarify EC vs. native symbols in diagnostics on ARM64X (#130857)Jacek Caban1-10/+17
On ARM64X, symbol names alone are ambiguous as they may refer to either a native or an EC symbol. Append '(EC symbol)' or '(native symbol)' in diagnostic messages to distinguish them.
2025-03-03[LLD][COFF] Support -aligncomm directives on ARM64X (#129513)Jacek Caban1-0/+15
2025-02-21[LLD][COFF] Support alternate names in both symbol tables on ARM64X (#127619)Jacek Caban1-0/+11
The `.drectve` directive applies only to the namespace in which it is defined, while the command-line argument applies only to the EC namespace.
2025-01-24[lld/COFF] Fix -start-lib / -end-lib more after reviews.llvm.org/D116434 ↵Nico Weber1-0/+4
(#124294) This is a follow-up to #120452 in a way. Since lld/COFF does not yet insert all defined in an obj file before all undefineds (ELF and MachO do this, see #67445 and things linked from there), it's possible that: 1. We add an obj file a.obj 2. a.obj contains an undefined that's in b.obj, causing b.obj to be added 3. b.obj contains an undefined that's in a part of a.obj that's not yet in the symbol table, causing a recursive load of a.obj, which adds the symbols in there twice, leading to duplicate symbol errors. For normal archives, `ArchiveFile::addMember()` has a `seen` check to prevent this. For start-lib lazy objects, we can just check if the archive is still lazy at the recursive call. This bug is similar to issue #59162. (Eventually, we'll probably want to do what the MachO and ELF ports do.) Includes a test that caused duplicate symbol diagnostics before this code change.
2025-01-22[LLD][COFF] Use EC symbol table for exports defined in module definition ↵Jacek Caban1-0/+63
files (#123849)
2025-01-21[LLD][COFF] Separate EC and native exports for ARM64X (#123652)Jacek Caban1-0/+135
Store exports in SymbolTable instead of Configuration.
2025-01-17[LLD][COFF] Process bitcode files separately for each symbol table on ARM64X ↵Jacek Caban1-9/+8
(#123194)
2025-01-16[LLD][COFF] Move getChunk to LinkerDriver (NFC) (#123103)Jacek Caban1-9/+0
The `getChunk` function returns all chunks, not just those specific to a symbol table. Move it out of the `SymbolTable` class to clarify its scope.
2025-01-15[LLD][COFF] Move symbol mangling and lookup helpers to SymbolTable class ↵Jacek Caban1-0/+106
(NFC) (#122836) This refactor prepares for further ARM64X hybrid support, where these helpers will need to work with either the native or EC symbol table based on context.
2025-01-13[LLD][COFF] Use appropriate symbol table for -include argument on ARM64X ↵Jacek Caban1-0/+29
(#122554) Move `LinkerDriver::addUndefined` to` SymbolTable` to allow its use with both symbol tables on ARM64X and rename it to `addGCRoot` to clarify its distinct role compared to the existing `SymbolTable::addUndefined`. Command-line `-include` arguments now apply to the EC symbol table, with `mainSymtab` introduced in `linkerMain`. There will be more similar cases. For `.drectve` sections, the corresponding symbol table is used based on the context.
2025-01-02[LLD][COFF] Emit warnings for missing load config on EC targets (#121339)Jacek Caban1-0/+9
ARM64EC and ARM64X images require a load configuration to be valid.
2025-01-01[LLD][COFF] Move addFile implementation to LinkerDriver (NFC) (#121342)Jacek Caban1-68/+2
The addFile implementation does not rely on the SymbolTable object. With #119294, the symbol table for input files is determined during the construction of the objects representing them. To clarify that relationship, this change moves the implementation from the SymbolTable class to the LinkerDriver class.
2024-12-29[LLD][COFF] Store and validate load config in SymbolTable (#120324)Jacek Caban1-0/+47
Improve diagnostics for invalid load configurations.
2024-12-19[lld/COFF] Fix -start-lib / -end-lib after reviews.llvm.org/D116434 (#120452)Nico Weber1-0/+1
That change forgot to set `lazy` to false before calling `addFile()` in `forceLazy()` which caused `addFile()` to parse the file we want to force a load for to be added as a lazy object again instead of adding the file to `ctx.objFileInstances`. This is caught by a pretty simple test (included).
2024-12-17[lld/COFF] Remove needless indirectionNico Weber1-1/+1
`symtab.ctx.symtab` is just `symtab`. Looks like #119296 added this using a global find-and-replace. This was the only instance of `symtab.ctx.symtab` in lld/. No behavior change.
2024-12-15[LLD][COFF] Store machine type in SymbolTable (NFC) (#119298)Jacek Caban1-12/+9
This change prepares for hybrid ARM64X support, which requires two `SymbolTable` instances: one for native symbols and one for EC symbols. In such cases, `config.machine` will remain ARM64X, while the `SymbolTable` instances will store ARM64 and ARM64EC machine types.
2024-12-15[LLD][COFF] Factor out LinkerDriver::setMachine (NFC) (#119297)Jacek Caban1-2/+1
2024-12-15[LLD][COFF] Store reference to SymbolTable instead of COFFLinkerContext in ↵Jacek Caban1-3/+3
InputFile (NFC) (#119296) This change prepares for the introduction of separate hybrid namespaces. Hybrid images will require two `SymbolTable` instances, making it necessary to associate `InputFile` objects with the relevant one.
2024-12-05[lld-link] Simplify some << toStringFangrui Song1-28/+19
2024-12-05[lld-link] Use COFFSyncStreamFangrui Song1-9/+5
Add a operator<< overload for Symbol *.
2024-12-05[lld-link] Replace error(...) with ErrFangrui Song1-8/+10
2024-12-04[lld-link] Replace log(...) with LogFangrui Song1-11/+12
2024-12-03[lld-link] Replace warn(...) with Warn(ctx)Fangrui Song1-11/+13
2024-11-24[LLD][COFF] Require explicit specification of ARM64EC target (#116281)Jacek Caban1-4/+18
Inferring the ARM64EC target can lead to errors. The `-machine:arm64ec` option may include x86_64 input files, and any valid ARM64EC input is also valid for `-machine:arm64x`. MSVC requires an explicit `-machine` argument with informative diagnostics; this patch adopts the same behavior.
2024-11-15[LLD][COFF] Fix handling of invalid ARM64EC function names (#116252)Jacek Caban1-1/+4
Since these symbols cannot be mangled or demangled, there is no symbol to check for conflicts in `checkLazyECPair`, nor is there an alias to create in `addUndefined`. Attempting to create an import library with such symbols results in an error; the patch includes a test to ensure the error is handled correctly. This is a follow-up to #115567.
2024-11-06[LLD][COFF] Add support for locally imported EC symbols (#114985)Jacek Caban1-7/+24
Allow imported symbols to be recognized in both mangled and demangled forms. Support __imp_aux_ symbols in addition to __imp_ symbols.
2024-10-23[LLD][COFF] Check both mangled and demangled symbols before adding a lazy ↵Jacek Caban1-0/+42
archive symbol to the symbol table on ARM64EC (#113284) On ARM64EC, a function symbol may appear in both mangled and demangled forms: - ARM64EC archives contain only the mangled name, while the demangled symbol is defined by the object file as an alias. - x86_64 archives contain only the demangled name (the mangled name is usually defined by an object referencing the symbol as an alias to a guess exit thunk). - ARM64EC import files contain both the mangled and demangled names for thunks. If more than one archive defines the same function, this could lead to different libraries being used for the same function depending on how they are referenced. Avoid this by checking if the paired symbol is already defined before adding a symbol to the table.
2024-10-23[LLD][COFF] Allow overriding EC alias symbols with lazy archive symbols ↵Jacek Caban1-4/+6
(#113283) On ARM64EC, external function calls emit a pair of weak-dependency aliases: `func` to `#func` and `#func` to the `func` guess exit thunk (instead of a single undefined `func` symbol, which would be emitted on other targets). Allow such aliases to be overridden by lazy archive symbols, just as we would for undefined symbols.
2024-10-21[LLD][COFF] Support anti-dependency symbols (#112542)Jacek Caban1-1/+1
Co-authored-by: Billy Laws <blaws05@gmail.com> Anti-dependency symbols are allowed to be duplicated, with the first definition taking precedence. If a regular weak alias is present, it is preferred over an anti-dependency definition. Chaining anti-dependencies is not allowed.
2024-10-15[lld] Avoid repeated hash lookups (NFC) (#112299)Kazu Hirata1-5/+3
2024-10-03[LLD][COFF] Do as many passes of resolveRemainingUndefines as necessary for ↵Mike Hommey1-1/+8
undefined lazy symbols (#109082)
2024-09-18[LLD][COFF] Store __imp_ symbols as Defined in InputFile (#109115)Jacek Caban1-3/+3
2024-09-18[LLD][COFF] Handle imported weak aliases consistently (#109105)Mike Hommey1-0/+8
symTab being a DenseMap, the order in which a symbol and its corresponding import symbol are processed is not guaranteed, and when the latter comes first, it is left undefined.
2024-09-17[LLD][COFF] Redirect __imp_ Symbols to __imp_aux_ on ARM64EC for x64 object ↵Jacek Caban1-0/+17
files (#108608) On ARM64EC, __imp_ symbols reference the auxiliary IAT, while __imp_aux_ symbols reference the regular IAT. However, x86_64 code expects both to reference the regular IAT. This change adjusts the symbols accordingly, matching the behavior observed in the MSVC linker.
2024-09-15[lld] Nits on uses of raw_string_ostream (NFC)JOE19941-4/+4
* Don't call raw_string_ostream::flush(), which is essentially a no-op. * Strip calls to raw_string_ostream::str(), to avoid excess layer of indirection.
2024-09-12[LLD][COFF] Add support for ARM64EC auxiliary IAT (#108304)Jacek Caban1-3/+4
In addition to the regular IAT, ARM64EC also includes an auxiliary IAT. At runtime, the regular IAT is populated with the addresses of imported functions, which may be x86_64 functions or the export thunks of ARM64EC functions. The auxiliary IAT contains versions of functions that are guaranteed to be directly callable by ARM64 code. The linker fills the auxiliary IAT with the addresses of `__impchk_` thunks. These thunks perform a call on the IAT address using `__icall_helper_arm64ec` with the target address from the IAT. If the imported function is an ARM64EC function, the OS may replace the address in the auxiliary IAT with the address of the ARM64EC version of the function (not its export thunk), avoiding the runtime call checker for better performance.
2024-09-11[LLD][COFF] Add support for ARM64EC import call thunks. (#107931)Jacek Caban1-1/+15
These thunks can be accessed using `__impchk_*` symbols, though they are typically not called directly. Instead, they are used to populate the auxiliary IAT. When the imported function is x86_64 (or an ARM64EC function with a patched export thunk), the thunk is used to call it. Otherwise, the OS may replace the thunk at runtime with a direct pointer to the ARM64EC function to avoid the overhead.
2024-09-11[LLD][COFF][NFC] Create import thunks in ImportFile::parse. (#107929)Jacek Caban1-2/+2
2024-09-04[LLD][COFF][NFC] Store impSym as DefinedImportData in ImportFile. (#107162)Jacek Caban1-2/+2