aboutsummaryrefslogtreecommitdiff
path: root/lld/include
AgeCommit message (Collapse)AuthorFilesLines
2025-06-24[lld][BP] Fix duplicate section size measurment (#145384)Ellis Hoag1-2/+2
2025-06-23[lld][BP] Print total size of startup symbols (#145106)Ellis Hoag1-19/+35
A good proxy to estimate the number of page faults during startup is the total size of startup functions. Assuming profiles are up-to-date, we can measure this total size pretty easily. Note that if profile data is old, this number could be wrong.
2025-06-03[lld][macho] Strip .__uniq. and .llvm. hashes in -order_file (#140670)SharonXSharon2-16/+34
``` /// Symbols can be appended with "(.__uniq.xxxx)?.llvm.yyyy" where "xxxx" and /// "yyyy" are numbers that could change between builds. We need to use the root /// symbol name before this suffix so these symbols can be matched with profiles /// which may have different suffixes. ``` Just like what we are doing in BP, https://github.com/llvm/llvm-project/blob/main/lld/MachO/BPSectionOrderer.cpp#L127 the patch removes the suffixes when parsing the order file and getting the symbol priority to have a better symbol match. --------- Co-authored-by: Sharon Xu <sharonxu@fb.com> Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>
2025-04-19[lld] Use llvm::unique (NFC) (#136453)Kazu Hirata1-1/+1
2025-02-13[lld][BP] Order .Tgm symbols for startup (#126328)Ellis Hoag1-4/+7
The Global Function Merger (https://discourse.llvm.org/t/rfc-global-function-merging/82608) pass optimistically creates merged instances of functions and suffixes their names with `.Tgm`. Then in the linker, ICF will (hopefully) fold these `.Tgm` functions. For example, a function `foo` might become a thunk `foo` that calls a merged function `foo.Tgm`. Since IRPGO runs before the global merger, we will only have a profile for `foo`. We want to correlate this profile to both `foo` and `foo.Tgm` so they can both be ordered to improve startup time. I built a large binary and found that it increased the number of functions ordered for startup, as expected. ``` Functions for startup: 12049 -> 12697 Functions for compression: 34733 -> 34707 ``` The reason why we don't see a larger improvement is because there are some cases where the code was accidentally working: `getRootSymbol("foo.llvm.5555.Tgm")` already returns `foo`.
2025-02-04[ELF] Add BPSectionOrderer options (#125559)Fangrui Song1-1/+6
Reland #120514 after 2f6e3df08a8b7cd29273980e47310cf09c6fdbd8 fixed iteration order issue and libstdc++/libc++ differences. --- Both options instruct the linker to optimize section layout with the following goals: * `--bp-compression-sort=[data|function|both]`: Improve Lempel-Ziv compression by grouping similar sections together, resulting in a smaller compressed app size. * `--bp-startup-sort=function --irpgo-profile=<file>`: Utilize a temporal profile file to reduce page faults during program startup. The linker determines the section order by considering three groups: * Function sections ordered according to the temporal profile (`--irpgo-profile=`), prioritizing early-accessed and frequently accessed functions. * Function sections. Sections containing similar functions are placed together, maximizing compression opportunities. * Data sections. Similar data sections are placed together. Within each group, the sections are ordered using the Balanced Partitioning algorithm. The linker constructs a bipartite graph with two sets of vertices: sections and utility vertices. * For profile-guided function sections: + The number of utility vertices is determined by the symbol order within the profile file. + If `--bp-compression-sort-startup-functions` is specified, extra utility vertices are allocated to prioritize nearby function similarity. * For sections ordered for compression: Utility vertices are determined by analyzing k-mers of the section content and relocations. The call graph profile is disabled during this optimization. When `--symbol-ordering-file=` is specified, sections described in that file are placed earlier. Co-authored-by: Pengying Xu <xpy66swsry@gmail.com>
2025-02-03BPSectionOrderer: stabilize iteration order and node orderFangrui Song1-6/+7
Exposed by the test added in the reverted #120514 * Fix libstdc++/libc++ differences due to nth_element. https://github.com/llvm/llvm-project/pull/125450#issuecomment-2631404178 * Fix LLVM_ENABLE_REVERSE_ITERATION=1 differences * Fix potential issue in `currentSize += D::getSize(*sections[*sectionIdxs.begin()])` where DenseSet was used, though not covered by a test
2025-02-03Revert "[ELF] Add BPSectionOrderer options (#120514)"Hans Wennborg1-14/+8
The ELF/bp-section-orderer.s test is failing on some buildbots due to what seems like non-determinism issues, see comments on the original PR and #125450 Reverting to green the build. This reverts commit 0154dce8d39d2688b09f4e073fe601099a399365 and follow-up commits 046dd4b28b9c1a75a96cf63465021ffa9fe1a979 and c92f20416e6dbbde9790067b80e75ef1ef5d0fa4.
2025-02-02[lld] BPSectionOrderer: stabilize iteration orderFangrui Song1-6/+6
2025-02-02[lld] BPSectionOrderer: stabilize iteration order with MapVectorFangrui Song1-1/+2
2025-02-02[ELF] Add BPSectionOrderer options (#120514)Pengying Xu1-1/+6
Add new ELF linker options for profile-guided section ordering optimizations: - `--irpgo-profile=<file>`: Read IRPGO profile data for use with startup and compression optimizations - `--bp-startup-sort={none,function}`: Order sections based on profile data to improve star tup time - `--bp-compression-sort={none,function,data,both}`: Order sections using balanced partitioning to improve compressed size - `--bp-compression-sort-startup-functions`: Additionally optimize startup functions for compression - `--verbose-bp-section-orderer`: Print statistics about balanced partitioning section ordering Thanks to the @ellishg, @thevinster, and their team's work. --------- Co-authored-by: Fangrui Song <i@maskray.me>
2025-02-02[lld] BPSectionOrderer: replace Symbol with Defined and optimize getSymbols. NFCFangrui Song1-8/+8
2025-01-27[lld-macho] Refactor BPSectionOrderer with CRTP. NFCFangrui Song2-70/+407
PR #117514 refactored BPSectionOrderer to be used by the ELF port but introduced some inefficiency: * BPSectionBase/BPSymbol are wrappers around a single pointer. The numbers of sections and symbols could be huge, and the extra allocations are memory inefficient. * Reconstructing the returned DenseMap (since BPSectionBase != InputSectin) is wasteful. This patch refactors BPSectionOrderer with Curiously Recurring Template Pattern and eliminates the inefficiency. In addition, `symbolToSectionIdxs` is removed and `rootSymbolToSectionIdxs` building is moved to lld/MachO: while getting sections for symbols is cheap in Mach-O, it is awkward and inefficient in the ELF port. While here, add a file-level comment and replace some `StringMap<*>` (which copies strings) with `DenseMap<CachedHashStringRef, *>`. Pull Request: https://github.com/llvm/llvm-project/pull/124482
2025-01-16[lld-macho,BalancedPartition] Simplify relocation hash and avoid xxHashFangrui Song1-9/+0
xxHash, inferior to xxh3, is discouraged. We try not to use xxhash in lld. Switch to read32le for content hash and xxh3/stable_hash_combine for relocation hash. Remove the intermediate std::string for relocation hash. Change the tail hashing scheme to consider individual bytes instead. This helps group 0102 and 0201 together. The benefit is negligible, though. Pull Request: https://github.com/llvm/llvm-project/pull/121729
2025-01-10[lld-macho,NFC] Switch to increasing prioritiesFangrui Song1-4/+4
--order_file, call graph profile, and BalancedPartitioning currently build the section order vector by decreasing priority (from SIZE_MAX to 0). However, it's conventional to use an increasing key (see OutputSection::inputOrder). Switch to increasing priorities, remove the global variable highestAvailablePriority, and remove the highestAvailablePriority parameter from BPSectionOrderer. Change size_t to int. This improves consistenty with the ELF and COFF ports. The ELF port utilizes negative priorities for --symbol-ordering-file and call graph profile, and non-negative priorities for --shuffle-sections (no Mach-O counterpart yet). Pull Request: https://github.com/llvm/llvm-project/pull/121727
2025-01-05[lld-macho] Remove redundant hasValidData. NFCFangrui Song1-1/+0
lld::macho::runBalancedPartitioning ensures that all sections satisfy `hasValidData`.
2024-12-18[lld] Move BPSectionOrderer from MachO to Common for reuse in ELF (#117514)Max1-0/+80
Add lld/Common/BPSectionOrdererBase from MachO for reuse in ELF
2024-11-29[ELF] Change getSrcMsg to use ELFSyncStream. NFCFangrui Song1-0/+1
2024-11-24[ELF] Simplif reportUndefinedSymbol. NFCFangrui Song1-2/+4
2024-11-23[ELF] Simplify reportMissingFeature. NFCFangrui Song1-1/+1
2024-11-16[lld] Use context-aware outs() and errs()Fangrui Song2-4/+0
For COFF and ELF that are mostly free of global states, lld::errs() and lld::outs() should not be used. This migration change allows us to remove lld::errs, which uses the global errorHandler().
2024-11-16[ELF] Make checkError context-awareFangrui Song1-0/+1
2024-11-16[ELF] Replace message(...) with Msg(ctx)Fangrui Song1-1/+1
2024-11-16[ELF] Replace internalLinkerError(getErrorLoc(ctx, buf) + ...) with ↵Fangrui Song1-0/+1
InternalErr(ctx, buf) and simplify `+ toStr(ctx, x)` to `<< x`. The trailing '\n' << llvm::getBugReportMsg() is not very useful and therefore removed.
2024-11-06[ELF] Add context-aware diagnostic functions (#112319)Fangrui Song1-1/+16
The current diagnostic functions log/warn/error/fatal lack a context argument and call the global `lld::errorHandler()`, which prevents multiple lld instances in one process. This patch introduces context-aware replacements: * log => Log(ctx) * warn => Warn(ctx) * errorOrWarn => Err(ctx) * error => ErrAlways(ctx) * fatal => Fatal(ctx) Example: `errorOrWarn(toString(f) + "xxx")` => `Err(ctx) << f << "xxx"`. (`toString(f)` is shortened to `f` as a bonus and may access `ctx` without accessing the global variable (see `Target.cpp`)). `ctx.e = &context->e;` can be replaced with a non-global Errorhandler when `ctx` becomes a local variable. (For the ELF port, the long term goal is to eliminate `error`. Most can be straightforwardly converted to `Err(ctx)`.)
2024-09-15[ELF] Rename unique_saver to uniqueSaver. NFCFangrui Song1-6/+2
and remove an unneeded FIXME.
2024-09-05[LTO][ELF][lld] Use unique string saver in ELF bitcode symbol parsing (#106670)Mingming Liu1-1/+7
lld ELF [BitcodeFile](https://github.com/llvm/llvm-project/blob/a527248a3c2d638b0c92a06992f3f1c1f80842ad/lld/ELF/InputFiles.h#L328) uses [string saver](https://github.com/llvm/llvm-project/blob/a527248a3c2d638b0c92a06992f3f1c1f80842ad/lld/include/lld/Common/CommonLinkerContext.h#L57) to keep copies of bitcode symbols. Symbol duplication is very common when compiling application binaries. This change proposes to introduce a UniqueStringSaver in lld context and use it for bitcode symbol parsing. The implementation covers ELF only. Similar opportunities should exist on other (COFF, MachO, wasm) formats. For an internal production binary where lto indexing takes ~10GiB originally, this changes optimizes away ~800MiB (~7.8%), measured by https://github.com/google/pprof. Flame graph breaks down memory by usage call stacks and agrees with this measurement.
2023-09-28[NFC][LLD] Refactor some copy-paste into the Common library (#67598)Matheus Izvekov1-0/+4
2023-06-19Re-land [LLD] Allow usage of LLD as a libraryAlexandre Ganea1-29/+42
This reverts commit aa495214b39d475bab24b468de7a7c676ce9e366. As discussed in https://github.com/llvm/llvm-project/issues/53475 this patch allows for using LLD-as-a-lib. It also lets clients link only the drivers that they want (see unit tests). This also adds the unit test infra as in the other LLVM projects. Among the test coverage, I've added the original issue from @krzysz00, see: https://github.com/ROCmSoftwarePlatform/D108850-lld-bug-reproduction Important note: this doesn't allow (yet) linking in parallel. This will come a bit later hopefully, in subsequent patches, for COFF at least. Differential revision: https://reviews.llvm.org/D119049
2023-06-14Revert "[LLD] Allow usage of LLD as a library"Leonard Chan1-42/+29
This reverts commit 2700da5fe28d8b17c66e5c960d2188276a6ced39. Reverting since this causes some test failures on our builders: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8778372807208184913/overview
2023-06-13[LLD] Allow usage of LLD as a libraryAlexandre Ganea1-29/+42
As discussed in https://github.com/llvm/llvm-project/issues/53475 this patch allows using LLD-as-a-lib. It also lets clients link only the drivers that they want (see unit tests). This also adds the unit test infra as in the other LLVM projects. Among the test coverage, I've added the original issue from @krzysz00, see: https://github.com/ROCmSoftwarePlatform/D108850-lld-bug-reproduction Important note: this doesn't allow (yet) linking in parallel. This will come a bit later, in subsequent patches, for COFF at last. Differential revision: https://reviews.llvm.org/D119049
2023-02-15[LLD] Add --lto-CGO[0-3] optionScott Linder1-1/+1
Allow controlling the CodeGenOpt::Level independent of the LTO optimization level in LLD via new options for the COFF, ELF, MachO, and wasm frontends to lld. Most are spelled as --lto-CGO[0-3], but COFF is spelled as -opt:lldltocgo=[0-3]. See D57422 for discussion surrounding the issue of how to set the CG opt level. The ultimate goal is to let each function control its CG opt level, but until then the current default means it is impossible to specify a CG opt level lower than 2 while using LTO. This option gives the user a means to control it for as long as it is not handled on a per-function basis. Reviewed By: MaskRay, #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D141970
2022-12-28[lld] Fix iwyu problems after 83d59e05b201760e3f364ff6316301d347cbad95Fangrui Song1-1/+0
The commit transitively includes lld/include/lld/Common/ErrorHandler.h into lld/include/lld/Common/Driver.h, which is not intended.
2022-12-05Remove unused #include "llvm/ADT/Optional.h"Fangrui Song1-1/+0
2022-12-03CodeGen/CommandFlags: Convert Optional to std::optionalFangrui Song1-2/+1
2022-12-03Convert Optional<CodeModel> to std::optional<CodeModel>Krzysztof Parzyszek1-1/+2
2022-11-27[lld] Change Optional to std::optionalFangrui Song1-3/+3
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-10-14[lld][nfc] Remove lld::demangle() (partial revert of D116279)Jez Ng1-8/+0
{D116279}, in addition to adding support for other demanglers, also factored out some of the demangling logic. However, I don't think the abstraction really carries its weight -- after {D135942}, only the ELF and WASM backends call it with anything other than a non-constant `shouldDemangle` argument. The COFF and Mach-O backends were already doing the should-demangle check before calling `demangle()`. Reviewed By: MaskRay, #lld-macho Differential Revision: https://reviews.llvm.org/D135943
2022-08-04[ELF] Add makeThreadLocal/makeThreadLocalN and remove InputFile::localSymStorageFangrui Song1-0/+26
makeThreadLocal/makeThreadLocalN are moved from D130810 ([ELF] Parallelize input section initialization) here to make D130810 more focused on the refactor: * COFF has some needs for multiple linker contexts. D108850 partially removed global states from lldCommon but left the global variable `lctx`. * To the best of my knowledge, all multiple-linker-context feature requests to ELF are more from user convenience, with no very strong argument. * In practice, ELF port is very difficult to remove global states without introducing significant performance regression/hurting code readability. * Per-thread allocators from D122922/D123879 are too expensive and will not really benefit ELF. This patch adds a simple thread_local based makeThreadLocal to lld/Common/Memory.h. It will enable further optimization in ELF.
2022-07-30[lld] Change vector to SmallVector. NFCFangrui Song2-2/+4
My lld executable is 1.6KiB smaller and some functions are now more efficient.
2022-06-24[NFC][lld] Fix typos to test commit accessDaniel Bertalan1-1/+1
2022-06-19[lld] Remove lld/include/lld/CoreNico Weber20-2460/+0
This is all dead code that we forgot to delete in https://reviews.llvm.org/D114842 Differential Revision: https://reviews.llvm.org/D128147
2022-06-11[lld-macho] Add support for -wKeith Smiley1-0/+1
This flag suppresses warnings produced by the linker. In ld64 this has an interesting interaction with -fatal_warnings, it silences the warnings but the link still fails. Instead of doing that here we still print the warning and eagerly fail the link in case both are passed, this seems more reasonable so users can understand why the link fails. Differential Revision: https://reviews.llvm.org/D127564
2022-02-17[lld] Make error handling functions opaqueFangrui Song1-11/+7
The inline `lld::error` expands to two function calls `errorHandler` and `error` where the latter is opaque. Move the functions to .cpp files to decrease code size. My x86-64 lld executable is 9KiB smaller. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D120002
2022-02-16[lld-macho] Don't include CommandFlags.h in CommonLinkerContext.hJez Ng1-4/+0
Main motivation: including `llvm/CodeGen/CommandFlags.h` in `CommonLinkerContext.h` means that the declaration of `llvm::Reloc` is visible in any file that includes `CommonLinkerContext.h`. Since our cpp files have both `using namespace llvm` and `using namespace lld::macho`, this results in conflicts with `lld::macho::Reloc`. I suppose we could put `llvm::Reloc` into a nested namespace, but in general, I think we should avoid transitively including too many header files in a very widely used header like `CommonLinkerContext.h`. RegisterCodeGenFlags' ctor initializes a bunch of function-`static` structures and does nothing else, so it should be fine to "initialize" it as a temporary stack variable rather than as a file static. Reviewed By: aganea Differential Revision: https://reviews.llvm.org/D119913
2022-02-10[MLIR][GPU][lld] Use LLD bundled in ROCm, removing workaroundKrzysztof Drewniak1-5/+0
Having clarified that executing the SerializeToHsaco pass can depend on a ROCm installation, switch from calling lld as a library to using the copy of lld guaranteed to be included in a ROCm install. This removes the workaround introduced in D119277 Reviewed By: whchung Differential Revision: https://reviews.llvm.org/D119463
2022-02-08[MLIR] Temporary workaround for calling the LLD ELF driver as-a-libAlexandre Ganea1-1/+6
This fixes the situation described in https://github.com/llvm/llvm-project/issues/53475 with a repro exposed by https://github.com/ROCmSoftwarePlatform/D108850-lld-bug-reproduction This is purposely just a workaround to unblock users. This could be transplanted to the release/14.x branch if need be. A proper fix will later be provided in https://reviews.llvm.org/D119049. Differential Revision: https://reviews.llvm.org/D119277
2022-01-20Re-land [LLD] Remove global state in lldCommonAlexandre Ganea5-43/+121
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext. See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html The previous land f860fe362282ed69b9d4503a20e5d20b9a041189 caused issues in https://lab.llvm.org/buildbot/#/builders/123/builds/8383, fixed by 22ee510dac9440a74b2e5b3fe3ff13ccdbf55af3. Differential Revision: https://reviews.llvm.org/D108850
2022-01-16Revert [LLD] Remove global state in lldCommonAlexandre Ganea5-121/+43
It seems to be causing issues on https://lab.llvm.org/buildbot/#/builders/123/builds/8383
2022-01-16[LLD] Supplement with more comments. Clarify the intention in ↵Alexandre Ganea1-1/+3
f860fe362282ed69b9d4503a20e5d20b9a041189.