aboutsummaryrefslogtreecommitdiff
path: root/lld/ELF/ScriptParser.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-07-06[LLD] Fix crash on parsing ':ALIGN' in linker script (#146723)Parth1-0/+2
The linker was crashing due to stack overflow when parsing ':ALIGN' in an output section description. This commit fixes the linker script parser so that the crash does not happen. The root cause of the stack overflow is how we parse expressions (readExpr) in linker script and the behavior of ScriptLexer::expect(...) utility. ScriptLexer::expect does not do anything if errors have already been encountered during linker script parsing. In particular, it never increments the current token position in the script file, even if the current token is the same as the expected token. This causes an infinite call cycle on parsing an expression such as '(4096)' when an error has already been encountered. readExpr() calls readPrimary() readPrimary() calls readParenExpr() readParenExpr(): expect("("); // no-op, current token still points to '(' Expression *E = readExpr(); // The cycle continues... Closes #146722 Signed-off-by: Parth Arora <partaror@qti.qualcomm.com>
2025-05-28[ELF] Postpone ASSERT errorFangrui Song1-2/+2
assignAddresses is executed more than once. When an ASSERT expression evaluates to zero, we should only report an error for the last assignAddresses. Make a change similar to #66854 and #96361. This change might help https://github.com/ClangBuiltLinux/linux/issues/2094
2025-05-25[lld] Remove unused includes (NFC) (#141421)Kazu Hirata1-2/+0
2025-04-02[LLD][ELF] Support OVERLAY NOCROSSREFS (#133807)Daniel Thornburgh1-0/+7
This allows NOCROSSREFS to be specified in OVERLAY linker script descriptions. This is a particularly useful part of the OVERLAY syntax, since it's very rarely possible for one overlay section to sensibly reference another. Closes #128790
2025-03-31[LLD][ELF] Allow memory region in OVERLAY (#133540)Daniel Thornburgh1-10/+13
This allows the contents of OVERLAYs to be attributed to memory regions. This is the only clean way to overlap VMAs in linker scripts that choose to primarily use memory regions to lay out addresses. This also simplifies OVERLAY expansion to better match GNU LD. Expressions for the first section's LMA and VMA are not generated if the user did not provide them. This allows the LMA/VMA offset to be preserved across multiple overlays in the same region, as with regular sections. Closes #129816
2025-03-11[ELF] Allow KEEP within OVERLAY (#130661)Nathan Chancellor1-14/+2
When attempting to add KEEP within an OVERLAY description, which the Linux kernel would like to do for ARCH=arm to avoid dropping the .vectors sections with '--gc-sections' [1], ld.lld errors with: ld.lld: error: ./arch/arm/kernel/vmlinux.lds:37: section pattern is expected >>> __vectors_lma = .; OVERLAY 0xffff0000 : AT(__vectors_lma) { .vectors { KEEP(*(.vectors)) } ... >>> ^ readOverlaySectionDescription() does not handle all input section description keywords, despite GNU ld's documentation stating that "The section definitions within the OVERLAY construct are identical to those within the general SECTIONS construct, except that no addresses and no memory regions may be defined for sections within an OVERLAY." Reuse the existing parsing in readInputSectionDescription(), which handles KEEP, allowing the Linux kernel's use case to work properly. [1]: https://lore.kernel.org/20250221125520.14035-1-ceggers@arri.de/
2025-02-21[LLD][ELF][AArch64] Add support for SHF_AARCH64_PURECODE ELF section flag ↵Csanád Hajdú1-0/+1
(3/3) (#125689) Add support for the new SHF_AARCH64_PURECODE ELF section flag: https://github.com/ARM-software/abi-aa/pull/304 The general implementation follows the existing one for ARM targets. The output section only has the `SHF_AARCH64_PURECODE` flag set if all input sections have it set. Related PRs: * LLVM: https://github.com/llvm/llvm-project/pull/125687 * Clang: https://github.com/llvm/llvm-project/pull/125688
2025-02-01[ELF] Replace inExpr with lexState. NFCFangrui Song1-13/+12
We may add another state State::Wild to behave more lik GNU ld.
2025-01-21[LLD] [ELF] Add support for linker script unary plus operator (#121508)Parth Arora1-0/+2
This commit adds support for linker script unary plus ('+') operator. It is helpful for improving compatibility between LLD and GNU LD. Closes #118047
2024-11-16[ELF] Replace functions bAlloc/saver/uniqueSaver with member accessFangrui Song1-3/+3
2024-11-16[ELF] Pass ctx to bAlloc/saver/uniqueSaverFangrui Song1-3/+3
2024-11-14[ELF] Migrate away from global ctxFangrui Song1-6/+6
2024-11-07[ELF] Replace errorCount with errCount(ctx)Fangrui Song1-8/+8
to reduce reliance on the global context.
2024-11-06[ELF] Replace errorOrWarn(...) with ErrFangrui Song1-2/+2
2024-11-06[ELF] Replace warn(...) with WarnFangrui Song1-1/+2
2024-11-06[ELF] Replace error(...) with ErrAlways or ErrFangrui Song1-9/+9
Most are migrated to ErrAlways mechanically. In the future we should change most to Err.
2024-10-06[ELF] Move static nextGroupId isInGroup to LinkerDriverFangrui Song1-5/+3
2024-10-06[ELF] Pass Ctx & to InputFilesFangrui Song1-3/+3
2024-10-03[ELF] Pass Ctx & to OutputSectionsFangrui Song1-1/+1
2024-09-23[ELF] Move elf::symtab into CtxFangrui Song1-1/+1
Remove the global variable `symtab` and add a member variable (`std::unique_ptr<SymbolTable>`) to `Ctx` instead. This is one step toward eliminating global states. Pull Request: https://github.com/llvm/llvm-project/pull/109612
2024-09-21[ELF] ScriptParser: make Ctx & a member variable. NFCFangrui Song1-44/+48
Lambda captures need adjusting.
2024-09-21[ELF] ScriptParser: pass Ctx to ScriptParser and ScriptLexer. NFCFangrui Song1-50/+51
2024-08-21[ELF] Move target to Ctx. NFCFangrui Song1-1/+1
Ctx was introduced in March 2022 as a more suitable place for such singletons. Follow-up to driver (2022-10) and script (2024-08).
2024-08-21[ELF] Move script into Ctx. NFCFangrui Song1-48/+52
Ctx was introduced in March 2022 as a more suitable place for such singletons. We now use default-initialization for `LinkerScript` and should pay attention to non-class types (e.g. `dot` is initialized by commit 503907dc505db1e439e7061113bf84dd105f2e35).
2024-08-05[LLD] Add CLASS syntax to SECTIONS (#95323)Daniel Thornburgh1-5/+52
This allows the input section matching algorithm to be separated from output section descriptions. This allows a group of sections to be assigned to multiple output sections, providing an explicit version of --enable-non-contiguous-regions's spilling that doesn't require altering global linker script matching behavior with a flag. It also makes the linker script language more expressive even if spilling is not intended, since input section matching can be done in a different order than sections are placed in an output section. The implementation reuses the backend mechanism provided by --enable-non-contiguous-regions, so it has roughly similar semantics and limitations. In particular, sections cannot be spilled into or out of INSERT, OVERWRITE_SECTIONS, or /DISCARD/. The former two aren't intrinsic, so it may be possible to relax those restrictions later.
2024-07-28[ELF] --defsym: support quoted LHSFangrui Song1-6/+6
and move = splitting from Driver.cpp to ScriptParser.cpp.
2024-07-28[ELF] Respect --sysroot for INCLUDEFangrui Song1-14/+1
If an included script is under the sysroot directory, when it opens an absolute path file (`INPUT` or `GROUP`), add sysroot before the absolute path. When the included script ends, the `isUnderSysroot` state is restored.
2024-07-27[ELF] Output section phdr: support quoted namesFangrui Song1-1/+1
2024-07-27[ELF] INSERT [AFTER|BEFORE]: support quoted namesFangrui Song1-1/+1
2024-07-27[ELF] Fix INCLUDE cycle detectionFangrui Song1-5/+1
Fix #93947: the cycle detection mechanism added by https://reviews.llvm.org/D37524 also disallowed including a file twice, which is an unnecessary limitation. Now that we have an include stack #100493, supporting multiple inclusion is trivial. Note: a filename can be referenced with many different paths, e.g. a.lds, ./a.lds, ././a.lds. We don't attempt to detect the cycle in the earliest point.
2024-07-27[ELF] OUTPUT_ARCH: report unclosed errorFangrui Song1-1/+1
2024-07-27[ELF] Replace unquote(next()) with readName. NFCFangrui Song1-15/+15
2024-07-27[ELF] Memory region: support quoted namesFangrui Song1-2/+2
2024-07-27[ELF] OVERLAY: support quoted output section namesFangrui Song1-1/+2
2024-07-27[ELF] REGION_ALIAS: support quoted namesFangrui Song1-1/+1
2024-07-27[ELF] Replace unquote(next()) with readName. NFCFangrui Song1-19/+18
2024-07-27[ELF] PROVIDE: allow quoted names to be discardedFangrui Song1-4/+6
Extend commit ebb326a51fec37b5a47e5702e8ea157cd4f835cd for (#74771) to support quoted names, e.g. `PROVIDE("f1" = f2 + f3);`.
2024-07-27[ELF] Simplify readAssignmentFangrui Song1-15/+14
After #100493, the `=` support from fe0de25b2195b66d1ebac5d3ebdb18f9e1e776da can be simplified.
2024-07-27[ELF] Updated some while conditions with till (#100893)Hongyu Chen1-10/+8
This change is based on [commit](https://github.com/llvm/llvm-project/commit/b32c38ab5b4cf5c66469180ba3594e98eff2c124) for a cleaner API usage. Thanks to @MaskRay !
2024-07-26[ELF] Replace some while (peek() != ")" && !atEOF()) with tillFangrui Song1-8/+4
2024-07-26[ELF] Replace some while (peek() != ")" && !atEOF()) with tillFangrui Song1-14/+9
2024-07-26[ELF] Add till and rewrite while (... consume("}"))Fangrui Song1-11/+7
After #100493, the idiom `while (!errorCount() && !consume("}"))` could lead to inaccurate diagnostics or dead loops. Introduce till to change the code pattern.
2024-07-26[ELF] ScriptLexer: generate tokens lazilyFangrui Song1-24/+48
The current tokenize-whole-file approach has a few limitations. * Lack of state information: `maybeSplitExpr` is needed to parse expressions. It's infeasible to add new states to behave more like GNU ld. * `readInclude` may insert tokens in the middle, leading to a time complexity issue with N-nested `INCLUDE`. * line/column information for diagnostics are inaccurate, especially after an `INCLUDE`. * `getLineNumber` cannot be made more efficient without significant code complexity and memory consumption. https://reviews.llvm.org/D104137 The patch switches to a traditional lexer that generates tokens lazily. * `atEOF` behavior is modified: we need to call `peek` to determine EOF. * `peek` and `next` cannot call `setError` upon `atEOF`. * Since `consume` no longer reports an error upon `atEOF`, the idiom `while (!errorCount() && !consume(")"))` would cause a dead loop. Use `while (peek() != ")" && !atEOF()) { ... } expect(")")` instead. * An include stack is introduced to handle `readInclude`. This can be utilized to address #93947 properly. * `tokens` and `pos` are removed. * `commandString` is reimplemented. Since it is used in -Map output, `\n` needs to be replaced with space. Pull Request: https://github.com/llvm/llvm-project/pull/100493
2024-07-23[ELF] Remove `consumeLabel` in ScriptLexer (#99567)Hongyu Chen1-8/+8
This commit removes `consumeLabel` since we can just use consume function to have the same functionalities.
2024-07-20 [ELF] Delete peek2 in Lexer (#99790)Hongyu Chen1-14/+14
Thanks to Fangrui's change https://github.com/llvm/llvm-project/commit/28045ceab08d41a8a42d93ebc445e8fe906f884c so peek2 can be removed.
2024-07-20[ELF] Simplify readExpr. NFCFangrui Song1-5/+3
2024-07-20[ELF] Support (TYPE=<value>) beside output section addressFangrui Song1-13/+15
Support `preinit_array . (TYPE=SHT_PREINIT_ARRAY) : { QUAD(16) }` Follow-up to https://reviews.llvm.org/D118840 peek2() could be eliminated by a future change.
2024-07-17[ELF] Support NOCROSSREFS and NOCROSSERFS_TOFangrui Song1-0/+16
Implement the two commands described by https://sourceware.org/binutils/docs/ld/Miscellaneous-Commands.html After `outputSections` is available, check each output section described by at least one `NOCROSSREFS`/`NOCROSSERFS_TO` command. For each checked output section, scan relocations from its input sections. This step is slow, therefore utilize `parallelForEach(isd->sections, ...)`. To support non SHF_ALLOC sections, `InputSectionBase::relocations` (empty) cannot be used. In addition, we may explore eliminating this member to speed up relocation scanning. Some parse code is adapted from #95714. Close #41825 Pull Request: https://github.com/llvm/llvm-project/pull/98773
2024-07-16[lld] Add emulation support for hexagon (#98857)Brian Cain1-0/+1
2024-07-16[ELF] OUTPUT_FORMAT: support "binary" and ignore extra OUTPUT_FORMAT commandsFangrui Song1-7/+15
This patch improves GNU ld compatibility. Close #87891: Support `OUTPUT_FORMAT(binary)`, which is like --oformat=binary. --oformat=binary takes precedence over an ELF `OUTPUT_FORMAT`. In addition, if more than one OUTPUT_FORMAT command is specified, only check the first one. Pull Request: https://github.com/llvm/llvm-project/pull/98837