aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Object/WasmObjectFile.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-03-04[lld][WebAssembly] Support for the custom-page-sizes WebAssembly proposal ↵Nick Fitzgerald1-0/+6
(#128942) This commit adds support for WebAssembly's custom-page-sizes proposal to `wasm-ld`. An overview of the proposal can be found [here](https://github.com/WebAssembly/custom-page-sizes/blob/main/proposals/custom-page-sizes/Overview.md). In a sentence, it allows customizing a Wasm memory's page size, enabling Wasm to target environments with less than 64KiB of memory (the default Wasm page size) available for Wasm memories. This commit contains the following: * Adds a `--page-size=N` CLI flag to `wasm-ld` for configuring the linked Wasm binary's linear memory's page size. * When the page size is configured to a non-default value, then the final Wasm binary will use the encodings defined in the custom-page-sizes proposal to declare the linear memory's page size. * Defines a `__wasm_first_page_end` symbol, whose address points to the first page in the Wasm linear memory, a.k.a. is the Wasm memory's page size. This allows writing code that is compatible with any page size, and doesn't require re-compiling its object code. At the same time, because it just lowers to a constant rather than a memory access or something, it enables link-time optimization. * Adds tests for these new features. r? @sbc100 cc @sunfishcode
2025-02-24[object][WebAssembly] Add support for RUNTIME_PATH to yaml2obj and obj2yaml ↵Hood Chatham1-0/+7
(#126080) This is the first step of adding RPATH support for wasm. See corresponding update to the WebAssembly/tool-conventions repo on dynamic linking: https://github.com/WebAssembly/tool-conventions/pull/246
2025-02-04Revert "[Object][WebAssembly] Fix data segment offsets higher than 2^31 ↵Sam Clegg1-2/+2
(#125739)" (#125786) This reverts commit c798a5c4d5c3c8cb21e6001f505d8f44217c2244. This broke bunch of test the emscripten side. Reverting while we investigate.
2025-02-04[Object][WebAssembly] Fix data segment offsets higher than 2^31 (#125739)Sam Clegg1-2/+2
Fixes: #58555
2025-01-17[WebAssembly][Object] Support more elem segment flags (#123427)Derek Schuff1-14/+27
Some tools (e.g. Rust tooling) produce element segment descriptors with neither elemkind or element type descriptors, but with init exprs instead of func indices (this is with the flags value of 4 in https://webassembly.github.io/spec/core/binary/modules.html#element-section). LLVM doesn't fully model reference types or the various ways to initialize element segments, but we do want to correctly parse and skip over all type sections, so this change updates the object parser to handle that case, and refactors for more clarity. The test file is updated to include one additional elem segment with a flags value of 4, an initializer value of (32.const 0) and an empty vector. Also support parsing files that export imported (undefined) functions.
2024-11-29[Support][Error] Add ErrorAsOutParameter constructor that takes an Error by ref.Lang Hames1-1/+1
ErrorAsOutParameter's Error* constructor supports cases where an Error might not be passed in (because in the calling context it's known that this call won't fail). Most clients always have an Error present however, and for them an Error& overload is more convenient.
2024-11-19[Object] Remove unused includes (NFC) (#116750)Kazu Hirata1-3/+0
Identified with misc-include-cleaner.
2024-11-12[llvm] Remove redundant control flow statements (NFC) (#115831)Kazu Hirata1-1/+0
Identified with readability-redundant-control-flow.
2024-11-04[WebAssembly] Remove WASM_FEATURE_PREFIX_REQUIRED (NFC) (#113729)Heejin Ahn1-1/+0
This has not been emitted since https://github.com/llvm/llvm-project/commit/3f34e1b883351c7d98426b084386a7aa762aa366. The corresponding proposed tool-conventions change: https://github.com/WebAssembly/tool-conventions/pull/236
2024-07-12[lld][WebAssembly] Report undefined symbols in -shared/-pie builds (#75242)Sam Clegg1-5/+8
Previously we would ignore all undefined symbols when using `-shared` or `-pie`. All undefined symbols would be treated as imports regardless of whether those symbols we defined in any shared library. With this change we now track symbol in shared libraries and report undefined symbols in the main program by default. The old behavior is still available via the `--unresolved-symbols=import-dynamic` command line flag. This rationale for allowing this type of breaking change is that `-pie` and `-shared` are both still experimental will warn as such, unless `--experimental-pic` is passed. As part of this change the linker now models shared library symbols via new SharedFunctionSymbol and SharedDataSymbol types. I've also added a new `--no-shlib-sigcheck` option that bypassed the checking of functions signature in shared libraries. This is specifically required by emscripten the case where the imports/exports of shared libraries have been modified by via JS type legalization (this is only needed when targeting old JS engines where bigint is not yet available See https://github.com/emscripten-core/emscripten/issues/18198
2024-05-28[WebAssembly] Add exnref type (#93586)Heejin Ahn1-2/+6
This adds (back) the exnref type restored in the new EH proposal adopted in Oct 2023 CG meeting: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md:x
2024-02-15[Object][Wasm] Use offset instead of index for Global address and store size ↵Derek Schuff1-12/+20
(#81781) Currently the address reported by binutils for a global is its index; but its offset (in the file or section) is more useful for binary size attribution. This PR treats globals similarly to functions, and tracks their offset and size. It also centralizes the logic differentiating linked from object and dylib files (where section addresses are 0).
2024-02-09[llvm-nm][WebAssembly] Print function symbol sizes (#81315)Derek Schuff1-0/+14
nm already prints sizes for data symbols. Do that for function symbols too, and update objdump to also print size information. Implements item 3 from https://github.com/llvm/llvm-project/issues/76107
2024-02-08[Object][WebAssembly] Improve error on invalid relocation (#81203)Sam Clegg1-22/+18
See https://github.com/emscripten-core/emscripten/issues/21140
2024-02-08[Object][Wasm] Generate symbol info from name section names (#81063)Derek Schuff1-4/+45
Currently symbol info is generated from a linking section or from export names. This PR generates symbols in a WasmObjectFile from the name section as well, which allows tools like objdump and nm to show useful information for more linked binaries. There are some limitations: most notably that we don't assume any particular ABI, so we don't get detailed information about data symbols if the segments are merged (which is the default). Covers most of the desired functionality from #76107
2024-02-07[Object][Wasm] Use file offset for section addresses in linked wasm files ↵Derek Schuff1-1/+7
(#80529) Wasm has no unified virtual memory space as other object formats and architectures do, so previously WasmObjectFile reported 0 for all section addresses, and until 428cf71ff used section offsets for function symbols. Now we use file offsets for function symbols, and this change switches section addresses to do the same (in linked files). The main result of this is that objdump now reports VMAs in section listings, and also uses file offets rather than section offsets when disassembling linked binaries (matching the behavior of other disassemblers and stack traces produced by browwsers). To make this work, this PR also updates objdump's generation of synthetics fallback symbols to match lib/Object and also correctly plumbs symbol types for regular and dummy symbols through to the backend to avoid needing special knowledge of address 0. This also paves the way for generating symbols from name sections rather than symbol tables or imports (see #76107) by allowing the disassembler's synthetic fallback symbols match the name-section generated symbols (in a followup PR).
2024-02-02[Object][Wasm] Move WasmSymbolInfo directly into WasmSymbol (NFC) (#80219)Derek Schuff1-9/+2
Move the WasmSymbolInfos from their own vector on the WasmLinkingData directly into the WasmSymbol object. Removing the const-ref to an external object allows the vector of WasmSymbols to be safely expanded/reallocated; generating symbol info from the name section will require this, as the numbers of function and data segment names are stored separately. This is a step toward generating symbol information from name sections for #76107
2024-01-25[Object][Wasm] Allow parsing of GC types in type and table sections (#79235)Derek Schuff1-28/+145
This change allows a WasmObjectFile to be created from a wasm file even if it uses typed funcrefs and GC types. It does not significantly change how lib/Object models its various internal types (e.g. WasmSignature, WasmElemSegment), so LLVM does not really "support" or understand such files, but it is sufficient to parse the type, global and element sections, discarding types that are not understood. This is useful for low-level binary tools such as nm and objcopy, which use only limited aspects of the binary (such as function definitions) or deal with sections as opaque blobs. This is done by allowing `WasmValType` to have a value of `OTHERREF` (representing any unmodeled reference type), and adding a field to `WasmSignature` indicating it's a placeholder for an unmodeled reference type (since there is a 1:1 correspondence between WasmSignature objects and types in the type section). Then the object file parsers for the type and element sections are expanded to parse encoded reference types and discard any unmodeled fields.
2024-01-17[WebAssembly] Use ValType instead of integer types to model wasm tables (#78012)Derek Schuff1-11/+12
LLVM models some features found in the binary format with raw integers and others with nested or enumerated types. This PR switches modeling of tables and segments to use wasm::ValType rather than uint32_t. This NFC change is in preparation for modeling more reference types, but IMO is also cleaner and closer to the spec.
2024-01-03Reland "[WebAssembly][Object]Use file offset as function symbol address for ↵Derek Schuff1-4/+13
linked files (#76198)" WebAssembly doesn't have a single virtual memory space the way other object formats or architectures do, so "addresses" mean different things depending on the context. Function symbol addresses in object files are offsets from the start of the code section. This is good for linking and relocation. However when dealing with linked binaries, offsets from the start of the file/module are more often used (e.g. for stack traces in browsers), and are more useful for use cases like binary size attribution. This PR changes Object to use the file offset instead of the section offset for function symbols, but only for linked (non-DSO) files. This is a reland of fc5f51cf with a fix for the MSan failure (it was not caused by this change, but it was revealed by the new tests).
2024-01-03Revert "[WebAssembly][Object]Use file offset as function symbol address for ↵Mitch Phillips1-12/+4
linked files (#76198)" This reverts commit fc5f51cf5af4364b38bf22e491d46e1e892ade0c. Reason: Broke the sanitizer buildbot - https://lab.llvm.org/buildbot/#/builders/5/builds/39751/steps/12/logs/stdio
2024-01-02[WebAssembly][Object]Use file offset as function symbol address for linked ↵Derek Schuff1-4/+12
files (#76198) WebAssembly doesn't have a single virtual memory space the way other object formats or architectures do, so "addresses" mean different things depending on the context. Function symbol addresses in object files are offsets from the start of the code section. This is good for linking and relocation. However when dealing with linked binaries, offsets from the start of the file/module are more often used (e.g. for stack traces in browsers), and are more useful for use cases like binary size attribution. This PR changes Object to use the file offset instead of the section offset for function symbols, but only for linked (non-DSO) files. This implements item number 4 from #76107
2023-12-26[WebAssembly] Add bounds check in parseCodeSection (#76407)DavidKorczynski1-0/+5
This is needed as otherwise `Ctx.Ptr` will be incremented to a position outside it's available buffer, which is being used to read values e.g. https://github.com/llvm/llvm-project/blob/966d564e43e650b9c34f9c67829d3947f52add91/llvm/lib/Object/WasmObjectFile.cpp#L1469 Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=28856 Signed-off-by: David Korczynski <david@adalogics.com>
2023-12-21[WebAssembly][Object] Record section start offsets at start of payload (#76188)Derek Schuff1-1/+1
LLVM ObjectFile currently records the start offsets of sections as the start of the section header, whereas most other tools (WABT, emscripten, wasm-tools) record it as the start of the section content, after the header. This affects binutils tools such as objdump and nm, but not compilation/assembly (since that is driven by symbols and assembler labels which already have their values inside the section payload rather in the header. This patch updates LLVM to match the other tools.
2023-12-20[WebAssembly] Add symbol information for shared libraries (#75238)Sam Clegg1-3/+47
The current (experimental) spec for WebAssembly shared libraries does not include a full symbol table like the object format. This change extracts symbol information from the normal wasm exports. This is the first step in having the linker report undefined symbols when linking with shared libraries. The current behaviour is to ignore all undefined symbols when linking with `-pie` or `-shared`. See https://github.com/emscripten-core/emscripten/issues/18198
2023-12-11[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)Kazu Hirata1-1/+1
This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.
2023-10-03[WebAssembly] Allow absolute symbols in the linking section (symbol table) ↵Sam Clegg1-9/+13
(#67493) Fixes a crash in `-Wl,-emit-relocs` where the linker was not able to write linker-synthetic absolute symbols to the symbol table. This change adds a new symbol flag (`WASM_SYMBOL_ABS`), which means that the symbol's offset is absolute and not relative to a given segment. Such symbols include `__stack_low` and `__stack_low`. Note that wasm object files never contains such symbols, only binaries linked with `-Wl,-emit-relocs`. Fixes: #67111
2023-08-25[llvm-nm][WebAssembly] Report the size of data symbolsSam Clegg1-1/+1
Fixes: https://github.com/llvm/llvm-project/issues/58839 Differential Revision: https://reviews.llvm.org/D158799
2023-07-27[WebAssembly][Objcopy] Write output section headers identically to inputsDerek Schuff1-0/+4
Previously when objcopy generated section headers, it padded the LEB that encodes the section size out to 5 bytes, matching the behavior of clang. This is correct, but results in a binary that differs from the input. This can sometimes have undesirable consequences (e.g. breaking source maps). This change makes the object reader remember the size of the LEB encoding in the section header, so that llvm-objcopy can reproduce it exactly. For sections not read from an object file (e.g. that llvm-objcopy is adding itself), pad to 5 bytes. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D155535
2023-07-11[WebAssembly] Support `annotate` clang attributes for marking functions.Brendan Dahl1-0/+2
Annotation attributes may be attached to a function to mark it with custom data that will be contained in the final Wasm file. The annotation causes a custom section named "func_attr.annotate.<name>.<arg0>.<arg1>..." to be created that will contain each function's index value that was marked with the annotation. A new patchable relocation type for function indexes had to be created so the custom section could be updated during linking. Reviewed By: sbc100 Differential Revision: https://reviews.llvm.org/D150803
2023-06-26Move SubtargetFeature.h from MC to TargetParserJob Noorman1-1/+1
SubtargetFeature.h is currently part of MC while it doesn't depend on anything in MC. Since some LLVM components might have the need to work with target features without necessarily needing MC, it might be worthwhile to move SubtargetFeature.h to a different location. This will reduce the dependencies of said components. Note that I choose TargetParser as the destination because that's where Triple lives and SubtargetFeatures feels related to that. This issues came up during a JITLink review (D149522). JITLink would like to avoid a dependency on MC while still needing to store target features. Reviewed By: MaskRay, arsenm Differential Revision: https://reviews.llvm.org/D150549
2023-02-27[lld][WebAssembly] Fix handling of mixed strong and weak referencesSam Clegg1-1/+12
When adding a undefined symbols to the symbol table, if the existing reference is weak replace the symbol flags with (potentially) non-weak binding. Fixes: https://github.com/llvm/llvm-project/issues/60829 Differential Revision: https://reviews.llvm.org/D144747
2023-02-07[NFC][TargetParser] Remove llvm/ADT/Triple.hArchibald Elliott1-1/+1
I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.
2023-01-16[llvm-objdump][RISCV] Use new common method to parse ARCH RISCV attributeElena Lepilkina1-1/+1
Differential Revision: https://reviews.llvm.org/D139553
2022-12-09Revert D139098 "[Alignment] Use Align for ObjectFile::getSectionAlignment"Guillaume Chatelet1-2/+2
This breaks lld. This reverts commit 10c47465e2505ddfee4e62a2ab2e535abea3ec56.
2022-12-09[Alignment] Use Align for ObjectFile::getSectionAlignmentGuillaume Chatelet1-2/+2
Differential Revision: https://reviews.llvm.org/D139098
2022-08-31[lld][WebAssemby] Allow import module names to be empty strings.Dan Gohman1-12/+4
The component-model [canonical ABI] is currently using import names with empty strings. Remove the special cases for empty strings from WasmObjectFile.cpp so that they can pass through as-is. [canonical ABI]: https://github.com/WebAssembly/component-model/blob/main/design/mvp/CanonicalABI.md Differential Revision: https://reviews.llvm.org/D133037
2022-07-17[llvm] Modernize bool literals (NFC)Kazu Hirata1-1/+1
Identified with modernize-use-bool-literals.
2022-06-23[WebAssembly][Object] Remove requirement that objects must have code sectionsDerek Schuff1-13/+3
When parsing name and linking sections, we currently require that the object must have a code section (it seems that this was intended to verify section ordering). However it can be useful for binaries to have their code sections stripped out (e.g. if we just want the debug info). In that case we need the rest of the known sections (so e.g. we know how many functions there are, to verify the name section) but not the actual code. I've removed the restriction completely. I think this is OK because the section-parsing code already checks function and global indices in many places for validity and will return appropriate errors if the relevant sections are missing. Also we can't just replace the requirement of seeing a code section with a requirement that we see a function or global section, because a binary may just not have any functions or globals. But there's only an problem if the name or linking section tries to name a nonexistent function. Part of a fix for https://github.com/emscripten-core/emscripten/issues/13084 Differential Revision: https://reviews.llvm.org/D128094
2022-06-20[llvm] Don't use Optional::getValue (NFC)Kazu Hirata1-1/+1
2022-06-07[WebAssembly] Add WASM_SEC_LAST_KNOWN to BinaryFormat section types list [NFC]Derek Schuff1-1/+1
There are 3 places where we were using WASM_SEC_TAG as the "last" known section type, which requires updating (or leaves a bug) when a new known section type is added. Instead add a "last type" to the enum for this purpose. Differential Revision: https://reviews.llvm.org/D127164
2022-05-27[WebAssembly] Consolidate sectionTypeToString in BinaryFormat [NFC]Derek Schuff1-21/+3
Currently there are 2 duplicate implementation, and I want to add a use in a 3rd place. Combine them in lib/BinaryFormat so they can be shared. Also update toString for symbol and reloc types to use StringRef Differential Revision: https://reviews.llvm.org/D126553
2022-03-15[WebAssembly] Fix asan issue from https://reviews.llvm.org/D121349Sam Clegg1-0/+1
2022-03-14[WebAssembly] Fix asan issue from https://reviews.llvm.org/D121349Sam Clegg1-0/+1
2022-03-14[WebAssembly] Second phase of implemented extended const proposalSam Clegg1-21/+56
This change continues to lay the ground work for supporting extended const expressions in the linker. The included test covers object file reading and writing and the YAML representation. Differential Revision: https://reviews.llvm.org/D121349
2022-02-10Cleanup LLVMObject headersserge-sans-paille1-2/+0
Most notably, llvm/Object/Binary.h no longer includes llvm/Support/MemoryBuffer.h llvm/Object/MachOUniversal*.h no longer include llvm/Object/Archive.h llvm/Object/TapiUniversal.h no longer includes llvm/Object/TapiFile.h llvm-project preprocessed size: before: 1068185081 after: 1068324320 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119457
2021-11-05[NFC] Inclusive language: Remove instances of master in URLsQuinn Pham1-1/+1
[NFC] This patch fixes URLs containing "master". Old URLs were either broken or redirecting to the new URL. Reviewed By: #libc, ldionne, mehdi_amini Differential Revision: https://reviews.llvm.org/D113186
2021-10-15[WebAssembly] Add import info to `dylink` section of shared librariesSam Clegg1-0/+8
See https://github.com/WebAssembly/tool-conventions/pull/175 Differential Revision: https://reviews.llvm.org/D111345
2021-10-12[WebAssembly] Make EH work with dynamic linkingHeejin Ahn1-2/+4
This makes Wasm EH work with dynamic linking. So far we were only able to handle destructors, which do not use any tags or LSDA info. 1. This uses `TargetExternalSymbol` for `GCC_except_tableN` symbols, which points to the address of per-function LSDA info. It is more convenient to use than `MCSymbol` because it can take additional target flags. 2. When lowering `wasm_lsda` intrinsic, if PIC is enabled, make the symbol relative to `__memory_base` and generate the `add` node. If PIC is disabled, continue to use the absolute address. 3. Make tag symbols (`__cpp_exception` and `__c_longjmp`) undefined in the backend, because it is hard to make it work with dynamic linking's loading order. Instead, we make all tag symbols undefined in the LLVM backend and import it from JS. 4. Add support for undefined tags to the linker. Companion patches: - https://github.com/WebAssembly/binaryen/pull/4223 - https://github.com/emscripten-core/emscripten/pull/15266 Reviewed By: sbc100 Differential Revision: https://reviews.llvm.org/D111388
2021-10-05[WebAssembly] Remove WasmTagTypeHeejin Ahn1-10/+21
This removes `WasmTagType`. `WasmTagType` contained an attribute and a signature index: ``` struct WasmTagType { uint8_t Attribute; uint32_t SigIndex; }; ``` Currently the attribute field is not used and reserved for future use, and always 0. And that this class contains `SigIndex` as its property is a little weird in the place, because the tag type's signature index is not an inherent property of a tag but rather a reference to another section that changes after linking. This makes tag handling in the linker also weird that tag-related methods are taking both `WasmTagType` and `WasmSignature` even though `WasmTagType` contains a signature index. This is because the signature index changes in linking so it doesn't have any info at this point. This instead moves `SigIndex` to `struct WasmTag` itself, as we did for `struct WasmFunction` in D111104. In this CL, in lib/MC and lib/Object, this now treats tag types in the same way as function types. Also in YAML, this removes `struct Tag`, because now it only contains the tag index. Also tags set `SigIndex` in `WasmImport` union, as functions do. I think this makes things simpler and makes tag handling more in line with function handling. These two shares similar properties in that both of them have signatures, but they are kind of nominal so having the same signature doesn't mean they are the same element. Also a drive-by fix: the reserved 'attirubute' part's encoding changed from uleb32 to uint8 a while ago. This was fixed in lib/MC and lib/Object but not in YAML. This doesn't change object files because the field's value is always 0 and its encoding is the same for the both encoding. This is effectively NFC; I didn't mark it as such just because it changed YAML test results. Reviewed By: sbc100, tlively Differential Revision: https://reviews.llvm.org/D111086