aboutsummaryrefslogtreecommitdiff
path: root/lldb/source/Plugins/ObjectFile
AgeCommit message (Collapse)AuthorFilesLines
8 hoursRevert "[lldb][MachO][NFC] Extract ObjC metadata symbol parsing into helper ↵Augusto Noronha1-54/+124
function (#161536)" This reverts commit 23e081524fd9f64fb3430822e879b6dc36a1d3f1.
9 hours[lldb][MachO][NFC] Extract ObjC metadata symbol parsing into helper function ↵Michael Buch1-124/+54
(#161536) Just a simple de-duplication of the same code. We saw a bug here recently (https://github.com/llvm/llvm-project/pull/161521). Might as well isolate this all in one place. rdar://158159242
10 hours[lldb][MachO] Fix inspection of global variables that start with 'O' (#161521)Michael Buch1-48/+42
On Darwin C-symbols are prefixed with a '_'. The LLDB Macho-O parses handles Objective-C metadata symbols starting with '_OBJC' specially. Previously global symbols starting with a '_O' prefix were lost because of incorrectly scoped if-guards. This patch removes those checks. There is more cleanup that can be done in this file because there's a bunch of duplicated checks for these ObjC symbols. I decided to leave that for an NFC follow-up. Depends on https://github.com/llvm/llvm-project/pull/161520 rdar://158159242
6 daysModify ObjectFileELF so it can load notes from PT_NOTE segments. (#160652)Greg Clayton1-0/+18
The ObjectFileELF parser was not able to load ELF notes from PT_NOTE program headers. This patch fixes ObjectFileELF::GetUUID() to check the program header and parse the notes in any PT_NOTE segments. This will allow memory ELF files to extract the UUID from an in memory image that has no section headers. Added a test that creates an ELF file, strips all section headers, and then makes sure that LLDB can see the UUID value.
12 days[lldb][MachO] Local structs for larger VA offsets (#159849)Jason Molenda2-43/+171
The Mach-O file format has several load commands which specify the location of data in the file in UInt32 offsets. lldb uses these same structures to track the offsets of the binary in virtual address space when it is running. Normally a binary is loaded in memory contiguously, so this is fine, but on Darwin systems there is a "system shared cache" where all system libraries are combined into one region of memory and pre-linked. The shared cache has the TEXT segments for every binary loaded contiguously, then the DATA segments, and finally a shared common LINKEDIT segment for all binaries. The virtual address offset from the TEXT segment for a libray to the LINKEDIT may exceed 4GB of virtual address space depending on the structure of the shared cache, so this use of a UInt32 offset will not work. There was an initial instance of this issue that I fixed last November in https://github.com/llvm/llvm-project/pull/117832 where I fixed this issue for the LC_SYMTAB / `symtab_command` structure. But we have the same issue now with three additional structures; `linkedit_data_command`, `dyld_info_command`, and `dysymtab_command`. For all of these we can see the pattern of `dyld_info.export_off += linkedit_slide` applied to the offset fields in ObjectFileMachO. This defines local structures that mirror the Mach-O structures, except that it uses UInt64 offset fields so we can reuse the same field for a large virtual address offset at runtime. I defined ctor's from the genuine structures, as well as operator= methods so the structures can be read from the Mach-O binary into the standard object, then copied into our local expanded versions of them. These structures are ABI in Mach-O and cannot change their layout. The alternative is to create local variables alongside these Mach-O load command objects for the offsets that we care about, adjust those by the correct VA offsets, and only use those local variables instead of the fields in the objects. I took the approach of the local enhanced structure in November and I think it is the cleaner approach. rdar://160384968
2025-09-09[lldb] Unwind through ARM Cortex-M exceptions automatically (#153922)Jason Molenda2-0/+38
When a processor faults/is interrupted/gets an exception, it will stop running code and jump to an exception catcher routine. Most processors will store the pc that was executing in a system register, and the catcher functions have special instructions to retrieve that & possibly other registers. It may then save those values to stack, and the author can add .cfi directives to tell lldb's unwinder where to find those saved values. ARM Cortex-M (microcontroller) processors have a simpler mechanism where a fixed set of registers are saved to the stack on an exception, and a unique value is put in the link register to indicate to the caller that this has taken place. No special handling needs to be written into the exception catcher, unless it wants to inspect these preserved values. And it is possible for a general stack walker to walk the stack with no special knowledge about what the catch function does. This patch adds an Architecture plugin method to allow an Architecture to override/augment the UnwindPlan that lldb would use for a stack frame, given the contents of the return address register. It resembles a feature where the LanguageRuntime can replace/augment the unwind plan for a function, but it is doing it at offset by one level. The LanguageRuntime is looking at the local register context and/or symbol name to decide if it will override the unwind rules. For the Cortex-M exception unwinds, we need to modify THIS frame's unwind plan if the CALLER's LR had a specific value. RegisterContextUnwind has to retrieve the caller's LR value before it has completely decided on the UnwindPlan it will use for THIS stack frame. This does mean that we will need one additional read of stack memory than we currently do when unwinding, on Armv7 Cortex-M targets. The unwinder walks the stack lazily, as stack frames are requested, and so now if you ask for 2 stack frames, we will read enough stack to walk 2 frames, plus we will read one extra word of memory, the spilled LR value from the stack. In practice, with 512-byte memory cache reads, this is unlikely to be a real performance hit. This PR includes a test with a yaml corefile description and a JSON ObjectFile, incorporating all of the necessary stack memory and symbol names from a real debug session I worked on. The architectural default unwind plans are used for all stack frames except the 0th because there's no instructions for the functions, and no unwind info. I may need to add an encoding of unwind fules to ObjectFileJSON in the future as we create more test cases like this. This PR depends on the yaml2macho-core utility from https://github.com/llvm/llvm-project/pull/153911 to run its API test. rdar://110663219
2025-08-27[LLDB] Omit loading local symbols in LLDB symbol table (#154809)barsolo20001-1/+40
https://discourse.llvm.org/t/rfc-should-we-omit-local-symbols-in-eekciihgtfvflvnbieicunjlrtnufhuelf-files-from-the-lldb-symbol-table/87384 Improving symbolication by excluding local symbols that are typically not useful for debugging or symbol lookups. This aligns with the discussion that local symbols, especially those with STB_LOCAL binding and STT_NOTYPE type (including .L-prefixed symbols), often interfere with symbol resolution and can be safely omitted. --------- Co-authored-by: Bar Soloveychik <barsolo@fb.com>
2025-08-26[lldb] Corretly parse Wasm segments (#154727)Jonas Devlieghere2-128/+214
My original implementation for parsing Wasm segments was wrong in two related ways. I had a bug in calculating the file vm address and I didn't fully understand the difference between active and passive segments and how that impacted their file vm address. With this PR, we now support parsing init expressions for active segments, rather than just skipping over them. This is necessary to determine where they get loaded. Similar to llvm-objdump, we currently only support simple opcodes (i.e. constants). We also currently do not support active segments that use a non-zero memory index. However this covers all segments for a non-trivial Swift binary compiled to Wasm.
2025-08-26[lldb] Do not use LC_FUNCTION_STARTS data to determine symbol size as ↵Alex Langford1-122/+0
symbols are created (#155282) Note: This is a resubmission of #106791. I had to revert this a year ago for a failing test that I could not understand. I have time now to try and get this in again. Summary: This improves the performance of ObjectFileMacho::ParseSymtab by removing eager and expensive work in favor of doing it later in a less-expensive fashion. Experiment: My goal was to understand LLDB's startup time. First, I produced a Debug build of LLDB (no dSYM) and a Release+NoAsserts build of LLDB. The Release build debugged the Debug build as it debugged a small C++ program. I found that ObjectFileMachO::ParseSymtab accounted for somewhere between 1.2 and 1.3 seconds consistently. After applying this change, I consistently measured a reduction of approximately 100ms, putting the time closer to 1.1s and 1.2s on average. Background: ObjectFileMachO::ParseSymtab will incrementally create symbols by parsing nlist entries from the symtab section of a MachO binary. As it does this, it eagerly tries to determine the size of symbols (e.g. how long a function is) using LC_FUNCTION_STARTS data (or eh_frame if LC_FUNCTION_STARTS is unavailable). Concretely, this is done by performing a binary search on the function starts array and calculating the distance to the next function or the end of the section (whichever is smaller). However, this work is unnecessary for 2 reasons: 1. If you have debug symbol entries (i.e. STABs), the size of a function is usually stored right after the function's entry. Performing this work right before parsing the next entry is unnecessary work. 2. Calculating symbol sizes for symbols of size 0 is already performed in `Symtab::InitAddressIndexes` after all the symbols are added to the Symtab. It also does this more efficiently by walking over a list of symbols sorted by address, so the work to calculate the size per symbol is constant instead of O(log n).
2025-08-19[lldb] Improve error handling in ObjectFileWasm (#154433)Jonas Devlieghere1-77/+102
Improve error handling in ObjectFileWasm by using helpers that wrap their result in an llvm::Expected. The helper to read a Wasm string now return an Expected<std::string> and I created a helper to parse 32-bit ULEBs that returns an Expected<uint32_t>.
2025-08-19[lldb] Create sections for Wasm segments (#153634)Jonas Devlieghere1-19/+65
This is a continuation of #153494. In a WebAssembly file, the "name" section contains names for the segments in the data section (WASM_NAMES_DATA_SEGMENT). We already parse these as symbols, and with this PR, we now also create sub-sections for each of the segments.
2025-08-14[lldb] Support parsing data symbols from the Wasm name section (#153494)Jonas Devlieghere1-22/+92
This PR adds support for parsing the data symbols from the WebAssembly name section, which consists of a name and address range for the segments in the Wasm data section. Unlike other object file formats, Wasm has no symbols for referencing items within those segments (i.e. symbols the user has defined).
2025-08-13[lldb] Use numeric_limits for all overflow checks in ObjectFileWasm (#153332)Jonas Devlieghere1-6/+6
Use std::numeric_limits<uint32_t>::max() for all overflow checks in ObjectFileWasm and fix a few locations where I incorrectly used `>=` instead of `>`.
2025-08-12[lldb] Support parsing the Wasm symbol table (#153093)Jonas Devlieghere2-10/+136
This PR adds support for parsing the WebAssembly symbol table. The symbol table is encoded in the "names" section and contains names and indexes into other sections. For now we only support parsing function (code) symbols. The result is that you can set breakpoints by symbol name, while previously breakpoints by name required debug info (DWARF). This is also necessary for Swift, which checks for the presence of `swift_release` as a heuristic to determine if there's a static Swift stdlib.
2025-08-11[Minidump] Update Minidump file builder to continue when the Module's ↵barsolo20001-22/+31
section cannot be found (#152009) Instead of returning an error when: - it can't obtain section information from a module. - there are other issues calculating the size. when we encounter such an error we log the error and continue with the other modules. tested with lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump.py --------- Co-authored-by: Bar Soloveychik <barsolo@fb.com>
2025-07-30[lldb] Support DW_OP_WASM_location in DWARFExpression (#151010)Jonas Devlieghere1-1/+5
Add support for DW_OP_WASM_location in DWARFExpression. This PR rebases #78977 and cleans up the unit test. The DWARF extensions are documented at https://yurydelendik.github.io/webassembly-dwarf/ and supported by LLVM-based toolchains such as Clang, Swift, Emscripten, and Rust.
2025-07-24[lldb] Fix uninitialized memory access. (#150544)Jorge Gorbe Moya1-3/+3
lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump64b.py fails under msan with uninitialized memory access errors. The problem is that a few structs are written to the dump without having been fully initialized. This change makes them default-initialized so dumping the fields that aren't explicitly written to won't trigger UB.
2025-07-18[LLDB] Fix Memory64 BaseRVA, move all non-stack memory to Mem64. (#146777)Jacob Lalonde1-39/+39
### Context Over a year ago, I landed support for 64b Memory ranges in Minidump (#95312). In this patch we added the Memory64 list stream, which is effectively a Linked List on disk. The layout is a sixteen byte header and then however many Memory descriptors. ### The Bug This is a classic off-by one error, where I added 8 bytes instead of 16 for the header. This caused the first region to start 8 bytes before the correct RVA, thus shifting all memory reads by 8 bytes. We are correctly writing all the regions to disk correctly, with no physical corruption but the RVA is defined wrong, meaning we were incorrectly reading memory ![image](https://github.com/user-attachments/assets/049ef55d-856c-4f3c-9376-aeaa3fe8c0e1) ### Why wasn't this caught? One problem we've had is forcing Minidump to actually use the 64b mode, it would be a massive waste of resources to have a test that actually wrote >4.2gb of IO to validate the 64b regions, and so almost all validation has been manual. As a weakness of manual testing, this issue is psuedo non-deterministic, as what regions end up in 64b or 32b is handled greedily and iterated in the order it's laid out in /proc/pid/maps. We often validated 64b was written correctly by hexdumping the Minidump itself, which was not corrupted (other than the BaseRVA) ![image](https://github.com/user-attachments/assets/b599e3be-2d59-47e2-8a2d-75f182bb0b1d) ### Why is this showing up now? During internal usage, we had a bug report that the Minidump wasn't displaying values. I was unable to repro the issue, but during my investigation I saw the variables were in the 64b regions which resulted in me identifying the bug. ### How do we prevent future regressions? To prevent regressions, and honestly to save my sanity for figuring out where 8 bytes magically came from, I've added a new API to SBSaveCoreOptions. ```SBSaveCoreOptions::GetMemoryRegionsToSave()``` The ability to get the memory regions that we intend to include in the Coredump. I added this so we can compare what we intended to include versus what was actually included. Traditionally we've always had issues comparing regions because Minidump includes `/proc/pid/maps` and it can be difficult to know what memoryregion read failure was a genuine error or just a page that wasn't meant to be included. We are also leveraging this API to choose the memory regions to be generated, as well as for testing what regions should be bytewise 1:1. After much debate with @clayborg, I've moved all non-stack memory to the Memory64 List. This list doesn't incur us any meaningful overhead and Greg originally suggested doing this in the original 64b PR. This also means we're exercising the 64b path every single time we save a Minidump, preventing regressions on this feature from slipping through testing in the future. Snippet produced by [minidump.py](https://github.com/clayborg/scripts) ``` MINIDUMP_MEMORY_LIST: NumberOfMemoryRanges = 0x00000002 MemoryRanges[0] = [0x00007f61085ff9f0 - 0x00007f6108601000) @ 0x0003f655 MemoryRanges[1] = [0x00007ffe47e50910 - 0x00007ffe47e52000) @ 0x00040c65 MINIDUMP_MEMORY64_LIST: NumberOfMemoryRanges = 0x000000000000002e BaseRva = 0x0000000000042669 MemoryRanges[0] = [0x00005584162d8000 - 0x00005584162d9000) MemoryRanges[1] = [0x00005584162d9000 - 0x00005584162db000) MemoryRanges[2] = [0x00005584162db000 - 0x00005584162dd000) MemoryRanges[3] = [0x00005584162dd000 - 0x00005584162ff000) MemoryRanges[4] = [0x00007f6100000000 - 0x00007f6100021000) MemoryRanges[5] = [0x00007f6108800000 - 0x00007f6108828000) MemoryRanges[6] = [0x00007f6108828000 - 0x00007f610899d000) MemoryRanges[7] = [0x00007f610899d000 - 0x00007f61089f9000) MemoryRanges[8] = [0x00007f61089f9000 - 0x00007f6108a08000) MemoryRanges[9] = [0x00007f6108bf5000 - 0x00007f6108bf7000) ``` ### Misc As a part of this fix I had to look at LLDB logs a lot, you'll notice I added `0x` to many of the PRIx64 `LLDB_LOGF`. This is so the user (or I) can directly copy paste the address in the logs instead of adding the hex prefix themselves. Added some SBSaveCore tests for the new GetMemoryAPI, and Docstrings. CC: @DavidSpickett, @da-viper @labath because we've been working together on save-core plugins, review it optional and I didn't tag you but figured you'd want to know
2025-07-02[lldb] remove do-nothing defaults in case statements,Jason Molenda1-7/+0
unbreak gcc CI bots.
2025-07-02[lldb] Fix warningsKazu Hirata1-0/+3
This patch fixes: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:415:7: error: label at end of compound statement is a C++23 extension [-Werror,-Wc++23-extensions] lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:536:7: error: label at end of compound statement is a C++23 extension [-Werror,-Wc++23-extensions] lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:672:7: error: label at end of compound statement is a C++23 extension [-Werror,-Wc++23-extensions]
2025-07-02[lldb][NFC][MachO] Clean up LC_THREAD reading code, remove i386 corefile ↵Jason Molenda1-177/+13
(#146480) While fixing bugs in the x86_64 LC_THREAD parser in ObjectFileMachO, I noticed that the other LC_THREAD parsers are all less clear than they should be. To recap, a Mach-O LC_THREAD load command has a byte size for the entire payload. Within the payload, there will be one or more register sets provided. A register set starts with a UInt32 "flavor", the type of register set defined in the system headers, and a UInt32 "count", the number of UInt32 words of memory for this register set. After one register set, there may be additional sets. A parser can skip an unknown register set flavor by using the count field to get to the next register set. When the total byte size of the LC_THREAD load command has been parsed, it is completed. This patch fixes the riscv/arm/arm64 LC_THREAD parsers to use the total byte size as the exit condition, and to skip past unrecognized register sets, instead of stopping parsing. Instead of fixing the i386 corefile support, I removed it. The last macOS that supported 32-bit Intel code was macOS 10.14 in 2018. I also removed i386 KDP support, 32-bit intel kernel debugging hasn't been supported for even longer than that. It would be preferable to do these things separately, but I couldn't bring myself to update the i386 LC_THREAD parser, and it required very few changes to remove this support entirely.
2025-06-30[lldb][Mach-O] Fix several bugs in x86_64 Mach-O corefile (#146460)Jason Molenda1-40/+26
reading, and one bug in the new RegisterContextUnifiedCore class. The PR I landed a few days ago to allow Mach-O corefiles to augment their registers with additional per-thread registers in metadata exposed a few bugs in the x86_64 corefile reader when running under different CI environments. It also showed a bug in my RegisterContextUnifiedCore class where I wasn't properly handling lookups of unknown registers (e.g. the LLDB_GENERIC_RA when debugging an intel target). The Mach-O x86_64 corefile support would say that it had fpu & exc registers available in every corefile, regardless of whether they were actually present. It would only read the bytes for the first register flavor in the LC_THREAD, the GPRs, but it read them incorrectly, so sometimes you got more register context than you'd expect. The LC_THREAD register context specifies a flavor and the number of uint32_t words; the ObjectFileMachO method would read that number of uint64_t's, exceeding the GPR register space, but it was followed by FPU and then EXC register space so it didn't crash. If you had a corefile with GPR and EXC register bytes, it would be written into the GPR and then FPU register areas, with zeroes filling out the rest of the context.
2025-06-27[lldb][Mach-O] Allow "process metadata" LC_NOTE to supply registers (#144627)Jason Molenda2-20/+46
The "process metadata" LC_NOTE allows for thread IDs to be specified in a Mach-O corefile. This extends the JSON recognzied in that LC_NOTE to allow for additional registers to be supplied on a per-thread basis. The registers included in a Mach-O corefile LC_THREAD load command can only be one of the register flavors that the kernel (xnu) defines in <mach/arm/thread_status.h> for arm64 -- the general purpose registers, floating point registers, exception registers. JTAG style corefile producers may have access to many additional registers beyond these that EL0 programs typically use, for instance TCR_EL1 on AArch64, and people developing low level code need access to these registers. This patch defines a format for including these registers for any thread. The JSON in "process metadata" is a dictionary that must have a `threads` key. The value is an array of entries, one per LC_THREAD in the Mach-O corefile. The number of entries must match the LC_THREADs so they can be correctly associated. Each thread's dictionary must have two keys, `sets`, and `registers`. `sets` is an array of register set names. If a register set name matches one from the LC_THREAD core registers, any registers that are defined will be added to that register set. e.g. metadata can add a register to the "General Purpose Registers" set that lldb shows users. `registers` is an array of dictionaries, one per register. Each register must have the keys `name`, `value`, `bitsize`, and `set`. It may provide additional keys like `alt-name`, that `DynamicRegisterInfo::SetRegisterInfo` recognizes. This `sets` + `registers` formatting is the same that is used by the `target.process.python-os-plugin-path` script interface uses, both are parsed by `DynamicRegisterInfo`. The one addition is that in this LC_NOTE metadata, each register must also have a `value` field, with the value provided in big-endian base 10, as usual with JSON. In RegisterContextUnifiedCore, I combine the register sets & registers from the LC_THREAD for a specific thread, and the metadata sets & registers for that thread from the LC_NOTE. Even if no LC_NOTE is present, this class ingests the LC_THREAD register contexts and reformats it to its internal stores before returning itself as the RegisterContex, instead of shortcutting and returning the core's native RegisterContext. I could have gone either way with that, but in the end I decided if the code is correct, we should live on it always. I added a test where we process save-core to create a userland corefile, then use a utility "add-lcnote" to strip the existing "process metadata" LC_NOTE that lldb put in it, and adds a new one from a JSON string. rdar://74358787 --------- Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
2025-06-24Reapply "[lldb/cmake] Plugin layering enforcement mechanism (#144543)" (#145305)Pavel Labath1-0/+2
The only difference from the original PR are the added BRIEF and FULL_DOCS arguments to define_property, which are required for cmake<3.23.
2025-06-23Revert "[lldb/cmake] Plugin layering enforcement mechanism (#144543)"Pavel Labath1-2/+0
Causes failures on several bots. This reverts commits 714b2fdf3a385e5b9a95c435f56b1696ec3ec9e8 and e7c1da7c8ef31c258619c1668062985e7ae83b70.
2025-06-23[lldb/cmake] Plugin layering enforcement mechanism (#144543)Pavel Labath1-0/+2
Some inter-plugin dependencies are okay, others are not. Yet others not, but we're sort of stuck with them. The idea is to be able to prevent backsliding while making sure that acceptable dependencies are.. accepted. For context, see https://github.com/llvm/llvm-project/pull/139170 and the attached changes to the documentation.
2025-06-17[lldb][AIX] Added XCOFF ParseSymtab handling (#141577)Dhruv Srivastava1-1/+101
This PR is in reference to porting LLDB on AIX. Link to discussions on llvm discourse and github: 1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640 2. https://github.com/llvm/llvm-project/issues/101657 The complete changes for porting are present in this draft PR: https://github.com/llvm/llvm-project/pull/102601 **Description:** Adding ParseSymtab logic after creating sections. It is able to handle both 32 and 64 bit symbols, without the need to add template logic. This is an incremental PR on top of my previous couple of XCOFF support commits.
2025-06-09[lldb][Mach-O] Fix DWARF5 debugging regression for Mach-OJason Molenda1-0/+7
A unification of the DWARF section names, https://github.com/llvm/llvm-project/pull/141344 broke dwarf5 debugging with Mach-O files. The str_offset and str_offset.dwo names are different in Mach-O from other object files.
2025-06-09[LLDB] Unify DWARF section name matching (#141344)nerix5-175/+20
Different object file formats support DWARF sections (COFF, ELF, MachO, PE/COFF, WASM). COFF and PE/COFF only matched a subset. This caused some GCC executables produced on MinGW to have issue later on when debugging. One example is that `.debug_rnglists` was not matched, which caused range-extraction to fail when printing a backtrace. This unifies the parsing of section names in `ObjectFile::GetDWARFSectionTypeFromName`, so all file formats can use the same naming convention. Since the prefixes are different, `GetDWARFSectionTypeFromName` only matches the suffixes (i.e. `.debug_` needs to be stripped before). I added two tests to ensure the sections are correctly identified on Windows executables.
2025-06-05Revert "[lldb] Set default object format to `MachO` in `ObjectFileMachO` ↵Jason Molenda1-1/+0
(#142704)" This reverts commit d4d2f069dec4fb8b13447f52752d4ecd08d976d6. Temporarily reverting until we can find a way to get the correct ObjectFile set in Module's Triples without adding "-macho" to the triple string for each Module. This is breaking TestUniversal.py on the x86_64 macOS CI bots.
2025-06-04[lldb] Set default object format to `MachO` in `ObjectFileMachO` (#142704)royitaqi1-0/+1
# The Change This patch sets the **default** object format of `ObjectFileMachO` to be `MachO` (instead of what currently ends up to be `ELF`, see below). This should be **the correct thing to do**, because the code before the line of change has already verified the Mach-O header. The existing logic: * In `ObjectFileMachO`, the object format is unassigned by default. So it's `UnknownObjectFormat` (see [code](https://github.com/llvm/llvm-project/blob/54d544b83141dc0b20727673f68793728ed54793/llvm/lib/TargetParser/Triple.cpp#L1024)). * The code then looks at load commands like `LC_VERSION_MIN_*` ([code](https://github.com/llvm/llvm-project/blob/54d544b83141dc0b20727673f68793728ed54793/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp#L5180-L5217)) and `LC_BUILD_VERSION` ([code](https://github.com/llvm/llvm-project/blob/54d544b83141dc0b20727673f68793728ed54793/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp#L5231-L5252)) and assign the Triple's OS and Environment if they exist. * If the above sets the Triple's OS to macOS, then the object format defaults to `MachO`; otherwise it is `ELF` ([code](https://github.com/llvm/llvm-project/blob/54d544b83141dc0b20727673f68793728ed54793/llvm/lib/TargetParser/Triple.cpp#L936-L937)) # Impact For **production usage** where Mach-O files have the said load commands (which is [expected](https://www.google.com/search?q=Are+mach-o+files+expected+to+have+the+LC_BUILD_VERSION+load+command%3F)), this patch won't change anything. * **Important note**: It's not clear if there are legitimate production use cases where the Mach-O files don't have said load commands. If there is, the exiting code think they are `ELF`. This patch changes it to `MachO`. This is considered a fix for such files. For **unit tests**, this patch will simplify the yaml data by not requiring the said load commands. # Test See PR.
2025-06-04[lldb/cmake] Implicitly pass arguments to llvm_add_library (#142583)Pavel Labath11-34/+34
If we're not touching them, we don't need to do anything special to pass them along -- with one important caveat: due to how cmake arguments work, the implicitly passed arguments need to be specified before arguments that we handle. This isn't particularly nice, but the alternative is enumerating all arguments that can be used by llvm_add_library and the macros it calls (it also relies on implicit passing of some arguments to llvm_process_sources).
2025-06-02[lldb][AIX] Added support to load DW_ranges section (#142356)Hemang Gadhavi1-0/+1
This PR is in reference to porting LLDB on AIX. Link to discussions on llvm discourse and github: 1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640 2. https://github.com/llvm/llvm-project/issues/101657 The complete changes for porting are present in this draft PR: https://github.com/llvm/llvm-project/pull/102601 - [lldb] [AIX] Added support to load Dwarf Ranges(.dwranges) section.
2025-05-29[LLDB][Minidump] Fix bug in generating 64b memory minidumps (#141995)Jacob Lalonde1-1/+2
In #129307, we introduced read write in chunks, and during the final revision of the PR I changed the behavior for 64b memory regions and did not test an actual 64b memory range. This caused LLDB to crash whenever we generated a 64b memory region. 64b regions has been a problem in testing for some time as it's a waste of test resources to generation a 5gb+ Minidump. I will work with @clayborg and @labath to come up with a way to specify creating a 64b list instead of a 32b list (likely via the yamilizer).
2025-05-24[lldb] Use llvm::stable_sort (NFC) (#141352)Kazu Hirata1-2/+1
2025-05-22[lldb] Disable some unwind plans for discontinuous functions (#140927)Pavel Labath2-22/+27
Basically, disable everything except the eh_frame unwind plan, as that's the only one which supports this right now. The other plans are working with now trying the interpret everything in between the function parts as a part of the function, which is more likely to produce wrong results than correct ones. I changed the interface for object file plans, to give the implementations a chance to implement this correctly, but I haven't actually converted PECallFrameInfo (its only implementation) to handle that. (from the looks of things, it should be relatively easy to do, if it becomes necessary) I'm also deleting UnwindPlan::GetFirstNonPrologueInsn, as it's not used, and it doesn't work for discontinuous functions.
2025-05-21[lldb] Remove unused local variables (NFC) (#140989)Kazu Hirata1-1/+0
2025-05-16[lldb][AIX] Added 32-bit XCOFF Executable support (#139875)Dhruv Srivastava2-17/+52
This PR is in reference to porting LLDB on AIX. Link to discussions on llvm discourse and github: 1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640 2. https://github.com/llvm/llvm-project/issues/101657 The complete changes for porting are present in this draft PR: https://github.com/llvm/llvm-project/pull/102601 **Description:** Adding support for XCOFF 32 bit file format as well in lldb, up to the point where 64-bit support is implemented. Added a new test case for the same. This is an incremental PR on top of the previous couple of XCOFF support commits.
2025-05-12[lldb][AIX] Support for XCOFF Sections (#131304)Dhruv Srivastava1-1/+49
This PR is in reference to porting LLDB on AIX. Link to discussions on llvm discourse and github: 1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640 2. https://github.com/llvm/llvm-project/issues/101657 The complete changes for porting are present in this draft PR: https://github.com/llvm/llvm-project/pull/102601 Incremental PR on ObjectFileXCOFF.cpp This PR is intended to handle XCOFF sections.
2025-05-08[lldb] Fix asan failure in MinidumpFileBuilderFelipe de Azevedo Piovezan1-1/+1
As per comment in https://github.com/llvm/llvm-project/pull/138698#issuecomment-2860369432
2025-05-07[LLDB][Minidump] Add some buffer directories (#138943)Jacob Lalonde1-0/+6
Add a generous amount of buffer directories. I found out some LLDB forks (internal and external) had custom ranges that could fail because we didn't pre-account for those. To prevent this from being a problem, I've added a large number of buffer directories at the cost of 240 bytes.
2025-04-23[lldb][MachO] MachO corefile support for riscv32 binaries (#137092)Jason Molenda1-0/+152
Add support for reading a macho corefile with CPU_TYPE_RISCV and the riscv32 general purpose register file. I added code for the floating point and exception registers too, but haven't exercised this. If we start putting the full CSR register bank in a riscv corefile, it'll be in separate 4k byte chunks, but I don't have a corefile to test against that so I haven't written the code to read it. The RegisterContextDarwin_riscv32 is copied & in the style of the other RegisterContextDarwin classes; it's not the first choice I would make for representing this, but it wasn't worth changing for this cputype. rdar://145014653
2025-04-09[NFC][LLDB] Clean up comments/some code in MinidumpFileBuilder (#134961)Jacob Lalonde1-19/+9
I've recently been working on Minidump File Builder again, and some of the comments are out of date, and many of the includes are no longer used. This patch removes unneeded includes and rephrases some comments to better fit with the current state after the read write chunks pr.
2025-04-08[lldb][Minidump] Fix MAX_WRITE_CHUNK_SIZE typeJason Molenda1-1/+1
MinidumpFileBuilder calls std::min(MAX_WRITE_CHUNK_SIZE, func_returning_uint64_t) and on Darwin this errors out with unsigned long long & unsigned long not being the same type.
2025-04-08[LLDB][Minidump]Update MinidumpFileBuilder to read and write in chunks (#129307)Jacob Lalonde2-34/+86
I recently received an internal error report that LLDB was OOM'ing when creating a Minidump. In my 64b refactor we made a decision to acquire buffers the size of the largest memory region so we could read all of the contents in one call. This made error handling very simple (and simpler coding for me!) but had the trade off of large allocations if huge pages were enabled. This patch is one I've had on the back burner for awhile, but we can read and write the Minidump memory sections in discrete chunks which we already do for writing to disk. I had to refactor the error handling a bit, but it remains the same. We make a best effort attempt to read as much of the memory region as possible, but fail immediately if we receive an error writing to disk. I did not add new tests for this because our existing test suite is quite good, but I did manually verify a few Minidumps couldn't read beyond the red_zone. ``` (lldb) reg read $sp rsp = 0x00007fffffffc3b0 (lldb) p/x 0x00007fffffffc3b0 - 128 (long) 0x00007fffffffc330 (lldb) memory read 0x00007fffffffc330 0x7fffffffc330: 60 c3 ff ff ff 7f 00 00 60 cd ff ff ff 7f 00 00 `.......`....... 0x7fffffffc340: 60 c3 ff ff ff 7f 00 00 65 e6 26 00 00 00 00 00 `.......e.&..... (lldb) memory read 0x00007fffffffc329 error: could not parse memory info (Success!) ``` I'm not sure how to quantify the memory improvement other than we would allocate the largest size regardless of the size. So a 2gb unreadable region would cause a 2gb allocation even if we were reading 4096 kb. Now we will take the range size or the max chunk size of 128 mb.
2025-03-21 [lldb] s/ValidRange/ValidRanges in UnwindPlan (#127661)Pavel Labath1-2/+2
To be able to describe discontinuous functions, this patch changes the UnwindPlan to accept more than one address range. I've also squeezed in a couple improvements/modernizations, for example using the lower_bound function instead of a linear scan.
2025-03-19[lldb] Use UnwindPlan::Row as values (#131150)Pavel Labath1-13/+13
In most places, the rows are copied anyway (because they are generated by cumulating modifications) immediately after adding them to the unwind plans. In others, they can be moved into the unwind plan. This lets us remove some backflip copies and make `const UnwindPlan` actually mean something. I've split this patch into two (and temporarily left both APIs) as this patch was getting a bit big. This patch covers all the interesting cases. Part two all about converting "architecture default" unwind plans from ABI and InstructionEmulation plugins.
2025-03-07Add complete ObjectFileJSON support for sections. (#129916)Greg Clayton1-22/+21
Sections now support specifying: - user IDs - file offset/size - alignment - flags - bool values for fake, encrypted and thread specific sections
2025-03-06[lldb][Mach-O] Don't read symbol table of specially marked binary (#129967)Jason Molenda2-24/+45
We have a binary image on Darwin that has no code, only metadata. It has a large symbol table with many external symbol names that will not be needed in the debugger. And it is possible to not have this binary on the debugger system - so lldb must read all of the symbol names out of memory, one at a time, which can be quite slow. We're adding a section __TEXT,__lldb_no_nlist, to this binary to indicate that lldb should not read the nlist symbols for it when we are reading out of memory. If lldb is run with an on-disk version of the binary, we will load the symbol table as we normally would, there's no benefit to handling this binary differently. I added a test where I create a dylib with this specially named section, launch the process. The main binary deletes the dylib from the disk so lldb is forced to read it out of memory. lldb attaches to the binary, confirms that the dylib is present in the process and is a memory Module. If the binary is not present, or lldb found the on-disk copy because it hasn't been deleted yet, we delete the target, flush the Debugger's module cache, sleep and retry, up to ten times. I create the specially named section by compiling an assembly file that puts a byte in the section which makes for a bit of a messy Makefile (the pre-canned actions to build a dylib don't quite handle this case) but I don't think it's much of a problem. This is a purely skipUnlessDarwin test case. Relanding this change with a restructured Makefiles for the test case that should pass on the CI bots. rdar://146167816
2025-03-06Revert "[lldb][Mach-O] Don't read symbol table of specially marked binary ↵Jason Molenda2-45/+24
(#129967)" This reverts commit 397696bb3d26c1298bf265e4907b0b6416ad59c9. This breaks the macOS CI bots, I need to use $LDFLAGS in the $LD invocation when building the dylib to get the dylibs to build on the CI bots. But I've added "-lno-nlists -lhas-nlists" to the LDFLAGS for the main binary in the same directory, so using LDFLAGS will result in a compile error for the dylibs. I'll need to build the dylibs in a subdir with a different Makefile, will reland with that change in a bit.