aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/AsmPrinter
AgeCommit message (Collapse)AuthorFilesLines
23 hoursReland "[DebugInfo][DwarfDebug] Separate creation and population of abstract ↵Vladislav Dzhidzhoev8-58/+162
subprogram DIEs" (#160786) This is an attempt to reland https://github.com/llvm/llvm-project/pull/159104 with the fix for https://github.com/llvm/llvm-project/issues/160197. The original patch had the following problem: when an abstract subprogram DIE is constructed from within `DwarfDebug::endFunctionImpl()`, `DwarfDebug::constructAbstractSubprogramScopeDIE()` acknowledges `unit:` field of DISubprogram. But an abstract subprogram DIE constructed from `DwarfDebug::beginModule()` was put in the same compile unit to which global variable referencing the subprogram belonged, regardless of subprogram's `unit:`. This is fixed by adding `DwarfDebug::getOrCreateAbstractSubprogramCU()` used by both`DwarfDebug:: constructAbstractSubprogramScopeDIE()` and `DwarfCompileUnit::getOrCreateSubprogramDIE()` when abstract subprogram is queried during the creation of DIEs for globals in `DwarfDebug::beginModule()`. The fix and the already-reviewed code from https://github.com/llvm/llvm-project/pull/159104 are two separate commits in this PR. ===== The original commit message follows: With this change, construction of abstract subprogram DIEs is split in two stages/functions: creation of DIE (in DwarfCompileUnit::getOrCreateAbstractSubprogramDIE) and its population with children (in DwarfCompileUnit::constructAbstractSubprogramScopeDIE). With that, abstract subprograms can be created/referenced from DwarfDebug::beginModule, which should solve the issue with static local variables DIE creation of inlined functons with optimized-out definitions. It fixes https://github.com/llvm/llvm-project/issues/29985. LexicalScopes class now stores mapping from DISubprograms to their corresponding llvm::Function's. It is supposed to be built before processing of each function (so, now LexicalScopes class has a method for "module initialization" alongside the method for "function initialization"). It is used by DwarfCompileUnit to determine whether a DISubprogram needs an abstract DIE before DwarfDebug::beginFunction is invoked. DwarfCompileUnit::getOrCreateSubprogramDIE method is added, which can create an abstract or a concrete DIE for a subprogram. It accepts llvm::Function* argument to determine whether a concrete DIE must be created. This is a temporary fix for https://github.com/llvm/llvm-project/issues/29985. Ideally, it will be fixed by moving global variables and types emission to DwarfDebug::endModule (https://reviews.llvm.org/D144007, https://reviews.llvm.org/D144005). Some code proposed by Ellis Hoag <ellis.sparky.hoag@gmail.com> in https://github.com/llvm/llvm-project/pull/90523 was taken for this commit.
5 days[TII] Split isTrivialReMaterializable into two versions [nfc] (#160377)Philip Reames1-5/+1
This change builds on https://github.com/llvm/llvm-project/pull/160319 which tries to clarify which *callers* (not backends) assume that the result is actually trivial. This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic. This change does *not* enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>
6 days[Debug][AArch64] Do not crash on unknown subreg register sizes. (#160442)David Green1-1/+1
The AArch64 zsub regs are scalable, so defined with a size of -1 (which comes through as 65535). The RegisterSize is only 128, so code to try and find overlapping regs of a z30_z31 in DwarfEmitter can crash on trying to access out of range bits in a BitVector. Hexagon and x86 also contain subregs with unknown sizes. Ideally most of these would be scalable values but in the meantime add a check that the register are small enough to overlap with the current register size, to prevent us from crashing. This fixes the issue reported on #153810.
7 daysUpdate callers of isTriviallyReMaterializable to check trivialness (#160319)Philip Reames1-1/+5
This is a preparatory change for an upcoming reorganization of our rematerialization APIs. Despite the interface being documented as "trivial" (meaning no virtual register uses on the instruction being considered for remat), our actual implementation inconsistently supports non-trivial remat, and certain backends (AMDGPU and RISC-V mostly) lie about instructions being trivial to abuse that. We want to allow non-triial remat more broadly, but first we need to do some cleanup to make it understandable what's going on. These three call sites are ones which appear to actually want the trivial definition, and appear fairly low risk to change. p.s. I'm deliberately *not* updating any APIs in this change, I'm going to do that as a followup once it's clear which category each callsite fits in.
7 daysRevert "[DebugInfo][DwarfDebug] Separate creation and population of abstract ↵Vladislav Dzhidzhoev7-141/+52
subprogram DIEs" (#160349) Reverts llvm/llvm-project#159104 due to the issues reported in https://github.com/llvm/llvm-project/issues/160197.
8 days[llvm][AsmPrinter] Add direct calls to callgraph section (#155706)Prabhu Rajasekaran1-16/+44
Extend CallGraphSection to include metadata about direct calls. This simplifies the design of tools that must parse .callgraph section to not require dependency on MC layer.
8 days[Remarks] Restructure bitstream remarks to be fully standalone (#156715)Tobias Stadler1-10/+9
Currently there are two serialization modes for bitstream Remarks: standalone and separate. The separate mode splits remark metadata (e.g. the string table) from actual remark data. The metadata is written into the object file by the AsmPrinter, while the remark data is stored in a separate remarks file. This means we can't use bitstream remarks with tools like opt that don't generate an object file. Also, it is confusing to post-process bitstream remarks files, because only the standalone files can be read by llvm-remarkutil. We always need to use dsymutil to convert the separate files to standalone files, which only works for MachO. It is not possible for clang/opt to directly emit bitstream remark files in standalone mode, because the string table can only be serialized after all remarks were emitted. Therefore, this change completely removes the separate serialization mode. Instead, the remark string table is now always written to the end of the remarks file. This requires us to tell the serializer when to finalize remark serialization. This automatically happens when the serializer goes out of scope. However, often the remark file goes out of scope before the serializer is destroyed. To diagnose this, I have added an assert to alert users that they need to explicitly call finalizeLLVMOptimizationRemarks. This change paves the way for further improvements to the remark infrastructure, including more tooling (e.g. #159784), size optimizations for bitstream remarks, and more. Pull Request: https://github.com/llvm/llvm-project/pull/156715
11 days[WebAssembly] Require tags for Wasm EH and Wasm SJLJ to be defined ↵Sam Clegg2-24/+1
externally (#159143) Rather then defining these tags in each object file that requires them we can can declare them as undefined and require that they defined externally in, for example, compiler-rt or libcxxabi.
12 days[DebugInfo] Emit skeleton to avoid mismatching inlining flags (#153568)Qiu Chaofan1-12/+8
This actually reverts 418120556398c01550d42500d56e6d328290185b. The original commit omits unit with all symbols inlined into current one, which leads to crash when a module using split-dwarf inlined a function from another module with mismatched split-dwarf-inlining option. This revert guarantees that DIEs are created in both DWO and the skeleton sections whenever split-dwarf is active.
13 days[DebugInfo][DwarfDebug] Separate creation and population of abstract ↵Vladislav Dzhidzhoev7-52/+141
subprogram DIEs (#159104) With this change, construction of abstract subprogram DIEs is split in two stages/functions: creation of DIE (in DwarfCompileUnit::getOrCreateAbstractSubprogramDIE) and its population with children (in DwarfCompileUnit::constructAbstractSubprogramScopeDIE). With that, abstract subprograms can be created/referenced from DwarfDebug::beginModule, which should solve the issue with static local variables DIE creation of inlined functons with optimized-out definitions. It fixes https://github.com/llvm/llvm-project/issues/29985. LexicalScopes class now stores mapping from DISubprograms to their corresponding llvm::Function's. It is supposed to be built before processing of each function (so, now LexicalScopes class has a method for "module initialization" alongside the method for "function initialization"). It is used by DwarfCompileUnit to determine whether a DISubprogram needs an abstract DIE before DwarfDebug::beginFunction is invoked. DwarfCompileUnit::getOrCreateSubprogramDIE method is added, which can create an abstract or a concrete DIE for a subprogram. It accepts llvm::Function* argument to determine whether a concrete DIE must be created. This is a temporary fix for https://github.com/llvm/llvm-project/issues/29985. Ideally, it will be fixed by moving global variables and types emission to DwarfDebug::endModule (https://reviews.llvm.org/D144007, https://reviews.llvm.org/D144005). Some code proposed by Ellis Hoag <ellis.sparky.hoag@gmail.com> in https://github.com/llvm/llvm-project/pull/90523 was taken for this commit.
2025-09-08[llvm][DebugInfo] Emit DW_OP_lit0/1 for constant boolean values (#157167)Laxman Sole3-2/+16
Backends like NVPTX use -1 to indicate `true` and 0 to indicate `false` for boolean values. Machine instruction `#DBG_VALUE` also uses -1 to indicate a `true` boolean constant. However, during the DWARF generation, booleans are treated as unsigned variables, and the debug_loc expression, like `DW_OP_lit0; DW_OP_not` is emitted for the `true` value. This leads to the debugger printing `255` instead of `true` for constant boolean variables. This change emits `DW_OP_lit1` instead of `DW_OP_lit0; DW_OP_not`.
2025-09-01[llvm][DebugInfo] Support DW_AT_linkage_names that are different between ↵Michael Buch1-5/+2
declaration and definition (#154137) This patch is motivated by https://github.com/llvm/llvm-project/pull/149827, where we plan on using mangled names on structor declarations to find the exact structor definition that LLDB's expression evaluator should call. So far LLVM expects the declaration and definition linkage names to be identical (or the declaration to just not have a linkage name). But we plan on attaching the GCC-style "unified" mangled name to declarations, which will be different to linkage name on the definition. This patch relaxes this restriction.
2025-08-30Revert "Emit DW_OP_lit0/1 for constant boolean values" (#156172)Michael Buch3-16/+2
Reverts llvm/llvm-project#155539 Failing on buildbots with: ``` Step 7 (test-build-stage1-unified-tree-check-all) failure: test (failure) ******************** TEST 'LLVM :: DebugInfo/debug-bool-const-location.ll' FAILED ******************** Exit Code: 1 Command Output (stderr): -- /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llc /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll -O3 -filetype=obj -o - | /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llvm-dwarfdump - | /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll # RUN: at line 2 + /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llc /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll -O3 -filetype=obj -o - + /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llvm-dwarfdump - + /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll:7:10: error: CHECK: expected string not found in input ; CHECK: {{.*}} DW_OP_lit0 ^ <stdin>:27:54: note: scanning from here [0x0000000000000018, 0x0000000000000020): DW_OP_lit1, DW_OP_stack_value ^ <stdin>:28:41: note: possible intended match here [0x0000000000000020, 0x0000000000000034): DW_OP_reg3 X3) ^ Input file: <stdin> Check file: /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll -dump-input=help explains the following input dump. Input was: <<<<<< . . . 22: DW_AT_decl_line (5) 23: DW_AT_external (true) 24: 25: 0x0000003f: DW_TAG_variable 26: DW_AT_location (0x00000000: 27: [0x0000000000000018, 0x0000000000000020): DW_OP_lit1, DW_OP_stack_value check:7'0 X~~~~~~~~~~~~~~~~~~~ error: no match found 28: [0x0000000000000020, 0x0000000000000034): DW_OP_reg3 X3) check:7'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ check:7'1 ? possible intended match 29: DW_AT_name ("arg") check:7'0 ~~~~~~~~~~~~~~~~~~~~ 30: DW_AT_decl_file ("test") check:7'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 31: DW_AT_decl_line (5) check:7'0 ~~~~~~~~~~~~~~~~~~~~~ 32: DW_AT_type (0x0000004f "bool") check:7'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 33: check:7'0 ~ . ```
2025-08-30Emit DW_OP_lit0/1 for constant boolean values (#155539)Laxman Sole3-2/+16
Backends like NVPTX use -1 to indicate `true` and 0 to indicate `false` for boolean values. Machine instruction `#DBG_VALUE` also uses -1 to indicate a `true` boolean constant. However, during the DWARF generation, booleans are treated as unsigned variables, and the debug_loc expression, like `DW_OP_lit0; DW_OP_not` is emitted for the `true` value. This leads to the debugger printing `255` instead of `true` for constant boolean variables. This change emits `DW_OP_lit1` instead of `DW_OP_lot0; DW_OP_not`.
2025-08-29[DwarfDebug] Avoid generating extra DW_TAG_subprogram entries (#154636)Vladislav Dzhidzhoev1-0/+7
The test llvm/test/DebugInfo/X86/pr12831.ll was added in 4d358b55fa to fix the issue with emission of empty DW_TAG_subprogram tags (https://bugs.llvm.org/show_bug.cgi?id=12831). However, the test output is not checked properly, and it contains: ``` 0x00000206: DW_TAG_subprogram 0x00000207: DW_TAG_reference_type DW_AT_type (0x00000169 "class ") ``` The reason is that the DIE for the definition DISubprogram "writeExpr" is created during the call to `getOrCreateSubprogramDIE(declaration of writeExpr)`. Therefore, when `getOrCreateSubprogramDIE(definition of writeExpr)` is first called, we get a recursive chain of calls: ``` getOrCreateSubprogramDIE(definition of writeExpr) getOrCreateSubprogramDIE(declaration of writeExpr) ... getOrCreateSubprogramDIE(definition of writeExpr) ``` The outer call doesn't expect that the DIE for the definition of writeExpr will be created during the creation of declaration DIE. So, another DIE is created for the same subprogram. In this PR, a check is added to fix that.
2025-08-25[SHT_LLVM_BB_ADDR_MAP] Change the callsite feature to emit end of callsites. ↵Rahman Lavaee1-11/+12
(#155041) This PR simply moves the callsite anchors from the beginning of callsites to their end. Emitting the end of callsites is more sensible as it allows breaking the basic block into subblocks which end with control transfer instructions.
2025-08-08[IR] Introduce the `ptrtoaddr` instructionAlexander Richardson1-0/+1
This introduces a new `ptrtoaddr` instruction which is similar to `ptrtoint` but has two differences: 1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance 2) `ptrtoaddr` only extracts (and then extends/truncates) the low index-width bits of the pointer For most architectures, difference 2) does not matter since index (address) width and pointer representation width are the same, but this does make a difference for architectures that have pointers that aren't just plain integer addresses such as AMDGPU fat pointers or CHERI capabilities. This commit introduces textual and bitcode IR support as well as basic code generation, but optimization passes do not handle the new instruction yet so it may result in worse code than using ptrtoint. Follow-up changes will update capture tracking, etc. for the new instruction. RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/139357
2025-08-06[DebugInfo][DWARF] Add heapallocsite information (#132073)Jann Horn3-18/+22
LLVM currently stores heapallocsite information in CodeView debuginfo, but not in DWARF debuginfo. Plumb it into DWARF as an LLVM-specific extension. heapallocsite debug information is useful when it is combined with allocator instrumentation that stores caller addresses; I've used a previous version of this patch for: - analyzing memory usage by object type - analyzing the distributions of values of class members Other possible uses might be: - attributing memory access profiles (for example, on Intel CPUs, from PEBS records with Linear Data Address) to object types or specific object members - adding type information to crash/ASAN reports
2025-08-05[DebugInfo][DWARF] Don't emit bogus DW_AT_call_target for complex calls ↵Jann1-3/+12
(#151378) On X86-64, LLVM currently generates the same DWARF debug info for `call rax` and `call [rax]`; in both cases, the generated DWARF claims that the call goes to address RAX. This bug occurs because the X86 machine instructions CALL64r and CALL64m both receive register operands, but those register operands have different semantics. To fix it, change DwarfDebug::constructCallSiteEntryDIEs() to validate the callee operand's semantics (`OperandType`) and make sure it is not semantically describing a memory location. This fix will result in less DW_TAG_call_site and DW_AT_call_target entries being generated. There is an existing test in dwarf-callsite-related-attrs.ll that asserts the broken behavior; remove the broken check, and instead add a new test dwarf-callsite-related-attrs-indirect.ll that checks behavior for indirect calls. The existing test xray-custom-log.ll is validating something even more broken: It checks the debug info generated by a PATCHABLE_EVENT_CALL. `TII->getCalleeOperand()` assumes that the first argument of a call instruction is always the destination, but the first argument of PATCHABLE_EVENT_CALL is instead the event structure; and so we were emitting debug info claiming the callee was stored in a register that actually contains some kind of xray event descriptor, and the test validates that this happens. I am breaking and deleting this test. I guess the intent there might have been to validate that we emit debuginfo referencing the target of the direct call that LLVM emits (which we don't do)? But I'm not sure.
2025-08-05[AsmPrinter] Remove an unnecessary cast (NFC) (#152085)Kazu Hirata1-1/+1
getValue() already returns uint64_t.
2025-08-03MCSymbolWasm: Remove classofFangrui Song1-1/+2
The object file format specific derived classes are used in context where the type is statically known. We don't use isa/dyn_cast and we want to eliminate MCSymbol::Kind in the base class.
2025-08-03MCSymbolELF: Migrate away from classofFangrui Song1-1/+1
The object file format specific derived classes are used in context where the type is statically known. We don't use isa/dyn_cast and we want to eliminate MCSymbol::Kind in the base class.
2025-08-03MCSymbolELF: Migrate away from classofFangrui Song1-2/+2
The object file format specific derived classes are used in context where the type is statically known. We don't use isa/dyn_cast and we want to eliminate MCSymbol::Kind in the base class.
2025-07-31[llvm][AsmPrinter] Emit call graph sectionPrabhu Rajasekaran1-0/+108
Collect the necessary information for constructing the call graph section, and emit to .callgraph section of the binary. MD5 hash of the callee_type metadata string is used as the numerical type id emitted. Reviewers: ilovepi Reviewed By: ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87576
2025-07-27[AsmPrinter] Remove an unnecessary cast (NFC) (#150839)Kazu Hirata1-4/+3
getLabelAfterInsn() already returns MCSymbol *.
2025-07-26MCSectionXCOFF: Remove classofFangrui Song1-2/+2
The object file format specific derived classes are used in context like MCStreamer and MCObjectTargetWriter where the type is statically known. We don't use isa/dyn_cast and we want to eliminate MCSection::SectionVariant in the base class.
2025-07-25MCSectionCOFF: Remove classofFangrui Song2-6/+7
The object file format specific derived classes are used in context like MCStreamer and MCObjectTargetWriter where the type is statically known. We don't use isa/dyn_cast and we want to eliminate MCSection::SectionVariant in the base class.
2025-07-24Reapply "Support SFrame command-line and .cfi_section syntax (#150316) (#150509)Sterling-Augustine2-3/+5
This reverts commit ad36e4284d66c3609ef8675ef02ff1844bc1951d, fixing a single uninitialized bit (which cannot be detected with Address Sanitizer). This PR adds support for the llvm-mc command-line flag "--gsframe" and adds ".sframe" to the legal values passed ".cfi_section". It plumbs the option through the cfi handling code a fair amount. Code to support actual section generation follows in a future PR. These options match the gnu-assembler's support syntax for sframes, on both the command line and in assembly files. First in a series of changes that will allow llvm-mc to produce sframe .cfi sections. For more information about sframes, see https://sourceware.org/binutils/docs-2.44/sframe-spec.html and the llvm-RFC here: https://discourse.llvm.org/t/rfc-adding-sframe-support-to-llvm/86900
2025-07-23[AsmPrinter] Remove an unnecessary cast (NFC) (#150257)Kazu Hirata1-1/+1
getTag already returns dwarf::Tag.
2025-07-23Revert "Support SFrame command-line and .cfi_section syntax (#149935)" (#150316)Sterling-Augustine2-5/+3
This reverts commit f9d0bd02d966e5c28aca9a6ceadd5ffec6aa9f78.
2025-07-23Support SFrame command-line and .cfi_section syntax (#149935)Sterling-Augustine2-3/+5
This PR adds support for the llvm-mc command-line flag "--gsframe" and adds ".sframe" to the legal values passed ".cfi_section". It plumbs the option through the cfi handling code a fair amount. Code to support actual section generation follows in a future PR. These options match the gnu-assembler's support syntax for sframes, on both the command line and in assembly files. First in a series of changes that will allow llvm-mc to produce sframe .cfi sections. For more information about sframes, see https://sourceware.org/binutils/docs-2.44/sframe-spec.html and the llvm-RFC here: https://discourse.llvm.org/t/rfc-adding-sframe-support-to-llvm/86900
2025-07-22Fix Windows EH IP2State tables (remove +1 bias) (#144745)sivadeilra3-15/+3
This changes how LLVM constructs certain data structures that relate to exception handling (EH) on Windows. Specifically this changes how IP2State tables for functions are constructed. The purpose of this change is to align LLVM to the requires of the Windows AMD64 ABI, which requires that the IP2State table entries point to the boundaries between instructions. On most Windows platforms (AMD64, ARM64, ARM32, IA64, but *not* x86-32), exception handling works by looking up instruction pointers in lookup tables. These lookup tables are stored in `.xdata` sections in executables. One element of the lookup tables are the `IP2State` tables (Instruction Pointer to State). If a function has any instructions that require cleanup during exception unwinding, then it will have an IP2State table. Each entry in the IP2State table describes a range of bytes in the function's instruction stream, and associates an "EH state number" with that range of instructions. A value of -1 means "the null state", which does not require any code to execute. A value other than -1 is an index into the State table. The entries in the IP2State table contain byte offsets within the instruction stream of the function. The Windows ABI requires that these offsets are aligned to instruction boundaries; they are not permitted to point to a byte that is not the first byte of an instruction. Unfortunately, CALL instructions present a problem during unwinding. CALL instructions push the address of the instruction after the CALL instruction, so that execution can resume after the CALL. If the CALL is the last instruction within an IP2State region, then the return address (on the stack) points to the *next* IP2State region. This means that the unwinder will use the wrong cleanup funclet during unwinding. To fix this problem, compilers should insert a NOP after a CALL instruction, if the CALL instruction is the last instruction within an IP2State region. The NOP is placed within the same IP2State region as the CALL, so that the return address points to the NOP and the unwinder will locate the correct region. This PR modifies LLVM so that it inserts NOP instructions after CALL instructions, when needed. In performance tests, the NOP has no detectable significance. The NOP is rarely inserted, since it is only inserted when the CALL is the last instruction before an IP2State transition or the CALL is the last instruction before the function epilogue. NOP padding is only necessary on Windows AMD64 targets. On ARM64 and ARM32, instructions have a fixed size so the unwinder knows how to "back up" by one instruction. Interaction with Import Call Optimization (ICO): Import Call Optimization (ICO) is a compiler + OS feature on Windows which improves the performance and security of DLL imports. ICO relies on using a specific CALL idiom that can be replaced by the OS DLL loader. This removes a load and indirect CALL and replaces it with a single direct CALL. To achieve this, ICO also inserts NOPs after the CALL instruction. If the end of the CALL is aligned with an EH state transition, we *also* insert a single-byte NOP. **Both forms of NOPs must be preserved.** They cannot be combined into a single larger NOP; nor can the second NOP be removed. This is necessary because, if ICO is active and the call site is modified by the loader, the loader will end up overwriting the NOPs that were inserted for ICO. That means that those NOPs cannot be used for the correct termination of the exception handling region (the IP2State transition), so we still need an additional NOP instruction. The NOPs cannot be combined into a longer NOP (which is ordinarily desirable) because then ICO would split one instruction, producing a malformed instruction after the ICO call.
2025-07-21[PseudoProbe] Warn on illegal guid (#148564)Haohai Wen2-0/+43
Check whether guid exists in pseudo probe desc when emitting pseudo probe.
2025-07-20MC: Rename isVirtualSection to isBssSectionFangrui Song1-1/+1
The term BSS (Block Started by Symbol) is a standard, widely recognized term, available in the a.out object file format and adopted by formats like COFF, XCOFF, Mach-O (called S_ZEROFILL while `__bss` is also used), and ELF. To avoid introducing unfamiliar terms, we should use isBSSSection instead of isVirtualSection.
2025-07-14[llvm] Remove unused includes (NFC) (#148768)Kazu Hirata1-1/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-07-13[AArch64] "Support" debug info for SVE types on Windows. (#147865)Eli Friedman1-2/+9
There isn't any way to encode a variable in an SVE register, and there isn't any way to encode a scalable offset, and as far as I know that's unlikely to change in the near future. So suppress any debug info which would require those encodings. This isn't ideal, but we need to ship something which doesn't crash. Alternatively, for Z registers, we could emit debug info assuming the vector length is 128 bits, but that seems like it would lead to unintuitive results. The change to AArch64FrameLowering is needed to avoid a crash. But we can't actually test that the returned offset is correct: LiveDebugValues performs the query, then discards the result.
2025-07-10[llvm] export private symbols needed by unittests (#145767)Andrew Rogers2-9/+13
## Purpose Export a small number of private LLVM symbols so that unit tests can still build/run when LLVM is built as a Windows DLL or a shared library with default hidden symbol visibility. ## Background The effort to build LLVM as a WIndows DLL is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307). Some LLVM unit tests use internal/private symbols that are not part of LLVM's public interface. When building LLVM as a DLL or shared library with default hidden symbol visibility, the symbols are not available when the unit test links against the DLL or shared library. This problem can be solved in one of two ways: 1. Export the private symbols from the DLL. 2. Link the unit tests against the intermediate static libraries instead of the final LLVM DLL. This PR applies option 1. Based on the discussion of option 2 in #145448, this option is preferable. ## Overview * Adds a new `LLVM_ABI_FOR_TEST` export macro, which is currently just an alias for `LLVM_ABI`. * Annotates the sub-set of symbols under `llvm/lib` that are required to get unit tests building using the new macro.
2025-07-09[SHT_LLVM_BB_ADDR_MAP] Emit callsite offsets in the `SHT_LLVM_BB_ADDR_MAP` ↵Rahman Lavaee1-7/+30
section. (#146563) Callsite offsets will help map addresses to the right position in the basic block (before or after a callsite). This PR also bumps the BBAddrMap version to 3. The encoding/decoding ability is already pushed upstream 8d7a8fcc3ab9f6d4c4a7e4312876fe94ed3d6c4f.
2025-07-08[XRay] xray_fn_idx: fix alignment directiveFangrui Song1-4/+2
Use `emitValueToAlignment` as the section does not contain code. `emitCodeAlignment` would lead to ALIGN relocations on RISC-V and LoongArch with linker relaxation. In addition, change the alignment to wordsize, sufficient for the runtime requirement (`XRayFunctionSledIndex`). Related to #147322
2025-07-07[AsmPrinter] Fix a warningKazu Hirata1-1/+2
This patch fixes: llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp:3575:44: error: missing field 'Cases' initializer [-Werror,-Wmissing-field-initializers]
2025-07-07Add CodeView S_LABEL32 symbols for jump table targets (for Windows ↵sivadeilra2-5/+25
debugging) (#146121) This PR provides more information to debuggers and analysis tools on Windows. It adds `S_LABEL32` symbols for each target BB of each jump table. This allows debuggers to insert symbolic labels when disassembling code. `S_LABEL32` symbol records indicate that a location is definitely code, and can optionally associate a string label with the code location. BBs generated for jump tables may or may not have string labels, so it is acceptable for the "name" field within `S_LABEL32` symbols to be an empty string. More importantly, this PR allows Windows analysis tools, such as those that generate hot-patches for the Windows kernel, to use these labels to distinguish code basic blocks from data blocks. Microsoft's analysis tools (similar to Bolt) rely on being able to identify all code blocks, so that the tools can traverse all instructions and verify that important requirements for hot-patching are met. This PR has no effect on code generation. It only affects the CodeView symbols that are emitted into OBJ files, which the linker then repackages into PDB files.
2025-07-04[NFCI][LLVM] Adopt `ArrayRef::consume_front()` in a few places (#146793)Rahul Joshi1-2/+2
2025-07-04[debuginfo][coro] Emit debug info labels for coroutine resume points (#141937)Adrian Vogelsgesang3-10/+20
RFC on discourse: https://discourse.llvm.org/t/rfc-debug-info-for-coroutine-suspension-locations-take-2/86606 With this commit, we add `DILabel` debug infos to the resume points of a coroutine. Those labels can be used by debugging scripts to figure out the exact line and column at which a coroutine was suspended by looking up current `__coro_index` value inside the coroutines frame, and then searching for the corresponding label inside the coroutine's resume function. The DWARF information generated for such a label looks like: ``` 0x00000f71: DW_TAG_label DW_AT_name ("__coro_resume_1") DW_AT_decl_file ("generator-example.cpp") DW_AT_decl_line (5) DW_AT_decl_column (3) DW_AT_artificial (true) DW_AT_LLVM_coro_suspend_idx (0x01) DW_AT_low_pc (0x00000000000019be) ``` The labels can be mapped to their corresponding `__coro_idx` values either via their naming convention `__coro_resume_<N>` or using the new `DW_AT_LLVM_coro_suspend_idx` attribute. In gdb, those line numebrs can be looked up using `info line -function my_coroutine -label __coro_resume_1`. LLDB unfortunately does not understand DW_TAG_label debug information, yet. Given this is an artificial compiler-generated label, I did apply the DW_AT_artificial tag to it. The DWARFv5 standard only allows that tag on type and variable definitions, but this is a natural extension and was also blessed in the RFC on discourse. Also, this commit adds `DW_AT_decl_column` to labels, not only for coroutines but also for normal C and C++ labels. While not strictly necessary, I am doing so now because it would be harder to do so later without breaking the binary LLVM-IR format Drive-by fixes: While reading the existing test cases to understand how to write my own test case, I did a couple of small typo fixes and comment improvements
2025-07-02[SHT_LLVM_BB_ADDR_MAP] Remove support for versions 1 and 0 ↵Rahman Lavaee1-10/+7
(SHT_LLVM_BB_ADDR_MAP_V0). (#146186) Version 2 was added more than two years ago (https://github.com/llvm/llvm-project/commit/6015a045d768feab3bae9ad9c0c81e118df8b04a). So it should be safe to deprecate older versions.
2025-07-01[DwarfDebug] Slightly optimize computeKeyInstructions() (NFC) (#146357)Nikita Popov1-14/+11
Fetch the DILocation once, instead of many times. This is pretty trivial, but goes through out-of-line code.
2025-06-30[KeyInstr] Fully support mixed key/non-key inlining modes (#144103)Orlando Cazalet-Hyams1-15/+9
Patch 3/4 adding bitcode support, though the final patch doesn't depend on this one. Prior to this patch, a Key Instructions function inlined into a Not-Key-Instructions function fell back to Not-Key-Instructions handling. In order to fully support inlining mixed modes we need to run `computeKeyInstructions` (in case there's a Key Instructions scope) and `findForceIsStmtInstrs` (in case there's a Not-Key-Instructions scope) on all functions. This has a slight performance cost for all configurations - see PR for details.
2025-06-30[KeyInstr] Use DISubprogram's is-key-instructions-on flag at DWARF emission ↵Orlando Cazalet-Hyams1-5/+24
(#144104) Patch 2/4 adding bitcode support. A non-key-instructions function inlined into a key-instructions function uses non-key-instructions is_stmt placement (without `findForceIsStmtInstrs`). A key-instructions function inlined into a non-key-instructions function currently results in falling back to non-key-instructions for the inlined scope too. Both of these concessions (not using `findForceIsStmtInstrs` in the 1st case, and not using Key Instructions for the inlined scope in the 2nd) are for performance reasons; to do the right thing we'd need to run both `findForceIsStmtInstrs` and `computeKeyInstructions` - in case that's controversial I've got a separate PR for that: PR 144103.
2025-06-28CSKY: Replace deprecated MCExpr::print with MCAsmInfo::printExprFangrui Song1-1/+1
2025-06-28MC: Migrate away from operator<< MCExprFangrui Song1-1/+4
MCExpr::print has an optional MCAsmInfo argument, which is error-prone when omitted. MCExpr::print and the convenience helper operator<< are discouraged to use. Switch to MCAsmInfo::printExpr instead. Use the target-specific MCAsmInfo if available.
2025-06-27MCExpr: Remove VK_NoneFangrui Song1-4/+2
`enum VariantKind` is deprecated. Targets are encouraged to use their own relocation specifier constants. MCSymbolRefExpr::create callers with a VK_None argument should switch to the overload with a VariantKind parameter.