aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Utils/InlineFunction.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-08-17[llvm] Remove unused includes (NFC) (#154051)Kazu Hirata1-1/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-08-08[IR] Remove size argument from lifetime intrinsics (#150248)Nikita Popov1-25/+5
Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).
2025-08-07InlineFunction: Split inlining into predicate and apply functions (#134213)Matt Arsenault1-51/+89
This is to support a new inline function reduction in llvm-reduce, which should pre-filter callsites that are not eligible for inlining. This code was mostly structured as a match and apply, with a few exceptions. The ugliest piece is for propagating and verifying compatible getGC and personalities. Also collection of EHPad and the convergence token to use are now cached in InlineFunctionInfo. I was initially confused by the split between the checks performed here and isInlineViable, so better document how this system is supposed to work. It turns out this split does make sense, in that isInlineViable checks if it's possible based on the callee content and the ultimate inline depended on the callsite context. I think more renames of these functions would help, and isInlineViable should probably move out of InlineCost to be with these transfoms.
2025-07-18[DebugInfo] Shave even more users of DbgVariableIntrinsic from LLVM (#149136)Jeremy Morse1-2/+1
At this stage I'm just opportunistically deleting any code using debug-intrinsic types, largely adjacent to calls to findDbgUsers. I'll get to deleting that in probably one or more two commits.
2025-06-26[llvm] Use llvm::is_contained (NFC) (#145844)Kazu Hirata1-10/+8
llvm::is_contained is shorter than llvm::all_of plus a lambda.
2025-06-26[Utils] Drop const from a return type (NFC) (#145838)Kazu Hirata1-1/+1
We don't need const on the return type.
2025-06-17[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389)Jeremy Morse1-7/+2
Seeing how we can't generate any debug intrinsics any more: delete a variety of codepaths where they're handled. For the most part these are plain deletions, in others I've tweaked comments to remain coherent, or added a type to (what was) type-generic-lambdas. This isn't all the DbgInfoIntrinsic call sites but it's most of the simple scenarios. Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-11[DLCov][NFC] Annotate intentionally-blank DebugLocs in existing code (#136192)Stephen Tozer1-0/+9
Following the work in PR #107279, this patch applies the annotative DebugLocs, which indicate that a particular instruction is intentionally missing a location for a given reason, to existing sites in the compiler where their conditions apply. This is NFC in ordinary LLVM builds (each function `DebugLoc::getFoo()` is inlined as `DebugLoc()`), but marks the instruction in coverage-tracking builds so that it will be ignored by Debugify, allowing only real errors to be reported. From a developer standpoint, it also communicates the intentionality and reason for a missing DebugLoc. Some notes for reviewers: - The difference between `I->dropLocation()` and `I->setDebugLoc(DebugLoc::getDropped())` is that the former _may_ decide to keep some debug info alive, while the latter will always be empty; in this patch, I always used the latter (even if the former could technically be correct), because the former could result in some (barely) different output, and I'd prefer to keep this patch purely NFC. - I've generally documented the uses of `DebugLoc::getUnknown()`, with the exception of the vectorizers - in summary, they are a huge cause of dropped source locations, and I don't have the time or the domain knowledge currently to solve that, so I've plastered it all over them as a form of "fixme".
2025-05-28[MemProf] Emit remarks when hinting allocations not needing cloning (#141859)Teresa Johnson1-9/+13
The context disambiguation code already emits remarks when hinting allocations (by adding hotness attributes) during cloning. However, we did not yet emit hints when applying the hotness attributes during building of the metadata (during matching and again after inlining). Add remarks when we apply the hint attributes for these non-context-sensitive allocations.
2025-05-27[Inline] Only consider provenance captures for scoped alias metadata (#138540)Nikita Popov1-1/+3
When determining whether an escape source may alias with a noalias argument, only take provenance captures into account. If only the address of the argument was captured, an access through the escape source is not legal.
2025-05-10[Utils] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#139352)Kazu Hirata1-1/+1
2025-05-07[KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions ↵Orlando Cazalet-Hyams1-2/+8
(#133485) RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
2025-05-06[KeyInstr] Inline atom info (#133481)Orlando Cazalet-Hyams1-1/+2
Source atom groups are identified by an atom group number and inlined-at pair, so we simply can copy the atom numbers into the caller when inlining. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
2025-04-26[Inliner] Preserve alignment of byval arguments (#137455)sallto1-11/+16
Previously the inliner always produced a memcpy with alignment 1 for src and destination, leading to potentially suboptimal Codegen. Since the Src ptr alignment is only available through the CallBase it has to be passed to HandleByValArgumentInit. Dst Alignment is already known so it doesn't have to be passed along. If there is no specified Src Alignment my changes cause the ptr to have no align data attached instead of align 1 as before (see inline-tail.ll), I believe this is fine but since I'm a first time contributor, please confirm. My changes are already covered by 4 existing regression tests, so I did not add any additional ones. The example from #45778 now results in: ```C opt -S -passes=inline,instcombine,sroa,instcombine test.ll define dso_local i32 @test(ptr %t) { entry: %.sroa.0.0.copyload = load ptr, ptr %t, align 8 # this used to be align 1 in the original issue %arrayidx.i = getelementptr inbounds nuw i8, ptr %.sroa.0.0.copyload, i64 24 %0 = load i32, ptr %arrayidx.i, align 4 ret i32 %0 } ``` Fixes #45778.
2025-04-25InlineFunction: Use use_empty instead of hasNUses(0) (#137347)Matt Arsenault1-2/+2
2025-04-10Reapply "Inline: Propagate callsite nofpclass attribute" (#135018)Matt Arsenault1-1/+6
This reverts commit 3f38cd07d820248fd2043efb1341fabaac2d84a6. Fix case where inner callsite has nofpclass but callsite does not.
2025-04-09[DebugInfo][Inlining] Propagate inlined `resume` source loc to new br (#134826)Stephen Tozer1-1/+2
As part of inlining an invoke instruction, we may replace an inlined resume instruction with a simple branch to the landing pad block. When this happens, we should also propagate the resume's DILocation to this branch, which this patch enables. Found using https://github.com/llvm/llvm-project/pull/107279.
2025-04-08Revert "Inline: Propagate callsite nofpclass attribute"Matt Arsenault1-9/+1
This reverts commit b0cb672b9968eeee6eb022e98476957dbdf8e6e2. Breaks bot
2025-04-08Inline: Propagate callsite nofpclass attributeMatt Arsenault1-1/+9
(#134800) Fixes #134070
2025-04-07[ctxprof] Use `isInSpecializedModule` as criteria for using contextual ↵Mircea Trofin1-1/+1
profile (#134468) After #134340, the availability of contextual profile isn't in itself an indication of compiling the module containing all the functions covered by that profile.
2025-03-31[IRBuilder] Add new overload for CreateIntrinsic (#131942)Rahul Joshi1-2/+2
Add a new `CreateIntrinsic` overload with no `Types`, useful for creating calls to non-overloaded intrinsics that don't need additional mangling.
2025-03-23[Transforms] Use *Set::insert_range (NFC) (#132652)Kazu Hirata1-2/+1
We can use *Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E); down to: Set.insert_range(Range); In some cases, we can further fold that into the set declaration.
2025-03-04[ctxprof][nfc] Prepare CtxProfAnalysis for flat profiles (#129623)Mircea Trofin1-1/+1
Mostly remove the equivalence "no contexts == no CtxProfAnalysis result", and instead check explicitly there are no contextual profiles.
2025-02-24[CaptureTracking] Remove StoreCaptures parameter (NFC)Nikita Popov1-2/+1
The implementation doesn't use it, and is unlikely to use it in the future. The places that do set StoreCaptures=false, do so incorrectly and would be broken if the parameter actually did anything.
2025-02-04[IR][NFC] Switch to use `LifetimeIntrinsic` (#125528)Yingwei Zheng1-3/+2
2025-01-27[NFC][DebugInfo] Use iterators for instruction insertion in more places ↵Jeremy Morse1-4/+4
(#124291) As part of the "RemoveDIs" work to eliminate debug intrinsics, we're replacing methods that use Instruction*'s as positions with iterators. This patch changes some more complex call-sites, those crossing file boundaries and where I've had to perform some minor rewrites.
2025-01-24[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites ↵Jeremy Morse1-23/+24
(#123737) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to getFirstNonPHI use the iterator-returning version. This patch changes a bunch of call-sites calling getFirstNonPHI to use getFirstNonPHIIt, which returns an iterator. All these call sites are where it's obviously safe to fetch the iterator then dereference it. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer getFirstNonPHI, but not before adding concise documentation of what considerations are needed (very few). --------- Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
2024-11-09[Inliner] Prevent adding pointer attributes to non-pointer arguments (#115569)Harald van Dijk1-1/+4
Fixes a crash seen after #114311
2024-10-30[OPT] Search whole BB for convergence token. (#112728)Steven Perron1-17/+21
The spec for llvm.experimental.convergence.entry says that is must be in the entry block for a function, and must preceed any other convergent operation. It does not have to be the first instruction in the entry block. Inlining assumes that the call to llvm.experimental.convergence.entry will be the first instruction after any phi instructions. This commit modifies inlining to search the entire block for the call.
2024-10-17Reapply "[Inliner] Propagate more attributes to params when inlining ↵goldsteinn1-16/+74
(#91101)" (2nd Attempt) (#112749) Root cause of the bug was code hanging onto `range` attr after changing BitWidth. This was fixed in PR #112633.
2024-10-17[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with ↵goldsteinn1-2/+2
incompatible type (#112649) In a variety of places we change the bitwidth of a parameter but don't update the attributes. The issue in this case is from the `range` attribute when inlining `__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an `i8`, and if the `i32` had a `range` attr assosiated it will cause an error. Fixes #112633
2024-10-17[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)Jay Foad1-7/+2
Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.
2024-10-16Revert "[Inliner] Propagate more attributes to params when inlining (#91101)"Arthur Eubanks1-74/+16
This reverts commit ae778ae7ce72219270c30d5c8b3d88c9a4803f81. Creates broken IR, see comments in #91101.
2024-10-16[Inliner] Propagate more attributes to params when inlining (#91101)goldsteinn1-16/+74
- **[Inliner] Add tests for propagating more parameter attributes; NFC** - **[Inliner] Propagate more attributes to params when inlining** Add support for propagating: - `derefereancable` - `derefereancable_or_null` - `align` - `nonnull` - `range` These are only propagated if the parameter to the to-be-inlined callsite match the exact parameter used in the to-be-inlined function.
2024-10-15 [Inliner] Don't propagate access attr to byval params (#112256)goldsteinn1-1/+1
- **[Inliner] Add tests for bad propagationg of access attr for `byval` param; NFC** - **[Inliner] Don't propagate access attr to `byval` params** We previously only handled the case where the `byval` attr was in the callbase's param attr list. This PR also handles the case if the `ByVal` was a param attr on the function's param attr list.
2024-10-11[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)Rahul Joshi1-3/+4
Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).
2024-09-27[nfc][ctx_prof] Efficient profile traversal and update (#110052)Mircea Trofin1-1/+1
This optimizes profile updates and visits, where we want to access contexts for a specific function. These are all the current update cases. We do so by maintaining a list of contexts for each function, preserving preorder traversal. The list is updated whenever contexts are `std::move`-d or deleted.
2024-09-23[ctx_prof] Handle `select` and its `step` instrumentation (#109185)Mircea Trofin1-1/+15
The `step` instrumentation shouldn't be treated, during use, like an `increment`. The latter is treated as a BB ID. The step isn't that, it's more of a type of value profiling. We need to distinguish between the 2 when really looking for BB IDs (==increments), and handle appropriately `step`s. In particular, we need to know when to elide them because `select`s may get elided by function cloning, if the condition of the select is statically known.
2024-09-20[Inliner] Fix bug where attributes are propagated incorrectly (#109347)goldsteinn1-4/+16
- **[Inliner] Add tests for incorrect propagation of return attrs; NFC** - **[Inliner] Fix bug where attributes are propagated incorrectly** The bug stems from the fact that we assume the new (inlined) callsite is calling the same function as the original (callee) callsite. While this is typically the case, since `VMap` simplifies the new instructions, callee intrinsics callsites can end up not corresponding with the same function. This can lead to buggy propagation.
2024-09-19[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)Jay Foad1-1/+1
It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.
2024-09-04[CtxProf] Replace include with forward declaration (NFC)Nikita Popov1-2/+2
This header is fairly expensive. Forward declare PGOContextualProfile instead.
2024-09-03[ctx_prof] Add Inlining support (#106154)Mircea Trofin1-0/+237
Add an overload of `InlineFunction` that updates the contextual profile. If there is no contextual profile, this overload is equivalent to the non-contextual profile variant. Post-inlining, the update mainly consists of: - making the PGO instrumentation of the callee "the caller's": the owner function (the "name" parameter of the instrumentation instructions) becomes the caller, and new index values are allocated for each of the callee's indices (this happens for both increment and callsite instrumentation instructions) - in the contextual profile: - each context corresponding to the caller has its counters updated to incorporate the counters inherited from the callee at the inlined callsite. Counter values are copied as-is because no scaling is required since the profile is contextual. - the contexts of the callee (at the inlined callsite) are moved to the caller. - the callee context at the inlined callsite is deleted.
2024-08-13[DataLayout] Remove constructor accepting a pointer to Module (#102841)Sergei Barannikov1-2/+2
The constructor initializes `*this` with `M->getDataLayout()`, which is effectively the same as calling the copy constructor. There does not seem to be a case where a copy would be necessary. Pull Request: https://github.com/llvm/llvm-project/pull/102841
2024-07-14[Transforms] Use range-based for loops (NFC) (#98725)Kazu Hirata1-4/+3
2024-07-02[Transforms] Use range-based for loops (NFC) (#97195)Kazu Hirata1-7/+4
2024-07-01Inline: Fix handling of byval using non-alloca addrspace (#97306)Matt Arsenault1-2/+3
Use the address space of the original pointer argument instead of querying the datalayout. This avoids producing a verifier error since this was changing the address space for the user instructions. Fixes #97086
2024-06-29[TypeProf][InstrFDO]Implement more efficient comparison sequence for ↵Mingming Liu1-4/+22
indirect-call-promotion with vtable profiles. (#81442) Clang's `-fwhole-program-vtables` is required for this optimization to take place. If `-fwhole-program-vtables` is not enabled, this change is no-op. * Function-comparison (before): ``` %vtable = load ptr, ptr %obj %vfn = getelementptr inbounds ptr, ptr %vtable, i64 1 %func = load ptr, ptr %vfn %cond = icmp eq ptr %func, @callee br i1 %cond, label bb1, label bb2: bb1: call @callee bb2: call %func ``` * VTable-comparison (after): ``` %vtable = load ptr, ptr %obj %cond = icmp eq ptr %vtable, @vtable-address-point br i1 %cond, label bb1, label bb2: bb1: call @callee bb2: %vfn = getelementptr inbounds ptr, ptr %vtable, i64 1 %func = load ptr, ptr %vfn call %func ``` Key changes: 1. Find out virtual calls and the vtables they come from. - The ICP relies on type intrinsic `llvm.type.test` to find out virtual calls and the compatible vtables, and relies on type metadata to find the address point for comparison. 2. ICP pass does cost-benefit analysis and compares vtable only when the number of vtables for a function candidate is within (option specified) threshold. 3. Sink the function addressing and vtable load instruction to indirect fallback. - The sink helper functions are simplified versions of `InstCombinerImpl::tryToSinkInstruction`. Currently debug intrinsics are not handled. Ideally `InstCombinerImpl::tryToSinkInstructionDbgValues` and `InstCombinerImpl::tryToSinkInstructionDbgVariableRecords` could be moved into Transforms/Utils/Local.cpp (or another util cpp file) to handle debug intrinsics when moving instructions across basic blocks. 4. Keep value profiles updated 1) Update vtable value profiles after inline 2) For either function-based comparison or vtable-based comparison, update both vtable and indirect call value profiles.
2024-06-28[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)Nikita Popov1-5/+5
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.
2024-06-24Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"Stephen Tozer1-4/+4
Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24[IR][NFC] Update IRBuilder to use InsertPosition (#96497)Stephen Tozer1-4/+4
Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.