aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/Analysis.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-08-13[CodeGen] Make OrigTy in CC lowering the non-aggregate type (#153414)Nikita Popov1-22/+30
https://github.com/llvm/llvm-project/pull/152709 exposed the original IR argument type to the CC lowering logic. However, in SDAG, this used the raw type, prior to aggregate splitting. This PR changes it to use the non-aggregate type instead. (This matches what happened in the GlobalISel case already.) I've also added some more detailed documentation on the InputArg/OutputArg fields, to explain how they differ. In most cases ArgVT is going to be the EVT of OrigTy, so they encode very similar information (OrigTy just preserves some additional information lost in EVTs, like pointer types). One case where they do differ is in post-legalization lowering of libcalls, where ArgVT is going to be a legalized type, while OrigTy is going to be the original non-legalized type.
2024-11-22[Clang] Attribute NoFPClass should not prevent tail call optimization. (#116741)Félix-Antoine Constantin1-4/+4
Fixes #111950
2024-08-29[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)Stephen Tozer1-1/+2
This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-07-17[AArch64] Don't tail call memset if it would convert to a bzero. (#98969)Amara Emerson1-38/+19
Well, not quite that simple. We can tc memset since it returns the first argument but bzero doesn't do that and therefore we can end up miscompiling. This patch also refactors the logic out of isInTailCallPosition() into the callers. As a result memcpy and memmove are also modified to do the same thing for consistency. rdar://131419786
2024-06-28[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)Nikita Popov1-1/+1
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.
2024-05-07[Analysis] Attribute Range should not prevent tail call optimization (#91122)Jinsong Ji1-3/+4
- Remove Range attr when comparing for tailcall - Add test for testcall with range
2024-02-21[LLVM][SelectionDAG] Reduce number of ComputeValueVTs variants. (#75614)Paul Walker1-43/+6
This is another step in the direction of fixing the `Fixed(0) != Scalable(0)` bugbear, although whilst weird I don't believe it's causing us any real issues.
2023-08-21[SelectionDAG][RISCV][SVE] Harden fixed offset version of ComputeValueVTs ↵Craig Topper1-14/+12
against scalable offsets. Use getFixedValue instead of getKnownMinValue to convert TypeSize to uint64_t. I believe this would have caught the bug fixed by D157872. To prevent false failures, I had to treat a scalable 0 as if it is fixed value. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D158115
2023-05-19[1/11][IR] Permit load/store/alloca for struct of the same scalable vector typeeopXD1-6/+57
This patch-set aims to simplify the existing RVV segment load/store intrinsics to use a type that represents a tuple of vectors instead. To achieve this, first we need to relax the current limitation for an aggregate type to be a target of load/store/alloca when the aggregate type contains homogeneous scalable vector types. Then to adjust the prolog of an LLVM function during lowering to clang. Finally we re-define the RVV segment load/store intrinsics to use the tuple types. The pull request under the RVV intrinsic specification is riscv-non-isa/rvv-intrinsic-doc#198 --- This is the 1st patch of the patch-set. This patch is originated from D98169. This patch allows aggregate type (StructType) that contains homogeneous scalable vector types to be a target of load/store/alloca. The RFC of this patch was posted in LLVM Discourse. https://discourse.llvm.org/t/rfc-ir-permit-load-store-alloca-for-struct-of-the-same-scalable-vector-type/69527 The main changes in this patch are: Extend `StructLayout::StructSize` from `uint64_t` to `TypeSize` to accommodate an expression of scalable size. Allow `StructType:isSized` to also return true for homogeneous scalable vector types. Let `Type::isScalableTy` return true when `Type` is `StructType` and contains scalable vectors Extra description is added in the LLVM Language Reference Manual on the relaxation of this patch. Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Co-Authored-by: eop Chen <eop.chen@sifive.com> Reviewed By: craig.topper, nikic Differential Revision: https://reviews.llvm.org/D146872
2023-01-11[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()Guillaume Chatelet1-2/+3
This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.
2023-01-06Revert D141134 "[NFC] Only expose getXXXSize functions in TypeSize"Guillaume Chatelet1-2/+2
The patch should be discussed further. This reverts commit dd56e1c92b0e6e6be249f2d2dd40894e0417223f.
2023-01-06[NFC] Only expose getXXXSize functions in TypeSizeGuillaume Chatelet1-2/+2
Currently 'TypeSize' exposes two functions that serve the same purpose: - getFixedSize / getFixedValue - getKnownMinSize / getKnownMinValue source : https://github.com/llvm/llvm-project/blob/bf82070ea465969e9ae86a31dfcbf94c2a7b4c4c/llvm/include/llvm/Support/TypeSize.h#L337-L338 This patch offers to remove one of the two and stick to a single function in the code base. Differential Revision: https://reviews.llvm.org/D141134
2022-03-16Cleanup codegen includesserge-sans-paille1-3/+0
This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10Revert "Cleanup codegen includes"Nico Weber1-0/+3
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10Cleanup codegen includesserge-sans-paille1-3/+0
after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2022-01-31[Analysis] Attribute noundef should not prevent tail call optimizationDávid Bolvanský1-1/+1
Very similar to https://reviews.llvm.org/D101230 Fixes https://github.com/llvm/llvm-project/issues/53501
2022-01-15[AttrBuilder] Remove ctor accepting AttributeList and IndexNikita Popov1-3/+3
Use the AttributeSet constructor instead. There's no good reason why AttrBuilder itself should exact the AttributeSet from the AttributeList. Moving this out of the AttrBuilder generally results in cleaner code.
2022-01-10Use a sorted array instead of a map to store AttrBuilder string attributesSerge Guelton1-2/+2
Using and std::map<SmallString, SmallString> for target dependent attributes is inefficient: it makes its constructor slightly heavier, and involves extra allocation for each new string attribute. Storing the attribute key/value as strings implies extra allocation/copy step. Use a sorted vector instead. Given the low number of attributes generally involved, this is cheaper, as showcased by https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions Differential Revision: https://reviews.llvm.org/D116599
2021-12-20[llvm] Construct SmallVector with iterator ranges (NFC)Kazu Hirata1-2/+2
2021-10-30[NFCI] Introduce `ICmpInst::compare()` and use it where appropriateRoman Lebedev1-3/+27
As noted in https://reviews.llvm.org/D90924#inline-1076197 apparently this is a pretty common pattern, let's not repeat it yet again, but have it in a common place. There may be some more places where it could be used, but these are the most obvious ones.
2021-10-12[CSSPGO] Unblock optimizations with pseudo probe instrumentation part 3.Hongtao Yu1-3/+1
This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation. Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are: - Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes - Unblocked CSE by avoiding pseudo probe from clobbering memory SSA - Unblocked induction variable simpliciation - Allow empty loop deletion by treating probe intrinsic isDroppable - Some refactoring. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D110847
2021-05-17IR/AArch64/X86: add "swifttailcc" calling convention.Tim Northover1-3/+4
Swift's new concurrency features are going to require guaranteed tail calls so that they don't consume excessive amounts of stack space. This would normally mean "tailcc", but there are also Swift-specific ABI desires that don't naturally go along with "tailcc" so this adds another calling convention that's the combination of "swiftcc" and "tailcc". Support is added for AArch64 and X86 for now.
2021-04-24[Analysis] Attribute alignment should not prevent tail call optimizationDávid Bolvanský1-8/+6
Fixes tail folding issue mentioned in D100879. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D101230
2021-02-21[CodeGen] Use range-based for loops (NFC)Kazu Hirata1-7/+5
2021-01-21[CodeGen] Use llvm::append_range (NFC)Kazu Hirata1-2/+1
2021-01-19[noalias.decl] Look through llvm.experimental.noalias.scope.declJeroen Dobbelaere1-3/+4
Just like llvm.assume, there are a lot of cases where we can just ignore llvm.experimental.noalias.scope.decl. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93042
2021-01-17[IR] Allow scalable vectors in structs to support intrinsics returning ↵Craig Topper1-8/+19
multiple values. RISC-V would like to use a struct of scalable vectors to return multiple values from intrinsics. This woud also be needed for target independent intrinsics like llvm.sadd.overflow. This patch removes the existing restriction for this. I've modified StructType::isSized to consider a struct containing scalable vectors as unsized so the verifier won't allow loads/stores/allocas of these structs. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D94142
2020-12-24[CodeGen] Remove unused function hasInlineAsmMemConstraint (NFC)Kazu Hirata1-21/+0
The last use of the function was removed on Sep 13, 2010 in commit 1094c80281e3cdd9e9a9d7ee716da6386b33359b.
2020-11-20[CSSPGO] IR intrinsic for pseudo-probe block instrumentationHongtao Yu1-0/+3
This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues: 1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting. We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality. Let's now look at an example. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86490
2020-06-03[CodeGen] Enable tail call position check for speculatable functionsVictor Huang1-17/+16
In the function "Analysis.cpp:isInTailCallPosition", it only checks whether a call is in a tail call position if the call has side effects, access memory or it is not safe to speculative execute. Therefore, a speculatable function will not go through tail call position check and improperly tail called when it is not in a tail-call position. This patch enables tail call position check for speculatable functions. Differential Revision: https://reviews.llvm.org/D80661
2020-05-23TargetLowering.h - remove unnecessary TargetMachine.h include. NFCSimon Pilgrim1-0/+1
Replace with forward declaration and move dependency down to source files that actually need it. Both TargetLowering.h and TargetMachine.h are 2 of the most expensive headers (top 10) in the ClangBuildAnalyzer report when building llc.
2020-04-13[CallSite removal][CodeGen] Replace ImmutableCallSite with CallBase in ↵Craig Topper1-8/+7
isInTailCallPosition.
2020-04-13[CallSite removal][CodeGen] Use CallBase instead of CallSite in getNoopInput ↵Craig Topper1-2/+2
in Analysis.cpp. NFC
2020-03-18Remove CompositeType class.Eli Friedman1-18/+19
The existence of the class is more confusing than helpful, I think; the commonality is mostly just "GEP is legal", which can be queried using APIs on GetElementPtrInst. Differential Revision: https://reviews.llvm.org/D75660
2020-01-06[NFC] Fix trivial typos in commentsJames Henderson1-1/+1
Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.
2019-11-06[Analysis] Attribute deref/deref_or_null should not prevent tail call ↵Dávid Bolvanský1-1/+5
optimization
2019-10-31Fix missing memcpy, memmove and memset tail callsSanne Wouda1-1/+18
Summary: If a wrapper around one of the mem* stdlib functions bitcasts the returned pointer value before returning it (e.g. to a wchar_t*), LLVM does not emit a tail call. Add a check for this scenario so that we emit a tail call. Reviewers: wmi, mkuper, ramred01, dmgreen Reviewed By: wmi, dmgreen Subscribers: hiraditya, sanwou01, javed.absar, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59078
2019-10-08[SVE][IR] Scalable Vector size queries and IR instruction supportGraham Hunter1-1/+2
* Adds a TypeSize struct to represent the known minimum size of a type along with a flag to indicate that the runtime size is a integer multiple of that size * Converts existing size query functions from Type.h and DataLayout.h to return a TypeSize result * Adds convenience methods (including a transparent conversion operator to uint64_t) so that most existing code 'just works' as if the return values were still scalars. * Uses the new size queries along with ElementCount to ensure that all supported instructions used with scalable vectors can be constructed in IR. Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen Reviewed By: rovka, sdesmalen Differential Revision: https://reviews.llvm.org/D53137 llvm-svn: 374042
2019-10-07[X86] Add new calling convention that guarantees tail call optimizationReid Kleckner1-1/+2
When the target option GuaranteedTailCallOpt is specified, calls with the fastcc calling convention will be transformed into tail calls if they are in tail position. This diff adds a new calling convention, tailcc, currently supported only on X86, which behaves the same way as fastcc, except that the GuaranteedTailCallOpt flag does not need to enabled in order to enable tail call optimization. Patch by Dwight Guth <dwight.guth@runtimeverification.com>! Reviewed By: lebedev.ri, paquette, rnk Differential Revision: https://reviews.llvm.org/D67855 llvm-svn: 373976
2019-08-22IR. Change strip* family of functions to not look through aliases.Peter Collingbourne1-1/+1
I noticed another instance of the issue where references to aliases were being replaced with aliasees, this time in InstCombine. In the instance that I saw it turned out to be only a QoI issue (a symbol ended up being missing from the symbol table due to the last reference to the alias being removed, preventing HWASAN from symbolizing a global reference), but it could easily have manifested as incorrect behaviour. Since this is the third such issue encountered (previously: D65118, D65314) it seems to be time to address this common error/QoI issue once and for all and make the strip* family of functions not look through aliases. Includes a test for the specific issue that I saw, but no doubt there are other similar bugs fixed here. As with D65118 this has been tested to make sure that the optimization isn't load bearing. I built Clang, Chromium for Linux, Android and Windows as well as the test-suite and there were no size regressions. Differential Revision: https://reviews.llvm.org/D66606 llvm-svn: 369697
2019-08-16[CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimizationGuozhi Wei1-2/+4
In function Analysis.cpp:isInTailCallPosition, instructions between call and ret are checked to see if they block tail call optimization. If an instruction is an intrinsic call, only llvm.lifetime_end is allowed and other intrinsic functions block tail call. When compiling tcmalloc, we found llvm.assume between a hot function call and ret, it blocks the optimization. But llvm.assume doesn't generate instructions, it should not block tail call. Differential Revision: https://reviews.llvm.org/D66096 llvm-svn: 369125
2019-08-02CodeGen: Don't follow aliases when extracting type info.Peter Collingbourne1-1/+1
This fixes a crash in the case where the type info object is an alias pointing to a non-zero offset within a global or is otherwise unanalyzable by the stripPointerCasts() function. Looking through the alias is not the right thing to do anyway for similar reasons as D65118. Differential Revision: https://reviews.llvm.org/D65314 llvm-svn: 367696
2019-05-01DAG: allow DAG pointer size different from memory representation.Tim Northover1-2/+13
In preparation for supporting ILP32 on AArch64, this modifies the SelectionDAG builder code so that pointers are allowed to have a larger type when "live" in the DAG compared to memory. Pointers get zero-extended whenever they are loaded, and truncated prior to stores. In addition, a few not quite so obvious locations need updating: * A GEP that has not been marked inbounds needs to enforce the IR-documented 2s-complement wrapping at the memory pointer size. Inbounds GEPs are undefined if they overflow the address space, so no additional operations are needed. * Signed comparisons would give incorrect results if performed on the zero-extended values. This shouldn't affect CodeGen for now, but will become active when the AArch64 ILP32 support is committed. llvm-svn: 359676
2019-04-10GlobalISel: Move computeValueLLTsMatt Arsenault1-0/+30
Call lowering should use this directly instead of going through the EVT version, but more work is needed to deal with this (mostly the passing of the IR type pointer instead of the relevant properties in ArgInfo). llvm-svn: 358111
2019-01-19Update the file headers across all of the LLVM projects in the monorepoChandler Carruth1-4/+3
to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
2019-01-09[CodeGen] Ignore return sext/zext attributes of unused results for tail callsFrancis Visoiu Mistrih1-0/+15
If the caller's return type does not have a zeroext attribute but the callee does a tail call zeroext, we won't consider the tail call during CodeGenPrepare because the attributes don't match. However, if the result of the tail call has no uses, it makes sense to drop the sext/zext attributes. Differential Revision: https://reviews.llvm.org/D56486 llvm-svn: 350753
2018-10-24[CodeGen] skip lifetime end marker in isInTailCallPositionRobert Lougher1-0/+4
A lifetime end intrinsic between a tail call and the return should not prevent the call from being tail call optimized. Differential Revision: https://reviews.llvm.org/D53519 llvm-svn: 345163
2018-10-15[TI removal] Make variables declared as `TerminatorInst` and initializedChandler Carruth1-1/+1
by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502
2018-09-26[CodeGen] Enable tail calls for functions with NonNull attributes.David Green1-2/+4
Adding NonNull as attributes to returned pointers has the unfortunate side effect of disabling tail calls. This patch ignores the NonNull attribute when we decide whether to tail merge, in the same way that we ignore the NoAlias attribute, as it has no affect on the call sequence. Differential Revision: https://reviews.llvm.org/D52238 llvm-svn: 343091
2018-08-21[WebAssembly] Add isEHScopeReturn instruction propertyHeejin Ahn1-1/+1
Summary: So far, `isReturn` property is used to mean both a return instruction from a functon and the end of an EH scope, a scope that starts with a EH scope entry BB and ends with a catchret or a cleanupret instruction. Because WinEH uses funclets, all EH-scope-ending instructions are also real return instruction from a function. But for wasm, they only serve as the end marker of an EH scope but not a return instruction that exits a function. This mismatch caused incorrect prolog and epilog generation in wasm EH scopes. This patch fixes this. This patch is in the same vein with rL333045, which splits `MachineBasicBlock::isEHFuncletEntry` into `isEHFuncletEntry` and `isEHScopeEntry`. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D50653 llvm-svn: 340325