aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/InlineCost.cpp
AgeCommit message (Collapse)AuthorFilesLines
2016-08-29Fix a thinko in r278189.Easwaran Raman1-1/+1
llvm-svn: 280008
2016-08-11Make more fields of InlineParams Optional.Easwaran Raman1-4/+8
Differential revision: https://reviews.llvm.org/D23386 llvm-svn: 278312
2016-08-10Changed sign of LastCallToStaticBounsPiotr Padlewski1-1/+1
Summary: I think it is much better this way. When I firstly saw line: Cost += InlineConstants::LastCallToStaticBonus; I though that this is a bug, because everywhere where the cost is being reduced it is usuing -=. Reviewers: eraman, tejohnson, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23222 llvm-svn: 278290
2016-08-10Do not directly use inline threshold cl options in cost analysis.Easwaran Raman1-57/+98
This adds an InlineParams struct which is populated from the command line options by getInlineParams and passed to getInlineCost for the call analyzer to use. Differential revision: https://reviews.llvm.org/D22120 llvm-svn: 278189
2016-08-05Remove cold callsite heuristic that is not necessary because of cold callee ↵Dehao Chen1-7/+5
heuristic. llvm-svn: 277863
2016-08-05Replace hot-callsite based heuristic to use its own threshold parameter ↵Dehao Chen1-6/+17
instead of share inline-hint parameter Summary: Hot callsites should have higher threshold than inline hints. This patch uses separate threshold parameter for hot callsites. Reviewers: davidxl, eraman Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D22368 llvm-svn: 277860
2016-07-23Avoid using a raw AssumptionCacheTracker in various inliner functions.Sean Silva1-29/+29
This unblocks the new PM part of River's patch in https://reviews.llvm.org/D22706 Conveniently, this same change was needed for D21921 and so these changes are just spun out from there. llvm-svn: 276515
2016-07-11Implement callsite-hotness based inline cost for Sample-based PGODehao Chen1-1/+8
Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073
2016-06-27Fix size computation of array allocation in inline cost analysisEaswaran Raman1-3/+4
Differential revision: http://reviews.llvm.org/D21690 llvm-svn: 273952
2016-06-09Use ProfileSummaryInfo in inline cost analysis.Easwaran Raman1-39/+28
Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo. Differential revision: http://reviews.llvm.org/D21045 llvm-svn: 272321
2016-05-19Allow -inline-threshold to override default threshold.Easwaran Raman1-4/+7
Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior. Differential Revision: http://reviews.llvm.org/D20452 llvm-svn: 270153
2016-05-10Revert r269131Easwaran Raman1-4/+2
llvm-svn: 269138
2016-05-10Reapply r266477 and r266488Easwaran Raman1-2/+4
llvm-svn: 269131
2016-05-09[Inliner] don't assume that a Constant alloca size is a ConstantInt (PR27277)Sanjay Patel1-4/+4
Differential Revision: http://reviews.llvm.org/D20077 llvm-svn: 268980
2016-04-28[Inliner] Formatting. NFC.Chad Rosier1-36/+41
Patch by Aditya Kumar! Differential Revision: http://reviews.llvm.org/D19047 llvm-svn: 267888
2016-04-22Introduce llvm.load.relative intrinsic.Peter Collingbourne1-0/+5
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223
2016-04-18Revert "Replace the use of MaxFunctionCount module flag"Eric Liu1-4/+2
This reverts commit r266477. This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData", while "ProfileData" depends on "Object", which depends on "BitCode", which depends on "Analysis". llvm-svn: 266619
2016-04-15Replace the use of MaxFunctionCount module flagEaswaran Raman1-2/+4
Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count. Differential Revision: http://reviews.llvm.org/D18622 llvm-svn: 266477
2016-04-15[TTI] Add getInliningThresholdMultiplier.Justin Lebar1-0/+4
Summary: InlineCost's threshold is multiplied by this value. This lets us adjust the inlining threshold up or down on a per-target basis. For example, we might want to increase the threshold on targets where calls are unusually expensive. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18560 llvm-svn: 266405
2016-04-13Return immediately from analyzeCall if analyzeBlock returns false.Easwaran Raman1-14/+2
This is part of the patch reviewed at http://reviews.llvm.org/D17584 llvm-svn: 266249
2016-04-08Refactor Threshold computation. NFC.Easwaran Raman1-22/+35
This is part of changes reviewed in http://reviews.llvm.org/D17584. llvm-svn: 265852
2016-04-08Don't IPO over functions that can be de-refinedSanjoy Das1-5/+6
Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762
2016-03-08Revert revisions 262636, 262643, 262679, and 262682.Easwaran Raman1-86/+16
llvm-svn: 262883
2016-03-04Fix a memory leak.Easwaran Raman1-1/+4
llvm-svn: 262682
2016-03-03Fix breakage caused by r262636.Easwaran Raman1-1/+1
Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused)) llvm-svn: 262643
2016-03-03Infrastructure for PGO enhancements in inlinerEaswaran Raman1-16/+83
This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636
2016-02-05CallAnalyzer::analyzeCall: change the condition back to "Cost < Threshold"Hans Wennborg1-1/+1
In r252595, I inadvertently changed the condition to "Cost <= Threshold", which caused a significant size regression in Chrome. This commit rectifies that. llvm-svn: 259915
2016-02-01Avoid inlining call sites in unreachable-terminated blockJun Bum Lim1-6/+17
Summary: If the normal destination of the invoke or the parent block of the call site is unreachable-terminated, there is little point in inlining the call site unless there is literally zero cost. Unlike my previous change (D15289), this change specifically handle the call sites followed by unreachable in the same basic block for call or in the normal destination for the invoke. This change could be a reasonable first step to conservatively inline call sites leading to an unreachable-terminated block while BFI / BPI is not yet available in inliner. Reviewers: manmanren, majnemer, hfinkel, davidxl, mcrosier, dblaikie, eraman Subscribers: dblaikie, davidxl, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16616 llvm-svn: 259403
2016-01-29Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith ↵Yaron Keren1-1/+1
r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240
2016-01-28Lower inlining threshold when the caller has minsize attribute.Easwaran Raman1-8/+8
When the caller has optsize attribute, we reduce the inlinining threshold to OptSizeThreshold (=75) if it is not already lower than that. We don't do the same for minsize and I suspect it was not intentional. This also addresses a FIXME regarding checking optsize attribute explicitly instead of using the right wrapper. Differential Revision: http://reviews.llvm.org/D16493 llvm-svn: 259120
2016-01-21Change ConstantFoldInstOperands to take Instruction instead of opcode and ↵Manuel Jacob1-2/+1
type. NFC. Summary: The previous form, taking opcode and type, is moved to an internal helper and the new form, taking an instruction, is a wrapper around this helper. Although this is a slight cleanup on its own, the main motivation is to refactor the constant folding API to ease migration to opaque pointers. This will be follow-up work. Reviewers: eddyb Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D16383 llvm-svn: 258391
2016-01-14Refactor threshold computation for inline cost analysisEaswaran Raman1-4/+106
Differential Revision: http://reviews.llvm.org/D15401 llvm-svn: 257832
2015-12-28Refactor inline costs analysis by removing the InlineCostAnalysis classEaswaran Raman1-36/+12
InlineCostAnalysis is an analysis pass without any need for it to be one. Once it stops being an analysis pass, it doesn't maintain any useful state and the member functions inside can be made free functions. NFC. Differential Revision: http://reviews.llvm.org/D15701 llvm-svn: 256521
2015-12-22Provide a way to specify inliner's attribute compatibility and merging.Akira Hatanaka1-3/+1
This reapplies r256277 with two changes: - In emitFnAttrCompatCheck, change FuncName's type to std::string to fix a use-after-free bug. - Remove an unnecessary install-local target in lib/IR/Makefile. Original commit message for r252949: Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 256304
2015-12-22Revert r256277 and r256279.Akira Hatanaka1-1/+3
Some of the bots failed again. llvm-svn: 256280
2015-12-22Provide a way to specify inliner's attribute compatibility and merging.Akira Hatanaka1-3/+1
This reapplies r252990 and r252949. I've added member function getKind to the Attr classes which returns the enum or string of the attribute. Original commit message for r252949: Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 256277
2015-12-07Use updated threshold for indirect call bonusEaswaran Raman1-2/+2
When considering foo->bar inlining, if there is an indirect call in foo which gets resolved to a direct call (say baz), then we try to inline baz into bar with a threshold T and subtract max(T - Cost(bar->baz), 0) from Cost(foo->bar). This patch uses max(Threshold(bar->baz) - Cost(bar->baz)) instead, where Thresheld(bar->baz) could be different from T due to bonuses or subtractions. Threshold(bar->baz) - Cost(bar->baz) better represents the desirability of inlining baz into bar. Differential Revision: http://reviews.llvm.org/D14309 llvm-svn: 254945
2015-12-03Test commit.Easwaran Raman1-2/+2
Remove blank spaces at the end of comments llvm-svn: 254630
2015-11-13Revert r252990.Akira Hatanaka1-1/+10
Some of the buildbots are still failing. llvm-svn: 252999
2015-11-13Provide a way to specify inliner's attribute compatibility and merging.Akira Hatanaka1-10/+1
This reapplies r252949. I've changed the type of FuncName to be std::string instead of StringRef in emitFnAttrCompatCheck. Original commit message for r252949: Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252990
2015-11-12Revert r252949.Akira Hatanaka1-1/+10
It broke some of the bots including clang-x64-ninja-win7. llvm-svn: 252951
2015-11-12Provide a way to specify inliner's attribute compatibility and mergingAkira Hatanaka1-10/+1
rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252949
2015-11-10Inliner: Do zero-cost inlines even if above a negative threshold (PR24851)Hans Wennborg1-1/+1
Differential Revision: http://reviews.llvm.org/D14499 llvm-svn: 252595
2015-10-10Analysis: Remove implicit ilist iterator conversionsDuncan P. N. Exon Smith1-8/+7
Remove implicit ilist iterator conversions from LLVMAnalysis. I came across something really scary in `llvm::isKnownNotFullPoison()` which relied on `Instruction::getNextNode()` being completely broken (not surprising, but scary nevertheless). This function is documented (and coded to) return `nullptr` when it gets to the sentinel, but with an `ilist_half_node` as a sentinel, the sentinel check looks into some other memory and we don't recognize we've hit the end. Rooting out these scary cases is the reason I'm removing the implicit conversions before doing anything else with `ilist`; I'm not at all surprised that clients rely on badness. I found another scary case -- this time, not relying on badness, just bad (but I guess getting lucky so far) -- in `ObjectSizeOffsetEvaluator::compute_()`. Here, we save out the insertion point, do some things, and then restore it. Previously, we let the iterator auto-convert to `Instruction*`, and then set it back using the `Instruction*` version: Instruction *PrevInsertPoint = Builder.GetInsertPoint(); /* Logic that may change insert point */ if (PrevInsertPoint) Builder.SetInsertPoint(PrevInsertPoint); The check for `PrevInsertPoint` doesn't protect correctly against bad accesses. If the insertion point has been set to the end of a basic block (i.e., `SetInsertPoint(SomeBB)`), then `GetInsertPoint()` returns an iterator pointing at the list sentinel. The version of `SetInsertPoint()` that's getting called will then call `PrevInsertPoint->getParent()`, which explodes horribly. The only reason this hasn't blown up is that it's fairly unlikely the builder is adding to the end of the block; usually, we're adding instructions somewhere before the terminator. llvm-svn: 249925
2015-09-1580-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; ↵Sanjay Patel1-4/+5
NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC llvm-svn: 247700
2015-08-23[WinEH] Require token linkage in EH pad/ret signaturesJoseph Tremoulet1-1/+1
Summary: WinEHPrepare is going to require that cleanuppad and catchpad produce values of token type which are consumed by any cleanupret or catchret exiting the pad. This change updates the signatures of those operators to require/enforce that the type produced by the pads is token type and that the rets have an appropriate argument. The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that restriction, this change adds a notion of an operator constraint to both LLParser and BitcodeReader, allowing appropriate sentinels to be constructed for forward references and appropriate error messages to be emitted for illegal inputs. Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad predecessor must have no other predecessors; this ensures that WinEHPrepare will see the expected linear relationship between sibling catches on the same try. Lastly, remove some superfluous/vestigial casts from instruction operand setters operating on BasicBlocks. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12108 llvm-svn: 245797
2015-08-18[PM/AA] Remove the last relics of the separate IPA library from LLVM,Chandler Carruth1-0/+1451
folding the code into the main Analysis library. There already wasn't much of a distinction between Analysis and IPA. A number of the passes in Analysis are actually IPA passes, and there doesn't seem to be any advantage to separating them. Moreover, it makes it hard to have interactions between analyses that are both local and interprocedural. In trying to make the Alias Analysis infrastructure work with the new pass manager, it becomes particularly awkward to navigate this split. I've tried to find all the places where we referenced this, but I may have missed some. I have also adjusted the C API to continue to be equivalently functional after this change. Differential Revision: http://reviews.llvm.org/D12075 llvm-svn: 245318
2013-01-21Sink InlineCost.cpp into IPA -- it is now officially an interproceduralChandler Carruth1-1237/+0
analysis. How cute that it wasn't previously. ;] Part of this confusion stems from the flattened header file tree. Thanks to Benjamin for pointing out the goof on IRC, and we're considering un-flattening the headers, so speak now if that would bug you. llvm-svn: 173033
2013-01-21Move the inline cost analysis's primary cost query to TTI instead of theChandler Carruth1-4/+4
old CodeMetrics system. TTI has the specific advantage of being extensible and customizable by targets to reflect target-specific cost metrics. llvm-svn: 173032
2013-01-21Now that the inline cost analysis is a pass, we can easily have itChandler Carruth1-12/+20
depend on and use other analyses (as long as they're either immutable passes or CGSCC passes of course -- nothing in the pass manager has been fixed here). Leverage this to thread TargetTransformInfo down through the inline cost analysis. No functionality changed here, this just threads things through. llvm-svn: 173031