riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2020-03-12	[SVE] Update API ConstantVector::getSplat() to use ElementCount.	Huihui Zhang	1	-3/+2
	Summary: Support ConstantInt::get() and Constant::getAllOnesValue() for scalable vector type, this requires ConstantVector::getSplat() to take in 'ElementCount', instead of 'unsigned' number of element count. This change is needed for D73753. Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74386
2020-03-12	[InstSimplify] simplify FP ops harder with FMF (part 2)	Sanjay Patel	1	-3/+3
	This is part of the IR sibling for: D75576 Related transform committed with: rG8ec71585719d
2020-03-12	[InstSimplify] simplify FP ops harder with FMF	Sanjay Patel	1	-7/+20
	This is part of the IR sibling for: D75576 (I'm splitting part of the transform as a separate commit to reduce risk. I don't know of any bugs that might be exposed by this improved folding, but it's hard to see those in advance...)
2020-03-12	[InstSimplify] reduce code for FP undef/nan folding; NFC	Sanjay Patel	1	-6/+3

2020-03-12	[SCEV] isHighCostExpansionHelper(): use correct TTI hooks	Roman Lebedev	1	-9/+12
	Summary: Cost modelling strikes again. In PR44668 <https://bugs.llvm.org/show_bug.cgi?id=44668> patch series, i've made the same mistake of always using generic `getOperationCost()` that i missed in reviewing D73480/D74495 which was later fixed in 62dd44d76da9aa596fb199bda8b1e8768bb41033. We should be using more specific hooks instead - `getCastInstrCost()`, `getArithmeticInstrCost()`, `getCmpSelInstrCost()`. Evidently, this does not have an effect on the existing testcases, with unchanged default cost budget. But if it does have an effect on some target, we'll have to segregate tests that use this function per-target, much like we already do with other TTI-aware transform tests. There's also an issue that @samparker has brought up in post-commit-review: >>! In D73501#1905171, @samparker wrote: > Hi, > Did you get performance numbers for these patches? We track the performance > of our (Arm) open source DSP library and the cost model fixes were generally > a notable improvement, so many thanks for that! But the final patch > for rewriting exit values has generally been bad, especially considering > the gains from the modelling improvements. I need to look into it further, > but on my current test case I'm seeing +30% increase in stack accesses > with a similar decrease in performance. > I'm just wondering if you observed any negative effects yourself? I don't know if this addresses that, or we need D66450 for that. Reviewers: samparker, spatel, mkazantsev, reames, wmi Reviewed By: reames Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits, samparker Tags: #llvm Differential Revision: https://reviews.llvm.org/D75908
2020-03-11	[InstSimplify][SVE] Fix SimplifyInsert/ExtractElementInst for scalable vector.	Huihui Zhang	2	-8/+12
	Summary: For scalable vector, index out-of-bound can not be determined at compile-time. The same apply for VectorUtil findScalarElement(). Add test cases to check the functionality of SimplifyInsert/ExtractElementInst for scalable vector. Reviewers: sdesmalen, efriedma, spatel, apazos Reviewed By: efriedma Subscribers: cameron.mcinally, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75782
2020-03-11	[TTI][ARM][MVE] Refine gather/scatter cost model	Anna Welker	1	-9/+14
	Refines the gather/scatter cost model, but also changes the TTI function getIntrinsicInstrCost to accept an additional parameter which is needed for the gather/scatter cost evaluation. This did require trivial changes in some non-ARM backends to adopt the new parameter. Extending gathers and truncating scatters are now priced cheaper. Differential Revision: https://reviews.llvm.org/D75525
2020-03-09	[InstSimplify] Simplify calls with "returned" attribute	Nikita Popov	1	-0/+3
	If a call argument has the "returned" attribute, we can simplify the call to the value of that argument. The "-inst-simplify" pass already handled this for the constant integer argument case via known bits, which is invoked in SimplifyInstruction. However, non-constant (or non-int) arguments are not handled at all right now. This addresses one of the regressions from D75801. Differential Revision: https://reviews.llvm.org/D75815
2020-03-09	[InstSimplify] Don't simplify musttail calls	Nikita Popov	1	-0/+8
	As pointed out by jdoerfert on D75815, we must be careful when simplifying musttail calls: We can only replace the return value if we can eliminate the call entirely. As we can't make this guarantee for all consumers of InstSimplify, this patch disables simplification of musttail calls. Without this patch, musttail simplification currently results in module verification errors. Differential Revision: https://reviews.llvm.org/D75824
2020-03-06	Extend TimeTrace to LLVM's new pass manager	Andrew Monshizadeh	1	-1/+6
	With the addition of the LLD time tracing it made sense to include coverage for LLVM's various passes. Doing so ensures that ThinLTO is also covered with a time trace. Before: {F11333974} After: {F11333928} Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D74516
2020-03-06	[AMDGPU][ConstantFolding] Fold llvm.amdgcn.cube* intrinsics	Jay Foad	1	-0/+68
	Summary: This folds the following family of intrinsics: llvm.amdgcn.cubeid (face id) llvm.amdgcn.cubema (major axis) llvm.amdgcn.cubesc (S coordinate) llvm.amdgcn.cubetc (T coordinate) Reviewers: nhaehnle, arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75187
2020-03-06	[APFloat] Make use of new overloaded comparison operators. NFC.	Jay Foad	3	-20/+12
	Reviewers: ekatz, spatel, jfb, tlively, craig.topper, RKSimon, nikic, scanon Subscribers: arsenm, jvesely, nhaehnle, hiraditya, dexonsmith, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75744
2020-03-06	[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch ↵	Juneyoung Lee	2	-4/+30
	conditions of dominating blocks' terminators Summary: ``` br i1 c, BB1, BB2: BB1: use1(c) BB2: use2(c) ``` In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB. This is a resubmission of 952ad47 with crash fix of llvm/test/Transforms/LoopRotate/freeze-crash.ll. Checked with Alive2 Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy Reviewed By: reames Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75401
2020-03-05	Revert "[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into ↵	Daniil Suchkov	2	-30/+4
	branch conditions of dominating blocks' terminators" That commit causes SIGSEGV on some simple tests. This reverts commit 952ad4701cf0d8da79789f6b83ddaa386c60d535.
2020-03-04	[InstSimplify] Constant fold icmp of gep	Nikita Popov	1	-1/+2
	InstSimplify can fold icmps of gep where the base pointers are the same and the offsets are constant. It does so by constructing a constant expression icmp and assumes that it gets folded -- but this doesn't actually happen, because GEP expressions can usually only be folded by the target-dependent constant folding layer. As such, we need to explicitly invoke it here. Differential Revision: https://reviews.llvm.org/D75407
2020-03-04	[ConstantFolding] Always return something from ConstantFoldConstant	Nikita Popov	3	-30/+16
	Spin-off from D75407. As described there, ConstantFoldConstant() currently returns null for non-ConstantExpr/ConstantVector inputs, but otherwise always returns non-null, independently of whether any folding has happened or not. This is confusing and makes consumer code more complicated. I would expect either that ConstantFoldConstant() returns only if it actually folded something, or that it always returns non-null. I'm going to the latter possibility here, which appears to be more useful considering existing usage. Differential Revision: https://reviews.llvm.org/D75543
2020-03-04	[DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" ↵	Evgeniy Brevnov	1	-20/+53
	should not be shared with general accesses(PR42151). Summary: This is second attempt to fix the problem with incorrect dependencies reported in presence of invariant load. Initial fix (https://reviews.llvm.org/D64405) was reverted due to a regression reported in https://reviews.llvm.org/D70516. The original fix changed caching behavior for invariant loads. Namely such loads are not put into the second level cache (NonLocalDepInfo). The problem with that fix is the first level cache (CachedNonLocalPointerInfo) still works as if invariant loads were in the second level cache. The solution is in addition to not putting dependence results into the second level cache avoid putting info about invariant loads into the first level cache as well. Reviewers: jdoerfert, reames, hfinkel, efriedma Reviewed By: jdoerfert Subscribers: DaniilSuchkov, hiraditya, bmahjour, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73027
2020-03-04	[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch ↵	Juneyoung Lee	2	-4/+30
	conditions of dominating blocks' terminators Summary: ``` br i1 c, BB1, BB2: BB1: use1(c) BB2: use2(c) ``` In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB. Checked with Alive2 Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy Reviewed By: reames Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75401
2020-03-03	[LoopNest]: Analysis to discover properties of a loop nest.	Whitney Tsang	2	-0/+297
	Summary: This patch adds an analysis pass to collect loop nests and summarize properties of the nest (e.g the nest depth, whether the nest is perfect, what's the innermost loop, etc...). The motivation for this patch was discussed at the latest meeting of the LLVM loop group (https://ibm.box.com/v/llvm-loop-nest-analysis) where we discussed the unimodular loop transformation framework ( “A Loop Transformation Theory and an Algorithm to Maximize Parallelism”, Michael E. Wolf and Monica S. Lam, IEEE TPDS, October 1991). The unimodular framework provides a convenient way to unify legality checking and code generation for several loop nest transformations (e.g. loop reversal, loop interchange, loop skewing) and their compositions. Given that the unimodular framework is applicable to perfect loop nests this is one property of interest we expose in this analysis. Several other utility functions are also provided. In the future other properties of interest can be added in a centralized place. Authored By: etiotto Reviewer: Meinersbur, bmahjour, kbarton, Whitney, dmgreen, fhahn, reames, hfinkel, jdoerfert, ppc-slack Reviewed By: Meinersbur Subscribers: bryanpkc, ppc-slack, mgorny, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D68789
2020-03-03	Revert "[LoopNest]: Analysis to discover properties of a loop nest."	Whitney Tsang	2	-297/+0
	This reverts commit 3a063d68e3c97136d10a2e770f389e6c13c3b317. Broke the build with modules enabled: http://green.lab.llvm.org/green/job/lldb-cmake/10655/console .
2020-03-03	[LoopNest]: Analysis to discover properties of a loop nest.	Whitney Tsang	2	-0/+297
	Summary: This patch adds an analysis pass to collect loop nests and summarize properties of the nest (e.g the nest depth, whether the nest is perfect, what's the innermost loop, etc...). The motivation for this patch was discussed at the latest meeting of the LLVM loop group (https://ibm.box.com/v/llvm-loop-nest-analysis) where we discussed the unimodular loop transformation framework ( “A Loop Transformation Theory and an Algorithm to Maximize Parallelism”, Michael E. Wolf and Monica S. Lam, IEEE TPDS, October 1991). The unimodular framework provides a convenient way to unify legality checking and code generation for several loop nest transformations (e.g. loop reversal, loop interchange, loop skewing) and their compositions. Given that the unimodular framework is applicable to perfect loop nests this is one property of interest we expose in this analysis. Several other utility functions are also provided. In the future other properties of interest can be added in a centralized place. Authored By: etiotto Reviewer: Meinersbur, bmahjour, kbarton, Whitney, dmgreen, fhahn, reames, hfinkel, jdoerfert, ppc-slack Reviewed By: Meinersbur Subscribers: bryanpkc, ppc-slack, mgorny, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D68789
2020-03-02	[PSI] Add the isCold query support with a given percentile value.	Hiroshi Yamauchi	1	-13/+64
	Summary: This follows up D67377 that added the isHot side. Reviewers: davidxl Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75283
2020-03-01	[ValueTracking] Let getGuaranteedNonFullPoisonOp consider assume, remove ↵	Juneyoung Lee	1	-4/+11
	mentioning about br Summary: This patch helps getGuaranteedNonFullPoisonOp handle llvm.assume call. Also, a comment about the semantics of branch is removed to prevent confusion. As llvm.assume does, branching on poison directly raises UB (as LangRef says), and this allows transformations such as introduction of llvm.assume on branch condition at each successor, or freely replacing values after conditional branch (such as at loop exit). Handling br is not addressed in this patch. It makes SCEV more accurate, causing existing LoopVectorize/IndVar/etc tests to fail. Reviewers: spatel, lebedev.ri, nlopes Reviewed By: nlopes Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75397
2020-03-01	[ValueTracking] A value is never undef or poison if it must raise UB	Juneyoung Lee	1	-0/+7
	Summary: This patch helps isGuaranteedNotToBeUndefOrPoison return true if the value makes the program always undefined. According to value tracking functions' comments, it is not still in consensus whether a poison value can be bitwise or not, so conservatively only the case with i1 is considered. Reviewers: spatel, lebedev.ri, reames, nlopes, regehr Reviewed By: nlopes Subscribers: uenoku, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75396
2020-02-28	[Inliner] Inlining should honor nobuiltin attributes	Teresa Johnson	1	-5/+21
	Summary: Final patch in series to fix inlining between functions with different nobuiltin attributes/options, which was specifically an issue in LTO. See discussion on D61634 for background. The prior patch in this series (D67923) enabled per-Function TLI construction that identified the nobuiltin attributes. Here I have allowed inlining to proceed if the callee's nobuiltins are a subset of the caller's nobuiltins, but not in the reverse case, which should be conservatively correct. This is controlled by a new option, -inline-caller-superset-nobuiltin, which is enabled by default. Reviewers: hfinkel, gchatelet, chandlerc, davidxl Subscribers: arsenm, jvesely, nhaehnle, mehdi_amini, eraman, hiraditya, haicheng, dexonsmith, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74162
2020-02-28	No longer generate calls to *_finite	serge-sans-paille	1	-0/+3
	According to Joseph Myers, a libm maintainer > They were only ever an ABI (selected by use of -ffinite-math-only or > options implying it, which resulted in the headers using "asm" to redirect > calls to some libm functions), not an API. The change means that ABI has > turned into compat symbols (only available for existing binaries, not for > anything newly linked, not included in static libm at all, not included in > shared libm for future glibc ports such as RV32), so, yes, in any case > where tools generate direct calls to those functions (rather than just > following the "asm" annotations on function declarations in the headers), > they need to stop doing so. As a consequence, we should no longer assume these symbols are available on the target system. Still keep the TargetLibraryInfo for constant folding. Differential Revision: https://reviews.llvm.org/D74712
2020-02-27	[DA] Delinearization of fixed-size multi-dimensional arrays	Bardia Mahjour	1	-34/+124
	Summary: Currently the dependence analysis in LLVM is unable to compute accurate dependence vectors for multi-dimensional fixed size arrays. This is mainly because the delinearization algorithm in scalar evolution relies on parametric terms to be present in the access functions. In the case of fixed size arrays such parametric terms are not present, but we can use the indexes from GEP instructions to recover the subscripts for each dimension of the arrays. This patch adds this ability under the existing option `-da-disable-delinearization-checks`. Authored By: bmahjour Reviewer: Meinersbur, sebpop, fhahn, dmgreen, grosser, etiotto, bollu Reviewed By: Meinersbur Subscribers: hiraditya, arphaman, Whitney, ppc-slack, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72178
2020-02-27	[AMDGPU][ConstantFolding] Fold llvm.amdgcn.fract intrinsic	Jay Foad	1	-1/+15
	Reviewers: nhaehnle, arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75179
2020-02-26	Cost Annotation Writer for InlineCost	Kirill Naumov	1	-0/+78
	Add extra diagnostics for the inline cost analysis under -print-instruction-deltas cl option. When enabled along with -debug-only=inline-cost it prints the IR of inline candidate annotated with cost and threshold change per every instruction. Reviewed By: apilipenko, davidxl, mtrofin Differential Revision: https://reviews.llvm.org/D71501
2020-02-25	[SCEV][IndVars] Always provide insertion point to the ↵	Roman Lebedev	1	-14/+8
	SCEVExpander::isHighCostExpansion() Summary: This addresses the `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` regression from D73728 Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73777
2020-02-25	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model min/max (PR44668)	Roman Lebedev	1	-19/+16
	Summary: Previosly we simply always said that `SCEVMinMaxExpr` is too costly to expand. But this isn't really true, it expands into just a comparison+swap pair. And again much like with add/mul, there will be one less such pair than the number of operands. And we need to count the cost of operands themselves. This does change a number of testcases, and as far as i can tell, all of these changes are improvements, in the sense that we fixed up more latches to do the [in]equality comparison. This concludes cost-modelling changes, no other SCEV expressions exist as of now. This is a part of addressing [[ https://bugs.llvm.org/show_bug.cgi?id=44668 \| PR44668 ]]. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73744
2020-02-25	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model polynomial ↵	Roman Lebedev	1	-9/+58
	recurrence Summary: So, i wouldn't call this obviously correct, but i think i got it right this time :) Roughly, we have ``` Op0x^0 + Op1x^1 + Op2x^2 ... ``` where `Op_{n} x^{n}` is called term, and `n` the degree of term. Due to the way they are stored internally in `SCEVAddRecExpr`, i believe we can have `Op_{n}` to be `0`, so we should not charge for those. I think it is most straight-forward to count the cost in 4 steps: 1. First, count it the same way we counted `scAddExpr`, but be sure to skip terms with zero constants. Much like with `add` expr we will have one less addition than number of terms. 2. Each non-constant term (term degree >= 1) requires a multiplication between the `Op_{n}` and `x^{n}`. But again, only charge for it if it is required - `Op_{n}` must not be 0 (no term) or 1 (no multiplication needed), and obviously don't charge constant terms (`x^0 == 1`). 3. We must charge for all the `x^0`..`x^{poly_degree}` themselves. Since `x^{poly_degree}` is `x * x * ... * x`, i.e. `poly_degree` `x`'es multiplied, for final `poly_degree` term we again require `poly_degree-1` multiplications. Note that all the `x^{0}`..`x^{poly_degree-1}` will be computed for the free along the way there. 4. And finally, the operands themselves. Here, much like with add/mul exprs, we really don't look for preexisting instructions.. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73741
2020-02-25	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model add/mul	Roman Lebedev	1	-0/+34
	Summary: While this resolves the regression from D73722 in `llvm/test/Transforms/IndVarSimplify/exit_value_test2.ll`, this now regresses `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` test, we no longer can perform that expansion within default budget of `4`, but require budget of `6`. That regression is being addressed by D73777. The basic idea here is simple. ``` Op0, Op1, Op2 ... \| \| \| \--+--/ \| \| \| \---+---/ ``` I.e. given N operands, we will have N-1 operations, so we have to add cost of an add (mul) for every Op processed, except the first one, plus we need to recurse into every Op. I'm guessing there's already canonicalization that ensures we won't have `1` operand in `scMulExpr`, and no `0` in `scAddExpr`/`scMulExpr`. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73728
2020-02-25	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model plain UDiv	Roman Lebedev	1	-12/+19
	Summary: If we don't believe this UDiv is actually a LShr in disguise, things are much worse. First, we try to see if this UDiv actually originates from user code, by looking for `S + 1`, and if found considering this UDiv to be free. But otherwise, we always considered this UDiv to be high-cost. However that is no longer the case with TTI-driven cost model: our default budget is 4, which matches the default cost of UDiv, so now we allow a single UDiv to not be counted as high-cost. While that is the case, it is evident this is actually a regression due to the fact that cost-modelling is incomplete - we did not account for the `add`, `mul` costs yet. That is being addressed in D73728. Cost-modelling for UDiv also seems pretty straight-forward: subtract cost of the UDiv itself, and recurse into both the LHS and RHS. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73722
2020-02-25	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model UDiv by ↵	Roman Lebedev	1	-12/+10
	power-of-two as LShr Summary: Like with casts, we need to subtract the cost of `lshr` instruction from budget, and recurse into LHS operand. Seems "pretty obviously correct" to me? To be noted, there is a number of other shortcuts we //could// cost-model: * `... + (-1 * ...)` -> `... - ...` <- likely very frequent case * `x - (rem x, power-of-2)`, which is currently `(x udiv power-of-2) * power-of-2` -> `x & -log2(power-of-2)` * `rem x, power-of-2`, which is currently `x - ((x udiv power-of-2) * power-of-2)` -> `x & log2(power-of-2)-1` * `... * power-of-2` -> `... << log2(power-of-2)` <- likely not very beneficial Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73718
2020-02-25	[SCEV] SCEVExpander::isHighCostExpansionHelper(): begin cost modelling - ↵	Roman Lebedev	1	-11/+23
	model cast cost Summary: This is not a NFC, although it does not change any of the existing tests. I'm not really sure if we should have specific tests for the cost modelling itself. This is the first patch that actually makes `SCEVExpander::isHighCostExpansionHelper()` account for the cost of the SCEV expression, and consider the budget available, by modelling cast expressions. I believe the logic itself is "pretty obviously correct" - from budget, we need to subtract the cost of the cast expression from inner type `Op->getType()` to the `S->getType()` type, and recurse into the expression we are casting. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: xbolva00, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73716
2020-02-25	[SCEV] SCEVExpander::isHighCostExpansion(): assert if TTI is not provided	Roman Lebedev	1	-4/+7
	Summary: Currently, as per `check-llvm`, we never call `SCEVExpander::isHighCostExpansion()` with null TTI, so this appears to be a safe restriction. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73712
2020-02-25	[NFC][SCEV] SCEVExpander::isHighCostExpansionHelper(): check that we ↵	Roman Lebedev	1	-2/+4
	processed expression first Summary: As far as i can tell this is still NFC. Initially in rL146438 it was added at the top of the function, later rL238507 dethroned it, and rL244474 did it again. I'm not sure if we have already checked the cost of this expansion, we should be doing that again. Reviewers: reames, mkazantsev, wmi, sanjoy, atrick, igor-laevsky Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73706
2020-02-25	[NFC][SCEV] Piping to pass new SCEVCheapExpansionBudget option into ↵	Roman Lebedev	1	-7/+14
	SCEVExpander::isHighCostExpansionHelper() Summary: In future patches`SCEVExpander::isHighCostExpansionHelper()` will respect the budget allocated by performing TTI cost modelling. This is a fully NFC patch to make things reviewable. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, zzheng, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73705
2020-02-25	[NFC][SCEV] Piping to pass TTI into SCEVExpander::isHighCostExpansionHelper()	Roman Lebedev	1	-8/+8
	Summary: Future patches will make use of TTI to perform cost-model-driven `SCEVExpander::isHighCostExpansionHelper()` This is a fully NFC patch to make things reviewable. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, zzheng, javed.absar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73704
2020-02-24	[NFC] [DA] Refactoring getIndexExpressionsFromGEP	Bardia Mahjour	1	-0/+45
	Summary: This patch moves the getIndexExpressionsFromGEP function from polly into ScalarEvolution so that both polly and DependenceAnalysis can use it for the purpose of subscript delinearization when the array sizes are not parametric. Authored By: bmahjour Reviewer: Meinersbur, sebpop, fhahn, dmgreen, grosser, etiotto, bollu Reviewed By: Meinersbur Subscribers: hiraditya, arphaman, Whitney, ppc-slack, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73995
2020-02-21	Flags for displaying only hot nodes in CFGPrinter graph	Kirill Naumov	1	-0/+36
	Added two flags to omit uncommon or dead paths in the CFG graphs: -cfg-hide-unreachable-paths -cfg-hide-deoptimize-paths The main purpose is performance analysis when such block are not "interesting" from perspective of common path performance. Reviewed By: apilipenko, davidxl Differential Revision: https://reviews.llvm.org/D74346
2020-02-21	[DependenceAnalysis] Memory dependence analysis internal caching mechanism ↵	Evgeniy Brevnov	1	-8/+21
	is broken in presence of TBAA (PR42733). Summary: There is a flaw in memory dependence analysis caching mechanism when memory accesses with TBAA are involved. Assume we first analysed and cached results for access with TBAA. Later we request dependence for the same memory but without TBAA (or different TBAA). By design these two queries should share one entry in the internal cache which corresponds to a general access (without TBAA). Thus upon second request internal cached is cleared and we continue analysis for access as if there is no TBAA. The problem is that even though internal cache is cleared the set of visited nodes is not. That means we won't traverse visited nodes again and populate internal cache with the corresponding dependence results. So we end up with internal cache in an incomplete state. Current implementation tries to signal that situation by resetting CacheInfo->Pair at line 1104. But that doesn't actually help since later code ignores this invalidation and relies on 'Cache->empty()' property to decide on cache completeness. Reviewers: reames, hfinkel, chandlerc, fedor.sergeev, asbirlea, fhahn, john.brawn, Prazek, sunfish Reviewed By: john.brawn Subscribers: DaniilSuchkov, kosarev, jfb, dantrushin, hiraditya, bmahjour, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73032
2020-02-20	[MustExecute] Add backward exploration for must-be-executed-context	Hideto Ueno	1	-8/+139
	Summary: As mentioned in D71974, it is useful for must-be-executed-context to explore CFG backwardly. This patch is ported from parts of D64975. We use a dominator tree to find the previous context if a dominator tree is available. Reviewers: jdoerfert, hfinkel, baziotis, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74817
2020-02-19	[DDG] Data Dependence Graph - Graph Simplification	Bardia Mahjour	2	-2/+149
	Summary: This is the last functional patch affecting the representation of DDG. Here we try to simplify the DDG to reduce the number of nodes and edges by iteratively merging pairs of nodes that satisfy the following conditions, until no such pair can be identified. A pair of nodes consisting of a and b can be merged if: 1. the only edge from a is a def-use edge to b and 2. the only edge to b is a def-use edge from a and 3. there is no cyclic edge from b to a and 4. all instructions in a and b belong to the same basic block and 5. both a and b are simple (single or multi instruction) nodes. These criteria allow us to fold many uninteresting def-use edges that commonly exist in the graph while avoiding the risk of introducing dependencies that didn't exist before. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack Tags: #llvm Differential Revision: https://reviews.llvm.org/D72350
2020-02-19	[ValueTracking] Improve isKnownNonNaN() to recognize zero splats.	Jonas Paulsson	1	-0/+3
	isKnownNonNaN() could not recognize a zero splat because that is a ConstantAggregateZero which is-a ConstantData but not a ConstantDataVector. Patch makes a ConstantAggregateZero return true. Review: Thomas Lively Differential Revision: https://reviews.llvm.org/D74263
2020-02-19	[AMDGPU][ConstantFolding] Fold llvm.amdgcn.fmul.legacy intrinsic	Jay Foad	1	-0/+12
	Reviewers: arsenm, rampitec, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74835
2020-02-18	[LazyCallGraph] Fix ambiguous index value	Brian Gesiak	1	-1/+2
	After having committed https://reviews.llvm.org/D72226, 2 buildbots running GCC 5.4.0 began failing. The cause was the order in which those compilers evaluated the left- and right-hand sides of the expression `RC.SCCIndices[C] = RC.SCCIndices.size();`. This commit splits the expression into multiple statements to avoid ambiguity, and adds a test case that exercises the code that caused the test failures on those older compilers (which was originally included in the reviewed patch, https://reviews.llvm.org/D72226).
2020-02-18	[IR] Lazily number instructions for local dominance queries	Reid Kleckner	7	-167/+18
	Essentially, fold OrderedBasicBlock into BasicBlock, and make it auto-invalidate the instruction ordering when new instructions are added. Notably, we don't need to invalidate it when removing instructions, which is helpful when a pass mostly delete dead instructions rather than transforming them. The downside is that Instruction grows from 56 bytes to 64 bytes. The resulting LLVM code is substantially simpler and automatically handles invalidation, which makes me think that this is the right speed and size tradeoff. The important change is in SymbolTableTraitsImpl.h, where the numbering is invalidated. Everything else should be straightforward. We probably want to implement a fancier re-numbering scheme so that local updates don't invalidate the ordering, but I plan for that to be future work, maybe for someone else. Reviewed By: lattner, vsk, fhahn, dexonsmith Differential Revision: https://reviews.llvm.org/D51664
2020-02-18	[VectorUtils] Accept IRBuilderBase; NFC	Nikita Popov	1	-7/+8