aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/ScalarEvolution.cpp
AgeCommit message (Collapse)AuthorFilesLines
2023-05-07[SCEV][reland] More precise trip multiplesJoshua Cao1-53/+114
We currently have getMinTrailingZeros(), from which we can get a SCEV's multiple by computing 1 << MinTrailingZeroes. However, this only gets us multiples that are a power of 2. This patch introduces a way to get max constant multiples that are not just a power of 2. The logic is similar to that of getMinTrailingZeros. getMinTrailingZerosImpl is replaced by computing the max constant multiple, and counting the number of trailing bits. I have so far found this useful in two places: 1) Computing unsigned constant ranges. For example, if we have i8 {10,+,10}<nuw>, we know the max constant it can be is 250. 2) My original intent was to use this in getSmallConstantTripMultiples, but it has no effect right now due to change from D110587. For example, if we have backedge count `(6 * %N) - 1`, the trip count becomes `1 + zext((6 * %N) - 1)`, and we cannot say that 6 is a multiple of the SCEV. I plan to look further into this separately. The implementation assumes the value is unsigned. It can probably be extended to handle signed values as well. If the code sees that a SCEV does not have <nuw>, it will fall back to finding the max multiple that is a power of 2. Multiples that are a power of 2 will still be a multiple even after the SCEV overflows. This does not apply to other values. This is the 1st commit message: --- This relands https://reviews.llvm.org/D141823. The verification fails when expensive checks are turned on. This can occur when: 1. SCEV S's multiple is cached 2. SCEV S's no wrap flags are strengthened, and the multiple changes 3. SCEV verifier finds that S's cached and recomputed multiple are different We eliminate most cases by forgetting SCEVAddRecExpr's cached values when the flags are modified, but there are still cases for other SCEV types. We relax the check by making sure the cached multiple divides the recomputed multiple, ensuring the cached multiple is correct, conservative multiple. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D149529
2023-05-02[SCEV] Reuse Accum variable when handling GEP flagsNikita Popov1-3/+1
The GEP minus the base pointer (which is the pre-inc addrec) is exactly the Accum value that was already calculated.
2023-04-29[SCEV] Use object size for globals to sharpen ranges.Florian Hahn1-0/+26
The highest address the object can start is ObjSize bytes before the end (unsigned max value). If this value is not a multiple of the alignment, the last possible start value is the next lowest multiple of the alignment. Note: The computations cannot overflow, because if they would there's no possible start address for the object. At the moment, this is limited to GlobalVariables, because I could not find a API similar to getObjectSize to also get the alignment of the object. With such an API, this can be generalized to general addresses. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149483
2023-04-28[SCEV] Skip instrs with non-scevable types in visitAndClearUsers.Florian Hahn1-0/+2
No SCEVs are formed for instructions with non-scevable types, so no other SCEV expressions can depend on them. Skip those instructions and their users when invalidating SCEV expressions. Depends on D144847. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D144848
2023-04-28[SCEV] Don't invalidate past dependency-breaking instructionsNikita Popov1-1/+19
When invalidating a value, we walk all users of that value and invalidate them as well. This can be very expensive for large use graphs. However, we only need to invalidate a user U of instruction I if SCEV(U) can depend on SCEV(I). This is not the case if U is an instruction that always produces a SCEVUnknown, such as a load. If the load pointer operand is invalidated, there is no need to invalidate the load result, which is completely unrelated from a SCEV perspective. Differential Revision: https://reviews.llvm.org/D149323
2023-04-28[SCEV] Replace IsAvailableOnEntry with block dispositionNikita Popov1-88/+2
As far as I understand, the IsAvailableOnEntry() function basically implements the same functionality as the properlyDominates() block disposition. The primary difference (apart from a weaker implementation) seems to be in this comment at the top: // Checks if the SCEV S is available at BB. S is considered available at BB // if S can be materialized at BB without introducing a fault. However, I don't really understand why there would be such a requirement. It's my understanding that SCEV explicitly does not care about trapping udiv instructions itself, and it's the job of SCEVExpander's isSafeToExpand() to make sure these don't get expanded if they may trap. Differential Revision: https://reviews.llvm.org/D149344
2023-04-27[SCEV] Drop LCSSA check in createNodeFromSelectLikePHI()Nikita Popov1-5/+0
SCEV expressions no longer try to preserve LCSSA form. SCEV construction will try to look through LCSSA phi nodes. As such, we also no longer need to limit this special-case fold.
2023-04-27[SCEV] Try simplifying phi before createNodeFromSelectLikePHI()Nikita Popov1-3/+3
Sometimes a phi can both be trivial and match the createNodeFromSelectLikePHI() fold. In that case it is generally more profitable to look through the phi node.
2023-04-27[SCEV] Remove LCSSA special case in getSCEVAtScope() (NFCI)Nikita Popov1-11/+0
We no longer try to preserve LCSSA form in SCEV representation: Nowadays, we look through LCSSA PHI nodes directly during SCEV construction. As such, this separate special case in getSCEVAtScope() is no longer needed.
2023-04-27[SCEV] Check correct binary operator for nowrap flagsNikita Popov1-2/+2
We should be checking the current BO here, not the nested one. If the current BO has nowrap flags (and is UB on poison), then we'll fetch both operand SCEVs of that BO. We'll check the nested BO on the next iteration of the do/while loop.
2023-04-27[SCEV] Check MatchBinaryOp opcode instead of original opcodeNikita Popov1-2/+2
These are not necessarily the same (e.g. or can become add) and this is what we're switching over in the first place.
2023-04-27[SCEV] Fix getOperandsToCreate() for and/orNikita Popov1-1/+1
We can create expressions either for constant operand or i1 and/or. The implementation was inverting the latter check.
2023-04-25[SCEV] Common code for computing trip count in a fixed type [NFC-ish]Philip Reames1-13/+31
This is a follow on to D147117 and D147355. In both cases, we were adding special cases to compute zext(BTC+1) instead of zext(BTC)+1 when the BTC+1 computation was known not to overflow. Differential Revision: https://reviews.llvm.org/D148661
2023-04-25[SCEV] Support sub in and negative constants willNotOverflowMax Kazantsev1-11/+28
This lifts two TODOs from this function, allowing us to prove no-overflow whether it happens through max int (up) or through min int (down) for both and and sub. Differential Revision: https://reviews.llvm.org/D148618 Reviewed By: dmakogon
2023-04-24Revert "[SCEV] Precise trip multiples"Joshua Cao1-105/+51
This reverts commit 027a4c8b96c7f97df8e98b1dac069b956810ab94.
2023-04-24[SCEV] Precise trip multiplesJoshua Cao1-51/+105
We currently have getMinTrailingZeros(), from which we can get a SCEV's multiple by computing 1 << MinTrailingZeroes. However, this only gets us multiples that are a power of 2. This patch introduces a way to get max constant multiples that are not just a power of 2. The logic is similar to that of getMinTrailingZeros. getMinTrailingZeros is replaced by computing the max constant multiple, and counting the number of trailing bits. This is applied in two places: 1) Computing unsigned constant ranges. For example, if we have i8 {10,+,10}<nuw>, we know the max constant it can be is 250. 2) Computing trip multiples as shown in SCEV output. This is useful if for example, we are unrolling a loop by a factor of 5, and we know the trip multiple is 5, then we don't need a loop epilog. If the code sees that a SCEV does not have <nuw>, it will fall back to finding the max multiple that is a power of 2. Multiples that are a power of 2 will still be a multiple even after the SCEV overflows. Differential Revision: https://reviews.llvm.org/D141823
2023-04-21[SCEV] Clarify inference in isAddRecNeverPoison()Nikita Popov1-42/+23
The justification in isAddRecNeverPoison() no longer applies, as it dates back to a time where LLVM had an unconditional forward progress guarantee. However, we also no longer need it, because we can exploit branch on poison UB instead. For a single exit loop (without abnormal exits) we know that all instructions dominating the exit will be executed, so if any of them trigger UB on poison that means that addrec is not poison. This is slightly stronger than the previous code, because a) we don't need the exit to also be the latch and b) we don't need the value to be used in the exit branch in particular, any UB-producing instruction is fine. I don't expect much practical impact from this change, this is mainly to clarify the reasoning behind this logic. Differential Revision: https://reviews.llvm.org/D148633
2023-04-14[SCEV] Preserve NSW for AddRec multiplied by -1 if it cannot be signed minimumDmitry Makogon1-2/+13
This preserves NSW flag for AddRecs multiplied by -1 if we can prove via constant ranges that the AddRec cannot be signed minimum. An explanation: Let M be signed minimum. If AddRec's range contains M, then M * (-1) will stay M and (M + 1) * (-1) will be signed maximum, so we get a signed overflow. In all other cases if an AddRec didn't signed overflow, then AddRec * (-1) wouldn't too. Differential Revision: https://reviews.llvm.org/D148084
2023-04-10[SCEV][NFC] GetMinTrailingZeros switch case and naming cleanupJoshua Cao1-27/+21
* combine zext and sext into the one switch case * combine vscale and udiv into one switch case * renames according to LLVM style
2023-04-10[SCEV] Strengthen huge constant trip multiples.Joshua Cao1-10/+13
SCEV determines that loops with trip count >=2^32 have a trip multiple of 1 to guard against huge multiples. This patch stregthens this to instead find the greatest power of 2 divisor that is less than the threshold. Differential Revision: https://reviews.llvm.org/D147868
2023-04-10[SCEV][NFC] Convert check to assert getSmallConstantTripMultiple()Joshua Cao1-1/+2
2023-04-10[SCEV] When computing trip count, only zext if necessaryJoshua Cao1-3/+8
This patch improves on https://reviews.llvm.org/D110587. To summarize the patch, given backedge-taken count BC, trip count TC is `BC + 1`. However, we don't know if BC we might overflow. So the patch modifies TC computation to `1 + zext(BC)`. This patch only adds the zext if necessary by looking at the constant range. If we can determine that BC cannot be the max value for its bitwidth, then we know adding 1 will not overflow, and the zext is not needed. We apply loop guards before computing TC to get more data. The primary motivation is to support my work on more precise trip multiples in https://reviews.llvm.org/D141823. For example: ``` void test(unsigned n) __builtin_assume(n % 6 == 0); for (unsigned i = 0; i < n; ++i) foo(); ``` Prior to this patch, we had `TC = 1 + zext(-1 + 6 * ((6 umax %n) /u 6))<nuw>`. SCEV range computation is able to determine that the BC cannot be the max value, so the zext is not needed. The result is `TC -> (6 * ((6 umax %n) /u 6))<nuw>`. From here, we would be able to determine that %n is a multiple of 6. There was one change in LoopCacheAnalysis/LoopInterchange required. Before this patch, if a loop has BC = false, it would compute `TC -> 1 + zext(false) -> 1`, which was fine. After this patch, it computes `TC -> 1 + false = true`. CacheAnalysis would then sign extend the `true`, which was not the intended the behavior. I modified CacheAnalysis such that it would only zero extend trip counts. This patch is not NFC, but also does not change any SCEV outputs. I would like to get this patch out first to make work with trip multiples easier. Differential Revision: https://reviews.llvm.org/D147117
2023-04-10[SCEV] Improve AddRecs' range computation in Expensive Range Sharpening modeMax Kazantsev1-1/+1
Apply loop guards to AddRec's start in range computation for non-self-wrapping AddRecs. According to CT measurements, this has a wide negative compile time impact, so we hold it in expensive range sharpening mode where it's not so critical. However, we need to find a way to share benefits of this mode with default mode. Patch by Aleksandr Popov! Differential Revision: https://reviews.llvm.org/D147557 Reviewed By: mkazantsev
2023-04-08[SCEV][NFC] Fix `Do not use 'else' after 'return'`Joshua Cao1-36/+32
Follow LLVM coding standards and make clangd emit less warnings.
2023-03-22[SCEV] Infer no-self-wrap via constant rangesPhilip Reames1-0/+12
Without this, pointer IVs in loops with small constant trip counts couldn't be proven no-self-wrap. This came up in a new LSR transform, but may also benefit other SCEV consumers as well. Differential Revision: https://reviews.llvm.org/D146596
2023-03-20[SCEV] Preserve divisibility and min/max information in applyLoopGuardsAlon Kom1-3/+154
applyLoopGuards doesn't always preserve information when there are multiple assumes. This patch tries to deal with multiple assumes regarding a SCEV's divisibility and min/max values, and rewrite it into a SCEV that still preserves all of the information. For example, let the trip count of the loop be TC. Consider the 3 following assumes: 1. __builtin_assume(TC % 8 == 0); 2. __builtin_assume(TC > 0); 3. __builtin_assume(TC < 100); Before this patch, depending on the assume processing order applyLoopGuards could create the following SCEV: max(min((8 * (TC / 8)) , 99), 1) Looking at this SCEV, it doesn't preserve the divisibility by 8 information. After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV: max(min((8 * (TC / 8)) , 96), 8) By aligning up 1 to 8, and aligning down 99 to 96, the new SCEV still preserves all of the original assumes. Differential Revision: https://reviews.llvm.org/D144947
2023-03-17[SCEV] Recognize vscale intrinsicsNikita Popov1-1/+3
Now that SCEV has a dedicated vscale node type, we should also map vscale intrinsics to it. To make sure this does not regress ranges (which were KnownBits based previously), add support for vscale to getRangeRef() as well. Differential Revision: https://reviews.llvm.org/D146226
2023-03-17[Analysis] Make order of analysis executions more stableBjorn Pettersson1-4/+5
When debugging and using debug-pass-manager (e.g. in regression tests) we prefer a consistent order in which analysis passes are executed. But when for example doing return MyClass(AM.getResult<LoopAnalysis>(F), AM.getResult<DominatorTreeAnalysis>(F)); then the order in which LoopAnalysis and DominatorTreeAnalysis isn't guaranteed, and might for example depend on which compiler that is used when building LLVM. I've not scanned the full source tree, but this fixes some occurances of the above pattern found in lib/Analysis. This problem was discussed briefly in review for D146206.
2023-03-15[SCEV] Do not strengthen nuw/nsw flags during get[Zero,Sign]ExtendedExpr.Florian Hahn1-10/+0
Modifying AddRecs when constructing other expressions can lead to surprising changes. It also seems like it is not really beneficial i most cases. At the moment, there's a single regression, but we still might be able to improve the flags at AddRec construction. Might help with the issue discussed in D143409. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D144051
2023-03-14[SCEV] Rename ControlsExit -> ControlsOnlyExit (NFC)Nikita Popov1-71/+71
As suggested in https://reviews.llvm.org/D145510#4192162.
2023-03-14[SCEV] Fix finite loop non-strict predicate simplification (PR60944)Nikita Popov1-19/+28
There are a number of issues with the current code for converting ule -> ult (etc) predicates for comparisons controlling finite loops: * It sets nowrap flags, which may only hold for that particular comparison, not globally. (PR60944) * It doesn't check that the RHS is invariant. (I'm not sure this can cause practical issues independently of the previous point.) * It runs before simplifications that may be more profitable. (PR54191) This patch moves the handling for this into computeExitLimitFromICmp(), because it is somewhat tightly coupled with assumptions in that code, and addresses the aforementioned issues. Fixes https://github.com/llvm/llvm-project/issues/60944. Fixes https://github.com/llvm/llvm-project/issues/54191. Differential Revision: https://reviews.llvm.org/D145510
2023-03-14[Analysis] Use *{Set,Map}::contains (NFC)Kazu Hirata1-1/+1
2023-03-14[SCEV] Apply loop guards against min/max for its argumentsDmitry Makogon1-43/+83
This replaces several rewriting rules in ScalarEvolution::applyLoopGuards that are applied to min/max expressions with the equivalent ones but applied to its arguments. So previously given we had a loop guard min(a, b) >= c, the min expression would get rewritten as max(c, min(a, b)). With such approach, we were unable to apply the rewrite if min operands were zext for example (min(zext(a), zext(b))), however it's equivalent to the expression zext(min(a, b)) for which we could apply the rewrite. Now we'd rewrite the min operands also with these expressions: a -> max(c, a) and b -> max(c, b). and this would allow us to apply the loop guard in this and similar cases: min(zext(a), zext(b)) would get rewritten as min(zext(max(c, a)), zext(max(c, b))) instead of just being skipped. The list of added rules (omitting predicates signedness for simplicity): 1. Guard: min(a, b) >= c Old rule: min(a, b) -> max(c, min(a, b)) New rules: a -> max(a, c) and b -> max(b, c) 2. Guard: min(a, b) > c Old rule: min(a, b) -> max(c + 1, min(a, b)) New rules: a -> max(a, c + 1) and b -> max(b, c + 1) 3. Guard: max(a, b) <= c Old rule: max(a, b) -> min(c, max(a, b)) New rules: a -> min(a, c) and b -> min(b, c) 4. Guard: max(a, b) < c Old rule: max(a, b) -> min(c - 1, max(a, b)) New rules: a -> min(a, c - 1) and b -> min(b, c - 1) The old rewrites still hold. Differential Revision: https://reviews.llvm.org/D145230
2023-03-14[SCEV] Rename variables in applyLoopGuards (NFC)Dmitry Makogon1-10/+10
2023-03-07[SCEV] Strengthen nowrap flags via ranges for ARs on construction.Florian Hahn1-0/+12
At the moment, proveNoWrapViaConstantRanges is only used when creating SCEV[Zero,Sign]ExtendExprs. We can get significant improvements by strengthening flags after creating the AddRec. I'll also share a follow-up patch that removes the code to strengthen flags when creating SCEV[Zero,Sign]ExtendExprs. Modifying AddRecs while creating those can lead to surprising changes. Compile-time looks neutral: https://llvm-compile-time-tracker.com/compare.php?from=94676cf8a13c511a9acfc24ed53c98964a87bde3&to=aced434e8b103109104882776824c4136c90030d&stat=instructions:u Reviewed By: mkazantsev, nikic Differential Revision: https://reviews.llvm.org/D144050
2023-03-07[IR] Add operator<< overload for CmpInst::Predicate (NFC)Nikita Popov1-2/+1
I regularly try and fail to use this while debugging.
2023-03-07[SCEV] Use fallthoughs in predicate switch when collecting rewrites for loop ↵Dmitry Makogon1-14/+10
guard (NFC)
2023-03-03[SCEV] Fix control flow warning (NFC)Nikita Popov1-0/+1
2023-03-03[SCEV] Extract a helper to create a SCEV with new operands (NFC)Nikita Popov1-27/+36
2023-03-03[SCEV] Remove an unnecessary switch (NFC)Nikita Popov1-36/+13
Just the scevUnconditionallyPropagatesPoisonFromOperands() check is sufficient. Also rename the flag to be more in line with the more general predicate.
2023-03-03[ScalarEvolution] Factor out RewriteMap utilities in applyLoopGuards (NFC)Dmitry Makogon1-9/+22
This factors out two utilities used with RewriteMap in applyLoopGuards: - AddRewrite, which puts a rewrite rule in the map and if needed registers the rewrite in the list of rewritten expressions, - GetMaybeRewritten, which checks whether an expression has already been rewritten, and if so, returns the rewrite. Otherwise, returns the given expression. This may be needed when adding new rewrite rules as not to copy-paste this code.
2023-03-02Revert "Revert "[SCEV] Add SCEVType to represent `vscale`.""Paul Walker1-23/+43
Relanding after fixing Polly related build error. This reverts commit 7b26dcae9eaf8cdcba7fef032fd83d060dffd4b4.
2023-03-02Revert "[SCEV] Add SCEVType to represent `vscale`."Paul Walker1-43/+23
This reverts commit 7912f5cc92f65ad0d3c705f3683a0b69dbedcc57.
2023-03-02[SCEV] Add SCEVType to represent `vscale`.Paul Walker1-23/+43
This is part of an effort to remove ConstantExpr based representations of `vscale` so that its LangRef definiton can be relaxed to accommodate a less strict definition of constant. Differential Revision: https://reviews.llvm.org/D144891
2023-02-27[SCEV] Hoist common cleanup code to function. (NFC)Florian Hahn1-28/+21
This allows for easier updating of common code in follow-on patches. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D144847
2023-02-27[SCEV] Make scalable size representation more explicitNikita Popov1-38/+18
Represent scalable type sizes using C * vscale, where vscale is the vscale constant expression. This exposes a bit more information to SCEV, because the vscale multiplier is explicitly modeled in SCEV (rather than part of the sizeof expression). This is mainly intended as an alternative to D143642. Differential Revision: https://reviews.llvm.org/D144624
2023-02-25[ScalarEvolution] Fix unused variable warnings. NFC.Simon Pilgrim1-3/+3
Replace dyn_cast<> with isa<> as we don't actually need the variable
2023-02-23[SCEV] Ensure SCEV does not replace aliases with their aliaseesLeonard Chan1-5/+1
Passes in general shouldn't replace an alias with the aliasee (see https://reviews.llvm.org/D66606). This can lead to situations where a linkonce_odr symbol (which could be interposable if lowered to weak linkage) can be replaced with a local aliasee which won't be interposable. SVEC does this when the function is invoked by FunctionPass Manager -> Loop Pass Manager -> Induction Variable Users in the codegen pipeline. This was found in hwasan instrumented code where a linonce_odr alias was replaced with its private aliasee. This fixes the bug descriped at https://github.com/llvm/llvm-project/issues/60668. Differential Revision: https://reviews.llvm.org/D144035
2023-02-23Revert "[SCEV] Preserve divisibility and min/max information in applyLoopGuards"komalon11-192/+22
This reverts commit 219ba2fb7b0ae89101f3c81a47fe4fc4aa80dea4.
2023-02-23[SCEV] Preserve divisibility and min/max information in applyLoopGuardsAlon Kom1-22/+192
applyLoopGuards doesn't always preserve information when there are multiple assumes. This patch tries to deal with multiple assumes regarding a SCEV's divisibility and min/max values, and rewrite it into a SCEV that still preserves all of the information. For example, let the trip count of the loop be TC. Consider the 3 following assumes: 1. __builtin_assume(TC % 8 == 0); 2. __builtin_assume(TC > 0); 3. __builtin_assume(TC < 100); Before this patch, depending on the assume processing order applyLoopGuards could create the following SCEV: max(min((8 * (TC / 8)) , 99), 1) Looking at this SCEV, it doesn't preserve the divisibility by 8 information. After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV: max(min((8 * (TC / 8)) , 96), 8) By aligning up 1 to 8, and aligning down 99 to 96, the new SCEV still preserves all of the original assumes. Differential Revision: https://reviews.llvm.org/D141850