aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Scalar/Reassociate.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-08-18[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)Kazu Hirata1-1/+1
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.
2025-06-25[Transforms] Use range-based for loops (NFC) (#145252)Kazu Hirata1-5/+5
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-05-19[Local] Move OverflowTracking to Local.h, move logic to helpers (NFC) (#140403)Florian Hahn1-14/+3
Move parts of the logic used by Reassociate to OverflowTracking (mergeFlags & applyFlags) and move the definition to Local.h. For now it just moves the NUW/NSW handling, as this matches the uses in LICM. I'll look into the FP math handling separately, as it looks like there's a difference between Reassociate (takes all flags from I, while LICM takes the intersection of the flags on both instructions). PR: https://github.com/llvm/llvm-project/pull/140403
2025-05-08Reapply "IR: Remove uselist for constantdata (#137313)" (#138961)Matt Arsenault1-1/+2
Reapply "IR: Remove uselist for constantdata (#137313)" This reverts commit 5936c02c8b9c6d1476f7830517781ce8b6e26e75. Fix checking uselists of constants in assume bundle queries
2025-05-07Revert "IR: Remove uselist for constantdata (#137313)"Kirill Stoimenov1-2/+1
Possibly breaks the build: https://lab.llvm.org/buildbot/#/builders/24/builds/8119 This reverts commit 87f312aad6ede636cd2de5d18f3058bf2caf5651.
2025-05-06IR: Remove uselist for constantdata (#137313)Matt Arsenault1-1/+2
This is a resurrected version of the patch attached to this RFC: https://discourse.llvm.org/t/rfc-constantdata-should-not-have-use-lists/42606 In this adaptation, there are a few differences. In the original patch, the Use's use list was replaced with an unsigned* to the reference count in the value. This version leaves them as null and leaves the ref counting only in Value. Remove use-lists from instances of ConstantData (which are shared across modules and have no operands). To continue supporting most of the use-list API, store a ref-count in place of the use-list; this is for API like Value::use_empty and Value::hasNUses. Operations that actually need the use-list -- like Value::use_begin -- will assert. This change has three benefits: 1. The compiler output cannot in any way depend on the use-list order of instances of ConstantData. 2. There's no use-list traffic when adding and removing simple constants from operand lists (although there is ref-count traffic; YMMV). 3. It's cheaper to serialize use-lists (since we're no longer serializing the use-list order of things like i32 0). The downside is that you can't look at all the users of ConstantData, but traversals of users of i32 0 are already ill-advised. Possible follow-ups: - Track if an instance of a ConstantVector/ConstantArray/etc. is known to have all ConstantData arguments, and drop the use-lists to ref-counts in those cases. Callers need to check Value::hasUseList before iterating through the use-list. - Remove even the ref-counts. I'm not sure they have any benefit besides minimizing the scope of this commit, and maintaining the counts is not free. Fixes #58629 Co-authored-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>
2025-05-04[Transforms] Use range-based for loops (NFC) (#138476)Kazu Hirata1-5/+3
2025-04-30[LLVM][Reassociate] Extend ConvertShiftToMul to allow for ConstantInt ↵Paul Walker1-1/+1
vectors. (#137340) This has the side effect of fixing the FIXME for when use-constant-int-for-fixed-length-splat becomes the default.
2025-04-23[Reassociate] Invalidate analysis passes after canonicalizeOperands (#136835)Björn Pettersson1-1/+3
When ranking operands for an expression tree the reassociate pass also perform canonicalization, putting constants on the right hand side. Such transforms was however not registered as modifying the IR. So at the end of the pass, if not having made any other changes, the pass returned that all analyses should be kept. With this patch we make sure to set MadeChange to true when modifying the IR via canonicalizeOperands. This is to make sure analyses such as DemandedBits are properly invalidated when instructions are modified.
2025-04-09[DebugInfo][Reassociate] Propagate source locs when factoring add->mul (#134829)Stephen Tozer1-0/+1
As part of reassociating add instructions, we may factorize some of the adds and produce a mul instruction; this patch propagates the source location of the reassociated tree of instructions to the new mul. Found using https://github.com/llvm/llvm-project/pull/107279.
2025-04-08[DebugInfo][Reassociate] Propagate source loc when negating mul factor (#134679)Stephen Tozer1-3/+9
As part of RemoveFactorFromExpression, we attempt to remove a factor from a mul/fmul expression; this may involve generating new instructions, e.g. to negate the result if the factor was negative in the original expression. When this happens, the new instructions should have a DebugLoc set from the instruction that the factored expression is being used to compute. Found using https://github.com/llvm/llvm-project/pull/107279.
2025-04-08[Reassociate] Apply Debugloc to instrs produced when optimizing add (#134676)Stephen Tozer1-4/+7
Currently in Reassociate we may create a set of new instructions when optimizing an `add`, but we do not set DebugLocs on the new instructions; this patch propagates the add's DebugLoc to the new instructions. Found using #107279.
2025-03-23[Transforms] Use *Set::insert_range (NFC) (#132652)Kazu Hirata1-2/+1
We can use *Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E); down to: Set.insert_range(Range); In some cases, we can further fold that into the set declaration.
2025-02-22[Reassociate] Use a reference to DataLayout instead of copying the ↵cooperp1-1/+1
underlying string data (NFC) (#128269) I noticed this when looking at all allocations by clang. For a medium sized file this was around 6000 calls to operator new, although i suspect there were more allocations in total as the SmallVectors in DataLayout may have their own allocations in some cases. In a follow-up i'm tempted to make the DataLayout copy constructor private, to avoid this in future. There are a few tests which copy the DataLayout, and perhaps need to (I didn't check yet), but we could provide a clone() method for them if needed. Its only accidental copying I think we should consider avoiding, not people who really do need to copy it for reasons.
2025-01-24[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)Jeremy Morse1-2/+2
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).
2025-01-21[Reassociate] Don't reassociate vXi1 logical expressions (#123329)Simon Pilgrim1-3/+4
Extends what we already do for i1 types and don't serialize vXi1 logical expressions to improve ILP. llvm-test-suite numbers https://github.com/llvm/llvm-project/issues/64840#issuecomment-2053621740 indicate that both reassociations are a net win. Fixes #64840 Fixes #63946
2024-11-08[DebugInfo][Reassociate] Preserve DebugLocs when reassociating subs (#114226)Stephen Tozer1-0/+2
In NegateValue in Reassociate, we return the negation of an existing value in order to break a subtract into an negate + add, potentially creating a new instruction to perform the negation, but we neglect to propagate the DebugLoc of the sub being replaced to the negate instruction if one is created. This patch adds that propagation. Found using https://github.com/llvm/llvm-project/pull/107279.
2024-07-14[Transforms] Use range-based for loops (NFC) (#98725)Kazu Hirata1-6/+6
2024-07-12[Transforms] Use range-based for loops (NFC) (#98465)Kazu Hirata1-2/+2
2024-07-01[Reassociate] Preserve `nuw` and `nsw` on `mul` chainsNoah Goldstein1-4/+13
Basically the same rules as `add` but we also need to ensure all operands a non-zero. Proofs: https://alive2.llvm.org/ce/z/jzsYht Closes #97040
2024-06-27[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)Nikita Popov1-4/+4
This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.
2024-06-25[Reassociate] Use poison instead of undef for dummy operands (NFCI)Nikita Popov1-3/+3
These will be replaced later.
2024-06-18[Reassociate] Avoid use of ConstantExpr::getShl()Nikita Popov1-1/+2
Use the constant folding API instead.
2024-06-17[DebugInfo][Reassociate] Fix missing debug location drop (#95355)Shan Huang1-0/+6
Fix #95343 .
2024-06-14[llvm] Use llvm::unique (NFC) (#95628)Kazu Hirata1-4/+4
2024-06-08[Reassociate] Use uint64_t for repeat count (#94232)Yingwei Zheng1-108/+12
This patch relands #91469 and uses `uint64_t` for repeat count to avoid a miscompilation caused by overflow https://github.com/llvm/llvm-project/pull/91469#discussion_r1623925158.
2024-06-03Revert "[Reassociate] Drop weight reduction to fix issue 91417 (#91469)" ↵Yingwei Zheng1-1/+92
(#94210) Reverts https://github.com/llvm/llvm-project/commit/3bcccb6af685c3132a9ee578b9e11b2503c35a5c and https://github.com/llvm/llvm-project/commit/9a282724a29899e84adc91bdeaf639010408a80d because #91469 causes a miscompilation https://github.com/llvm/llvm-project/pull/91469#discussion_r1623925158.
2024-05-29[Reassociate] Drop weight reduction to fix issue 91417 (#91469)Yingwei Zheng1-111/+1
See the following case: https://alive2.llvm.org/ce/z/A-fBki ``` define i3 @src(i3 %0) { %2 = mul i3 %0, %0 %3 = mul i3 %2, %0 %4 = mul i3 %3, %0 %5 = mul nsw i3 %4, %0 ret i3 %5 } define i3 @tgt(i3 %0) { %2 = mul i3 %0, %0 %5 = mul nsw i3 %2, %0 ret i3 %5 } ``` https://github.com/llvm/llvm-project/commit/d7aeefebd6b049f017711cd7c6ef5f217a17b673 introduced weight reduction during weights combination of the same operand. As the weight of `%0` changes from 5 to 3, the nsw flag in `%5` should be dropped. However, the nsw flag isn't cleared by `RewriteExprTree` since `%5 = mul nsw i3 %0, %4` is not included in the range of `[ExpressionChangedStart, ExpressionChangedEnd)`. ``` Calculated Rank[] = 3 Combine negations for: %2 = mul i3 %0, %0 Calculated Rank[] = 4 Combine negations for: %3 = mul i3 %0, %2 Calculated Rank[] = 5 Combine negations for: %4 = mul i3 %0, %3 Calculated Rank[] = 6 Combine negations for: %5 = mul nsw i3 %0, %4 LINEARIZE: %5 = mul nsw i3 %0, %4 OPERAND: i3 %0 (1) ADD USES LEAF: i3 %0 (1) OPERAND: %4 = mul i3 %0, %3 (1) DIRECT ADD: %4 = mul i3 %0, %3 (1) OPERAND: i3 %0 (1) OPERAND: %3 = mul i3 %0, %2 (1) DIRECT ADD: %3 = mul i3 %0, %2 (1) OPERAND: i3 %0 (1) OPERAND: %2 = mul i3 %0, %0 (1) DIRECT ADD: %2 = mul i3 %0, %0 (1) OPERAND: i3 %0 (1) OPERAND: i3 %0 (1) RAIn: mul i3 [ %0, #3] [ %0, #3] [ %0, #3] RAOut: mul i3 [ %0, #3] [ %0, #3] [ %0, #3] RAOut after CSE reorder: mul i3 [ %0, #3] [ %0, #3] [ %0, #3] RA: %5 = mul nsw i3 %0, %4 TO: %5 = mul nsw i3 %4, %0 RA: %4 = mul i3 %0, %3 TO: %4 = mul i3 %0, %0 ``` The best way to fix this is to inform `RewriteExprTree` to clear flags of the whole expr tree when weight reduction happens. But I find that weight reduction based on Carmichael number never happens in practice. See the coverage result https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/home/dtcxzyw/llvm-project/llvm/lib/Transforms/Scalar/Reassociate.cpp.html#L323 I think it would be better to drop `IncorporateWeight`. Fixes #91417
2024-05-28[Reassociate] Preserve NSW flags after expr tree rewriting (#93105)Akshay Deodhar1-13/+22
We can guarantee NSW on all operands in a reassociated add expression tree when: - All adds in an add operator tree are NSW, AND either - All add operands are guaranteed to be nonnegative, OR - All adds are also NUW - Alive2: - Nonnegative Operands - 3 operands: https://alive2.llvm.org/ce/z/G4XW6Q - 4 operands: https://alive2.llvm.org/ce/z/FWcZ6D - NUW NSW adds: https://alive2.llvm.org/ce/z/vRUxeC --------- Co-authored-by: Nikita Popov <github@npopov.com>
2024-03-05[NFC][RemoveDIs] Insert instruction using iterators in Transforms/Jeremy Morse1-25/+30
As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. Noteworthy changes: * FindInsertedValue now takes an optional iterator rather than an instruction pointer, as we need to always insert with iterators, * I've added a few iterator-taking versions of some value-tracking and DomTree methods -- they just unwrap the iterator. These are purely convenience methods to avoid extra syntax in some passes. * A few calls to getNextNode become std::next instead (to keep in the theme of using iterators for positions), * SeparateConstOffsetFromGEP has it's insertion-position field changed. Noteworthy because it's not a purely localised spelling change. All this should be NFC.
2024-02-29[NFC][RemoveDIs] Have CreateNeg only accept iterators (#82999)Jeremy Morse1-3/+5
Removing debug-intrinsics requires that we always insert with an iterator, not with an instruction position. To enforce that, we need to eliminate the `Instruction *` taking functions. It's safe to leave the insert-at-end-of-block functions as the intention is clear for debug info purposes (i.e., insert after both instructions and debug-info at the end of the function). This patch demonstrates how that needs to happen. At a variety of call-sites to the `CreateNeg` constructor we need to consider: * Has this instruction been selected because of the operation it performs? In that case, just call `getIterator` and pass an iterator in. * Has this instruction been selected because of it's position? If so, we need to keep the iterator identifying that position (see the 3rd hunk changing Reassociate.cpp, although it's coincidentally not debug-info significant). This also demonstrates what we'll try and do with the constructor methods going forwards: have one fully explicit set of parameters including iterator, and another with default-arguments where the block-to-insert-into argument defaults to nullptr / no-position, creating an instruction that hasn't been inserted yet.
2023-12-09[Reassociate] Preserve NUW flags after expr tree rewriting (#72360)Yingwei Zheng1-9/+22
Alive2: https://alive2.llvm.org/ce/z/38KiC_
2023-12-06Recommit "[Reassociate] Use disjoint flag to convert Or to Add. (#72772)"Craig Topper1-3/+4
Original message: We still have to keep the noCommonBitsSet call to handle multiple reassociations in one pass. We'll lose the flag on the first reassociation.
2023-12-06Revert "[Reassociate] Use disjoint flag to convert Or to Add. (#72772)"Craig Topper1-4/+3
This reverts commit 78964457cf1bafe57a54629fafbd081452a9e528. Looks like I didn't rebase this correctly before commit
2023-12-06[Reassociate] Use disjoint flag to convert Or to Add. (#72772)Craig Topper1-3/+4
We still have to keep the noCommonBitsSet call to handle multiple reassociations in one pass. We'll lose the flag on the first reassociation.
2023-12-04[IR][TRE] Support associative intrinsics (#74226)Joshua Cao1-1/+1
There is support for intrinsics in Instruction::isCommunative, but there is no equivalent implementation for isAssociative. This patch builds support for associative intrinsics with TRE as an application. TRE can now have associative intrinsics as an accumulator. For example: ``` struct Node { Node *next; unsigned val; } unsigned maxval(struct Node *n) { if (!n) return 0; return std::max(n->val, maxval(n->next)); } ``` Can be transformed into: ``` unsigned maxval(struct Node *n) { struct Node *head = n; unsigned max = 0; // Identity of unsigned std::max while (true) { if (!head) return max; max = std::max(max, head->val); head = head->next; } return max; } ``` This example results in about 5x speedup in local runs. We conservatively only consider min/max and as associative for this patch to limit testing scope. There are probably other intrinsics that could be considered associative. There are a few consumers of isAssociative() that could be impacted. Testing has only required to Reassociate pass be updated.
2023-11-30[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149)Jeremy Morse1-5/+9
Part of the "RemoveDIs" project to remove debug intrinsics requires passing block-positions around in iterators rather than as instruction pointers, allowing some debug-info to reside in BasicBlock::iterator. This means getInsertionPointAfterDef has to return an iterator, and as it can return no-instruction that means returning an optional iterator. This patch changes the signature for getInsertionPtAfterDef and then patches up the various places that use it to handle the different type. This would overall be an NFC patch, however in InstCombinerImpl::freezeOtherUses I've started skipping any debug intrinsics at the returned insert-position. This should not have any _meaningful_ effect on the compiler output: at worst it means variable assignments that are skipped will now cover the freeze instruction and anything inserted before it, which should be inconsequential. Sadly: this makes the function signature ugly. This is probably the ugliest piece of fallout for the "RemoveDIs" work, but it serves the overall purpose of improving compile times and not allowing `-g` to affect compiler output, so should be worthwhile in the end.
2023-10-10[ValueTracking] Use SimplifyQuery in haveNoCommonBitsSet() (NFC)Nikita Popov1-2/+2
Pass SimplifyQuery instead of unpacked list of arguments.
2023-07-03[Reassociate] Keep flags for more unchanged operationsDavid Green1-22/+39
Reassociation destroys nsw/nuw flags from BinOps that are changed. But if the expression at the end of a tree that was altered, but didn't change itself, the flags do not need to be removed. For example, if %a, %b and %c are reassociated in %x = add nsw i32 %a, %c %y = add nsw i32 %x, %b %z = add nsw i32 %y, %d The value of %y and so add %y %d remains the same, and %z needn't drop the nsw flags. https://alive2.llvm.org/ce/z/_juAiV Differential Revision: https://reviews.llvm.org/D154289
2023-06-26[Reassociation] Only form CSE expressions for local operandsQuentin Colombet1-3/+85
# TL;DR # This patch constrains how much freedom the heuristic that tries to from CSE expressions has. The added constrain is that the CSE-able expressions must be within the same basic block as the expressions they get moved before. # Details # The reassociation pass currently tweaks the rewrite of the final expression towards surfacing pairs of operands that would be CSE-able. This heuristic applies after the regular ordering of the expression. The regular ordering uses the program structure to choose in which order each subexpression is materialized. That order follows the topological order. Now, to expose more CSE opportunities, this heurisitc effectively bypasses the previous ordering normally defined by the program and pushes up sub-expressions that are arbitrary deep in the CFG. E.g., let's say the program order (top to bottom) gives `((a*b)*c)*d)*e` and `b*e` appears the most in the program. The expression will be reordered in `(((b*e)*a)*c)*d` This reordering implies that all the sub expressions (in this example `xx*a`, then `yy*c`, etc.) will need to appear after the CSE-able expression. This may over-constrain where the (sub) expressions may live and in particular it may create loop-dependent expressions. This patch only allows to move expressions up the expression chain when the related values are definied in the same basic block as the ones they "push-down". This constrain is far for being perfect but at least it avoids accidentally creating loop dependent variables. If we really want to expose CSE-able expressions in a proper way, we would need a profitability metric and also make the decision globally as opposed to one chain at a time. I've put the new constrain behind an option to make comparing the old and new versions easy. However, I believe that even if we find cases where the old version performs better it is probably by accident. What I am aiming for with this change is more predictability, then we can improve if need be. This fixes www.llvm.org/PR61458 Differential Revision: https://reviews.llvm.org/D147457
2023-04-16[Scalar] Use range-based for loops (NFC)Kazu Hirata1-6/+3
2023-03-14[Transforms] Use *{Set,Map}::contains (NFC)Kazu Hirata1-1/+1
2022-09-12[Reassociate] prevent partial undef negation replacementSanjay Patel1-0/+6
As shown in the examples in issue #57683, we allow matching vectors with poison (undef) in this transform (and possibly more), but we can't then use the partially defined value as a replacement value in other expressions blindly. This seems to be avoided in simpler examples of reassociation, and other passes should be able to clean up the redundant op seen in these tests.
2022-09-07[Reassociate] Avoid ConstantExpr::getFNeg() (NFCI)Nikita Popov1-5/+11
Use ConstantFoldUnaryOpOperand() instead. Also make the code below robust against non-instruction users, just in case it doesn't fold.
2022-08-31[Reassociate] Use getInsertionPointerAfterDef()Nikita Popov1-30/+6
This simplifies the code and fixes handling for the callbr case, where the instruction needs to be inserted in the normal destination, rather than after the terminator. Originally part of D129660.
2022-08-27[Transform] Use range-based for loops (NFC)Kazu Hirata1-4/+4
2022-08-07[llvm] Qualify auto (NFC)Kazu Hirata1-1/+1
Identified with readability-qualified-auto.
2022-07-25[Reassociate][NFC] Use an appropriate dyn_cast for BinaryOperatorWarren Ristow1-9/+9
In D129523, it was noted that there is are some questionable naked casts from Instruction to BinaryOperator, which could be addressed by doing a dyn_cast directly to BinaryOperator, avoiding the need for the later cast. This cleans up that casting. Reviewed By: nikic, spatel, RKSimon Differential Revision: https://reviews.llvm.org/D130448
2022-07-24[Reassociate][NFC] Consistent checking for FastMathFlags suitabilityWarren Ristow1-3/+3
In D129523, it was noted that the approach to check whether a value can have FastMathFlags was done in different ways, and they should be made consistent. This patch makes minor changes to fix that. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D130408
2022-07-15[Reassociate] Enable FP reassociation via 'reassoc' and 'nsz'Warren Ristow1-5/+15
Compiling with '-ffast-math' tuns on all the FastMathFlags (FMF), as expected, and that enables FP reassociation. Only the two FMF flags 'reassoc' and 'nsz' are technically required to perform reassociation, but disabling other unrelated FMF bits is needlessly suppressing the optimization. This patch fixes that needless suppression, and makes appropriate adjustments to test-cases, fixing some outstanding TODOs in the process. Fixes: #56483 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D129523