aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Scalar/LoopFuse.cpp
AgeCommit message (Collapse)AuthorFilesLines
6 days[LoopFusion] Detecting legal dependencies for fusion using DA info (#146383)Alireza Torabian1-0/+42
Loop fusion pass will use the information provided by the recent DA patch to fuse additional legal loops, including those with forward loop-carried dependencies.
2025-07-28[LoopFusion] Fix sink instructions (#147501)Madhur Amilkanthwar1-0/+25
If we have instructions in second loop's preheader which can be sunk, we should also be adjusting PHI nodes to receive values from the fused loop's latch block. Fixes #128600
2025-05-17Reapply "[LoopPeel] Implement initial peeling off the last loop iteration. ↵Florian Hahn1-1/+2
(#139551)" This reverts the revert commit bf92b127d2637948f53d11a187e865aa10e2e74c. This adds missing initialization of PeelLast in gatherPeelingPreferences. Original message: Generalize countToEliminateCompares to also consider peeling off the last iteration if it eliminates a compare. At the moment, codegen for peeling off the last iteration is quite restrictive and callers have to make sure that the exit condition can be adjusted when peeling and that the loop executes at least 2 iterations. Both will be relaxed in follow-ups. PR: https://github.com/llvm/llvm-project/pull/139551
2025-05-16Revert "[LoopPeel] Implement initial peeling off the last loop iteration. ↵Florian Hahn1-2/+1
(#139551)" This reverts commit bb10c3ba7f77d40a7fbfd4ac815015d3a4ae476a. Also reverts 4f663cca15f2b53c2bc6a84d1b1f5bd81679356d: Revert "[LoopPeel] Make sure PeelLast is always initialized." Revert for now to bring msan bots back to green https://lab.llvm.org/buildbot/#/builders/164/builds/9992 https://lab.llvm.org/buildbot/#/builders/94/builds/7158
2025-05-15[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)Florian Hahn1-1/+2
Generalize countToEliminateCompares to also consider peeling off the last iteration if it eliminates a compare. At the moment, codegen for peeling off the last iteration is quite restrictive and callers have to make sure that the exit condition can be adjusted when peeling and that the loop executes at least 2 iterations. Both will be relaxed in follow-ups. PR: https://github.com/llvm/llvm-project/pull/139551
2025-03-16[LoopFuse] Change placeholder from `undef` to `poison` (#131535)Pedro Lobo1-1/+1
Use `poison` instead of `undef` as a placeholder for phi entries of unreachable predecessors.
2025-02-11[DependenceAnalysis][NFC] Removing PossiblyLoopIndependent parameter (#124615)Alireza Torabian1-6/+6
Parameter PossiblyLoopIndependent has lost its intended purpose. This flag is always set to true in all cases when depends() is called, hence we want to reconsider the utility of this variable and remove it from the function signature entirely. This is an NFC patch.
2025-01-24[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)Jeremy Morse1-2/+2
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).
2024-11-02[Scalar] Remove unused includes (NFC) (#114645)Kazu Hirata1-1/+0
Identified with misc-include-cleaner.
2024-08-13Reland "[Support] Assert that DomTree nodes share parent" (#102782)Vitaly Buka1-2/+6
A dominance query of a block that is in a different function is ill-defined, so assert that getNode() is only called for blocks that are in the same function. There are three cases, where this behavior did occur. LoopFuse didn't explicitly do this, but didn't invalidate the SCEV block dispositions, leaving dangling pointers to free'ed basic blocks behind, causing use-after-free. We do, however, want to be able to dereference basic blocks inside the dominator tree, so that we can refer to them by a number stored inside the basic block. Reverts #102780 Reland #101198 Fixes #102784 Co-authored-by: Alexis Engelke <engelke@in.tum.de>
2024-08-10Revert "[Support] Assert that DomTree nodes share parent" (#102780)Vitaly Buka1-6/+2
Reverts llvm/llvm-project#101198 Breaks multiple bots: https://lab.llvm.org/buildbot/#/builders/72/builds/2103 https://lab.llvm.org/buildbot/#/builders/164/builds/1909 https://lab.llvm.org/buildbot/#/builders/66/builds/2706
2024-08-10[Support] Assert that DomTree nodes share parent (#101198)Alexis Engelke1-2/+6
A dominance query of a block that is in a different function is ill-defined, so assert that getNode() is only called for blocks that are in the same function. There are two cases, where this behavior did occur. LoopFuse didn't explicitly do this, but didn't invalidate the SCEV block dispositions, leaving dangling pointers to free'ed basic blocks behind, causing use-after-free. We do, however, want to be able to dereference basic blocks inside the dominator tree, so that we can refer to them by a number stored inside the basic block.
2024-06-30[LoopFuse] Use poison instead of undef as placeholder for phi entry of ↵Nuno Lopes1-1/+1
unreachable predecessor [NFC]
2024-06-28[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)Nikita Popov1-1/+1
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.
2023-09-11[NFC][RemoveDIs] Prefer iterator-insertion over instructionsJeremy Morse1-8/+11
Continuing the patch series to get rid of debug intrinsics [0], instruction insertion needs to be done with iterators rather than instruction pointers, so that we can communicate information in the iterator class. This patch adds an iterator-taking insertBefore method and converts various call sites to take iterators. These are all sites where such debug-info needs to be preserved so that a stage2 clang can be built identically; it's likely that many more will need to be changed in the future. At this stage, this is just changing the spelling of a few operations, which will eventually become signifiant once the debug-info bearing iterator is used. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152537
2023-09-01[llvm] Fix duplicate word typos. NFCFangrui Song1-3/+3
Those fixes were taken from https://reviews.llvm.org/D137338
2023-04-17Remove several no longer needed includes. NFCIBjorn Pettersson1-3/+0
Mostly removing includes of InitializePasses.h and Pass.h in passes that no longer has support for the legacy PM.
2023-02-14[LoopFuse] Remove legacy passFangrui Song1-61/+0
Following recent changes to remove non-core legacy passes.
2023-01-14Use std::nullopt instead of None in comments (NFC)Kazu Hirata1-2/+2
2023-01-11[LoopFusion] Sorting of undominated FusionCandidates crashesRamkrishnan Narayanan Komala1-4/+30
This patch tries to fix [[ https://github.com/llvm/llvm-project/issues/56263 | issue ]]. If two **FusionCandidates** are in same level of dominator tree then, they will not be dominates each other. But they are control flow equivalent. To sort those FusionCandidates **nonStrictlyPostDominate** check is needed. Reviewed By: Narutoworld Differential Revision: https://reviews.llvm.org/D139993
2023-01-03[LoopFusion] Exit early if one of fusion candidate has guarded branch but ↵luxufan1-3/+4
the another has not Fixes: https://github.com/llvm/llvm-project/issues/59024 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D138269
2022-12-19[LoopPeel] Expose ValueMap of last peeled iteration. NFCAnna Thomas1-1/+2
The value map of last peeled iteration is computed within peelLoop API. This patch exposes it for callers of peelLoop. While this is not currently used by upstream passes, we have a usecase downstream which benefits from this API update. Future users of peelLoop can also use the ValueMap if needed. Similar value maps are exposed by other loop utilities such as loop cloning. Differential Revision: https://reviews.llvm.org/D138228
2022-12-16[SCEV] Return ArrayRef for SCEV operands() (NFC)Nikita Popov1-1/+1
Use a consistent type for the operands() methods of different SCEV types. Also make the API consistent by only providing operands(), rather than also providin op_begin() and op_end() for some of them.
2022-12-13[LoopFusion] sink second loop PHIsJoshua Cao1-0/+5
Fixes https://github.com/llvm/llvm-project/issues/59023 PHI nodes that are in the second loop only have the first loop as its predecessor. These PHI nodes should be sunk to the end of the fused loop. If the second loop uses the PHI, then the loops cannot be fused. I don't think this should happen in typical compilation workflows. The PHI will be in a dedicated exit block of the first loop following LCSSA transformations. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D139812
2022-12-13[Transforms/Scalar] llvm::Optional => std::optionalFangrui Song1-4/+4
2022-12-02[Transforms] Use std::nullopt instead of None (NFC)Kazu Hirata1-4/+4
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-11[LoopFuse] Ensure inner loops are in loop simplified form under new PMMengxuan Cai1-3/+2
LoopInfo doesn't give all loops in a loop nest, it gives top level loops only. While isLoopSimplifyForm() only checkes for the outter most loop of a loop nest. As a result, inner loops that are not in simplied form can not be simplified with the original code. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D137672
2022-10-31[LoopFuse] Ensure loops are in loop simplified form under new PMMengxuan Cai1-1/+14
Loop Fusion (Function Pass) requires loops in simplified form. With legacy-pm, loop-simplify pass is added as a dependency for loop-fusion. But the new pass manager does not always ensure this format. This patch tries to invoke simplifyLoop() on loops that are not in simplified form only for new PM. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D136781
2022-09-19[LoopFuse] Drop loop dispositions before reassigning blocks to other loopMax Kazantsev1-0/+2
This bug was found by recent improvement in SCEV verifier. The code in LoopFuse directly reassigns blocks to be a part of a different loop, which should automatically invalidate all related cached loop dispositions. Differential Revision: https://reviews.llvm.org/D134173 Reviewed By: nikic
2022-09-07Sink/hoist memory instructions between loop fusion candidatesAaron Kogon1-36/+125
Currently, instructions in the preheader of the second of two fusion candidates are sunk and hoisted whenever possible, to try to allow the loops to fuse. Memory instructions are skipped, and are never sunk or hoisted. This change adds memory instructions for sinking/hoisting consideration. This change uses DependenceAnalysis to check if a mem inst in the preheader of FC1 depends on an instruction in FC0's header, across which it will be hoisted, or FC1's header, across which it will be sunk. We reject cases where the dependency is a data hazard. Differential Revision: https://reviews.llvm.org/D131606
2022-08-20Remove redundant initialization of Optional (NFC)Kazu Hirata1-1/+1
2022-08-20[Scalar] Qualify auto in range-based for loops (NFC)Kazu Hirata1-1/+1
Identified with readability-qualified-auto.
2022-08-07[Transforms] Fix comment typos (NFC)Kazu Hirata1-1/+1
2022-08-07[llvm] Fix comment typos (NFC)Kazu Hirata1-1/+1
2022-07-27Sinking or hoisting instructions between loops before fusionAaron Kogon1-15/+143
Instructions between two adjacent loops will be hoisted above the first loop, or sunk below the second to facilitate loop fusion. Hoisting will be attempted for an instruction that dominates the first loop. Otherwise, sinking this instructions will be attempted. Instructions with side effects will not be considered for sinking or hoisting. Hoisting/sinking of any instructions between loops will only be performed if all the instructions can be moved. As well, sinking/hoisting is considered for each instruction in isolation, without taking into account sinking/hoisting decisions for other instructions in the preheader. Differential Revision: https://reviews.llvm.org/D118076
2022-06-05Remove unneeded cl::ZeroOrMore for cl::opt/cl::list optionsFangrui Song1-1/+1
2022-06-04Remove unneeded cl::ZeroOrMore for cl::opt optionsFangrui Song1-1/+1
Similar to 557efc9a8b68628c2c944678c6471dac30ed9e8e. This commit handles options where cl::ZeroOrMore is more than one line below cl::opt.
2022-02-02[LoopFuse] Change DT to reference in FusionCandidate struct. NFCAnna Thomas1-6/+5
Assertion added in f50821cff0 confirms that the DT is indeed nonnull. Change it to a reference instead of a pointer to make this explicit in FusionCandidate. Suggested in D118472.
2022-02-01[LoopFuse] Add assertion for non-null DT in fusion candidateAnna Thomas1-0/+1
The code paths analyzed (all constructor invocations of fusion candidate) pass in a non-null DT. Adding this assert as requested in D118472 before converting this to a reference argument.
2022-02-01[LoopPeel] Use reference instead of pointer for DT argumentAnna Thomas1-1/+1
Cleanup code in peelLoop API. We already have usage of DT without guarding against a null DT, so this change constant folds the remaining null DT checks. Also make the argument a reference so that it is clear the argument is a nonnull DT. Extracted from D118472.
2021-04-26[ADT] Remove StatisticBase and make NoopStatistic emptyFangrui Song1-0/+4
In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 mostly unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. For now this patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful in LLVM_ENABLE_STATS==0 builds, we can replace `llvm::Statistic` with `llvm::TrackingStatistic`, or use a different abstraction to keep track of the strings. Similarly, skip the code in `mlir/lib/Pass/PassStatistics.cpp` which calls `getName`/`getDesc`/`getValue`. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211
2021-04-26Revert "[ADT] Remove StatisticBase and make NoopStatistic empty"Lei Zhang1-4/+0
This reverts commit b5403117814a7c39b944839e10492493f2ceb4ac because it breaks MLIR build: https://buildkite.com/mlir/mlir-core/builds/13299#ad0f8901-dfa4-43cf-81b8-7940e2c6c15b
2021-04-26[ADT] Remove StatisticBase and make NoopStatistic emptyFangrui Song1-0/+4
In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. This patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful and not noisy, we can replace `llvm::Statistic` with `llvm::TrackingStatistic` in the future. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211
2021-04-06[LoopFusion] Bails out if only the second candidate is guarded (PR48060)Ta-Wei Tu1-0/+10
If only the second candidate loop is guarded while the first one is not, fusioning two loops might not be valid but this check is currently missing. Fixes https://bugs.llvm.org/show_bug.cgi?id=48060 Reviewed By: sidbav Differential Revision: https://reviews.llvm.org/D99716
2021-01-02[Transforms] Construct SmallVector with iterator ranges (NFC)Kazu Hirata1-4/+2
2020-11-15[Loop Fusion] Use pred_empty and succ_empty (NFC)Kazu Hirata1-10/+7
2020-09-22[LoopInfo] empty() -> isInnermost(), add isOutermost()Stefanos Baziotis1-2/+2
Differential Revision: https://reviews.llvm.org/D82895
2020-07-31[Loop Peeling] Separate the Loop Peeling Utilities from the Loop Unrolling ↵Sidharth Baveja1-1/+1
Utilities Summary: This patch separates the Loop Peeling Utilities from Loop Unrolling. The reason for this change is that Loop Peeling is no longer only being used by loop unrolling; Patch D82927 introduces loop peeling with fusion, such that loops can be modified to have to same trip count, making them legal to be peeled. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D83056
2020-07-23[Loop Fusion] Integrate Loop Peeling into Loop Fusion (re-land after fixing ↵Sidharth Baveja1-41/+261
ASAN build failures) This patch adds the ability to peel off iterations of the first loop in loop fusion. This can allow for both loops to have the same trip count, making it legal for them to be fused together. Here is a simple scenario peeling can be used in loop fusion: for (i = 0; i < 10; ++i) a[i] = a[i] + 3; for (j = 1; j < 10; ++j) b[j] = b[j] + 5; Here is we can make use of peeling, and then fuse the two loops together. We can peel off the 0th iteration of the loop i, and then combine loop i and j for i = 1 to 10. a[0] = a[0] +3; for (i = 1; i < 10; ++i) { a[i] = a[i] + 3; b[i] = b[i] + 5; } Currently peeling with loop fusion is only supported for loops with constant trip counts and a single exit point. Both unguarded and guarded loops are supported. Reviewed By: bmahjour (Bardia Mahjour), MaskRay (Fangrui Song) Differential Revision: https://reviews.llvm.org/D82927
2020-07-21Revert D82927 "[Loop Fusion] Integrate Loop Peeling into Loop Fusion"Fangrui Song1-255/+42
This reverts commit bb8850d34d601d4edd75fd30c07821c05a726c42. It broke 3 check-llvm-transforms-loopfusion tests in an ASAN build. LoopFuse.cpp `for (BasicBlock *Pred : predecessors(BB)) {` may operate on a deleted BB.