aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Utils/LoopSimplify.cpp
diff options
context:
space:
mode:
authorYaxun (Sam) Liu <yaxun.liu@amd.com>2022-09-17 17:57:35 -0400
committerYaxun (Sam) Liu <yaxun.liu@amd.com>2022-09-18 20:21:14 -0400
commite5581df60a35fffb0c69589777e4e126c849405f (patch)
tree83923a7eff0231df1b1e198bfb1865575a2de95b /llvm/lib/Transforms/Utils/LoopSimplify.cpp
parent930315f6aa587ac962183708844eb2390d5ba55e (diff)
downloadllvm-e5581df60a35fffb0c69589777e4e126c849405f.zip
llvm-e5581df60a35fffb0c69589777e4e126c849405f.tar.gz
llvm-e5581df60a35fffb0c69589777e4e126c849405f.tar.bz2
[SimplifyCFG] accumulate bonus insts cost
SimplifyCFG folds bool foo() { if (cond1) return false; if (cond2) return false; return true; } as bool foo() { if (cond1 | cond2) return false return true; } 'cond2' is called 'bonus insts' in branch folding since they introduce overhead since the original CFG could do early exit but the folded CFG always executes them. SimplifyCFG calculates the costs of 'bonus insts' of a folding a BB into its predecessor BB which shares the destination. If it is below bonus-inst-threshold, SimplifyCFG will fold that BB into its predecessor and cond2 will always be executed. When SimplifyCFG calculates the cost of 'bonus insts', it only consider 'bonus' insts in the current BB to be considered for folding. This causes issue for unrolled loops which share destinations, e.g. bool foo(int *a) { for (int i = 0; i < 32; i++) if (a[i] > 0) return false; return true; } After unrolling, it becomes bool foo(int *a) { if(a[0]>0) return false if(a[1]>0) return false; //... if(a[31]>0) return false; return true; } SimplifyCFG will merge each BB with its predecessor BB, and ends up with 32 'bonus insts' which are always executed, which is much slower than the original CFG. The root cause is that SimplifyCFG does not consider the accumulated cost of 'bonus insts' which are folded from different BB's. This patch fixes that by introducing a ValueMap to track costs of 'bonus insts' coming from different BB's into the same BB, and cuts off if the accumulated cost exceeds a threshold. Reviewed by: Artem Belevich, Florian Hahn, Nikita Popov, Matt Arsenault Differential Revision: https://reviews.llvm.org/D132408
Diffstat (limited to 'llvm/lib/Transforms/Utils/LoopSimplify.cpp')
-rw-r--r--llvm/lib/Transforms/Utils/LoopSimplify.cpp3
1 files changed, 2 insertions, 1 deletions
diff --git a/llvm/lib/Transforms/Utils/LoopSimplify.cpp b/llvm/lib/Transforms/Utils/LoopSimplify.cpp
index 363f3c5..542094f 100644
--- a/llvm/lib/Transforms/Utils/LoopSimplify.cpp
+++ b/llvm/lib/Transforms/Utils/LoopSimplify.cpp
@@ -480,6 +480,7 @@ static bool simplifyOneLoop(Loop *L, SmallVectorImpl<Loop *> &Worklist,
DominatorTree *DT, LoopInfo *LI,
ScalarEvolution *SE, AssumptionCache *AC,
MemorySSAUpdater *MSSAU, bool PreserveLCSSA) {
+ SimplifyCFGCostTracker CostTracker;
bool Changed = false;
if (MSSAU && VerifyMemorySSA)
MSSAU->getMemorySSA()->verifyMemorySSA();
@@ -666,7 +667,7 @@ ReprocessLoop:
// The block has now been cleared of all instructions except for
// a comparison and a conditional branch. SimplifyCFG may be able
// to fold it now.
- if (!FoldBranchToCommonDest(BI, /*DTU=*/nullptr, MSSAU))
+ if (!FoldBranchToCommonDest(BI, CostTracker, /*DTU=*/nullptr, MSSAU))
continue;
// Success. The block is now dead, so remove it from the loop,