aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/CGSCCPassManager.cpp
diff options
context:
space:
mode:
authorArthur Eubanks <aeubanks@google.com>2024-06-11 10:50:13 -0600
committerGitHub <noreply@github.com>2024-06-11 09:50:13 -0700
commit71497cc7a4695d22fc5dfd64958744816c15a19e (patch)
treed11ee580a32384decfe7fb2845b829128ff06deb /llvm/lib/Analysis/CGSCCPassManager.cpp
parenta03e93e1b2172791085f3f8c293b8e5d6ed8d841 (diff)
downloadllvm-71497cc7a4695d22fc5dfd64958744816c15a19e.zip
llvm-71497cc7a4695d22fc5dfd64958744816c15a19e.tar.gz
llvm-71497cc7a4695d22fc5dfd64958744816c15a19e.tar.bz2
[CGSCC] Fix compile time blowup with large RefSCCs (#94815)
In some modules, e.g. Kotlin-generated IR, we end up with a huge RefSCC and the call graph updates done as a result of the inliner take a long time. This is due to RefSCC::removeInternalRefEdges() getting called many times, each time removing one function from the RefSCC, but each call to removeInternalRefEdges() is proportional to the size of the RefSCC. There are two places that call removeInternalRefEdges(), in updateCGAndAnalysisManagerForPass() and LazyCallGraph::removeDeadFunction(). 1) Since LazyCallGraph can deal with spurious (edges that exist in the graph but not in the IR) ref edges, we can simply not call removeInternalRefEdges() in updateCGAndAnalysisManagerForPass(). 2) LazyCallGraph::removeDeadFunction() still ends up taking the brunt of compile time with the above change for the original reason. So instead we batch all the dead function removals so we can call removeInternalRefEdges() just once. This requires some changes to callers of removeDeadFunction() to not actually erase the function from the module, but defer it to when we batch delete dead functions at the end of the CGSCC run, leaving the function body as "unreachable" in the meantime. We still need to ensure that call edges are accurate. I had also tried deleting dead functions after visiting a RefSCC, but deleting them all at once at the end was simpler. Many test changes are due to not performing unnecessary revisits of an SCC (the CGSCC infrastructure deems ref edge refinements as unimportant when it comes to revisiting SCCs, although that seems to not be consistently true given these changes) because we don't remove some ref edges. Specifically for devirt-invalidated.ll this seems to expose an inlining order issue with the inliner. Probably unimportant for this type of intentionally weird call graph. Compile time: https://llvm-compile-time-tracker.com/compare.php?from=6f2c61071c274a1b5e212e6ad4114641ec7c7fc3&to=b08c90d05e290dd065755ea776ceaf1420680224&stat=instructions:u
Diffstat (limited to 'llvm/lib/Analysis/CGSCCPassManager.cpp')
-rw-r--r--llvm/lib/Analysis/CGSCCPassManager.cpp42
1 files changed, 9 insertions, 33 deletions
diff --git a/llvm/lib/Analysis/CGSCCPassManager.cpp b/llvm/lib/Analysis/CGSCCPassManager.cpp
index 8ae5c3d..2ed1d98 100644
--- a/llvm/lib/Analysis/CGSCCPassManager.cpp
+++ b/llvm/lib/Analysis/CGSCCPassManager.cpp
@@ -158,10 +158,12 @@ ModuleToPostOrderCGSCCPassAdaptor::run(Module &M, ModuleAnalysisManager &AM) {
SmallDenseSet<std::pair<LazyCallGraph::Node *, LazyCallGraph::SCC *>, 4>
InlinedInternalEdges;
+ SmallVector<Function *, 4> DeadFunctions;
+
CGSCCUpdateResult UR = {
- RCWorklist, CWorklist, InvalidRefSCCSet,
- InvalidSCCSet, nullptr, PreservedAnalyses::all(),
- InlinedInternalEdges, {}};
+ RCWorklist, CWorklist, InvalidRefSCCSet,
+ InvalidSCCSet, nullptr, PreservedAnalyses::all(),
+ InlinedInternalEdges, DeadFunctions, {}};
// Request PassInstrumentation from analysis manager, will use it to run
// instrumenting callbacks for the passes later.
@@ -340,6 +342,10 @@ ModuleToPostOrderCGSCCPassAdaptor::run(Module &M, ModuleAnalysisManager &AM) {
} while (!RCWorklist.empty());
}
+ CG.removeDeadFunctions(DeadFunctions);
+ for (Function *DeadF : DeadFunctions)
+ DeadF->eraseFromParent();
+
#if defined(EXPENSIVE_CHECKS)
// Verify that the call graph is still valid.
CG.verify();
@@ -1030,36 +1036,6 @@ static LazyCallGraph::SCC &updateCGAndAnalysisManagerForPass(
return true;
});
- // Now do a batch removal of the internal ref edges left.
- auto NewRefSCCs = RC->removeInternalRefEdge(N, DeadTargets);
- if (!NewRefSCCs.empty()) {
- // The old RefSCC is dead, mark it as such.
- UR.InvalidatedRefSCCs.insert(RC);
-
- // Note that we don't bother to invalidate analyses as ref-edge
- // connectivity is not really observable in any way and is intended
- // exclusively to be used for ordering of transforms rather than for
- // analysis conclusions.
-
- // Update RC to the "bottom".
- assert(G.lookupSCC(N) == C && "Changed the SCC when splitting RefSCCs!");
- RC = &C->getOuterRefSCC();
- assert(G.lookupRefSCC(N) == RC && "Failed to update current RefSCC!");
-
- // The RC worklist is in reverse postorder, so we enqueue the new ones in
- // RPO except for the one which contains the source node as that is the
- // "bottom" we will continue processing in the bottom-up walk.
- assert(NewRefSCCs.front() == RC &&
- "New current RefSCC not first in the returned list!");
- for (RefSCC *NewRC : llvm::reverse(llvm::drop_begin(NewRefSCCs))) {
- assert(NewRC != RC && "Should not encounter the current RefSCC further "
- "in the postorder list of new RefSCCs.");
- UR.RCWorklist.insert(NewRC);
- LLVM_DEBUG(dbgs() << "Enqueuing a new RefSCC in the update worklist: "
- << *NewRC << "\n");
- }
- }
-
// Next demote all the call edges that are now ref edges. This helps make
// the SCCs small which should minimize the work below as we don't want to
// form cycles that this would break.