aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/LazyCallGraph.cpp
AgeCommit message (Collapse)AuthorFilesLines
2016-09-04[LCG] Clean up and make NDEBUG verify calls more rigorous withChandler Carruth1-32/+38
make_scope_exit now that we have that utility. This makes the code much more clear and readable by isolating the check. It also makes it easy to go through and make sure all the interesting update routines have a start and end verify so we don't slowly let the graph drift into an invalid state. llvm-svn: 280619
2016-09-04[LCG] A NFC refactoring to extract the logic for doingChandler Carruth1-111/+184
a postorder-sequence based update after edge insertion into a generic helper function. This separates the SCC-specific logic into two fairly simple lambdas and extracts the rest into a generic helper template function. I think this is a net win on its own merits because it disentangles different pieces of the algorithm. Now there is one place that does the two-step partition to identify a set of newly connected components and at the same time update the postorder sequence. However, I'm also hoping to re-use this an upcoming patch to update a cached post-order sequence of RefSCCs when doing the analogous update to the RefSCC graph, and I don't want to have two copies. The diff is quite messy but this really is just moving things around and making types generic rather than specific. llvm-svn: 280618
2016-08-24[PM] Introduce basic update capabilities to the new PM's CGSCC passChandler Carruth1-23/+19
manager, including both plumbing and logic to handle function pass updates. There are three fundamentally tied changes here: 1) Plumbing *some* mechanism for updating the CGSCC pass manager as the CG changes while passes are running. 2) Changing the CGSCC pass manager infrastructure to have support for the underlying graph to mutate mid-pass run. 3) Actually updating the CG after function passes run. I can separate them if necessary, but I think its really useful to have them together as the needs of #3 drove #2, and that in turn drove #1. The plumbing technique is to extend the "run" method signature with extra arguments. We provide the call graph that intrinsically is available as it is the basis of the pass manager's IR units, and an output parameter that records the results of updating the call graph during an SCC passes's run. Note that "...UpdateResult" isn't a *great* name here... suggestions very welcome. I tried a pretty frustrating number of different data structures and such for the innards of the update result. Every other one failed for one reason or another. Sometimes I just couldn't keep the layers of complexity right in my head. The thing that really worked was to just directly provide access to the underlying structures used to walk the call graph so that their updates could be informed by the *particular* nature of the change to the graph. The technique for how to make the pass management infrastructure cope with mutating graphs was also something that took a really, really large number of iterations to get to a place where I was happy. Here are some of the considerations that drove the design: - We operate at three levels within the infrastructure: RefSCC, SCC, and Node. In each case, we are working bottom up and so we want to continue to iterate on the "lowest" node as the graph changes. Look at how we iterate over nodes in an SCC running function passes as those function passes mutate the CG. We continue to iterate on the "lowest" SCC, which is the one that continues to contain the function just processed. - The call graph structure re-uses SCCs (and RefSCCs) during mutation events for the *highest* entry in the resulting new subgraph, not the lowest. This means that it is necessary to continually update the current SCC or RefSCC as it shifts. This is really surprising and subtle, and took a long time for me to work out. I actually tried changing the call graph to provide the opposite behavior, and it breaks *EVERYTHING*. The graph update algorithms are really deeply tied to this particualr pattern. - When SCCs or RefSCCs are split apart and refined and we continually re-pin our processing to the bottom one in the subgraph, we need to enqueue the newly formed SCCs and RefSCCs for subsequent processing. Queuing them presents a few challenges: 1) SCCs and RefSCCs use wildly different iteration strategies at a high level. We end up needing to converge them on worklist approaches that can be extended in order to be able to handle the mutations. 2) The order of the enqueuing need to remain bottom-up post-order so that we don't get surprising order of visitation for things like the inliner. 3) We need the worklists to have set semantics so we don't duplicate things endlessly. We don't need a *persistent* set though because we always keep processing the bottom node!!!! This is super, super surprising to me and took a long time to convince myself this is correct, but I'm pretty sure it is... Once we sink down to the bottom node, we can't re-split out the same node in any way, and the postorder of the current queue is fixed and unchanging. 4) We need to make sure that the "current" SCC or RefSCC actually gets enqueued here such that we re-visit it because we continue processing a *new*, *bottom* SCC/RefSCC. - We also need the ability to *skip* SCCs and RefSCCs that get merged into a larger component. We even need the ability to skip *nodes* from an SCC that are no longer part of that SCC. This led to the design you see in the patch which uses SetVector-based worklists. The RefSCC worklist is always empty until an update occurs and is just used to handle those RefSCCs created by updates as the others don't even exist yet and are formed on-demand during the bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and we push new SCCs onto it and blacklist existing SCCs on it to get the desired processing. We then *directly* update these when updating the call graph as I was never able to find a satisfactory abstraction around the update strategy. Finally, we need to compute the updates for function passes. This is mostly used as an initial customer of all the update mechanisms to drive their design to at least cover some real set of use cases. There are a bunch of interesting things that came out of doing this: - It is really nice to do this a function at a time because that function is likely hot in the cache. This means we want even the function pass adaptor to support online updates to the call graph! - To update the call graph after arbitrary function pass mutations is quite hard. We have to build a fairly comprehensive set of data structures and then process them. Fortunately, some of this code is related to the code for building the cal graph in the first place. Unfortunately, very little of it makes any sense to share because the nature of what we're doing is so very different. I've factored out the one part that made sense at least. - We need to transfer these updates into the various structures for the CGSCC pass manager. Once those were more sanely worked out, this became relatively easier. But some of those needs necessitated changes to the LazyCallGraph interface to make it significantly easier to extract the changed SCCs from an update operation. - We also need to update the CGSCC analysis manager as the shape of the graph changes. When an SCC is merged away we need to clear analyses associated with it from the analysis manager which we didn't have support for in the analysis manager infrsatructure. New SCCs are easy! But then we have the case that the original SCC has its shape changed but remains in the call graph. There we need to *invalidate* the analyses associated with it. - We also need to invalidate analyses after we *finish* processing an SCC. But the analyses we need to invalidate here are *only those for the newly updated SCC*!!! Because we only continue processing the bottom SCC, if we split SCCs apart the original one gets invalidated once when its shape changes and is not processed farther so its analyses will be correct. It is the bottom SCC which continues being processed and needs to have the "normal" invalidation done based on the preserved analyses set. All of this is mostly background and context for the changes here. Many thanks to all the reviewers who helped here. Especially Sanjoy who caught several interesting bugs in the graph algorithms, David, Sean, and others who all helped with feedback. Differential Revision: http://reviews.llvm.org/D21464 llvm-svn: 279618
2016-08-12Use the range variant of find/find_if instead of unpacking begin/endDavid Majnemer1-20/+17
If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278469
2016-08-11Use the range variant of find instead of unpacking begin/endDavid Majnemer1-2/+1
If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433
2016-08-11Use range algorithms instead of unpacking begin/endDavid Majnemer1-2/+1
No functionality change is intended. llvm-svn: 278417
2016-07-07[LCG] Hoist the definitions of the stream operator friends to be inlineChandler Carruth1-42/+0
friend definitions. Based on the experiments Sean Silva and Reid did, this seems the safest course of action and also will work around a questionable warning provided by GCC6 on the old form of the code. Thanks for Davide pointing out the issue and other suggesting ways to fix. llvm-svn: 274740
2016-06-27[PM] Improve the debugging and logging facilities of the CGSCC bits ofChandler Carruth1-0/+53
the new pass manager. This adds operator<< overloads for the various bits of the LazyCallGraph, dump methods for use from the debugger, and debug logging using them to the CGSCC pass manager. Having this was essential for debugging the call graph update patch, and I've extracted what I could from that patch here to minimize the delta. llvm-svn: 273961
2016-06-18Add a super basic LazyCallGraph DOT printer.Sean Silva1-0/+32
Access it through -passes=print-lcg-dot Let me know any suggestions for changing the rendering; I'm not particularly attached to what is implemented here. llvm-svn: 273082
2016-03-11[PM] Make the AnalysisManager parameter to run methods a reference.Chandler Carruth1-2/+2
This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, *many* parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219
2016-03-11[PM] Implement the final conclusion as to how the analysis IDs shouldChandler Carruth1-1/+1
work in the face of the limitations of DLLs and templated static variables. This requires passes that use the AnalysisBase mixin provide a static variable themselves. So as to keep their APIs clean, I've made these private and befriended the CRTP base class (which is the common practice). I've added documentation to AnalysisBase for why this is necessary and at what point we can go back to the much simpler system. This is clearly a better pattern than the extern template as it caught *numerous* places where the template magic hadn't been applied and things were "just working" but would eventually have broken mysteriously. llvm-svn: 263216
2016-02-28[PM] Appease mingw32's auto-import DLL build with minimal tweaks, with fix ↵NAKAMURA Takumi1-0/+2
for clang. char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262188
2016-02-28Revert r262185, "[PM] Appease mingw32's auto-import DLL build with minimal ↵NAKAMURA Takumi1-2/+0
tweaks." I'll rework soon. llvm-svn: 262186
2016-02-28[PM] Appease mingw32's auto-import DLL build with minimal tweaks.NAKAMURA Takumi1-0/+2
char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262185
2016-02-26[PM] Introduce CRTP mixin base classes to help define passes andChandler Carruth1-2/+0
analyses in the new pass manager. These just handle really basic stuff: turning a type name into a string statically that is nice to print in logs, and getting a static unique ID for each analysis. Sadly, the format of passes in anonymous namespaces makes using their names in tests really annoying so I've customized the names of the no-op passes to keep tests sane to read. This is the first of a few simplifying refactorings for the new pass manager that should reduce boilerplate and confusion. llvm-svn: 262004
2016-02-17[LCG] Construct an actual call graph with call-edge SCCs nested insideChandler Carruth1-388/+1181
reference-edge SCCs. This essentially builds a more normal call graph as a subgraph of the "reference graph" that was the old model. This allows both to exist and the different use cases to use the aspect which addresses their needs. Specifically, the pass manager and other *ordering* constrained logic can use the reference graph to achieve conservative order of visit, while analyses reasoning about attributes and other properties derived from reachability can reason about the direct call graph. Note that this isn't necessarily complete: it doesn't model edges to declarations or indirect calls. Those can be found by scanning the instructions of the function if desirable, and in fact every user currently does this in order to handle things like calls to instrinsics. If useful, we could consider caching this information in the call graph to save the instruction scans, but currently that doesn't seem to be important. An important realization for why the representation chosen here works is that the call graph is a formal subset of the reference graph and thus both can live within the same data structure. All SCCs of the call graph are necessarily contained within an SCC of the reference graph, etc. The design is to build 'RefSCC's to model SCCs of the reference graph, and then within them more literal SCCs for the call graph. The formation of actual call edge SCCs is not done lazily, unlike reference edge 'RefSCC's. Instead, once a reference SCC is formed, it directly builds the call SCCs within it and stores them in a post-order sequence. This is used to provide a consistent platform for mutation and update of the graph. The post-order also allows for very efficient updates in common cases by bounding the number of nodes (and thus edges) considered. There is considerable common code that I'm still looking for the best way to factor out between the various DFS implementations here. So far, my attempts have made the code harder to read and understand despite reducing the duplication, which seems a poor tradeoff. I've not given up on figuring out the right way to do this, but I wanted to wait until I at least had the system working and tested to continue attempting to factor it differently. This also requires introducing several new algorithms in order to handle all of the incremental update scenarios for the more complex structure involving two edge colorings. I've tried to comment the algorithms sufficiently to make it clear how this is expected to work, but they may still need more extensive documentation. I know that there are some changes which are not strictly necessarily coupled here. The process of developing this started out with a very focused set of changes for the new structure of the graph and algorithms, but subsequent changes to bring the APIs and code into consistent and understandable patterns also ended up touching on other aspects. There was no good way to separate these out without causing *massive* merge conflicts. Ultimately, to a large degree this is a rewrite of most of the core algorithms in the LCG class and so I don't think it really matters much. Many thanks to the careful review by Sanjoy Das! Differential Revision: http://reviews.llvm.org/D16802 llvm-svn: 261040
2016-02-02[LCG] Build an edge abstraction for the LazyCallGraph and use it toChandler Carruth1-133/+160
differentiate between indirect references to functions an direct calls. This doesn't do a whole lot yet other than change the print out produced by the analysis, but it lays the groundwork for a very major change I'm working on next: teaching the call graph to actually be a call graph, modeling *both* the indirect reference graph and the call graph simultaneously. More details on that in the next patch though. The rest of this is essentially a bunch of over-engineering that won't be interesting until the next patch. But this also isolates essentially all of the churn necessary to introduce the edge abstraction from the very important behavior change necessary in order to separately model the two graphs. So it should make review of the subsequent patch a bit easier at the cost of making this patch seem poorly motivated. ;] Differential Revision: http://reviews.llvm.org/D16038 llvm-svn: 259463
2015-12-28[lcg] Fix a few more formatting goofs found by clang-format. NFC.Chandler Carruth1-4/+4
llvm-svn: 256480
2015-01-14Revert r225854: [PM] Move the LazyCallGraph printing functionality toChandler Carruth1-38/+36
a print method. This was formulated on a bad idea, but sadly I didn't uncover how bad this was until I got further down the path. I had hoped that we could provide a low boilerplate way of printing analyses, but it just doesn't seem like this really fits the needs of the analyses. Not all analyses really want to do printing, and those that do don't all use the same interface. Instead, with the new pass manager let's just take advantage of the fact that creating an explicit printer pass like the LCG has is pretty low boilerplate already and rely on that for testing. llvm-svn: 225861
2015-01-13[PM] Move the LazyCallGraph printing functionality to a print method.Chandler Carruth1-36/+38
I'm adding generic analysis printing utility pass support which will require such a method (or a specialization) so this will let the existing printing logic satisfy that. llvm-svn: 225854
2015-01-05[PM] Switch the new pass manager to use a reference-based API for IRChandler Carruth1-3/+2
units. This was debated back and forth a bunch, but using references is now clearly cleaner. Of all the code written using pointers thus far, in only one place did it really make more sense to have a pointer. In most cases, this just removes immediate dereferencing from the code. I think it is much better to get errors on null IR units earlier, potentially at compile time, than to delay it. Most notably, the legacy pass manager uses references for its routines and so as more and more code works with both, the use of pointers was likely to become really annoying. I noticed this when I ported the domtree analysis over and wrote the entire thing with references only to have it fail to compile. =/ It seemed better to switch now than to delay. We can, of course, revisit this is we learn that references are really problematic in the API. llvm-svn: 225145
2014-11-19Update SetVector to rely on the underlying set's insert to return a ↵David Blaikie1-5/+5
pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334
2014-05-15Fix typosAlp Toker1-2/+2
llvm-svn: 208839
2014-05-04[LCG] Add the last (and most complex) of the edge insertion mutationChandler Carruth1-0/+119
operations on the call graph. This one forms a cycle, and while not as complex as removing an internal edge from an SCC, it involves a reasonable amount of work to find all of the nodes newly connected in a cycle. Also somewhat alarming is the worst case complexity here: it might have to walk roughly the entire SCC inverse DAG to insert a single edge. This is carefully documented in the API (I hope). llvm-svn: 207935
2014-05-01[LCG] Add the other simple edge insertion API to the call graph. ThisChandler Carruth1-0/+15
just connects an SCC to one of its descendants directly. Not much of an impact. The last one is the hard one -- connecting an SCC to one of its ancestors, and thereby forming a cycle such that we have to merge all the SCCs participating in the cycle. llvm-svn: 207751
2014-05-01[LCG] Don't lookup the child SCC twice. Spotted this by inspection, andChandler Carruth1-2/+2
no functionality changed. llvm-svn: 207750
2014-05-01[LCG] Add some basic methods for querying the parent/child relationshipsChandler Carruth1-0/+15
of SCCs in the SCC DAG. Exercise them in the big graph test case. These will be especially useful for establishing invariants in insertion logic. llvm-svn: 207749
2014-04-30[LCG] Add the really, *really* boring edge insertion case: adding anChandler Carruth1-4/+19
edge entirely within an existing SCC. Shockingly, making the connected component more connected is ... a total snooze fest. =] Anyways, its wired up, and I even added a test case to make sure it pretty much sorta works. =D llvm-svn: 207631
2014-04-30[LCG] Actually test the *basic* edge removal bits (IE, the non-SCCChandler Carruth1-4/+8
bits), and discover that it's totally broken. Yay tests. Boo bug. Fix the basic edge removal so that it works by nulling out the removed edges rather than actually removing them. This leaves the indices valid in the map from callee to index, and preserves some of the locality for iterating over edges. The iterator is made bidirectional to reflect that it now has to skip over null entries, and the skipping logic is layered onto it. As future work, I would like to track essentially the "load factor" of the edge list, and when it falls below a threshold do a compaction. An alternative I considered (and continue to consider) is storing the callees in a doubly linked list where each element of the list is in a set (which is essentially the classical linked-hash-table datastructure). The problem with that approach is that either you need to heap allocate the linked list nodes and use pointers to them, or use a bucket hash table (with even *more* linked list pointer overhead!), etc. It's pretty easy to get 5x overhead for values that are just pointers. So far, I think punching holes in the vector, and periodic compaction is likely to be much more efficient overall in the space/time tradeoff. llvm-svn: 207619
2014-04-28[LCG] Add the most basic of edge insertion to the lazy call graph. ThisChandler Carruth1-0/+15
just handles the pre-DFS case. Also add some test cases for this case to make sure it works. llvm-svn: 207411
2014-04-28[LCG] Make the return of the IntraSCC removal method actually match itsChandler Carruth1-5/+3
contract (and be much more useful). It now provides exactly the post-order traversal a caller might need to perform on newly formed SCCs. llvm-svn: 207410
2014-04-27[LCG] Re-organize the methods for mutating a call graph to make theirChandler Carruth1-76/+78
API requirements much more obvious. The key here is that there are two totally different use cases for mutating the graph. Prior to doing any SCC formation, it is very easy to mutate the graph. There may be users that want to do small tweaks here, and then use the already-built graph for their SCC-based operations. This method remains on the graph itself and is documented carefully as being cheap but unavailable once SCCs are formed. Once SCCs are formed, and there is some in-flight DFS building them, we have to be much more careful in how we mutate the graph. These mutation operations are sunk onto the SCCs themselves, which both simplifies things (the code was already there!) and helps make it obvious that these interfaces are only applicable within that context. The other primary constraint is that the edge being mutated is actually related to the SCC on which we call the method. This helps make it obvious that you cannot arbitrarily mutate some other SCC. I've tried to write much more complete documentation for the interesting mutation API -- intra-SCC edge removal. Currently one aspect of this documentation is a lie (the result list of SCCs) but we also don't even have tests for that API. =[ I'm going to add tests and fix it to match the documentation next. llvm-svn: 207339
2014-04-26[LCG] Rather than removing nodes from the SCC entry set when we processChandler Carruth1-6/+7
them, just skip over any DFS-numbered nodes when finding the next root of a DFS. This allows the entry set to just be a vector as we populate it from a uniqued source. It also removes the possibility for a linear scan of the entry set to actually do the removal which can make things go quadratic if we get unlucky. llvm-svn: 207312
2014-04-26[LCG] Rotate the full SCC finding algorithm to avoid round-trips throughChandler Carruth1-21/+23
the DFS stack for leaves in the call graph. As mentioned in my previous commit, this is particularly interesting for graphs which have high fan out but low connectivity resulting in many leaves. For such graphs, this can remove a large % of the DFS stack traffic even though it doesn't make the stack much smaller. It's a bit easier to formulate this for the full algorithm because that one stops completely for each SCC. For example, I was able to directly eliminate the "Recurse" boolean used to continue an outer loop from the inner loop. llvm-svn: 207311
2014-04-26[LCG] Hoist the main DFS loop out of the edge removal function. ThisChandler Carruth1-74/+70
makes working through the worklist much cleaner, and makes it possible to avoid the 'bool-to-continue-the-outer-loop' hack. Not a huge difference, but I think this is approaching as polished as I can make it. llvm-svn: 207310
2014-04-26[LCG] In the incremental SCC re-formation, lift the node currently beingChandler Carruth1-30/+38
processed in the DFS out of the stack completely. Keep it exclusively in a variable. Re-shuffle some code structure to make this easier. This can have a very dramatic effect in some cases because call graphs tend to look like a high fan-out spanning tree. As a consequence, there are a large number of leaf nodes in the graph, and this technique causes leaf nodes to never even go into the stack. While this only reduces the max depth by 1, it may cause the total number of round trips through the stack to drop by a lot. Now, most of this isn't really relevant for the incremental version. =] But I wanted to prototype it first here as this variant is in ways more complex. As long as I can get the code factored well here, I'll next make the primary walk look the same. There are several refactorings this exposes I think. llvm-svn: 207306
2014-04-26[LCG] Special case the removal of self edges. These don't impact the SCCChandler Carruth1-0/+6
graph in any way because we don't track edges in the SCC graph, just nodes. This also lets us add a nice assert about the invariant that we're working on at least a certain number of nodes within the SCC. llvm-svn: 207305
2014-04-26[LCG] Refactor the duplicated code I added in my last commit here intoChandler Carruth1-23/+14
a helper function. Also factor the other two places where we did the same thing into the helper function. =] Much cleaner this way. NFC. llvm-svn: 207300
2014-04-25[LCG] During the incremental update of an SCC, switch to using theChandler Carruth1-26/+26
SCCMap to test for nodes that have been re-added to the root SCC rather than a set vector. We already have done the SCCMap lookup, we juts need to test it in two different ways. In turn, do most of the processing of these nodes as they go into the root SCC rather than lazily. This simplifies the final loop to just stitch the root SCC into its children's parent sets. No functionlatiy changed. However, this makes a few things painfully obvious, which was my intent. =] There is tons of repeated code introduced here and elsewhere. I'm splitting the refactoring of that code into helpers from this change so its clear that this is the change which switches the datastructures used around, and the other is a pure factoring & deduplication of code change. llvm-svn: 207217
2014-04-25[LCG] During the incremental re-build of an SCC after removing an edge,Chandler Carruth1-4/+5
remove the nodes in the SCC from the SCC map entirely prior to the DFS walk. This allows the SCC map to represent both the state of not-yet-re-added-to-an-SCC and added-back-to-this-SCC independently. The first is being missing from the SCC map, the second is mapping back to 'this'. In a subsequent commit, I'm going to use this property to simplify the new node list for this SCC. In theory, I think this also makes the contract for orphaning a node from the graph slightly less confusing. Now it is also orphaned from the SCC graph. Still, this isn't quite right either, and so I'm not adding test cases here. I'll add test cases for the behavior of orphaning nodes when the code *actually* supports it. The change here is mostly incidental, my goal is simplifying the algorithm. llvm-svn: 207213
2014-04-25[LCG] Rather than doing a linear time SmallSetVector removal of eachChandler Carruth1-6/+7
child from the worklist, wait until we actually need to pop another element off of the worklist and skip over any that were already visited by the DFS. This also enables swapping the nodes of the SCC into the worklist. No functionality changed. llvm-svn: 207212
2014-04-25[LCG] Remove a completely unnecessary loop. It wasn't even doing anyChandler Carruth1-60/+54
thing, just mucking up the code. I feel bad that I even wrote this loop. Very sorry. The diff is huge because of the indent change, but I promise all this is doing is realizing that the outer two loops were actually the exact same loops, and we didn't need two of them. llvm-svn: 207202
2014-04-25[LCG] Now that the loop structure of the core SCC finding routine isChandler Carruth1-1/+8
factored into a more reasonable form, replace the tail call with a simple outer-loop continuation. It's sad that C++ makes this so awkward to write, but it seems more direct and clear than the tail call at this point. llvm-svn: 207201
2014-04-24[LCG] Switch a weird do/while loop that actually couldn't fail itsChandler Carruth1-5/+4
condition into an obviously infinite loop with an assert about the degenerate condition. No functionality changed. llvm-svn: 207147
2014-04-24[LCG] Incorporate the core trick of improvements on the naive Tarjan'sChandler Carruth1-41/+61
algorithm here: http://dl.acm.org/citation.cfm?id=177301. The idea of isolating the roots has even more relevance when using the stack not just to implement the DFS but also to implement the recursive step. Because we use it for the recursive step, to isolate the roots we need to maintain two stacks: one for our recursive DFS walk, and another of the nodes that have been walked. The nice thing is that the latter will be half the size. It also fixes a complete hack where we scanned backwards over the stack to find the next potential-root to continue processing. Now that is always the top of the DFS stack. While this is a really nice improvement already (IMO) it further opens the door for two important simplifications: 1) De-duplicating some of the code across the two different walks. I've actually made the duplication a bit worse in some senses with this patch because the two are starting to converge. 2) Dramatically simplifying the loop structures of both walks. I wanted to do those separately as they'll be essentially *just* CFG restructuring. This patch on the other hand actually uses different datastructures to implement the algorithm itself. llvm-svn: 207098
2014-04-24[LCG] Rotate logic applied to the top of the DFSStack to instead beChandler Carruth1-25/+24
applied prior to pushing a node onto the DFSStack. This is the first step toward avoiding the stack entirely for leaf nodes. It also simplifies things a bit and I think is pointing the way toward factoring some more of the shared logic out of the two implementations. It is also making it more obvious how to restructure the loops themselves to be a bit easier to read (although no different in terms of functionality). llvm-svn: 207095
2014-04-24[LCG] Switch the parent SCC tracking from a SmallSetVector toChandler Carruth1-2/+2
a SmallPtrSet. Currently, there is no need for stable iteration in this dimension, and I now thing there won't need to be going forward. If this is ever re-introduced in any form, it needs to not be a SetVector based solution because removal cannot be linear. There will be many SCCs with large numbers of parents. When encountering these, the incremental SCC update for intra-SCC edge removal was quadratic due to linear removal (kind of). I'm really hoping we can avoid having an ordering property here at all though... llvm-svn: 207091
2014-04-24[LCG] We don't actually need a set in each SCC to track the nodes. WeChandler Carruth1-7/+1
can use the node -> SCC mapping in the top-level graph to test this on the rare occasions we need it. llvm-svn: 207090
2014-04-23[LCG] Normalize the post-order SCC iterator to just iterate over the SCCChandler Carruth1-2/+2
values rather than having pointers in weird places. llvm-svn: 207053
2014-04-23[LCG] Switch the primary node iterator to be a *much* more normal C++Chandler Carruth1-39/+33
iterator, returning a Node by reference on dereference. llvm-svn: 207048