aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/TailDuplication.cpp
AgeCommit message (Collapse)AuthorFilesLines
2019-01-19Update the file headers across all of the LLVM projects in the monorepoChandler Carruth1-4/+3
to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
2018-01-19Split TailDuplicatePass into pre- and post-RA variant; NFCMatthias Braun1-23/+36
Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This avoids playing games with fake pass IDs and using MRI::isSSA() to determine pre-/post-RA state. llvm-svn: 322926
2017-12-15MachineFunction: Return reference from getFunction(); NFCMatthias Braun1-1/+1
The Function can never be nullptr so we can return a reference. llvm-svn: 320884
2017-12-13Remove unnecessary includes; NFCMatthias Braun1-0/+2
llvm-svn: 320545
2017-08-23Add test case for r311511Matthias Braun1-1/+4
This also changes the TailDuplicator to be configured explicitely pre/post regalloc rather than relying on the isSSA() flag. This was necessary to have `llc -run-pass` work reliably. llvm-svn: 311520
2017-06-07[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko1-5/+9
warnings; other minor fixes (NFC). llvm-svn: 304954
2017-05-25CodeGen: Rename DEBUG_TYPE to match passnamesMatthias Braun1-2/+1
Rename the DEBUG_TYPE to match the names of corresponding passes where it makes sense. Also establish the pattern of simply referencing DEBUG_TYPE instead of repeating the passname where possible. llvm-svn: 303921
2016-10-11Codegen: Tail-duplicate during placement.Kyle Butt1-1/+1
The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283934
2016-10-11Revert "Codegen: Tail-duplicate during placement."Daniel Jasper1-1/+1
This reverts commit r283842. test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our internal testing. I'll share a link with you. llvm-svn: 283857
2016-10-11Codegen: Tail-duplicate during placement.Kyle Butt1-1/+1
The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283842
2016-10-08Revert "Codegen: Tail-duplicate during placement."Kyle Butt1-1/+1
This reverts commit 71c312652c10f1855b28d06697c08d47e7a243e4. llvm-svn: 283647
2016-10-07Codegen: Tail-duplicate during placement.Kyle Butt1-1/+1
The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283619
2016-10-05Revert "Codegen: Tail-duplicate during placement."Kyle Butt1-1/+1
This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc. Issue with loop info and block removal revealed by polly. I have a fix for this issue already in another patch, I'll re-roll this together with that fix, and a test case. llvm-svn: 283292
2016-10-04Codegen: Tail-duplicate during placement.Kyle Butt1-1/+1
The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283274
2016-10-04Revert "Codegen: Tail-duplicate during placement."Kyle Butt1-1/+1
This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2. Causing crashes on aarch64 build. llvm-svn: 283172
2016-10-04Codegen: Tail-duplicate during placement.Kyle Butt1-1/+1
The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. llvm-svn: 283164
2016-08-25TailDuplication: Don't pass MMI separately from MF. NFCKyle Butt1-1/+1
MMI must match the function passed, and MF has a handle on MMI. Use that instead of accepting it as separate argument. No Functional Change. llvm-svn: 279701
2016-08-25TailDuplication: Save MF and reduce number of parameters. NFCKyle Butt1-2/+1
Save the function in the class, and then don't pass it around. This reduces the number of parameters and makes calls to member functions simpler. No Functional Change. llvm-svn: 279700
2016-04-22Re-commit optimization bisect support (r267022) without new pass manager ↵Andrew Kaylor1-1/+1
support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
2016-04-22Revert "Initial implementation of optimization bisect support."Vedant Kumar1-1/+1
This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115
2016-04-21Initial implementation of optimization bisect support.Andrew Kaylor1-1/+1
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022
2016-04-08Codegen: Factor tail duplication into a utility class. NFCKyle Butt1-949/+19
This is in preparation for tail duplication during block placement. See D18226. This needs to be a utility class for 2 reasons. No passes may run after block placement, and also, tail-duplication affects subsequent layout decisions, so it must be interleaved with placement, and can't be separated out into its own pass. The original pass is still useful, and now runs by delegating to the utility class. llvm-svn: 265842
2016-04-06RegisterScavenger: Take a reference as enterBasicBlock() argument.Matthias Braun1-1/+1
Make it obvious that the argument cannot be nullptr. Remove an unnecessary nullptr check in initRegState. llvm-svn: 265511
2016-04-04Revert "CodeGen: Remove dead code in TailDuplicate"Justin Bogner1-14/+58
It seems this is reachable after all. It hit on 7zip-benchmark in lnt on ppc64: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/2317 This reverts r265347. llvm-svn: 265352
2016-04-04CodeGen: Remove dead code in TailDuplicateJustin Bogner1-58/+14
I noticed that this isn't covered by our existing tests and spent some time trying to come up with an example it actually hits. I tried hand rolling something based on the explanation in the comment, but couldn't get anything that didn't abort tail duplication earlier for one reason or another. Then, I tried cranking tail-dup-size cranked up so this would fire more and ran a bootstrap of clang and the nightly test suite - those don't hit this either. This reverts r132816 and replaces it with an assert. llvm-svn: 265347
2016-03-23Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.Cong Hou1-3/+3
Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 264199
2016-02-22Don't tail-duplicate blocks that contain convergent instructions.Justin Lebar1-0/+5
Summary: Convergent instrs shouldn't be made control-dependent on other values, but this is basically the whole point of tail duplication. So just bail if we see a convergent instruction. Reviewers: iteratee Subscribers: jholewinski, jhen, hfinkel, tra, jingyue, llvm-commits Differential Revision: http://reviews.llvm.org/D17320 llvm-svn: 261540
2016-02-20Don't scan for SSA register operands to update when not in SSA form.Dan Gohman1-22/+24
TailDuplicate can run on either on SSA code or non-SSA code, as indicated to it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to preserve SSA invariants when it duplicates code. This patch makes it skip some of this extra work in the case where the code is not in SSA form. llvm-svn: 261450
2015-12-16Minor change to TailDuplication.cpp to turn on normalization when removing ↵Cong Hou1-1/+1
successor llvm-svn: 255752
2015-12-15Improve the successor list update in TailDuplication.cpp.Cong Hou1-6/+6
This patch improves a temporary fix in r255530 so that we can normalize successor list without trigger assertion failures in tail duplication pass. llvm-svn: 255638
2015-12-14Remove the successor probabilities normalization in tail duplication pass.Cong Hou1-1/+0
The normalization may cause assertion failures on SystemZ and some out-of-tree tests. The root cause is that unknown probabilities are materialized into known ones by calling getSuccProbability(), which is then used to add another successor to the same MBB which results in mixed known and unknown probabilities. But currently those mixed probabilities cannot be normalized. I will compose another patch to fix the root issue. llvm-svn: 255530
2015-12-13Normalize MBB's successors' probabilities in several locations.Cong Hou1-0/+1
This patch adds some missing calls to MBB::normalizeSuccProbs() in several locations where it should be called. Those places are found by checking if the sum of successors' probabilities is approximate one in MachineBlockPlacement pass with some instrumented code (not in this patch). Differential revision: http://reviews.llvm.org/D15259 llvm-svn: 255455
2015-12-01Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou1-3/+3
interfaces, and update all uses of old interfaces. (This is the second attempt to submit this patch. The first caused two assertion failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687) The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254377
2015-12-01Revert r254348: "Replace all weight-based interfaces in MBB with ↵Hans Wennborg1-3/+3
probability-based interfaces, and update all uses of old interfaces." and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction." Asserts were firing in Chromium builds. See PR25687. llvm-svn: 254366
2015-12-01Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou1-3/+3
interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348
2015-10-21Tail duplication can mix incompatible registers in phi nodesKrzysztof Parzyszek1-0/+21
Do not tail duplicate blocks where the successor has a phi node, and the corresponding value in that phi node uses a subregister. http://reviews.llvm.org/D13922 llvm-svn: 250877
2015-10-09CodeGen: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith1-4/+4
Finish removing implicit ilist iterator conversions from LLVMCodeGen. I'm sure there are lots more of these in lib/CodeGen/*/. llvm-svn: 249915
2015-09-17[WinEH] Add and use hasEHPadSuccessor instead of getLandingPadSuccessorReid Kleckner1-1/+1
getLandingPadSuccessor assumes that each invoke can have at most one EH pad successor, but WinEH invokes can have more than one. Two out of three callers of getLandingPadSuccessor don't use the returned landingpad, so we can make them use this simple predicate instead. Eventually we'll have to circle back and fix SplitKit.cpp so that register allocation works. Baby steps. llvm-svn: 247904
2015-09-09Save LaneMask with livein registersMatthias Braun1-2/+2
With subregister liveness enabled we can detect the case where only parts of a register are live in, this is expressed as a 32bit lanemask. The current code only keeps registers in the live-in list and therefore enumerated all subregisters affected by the lanemask. This turned out to be too conservative as the subregister may also cover additional parts of the lanemask which are not live. Expressing a given lanemask by enumerating a minimum set of subregisters is computationally expensive so the best solution is to simply change the live-in list to store the lanemasks as well. This will reduce memory usage for targets using subregister liveness and slightly increase it for other targets Differential Revision: http://reviews.llvm.org/D12442 llvm-svn: 247171
2015-08-24MachineBasicBlock: Add liveins() method returning an iterator_rangeMatthias Braun1-4/+3
llvm-svn: 245895
2015-08-11use range-based for loops; NFCISanjay Patel1-8/+2
llvm-svn: 244545
2015-08-10use range-based for loop; NFCISanjay Patel1-5/+5
llvm-svn: 244531
2015-08-10remove function names from comments; NFCSanjay Patel1-22/+20
llvm-svn: 244509
2015-08-04wrap OptSize and MinSize attributes for easier and consistent access (NFCI)Sanjay Patel1-0/+1
Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
2015-06-23[MachineBasicBlock] Add getFirstNonDebugInstr to complement getLastNonDebugInstrBenjamin Kramer1-5/+2
Use it in CodeGen where applicable. No functionality change intended. llvm-svn: 240414
2015-06-23Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko1-1/+1
Apparently, the style needs to be agreed upon first. llvm-svn: 240390
2015-06-19Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko1-1/+1
The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
2015-05-07Clear kill flags in tail duplication.Pete Cooper1-0/+3
If we duplicate an instruction then we must also clear kill flags on any uses we rewrite. Otherwise we might be killing a register which was used in other BBs. For example, here the entry BB ended up with these instructions, the ADD having been tail duplicated. %vreg24<def> = t2ADDri %vreg10<kill>, 1, pred:14, pred:%noreg, opt:%noreg; GPRnopc:%vreg24 rGPR:%vreg10 %vreg22<def> = COPY %vreg10; GPR:%vreg22 rGPR:%vreg10 The copy here is inserted after the add and so needs vreg10 to be live. llvm-svn: 236782
2015-02-14CodeGen: Canonicalize access to function attributes, NFCDuncan P. N. Exon Smith1-2/+1
Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) Also, add `Function::getFnStackAlignment()`, and canonicalize: getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment() llvm-svn: 229208
2014-08-05Have MachineFunction cache a pointer to the subtarget to make lookupsEric Christopher1-2/+2
shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838