aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineCSE.cpp
AgeCommit message (Collapse)AuthorFilesLines
2016-10-28MachineRegisterInfo: Remove unused arg from isConstantPhysReg(); NFCMatthias Braun1-1/+1
llvm-svn: 285423
2016-09-10[CodeGen] Rename MachineInstr::isInvariantLoad to ↵Justin Lebar1-1/+1
isDereferenceableInvariantLoad. NFC Summary: I want to separate out the notions of invariance and dereferenceability at the MI level, so that they correspond to the equivalent concepts at the IR level. (Currently an MI load is MI-invariant iff it's IR-invariant and IR-dereferenceable.) First step is renaming this function. Reviewers: chandlerc Subscribers: MatzeB, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D23370 llvm-svn: 281125
2016-06-30CodeGen: Use MachineInstr& in TargetInstrInfo, NFCDuncan P. N. Exon Smith1-4/+3
This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `*` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr*` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189
2016-04-22Re-commit optimization bisect support (r267022) without new pass manager ↵Andrew Kaylor1-1/+1
support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
2016-04-22Revert "Initial implementation of optimization bisect support."Vedant Kumar1-1/+1
This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115
2016-04-21Initial implementation of optimization bisect support.Andrew Kaylor1-1/+1
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022
2016-04-19[SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARDTim Shen1-0/+6
With this change, ideally IR pass can always generate llvm.stackguard call to get the stack guard; but for now there are still IR form stack guard customizations around (see getIRStackGuard()). Future SSP customization should go through LOAD_STACK_GUARD. There is a behavior change: stack guard values are not CSEed anymore, since we should never reuse the value in case that it has been spilled (and corrupted). See ssp-guard-spill.ll. This also cause the change of stack size and codegen in X86 and AArch64 test cases. Ideally we'd like to know if the guard created in llvm.stackprotector() gets spilled or not. If the value is spilled, discard the value and reload stack guard; otherwise reuse the value. This can be done by teaching register allocator to know how to rematerialize LOAD_STACK_GUARD and force a rematerialization (which seems hard), or check for spilling in expandPostRAPseudo. It only makes sense when the stack guard is a global variable, which requires more instructions to load. Anyway, this seems to go out of the scope of the current patch. llvm-svn: 266806
2016-01-06rangify; NFCISanjay Patel1-24/+14
llvm-svn: 256891
2015-09-09[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatibleChandler Carruth1-3/+3
with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are *within* a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
2015-05-09MachineCSE: Add a target query for the LookAheadLimit heurisiticTom Stellard1-2/+3
This is used to determine whether or not to CSE physical register defs. Differential Revision: http://reviews.llvm.org/D9472 llvm-svn: 236923
2015-03-23Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.Benjamin Kramer1-0/+1
llvm-svn: 232998
2015-02-04MachineCSE: Clear dead-def flag on CSE.Matthias Braun1-2/+9
In case CSE reuses a previoulsy unused register the dead-def flag has to be cleared on the def operand, as exposed by the arm64-cse.ll test. This fixes PR22439 and the corresponding rdar://19694987 Differential Revision: http://reviews.llvm.org/D7395 llvm-svn: 228178
2014-12-02[MachineCSE] Clear kill-flag on registers imp-def'd by the CSE'd instruction.Ahmed Bougacha1-0/+31
Go through implicit defs of CSMI and MI, and clear the kill flags on their uses in all the instructions between CSMI and MI. We might have made some of the kill flags redundant, consider: subs ... %NZCV<imp-def> <- CSMI csinc ... %NZCV<imp-use,kill> <- this kill flag isn't valid anymore subs ... %NZCV<imp-def> <- MI, to be eliminated csinc ... %NZCV<imp-use,kill> Since we eliminated MI, and reused a register imp-def'd by CSMI (here %NZCV), that register, if it was killed before MI, should have that kill flag removed, because it's lifetime was extended. Also, add an exhaustive testcase for the motivating example. Reviewed by: Juergen Ributzka <juergen@apple.com> llvm-svn: 223133
2014-08-11In Machine CSE pass, the source register of a COPY machine instruction canJiangning Liu1-11/+19
be propagated to all its users, and this propagation could increase the probability of finding common subexpressions. If the COPY has only one user, the COPY itself can be removed. llvm-svn: 215344
2014-08-05Have MachineFunction cache a pointer to the subtarget to make lookupsEric Christopher1-2/+2
shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838
2014-08-04Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher1-2/+3
information and update all callers. No functional change. llvm-svn: 214781
2014-07-29Add TargetInstrInfo interface isAsCheapAsAMove.Jiangning Liu1-1/+1
llvm-svn: 214158
2014-04-22[Modules] Remove potential ODR violations by sinking the DEBUG_TYPEChandler Carruth1-1/+2
define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837
2014-03-31Disable each MachineFunctionPass for 'optnone' functions, unless thatPaul Robinson1-0/+3
pass normally runs at optimization level None, or is part of the register allocation pipeline. llvm-svn: 205228
2014-03-17Switch a number of loops in lib/CodeGen over to range-based for-loops, now thatOwen Anderson1-17/+9
the MachineRegisterInfo iterators are compatible with it. llvm-svn: 204075
2014-03-13Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changingOwen Anderson1-17/+17
operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865
2014-03-07[C++11] Add 'override' keyword to virtual methods that override their base ↵Craig Topper1-3/+3
class. llvm-svn: 203220
2014-03-07Replace PROLOG_LABEL with a new CFI_INSTRUCTION.Rafael Espindola1-2/+2
The old system was fairly convoluted: * A temporary label was created. * A single PROLOG_LABEL was created with it. * A few MCCFIInstructions were created with the same label. The semantics were that the cfi instructions were mapped to the PROLOG_LABEL via the temporary label. The output position was that of the PROLOG_LABEL. The temporary label itself was used only for doing the mapping. The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to one by holding an index into the CFI instructions of this function. I did consider removing MMI.getFrameInstructions completelly and having CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non trivial constructors and destructors and are somewhat big, so the this setup is probably better. The net result is that we don't create temporary labels that are never used. llvm-svn: 203204
2014-03-02[C++11] Replace llvm::next and llvm::prior with std::next and std::prev.Benjamin Kramer1-2/+2
Remove the old functions. llvm-svn: 202636
2013-12-17Disabled subregister copy coalescing during MachineCSE.Andrew Trick1-5/+15
This effectively backs out r197465 but leaves some of the general fixes in place. Not all targets are ready to handle this feature. To enable it, some infrastructure work is needed to better handle register class constraints. llvm-svn: 197514
2013-12-17Allow MachineCSE to coalesce trivial subregister copies the same way that it ↵Andrew Trick1-3/+8
coalesces normal copies. Without this, MachineCSE is powerless to handle redundant operations with truncated source operands. This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled: %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1 %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2 %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def> Test case: cse-add-with-overflow.ll. This exposed an existing bug in PPCInstrInfo::commuteInstruction. Thanks to Rafael for the test case: PowerPC/crash.ll. llvm-svn: 197465
2013-12-16Revert "Allow MachineCSE to coalesce trivial subregister copies the same way ↵Rafael Espindola1-8/+3
that it coalesces normal copies." This reverts commit r197414. It broke the ppc64 bootstrap. I will post a testcase in a sec. llvm-svn: 197424
2013-12-16Allow MachineCSE to coalesce trivial subregister copies the same wayAndrew Trick1-3/+8
that it coalesces normal copies. Without this, MachineCSE is powerless to handle redundant operations with truncated source operands. This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled: %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1 %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2 %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def> llvm-svn: 197414
2013-12-16whitespaceAndrew Trick1-1/+1
llvm-svn: 197413
2013-07-14Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector ↵Craig Topper1-4/+4
size. llvm-svn: 186274
2012-12-03Use the new script to sort the includes of every file under lib.Chandler Carruth1-5/+5
Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
2012-11-27CSE: allow PerformTrivialCoalescing to check copies across basic blockManman Ren1-2/+0
boundaries. Given the following case: BB0 %vreg1<def> = SUBrr %vreg0, %vreg7 %vreg2<def> = COPY %vreg7 BB1 %vreg10<def> = SUBrr %vreg0, %vreg2 We should be able to CSE between SUBrr in BB0 and SUBrr in BB1. rdar://12462006 llvm-svn: 168717
2012-11-26Don't use iterator after being erased.Jakub Staszak1-1/+1
llvm-svn: 168622
2012-11-13Do not consider a machine instruction that uses and defines the sameUlrich Weigand1-16/+44
physical register as candidate for common subexpression elimination in MachineCSE. This fixes a bug on PowerPC in MultiSource/Applications/oggenc/oggenc caused by MachineCSE invalidly merging two separate DYNALLOC insns. llvm-svn: 167855
2012-10-16Remove unused BitVectors from getAllocatableSet().Jakob Stoklund Olesen1-3/+0
llvm-svn: 165999
2012-10-15Switch most getReservedRegs() clients to the MRI equivalent.Jakob Stoklund Olesen1-4/+1
Using the cached bit vector in MRI avoids comstantly allocating and recomputing the reserved register bit vector. llvm-svn: 165983
2012-08-11MachineCSE: Hoist isConstantPhysReg out of the loop, it checks for overlaps ↵Benjamin Kramer1-4/+3
already. llvm-svn: 161729
2012-08-11PR13578: Teach MachineCSE that instructions that use a constant register can ↵Benjamin Kramer1-2/+5
be CSE'd safely. This is common e.g. when doing rip-relative addressing on x86_64. llvm-svn: 161728
2012-08-08X86: enable CSE between CMP and SUBManman Ren1-2/+18
We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462
2012-08-07MachineCSE: Update the heuristics for isProfitableToCSE.Manman Ren1-0/+23
If the result of a common subexpression is used at all uses of the candidate expression, CSE should not increase the live range of the common subexpression. rdar://11393714 and rdar://11819721 llvm-svn: 161396
2012-07-19Remove tabs.Bill Wendling1-1/+1
llvm-svn: 160475
2012-07-05Remove ParentMap. You can just ask the domnode for its parent. No functionalityNick Lewycky1-11/+8
change. Move the "Not profitable, avoid CSE!" debug message next to where we fail the check for profitability and use a different message for avoiding CSE due to being in different register classes. llvm-svn: 159729
2012-06-01Switch some getAliasSet clients to MCRegAliasIterator.Jakob Stoklund Olesen1-3/+2
MCRegAliasIterator can optionally visit the register itself, allowing for simpler code. llvm-svn: 157837
2012-03-04Use uint16_t to store register overlaps to reduce static data.Craig Topper1-1/+1
llvm-svn: 152001
2012-02-28Handle regmasks in MachineCSE.Jakob Stoklund Olesen1-0/+6
Don't attempt to extend physreg live ranges across calls. <rdar://problem/10942095> llvm-svn: 151610
2012-02-17Re-enable 150652 and 150654 - Make FPSCR non-reserved, and make MachineCSE ↵Lang Hames1-3/+9
bail on reserved registers. This *should* be safe as of r150786. llvm-svn: 150769
2012-02-16Oop - r150653 + r150654 broke one of my test cases. Backing out for now...Lang Hames1-9/+3
llvm-svn: 150655
2012-02-16MachineCSE shouldn't extend the live ranges of reserved or allocatable ↵Lang Hames1-3/+9
registers. llvm-svn: 150653
2012-02-08Codegen pass definition cleanup. No functionality.Andrew Trick1-2/+1
Moving toward a uniform style of pass definition to allow easier target configuration. Globally declare Pass ID. Globally declare pass initializer. Use INITIALIZE_PASS consistently. Add a call to the initializer from CodeGen.cpp. Remove redundant "createPass" functions and "getPassName" methods. While cleaning up declarations, cleaned up comments (sorry for large diff). llvm-svn: 150100
2012-02-08whitespaceAndrew Trick1-2/+2
llvm-svn: 150094