aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineCombiner.cpp
AgeCommit message (Collapse)AuthorFilesLines
2016-12-11instr-combiner: sum up all latencies of the transformed instructionsSebastian Pop1-2/+9
We have found that -- when the selected subarchitecture has a scheduling model and we are not optimizing for size -- the machine-instruction combiner uses a too-simple algorithm to compute the cost of one of the two alternatives [before and after running a combining pass on a section of code], and therefor it throws away the combination results too often. This fix has the potential to help any ISA with the potential to combine instructions and for which at least one subarchitecture has a scheduling model. As of now, this is only known to definitely affect AArch64 subarchitectures with a scheduling model. Regression tested on AMD64/GNU-Linux, new test case tested to fail on an unpatched compiler and pass on a patched compiler. Patch by Abe Skolnik and Sebastian Pop. llvm-svn: 289399
2016-10-01Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini1-1/+1
llvm-svn: 283004
2016-04-24[MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)Gerolf Hoflehner1-1/+11
The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328
2016-04-22Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64Daniel Sanders1-11/+1
It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127
2016-04-22[MachineCombiner] Support for floating-point FMA on ARM64Gerolf Hoflehner1-1/+11
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098
2016-04-18[NFC] Header cleanupMehdi Amini1-2/+1
Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
2016-02-27Minor code cleanup. NFC.Junmo Park1-1/+1
llvm-svn: 262096
2016-02-22Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"Duncan P. N. Exon Smith1-4/+4
This reverts commit r261510, effectively reapplying r261509. The original commit missed a caller in AArch64ConditionalCompares. Original commit message: Pass non-null arguments by reference in MachineTraceMetrics::Trace, simplifying future work to remove implicit iterator => pointer conversions. llvm-svn: 261511
2016-02-22Revert "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"Duncan P. N. Exon Smith1-4/+4
This reverts commit r261509. I'm not sure how this compiled locally, but something was out of whack. llvm-svn: 261510
2016-02-22CodeGen: Use references in MachineTraceMetrics::Trace, NFCDuncan P. N. Exon Smith1-4/+4
Pass non-null arguments by reference in MachineTraceMetrics::Trace, simplifying future work to remove implicit iterator => pointer conversions. llvm-svn: 261509
2015-11-10less indent; NFCISanjay Patel1-46/+47
llvm-svn: 252643
2015-11-10add 'MustReduceDepth' as an objective/cost-metric for the MachineCombinerSanjay Patel1-29/+53
This is one of the problems noted in PR25016: https://llvm.org/bugs/show_bug.cgi?id=25016 and: http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html The spilling problem is independent and not addressed by this patch. The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. This is caused by inclusion of the "slack" factor when calculating the critical path of the original code sequence. If we don't add that, then we have a more conservative cost comparison of the old code sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64 MULADD patterns because benchmark regressions were observed without that. The two failing test cases now have identical asm that does what we want: a + b + c + d ---> (a + b) + (c + d) Differential Revision: http://reviews.llvm.org/D13417 llvm-svn: 252616
2015-11-05replace MachineCombinerPattern namespace and enum with enum class; NFCISanjay Patel1-1/+1
Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196
2015-10-06Fix Clang-tidy modernize-use-nullptr warnings in source directories and ↵Hans Wennborg1-4/+3
generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482
2015-10-03include equal sign in debug equations; NFCSanjay Patel1-2/+2
llvm-svn: 249248
2015-08-11fix minsize detection: minsize attribute implies optimizing for sizeSanjay Patel1-3/+1
llvm-svn: 244604
2015-08-05[MachineCombiner] Don't use the opcode-only form of computeInstrLatencyHal Finkel1-1/+1
In r242277, I updated the MachineCombiner to work with itineraries, but I missed a call that is scheduling-model-only (the opcode-only form of computeInstrLatency). Using the form that takes an MI* allows this to work with itineraries (and should be NFC for subtargets with scheduling models). llvm-svn: 244020
2015-08-04wrap OptSize and MinSize attributes for easier and consistent access (NFCI)Sanjay Patel1-0/+1
Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
2015-07-15[MachineCombiner] Work with itinerariesHal Finkel1-4/+9
MachineCombiner predicated its use of scheduling-based metrics on hasInstrSchedModel(), but useful conclusions can be drawn from pipeline itineraries as well. Almost all of the logic (except for resource tracking in preservesResourceLen) can be used if we have an itinerary, so enable it in that case as well. This will be used by the PowerPC backend in an upcoming commit. llvm-svn: 242277
2015-06-23Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko1-1/+1
Apparently, the style needs to be agreed upon first. llvm-svn: 240390
2015-06-23[x86] generalize reassociation optimization in machine combiner to 2 ↵Sanjay Patel1-18/+31
instructions Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass to reassociate the following sequence to reduce the critical path: A = ? op ? B = A op X C = B op Y --> A = ? op ? B = X op Y C = A op B 'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could be any associative math/logic op (see TODO in code comment). This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth reassociating them. This generalization has a compile-time cost because we can now match more instruction sequences and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't improve the critical path. For example, in the new test case: A = M div N B = A add X C = B add Y We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path: A = M div N B = A add Y C = B add X We need the combiner to reject that pattern but select this: A = M div N B = X add Y C = B add A Differential Revision: http://reviews.llvm.org/D10460 llvm-svn: 240361
2015-06-19name change: hasPattern() -> getMachineCombinerPatterns() ; NFCSanjay Patel1-5/+5
This was suggested as part of D10460, but it's independent of any functional change. llvm-svn: 240192
2015-06-19Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko1-1/+1
The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
2015-06-13hoist loop-invariant; NFCISanjay Patel1-3/+2
llvm-svn: 239681
2015-06-13remove unnecessary casts; NFCISanjay Patel1-3/+2
llvm-svn: 239678
2015-06-10punctuation policing; NFCSanjay Patel1-5/+5
llvm-svn: 239484
2015-06-10fix typo in comment; NFCSanjay Patel1-1/+1
llvm-svn: 239478
2015-05-21fix typo in comment; NFCSanjay Patel1-1/+1
llvm-svn: 237962
2015-05-21use range-based for-loops; NFCISanjay Patel1-4/+2
llvm-svn: 237918
2015-02-14CodeGen: Canonicalize access to function attributes, NFCDuncan P. N. Exon Smith1-2/+1
Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) Also, add `Function::getFnStackAlignment()`, and canonicalize: getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment() llvm-svn: 229208
2015-01-27remove function names from comments; NFCSanjay Patel1-8/+6
llvm-svn: 227256
2015-01-27fix typos; NFCSanjay Patel1-4/+4
llvm-svn: 227253
2015-01-27The subtarget is cached on the MachineFunction. Access it directly.Eric Christopher1-2/+1
llvm-svn: 227173
2014-09-02Change MCSchedModel to be a struct of statically initialized data.Pete Cooper1-3/+3
This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour Reviewed by Andy Trick and Chandler C llvm-svn: 216919
2014-08-13[MachineCombiner] Removal of dangling DBG_VALUES after combining [20598]Gerolf Hoflehner1-1/+1
This is a cleaner solution to the problem described in r215431. When instructions are combined a dangling DBG_VALUE is removed. This resolves bug 20598. llvm-svn: 215587
2014-08-07MachineCombiner Pass for selecting faster instruction sequence on AArch64Gerolf Hoflehner1-1/+3
Re-commit of r214832,r21469 with a work-around that avoids the previous problem with gcc build compilers The work-around is to use SmallVector instead of ArrayRef of basic blocks in preservesResourceLen()/MachineCombiner.cpp llvm-svn: 215151
2014-08-04Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher1-2/+2
information and update all callers. No functional change. llvm-svn: 214781
2014-08-03CodeGen: silence a warningSaleem Abdulrasool1-2/+1
GCC 4.8.2 objects to the tautological condition in the assert as the unsigned value is guaranteed to be >= 0. Simplify the assertion by dropping the tautological condition. llvm-svn: 214671
2014-08-03MachineCombiner Pass for selecting faster instructionGerolf Hoflehner1-0/+434
sequence - target independent framework When the DAGcombiner selects instruction sequences it could increase the critical path or resource len. For example, on arm64 there are multiply-accumulate instructions (madd, msub). If e.g. the equivalent multiply-add sequence is not on the crictial path it makes sense to select it instead of the combined, single accumulate instruction (madd/msub). The reason is that the conversion from add+mul to the madd could lengthen the critical path by the latency of the multiply. But the DAGCombiner would always combine and select the madd/msub instruction. This patch uses machine trace metrics to estimate critical path length and resource length of an original instruction sequence vs a combined instruction sequence and picks the faster code based on its estimates. This patch only commits the target independent framework that evaluates and selects code sequences. The machine instruction combiner is turned off for all targets and expected to evolve over time by gradually handling DAGCombiner pattern in the target specific code. This framework lays the groundwork for fixing rdar://16319955 llvm-svn: 214666