aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineCombiner.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-03-13[MachineCombiner][Targets] Use Register in TII genAlternativeCodeSequence ↵Craig Topper1-7/+7
interface. NFC (#131272)
2024-12-16[NFC] Remove some unnecessary semicolonsDavid Green1-2/+2
All inside LLVM_DEBUG, some of which have been cleaned up by adding block scopes to allow them to format more nicely.
2024-10-28Check hasOptSize() in shouldOptimizeForSize() (#112626)Ellis Hoag1-5/+1
2024-10-16[CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. ↵Christudasan Devadasan1-4/+4
(#108507)
2024-07-09[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793)paperchalice1-4/+4
- Add `MachineLoopAnalysis`. - Add `MachineLoopPrinterPass`. - Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
2024-06-11[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis ↵paperchalice1-1/+1
result (#94571) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.
2024-04-24[CodeGen] Make the parameter TRI required in some functions. (#85968)Xu Zhang1-6/+14
Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-11[MachineCombiner][NFC] Split target-dependent patternsPengcheng Wang1-46/+26
We split target-dependent MachineCombiner patterns into their target folder. This makes MachineCombiner much more target-independent. Reviewers: davemgreen, asavonic, rotateright, RKSimon, lukel97, LuoYuanke, topperc, mshockwave, asi-sc Reviewed By: topperc, mshockwave Pull Request: https://github.com/llvm/llvm-project/pull/87991
2024-03-14[MachineCombiner] Don't ignore PHI depths (#82025)Jonas Paulsson1-3/+0
The depths of the Root and the NewRoot are to be compared in MachineCombiner::improvesCriticalPathLen(), and while the call to BlockTrace.getInstrCycles(*Root) includes the Depth of a PHI, for some reason PHI nodes have been ignored in getOperandDef(). This patch removes the special handling of PHIs in getOperandDef() so that Root and NewRoot get a fair comparison. This does not affect loop headers as MachineTraceMetrics handles that case by ignoring incoming PHI edges.
2023-06-01[CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.Jay Foad1-8/+4
Differential Revision: https://reviews.llvm.org/D151424
2023-05-05[Coverity] Big parameter passed by value.Luo, Yuanke1-2/+2
2023-04-29[X86] Fix the vnni machine combine issue.Luo, Yuanke1-21/+3
The previous patch (D148980) didn't set the InstrIdxForVirtReg correctly in genAlternativeDpCodeSequence(). It causes vnni lit test failure when LLVM_ENABLE_EXPENSIVE_CHECKS is on.
2023-04-27[X86] Machine combine vnni instruction.Luo, Yuanke1-5/+29
"vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is reduced after combination. However when vpdpwssd is in a critical path the combination get less ILP. It happens when vpdpwssd is in a loop, the vpmaddwd can be executed in parallel in multi-iterations while vpdpwssd has data dependency for each iterations. If vpaddd is in a critical path while vpmaddwd is not, it is profitable to split vpdpwssd into "vpmaddwd + vpaddd". This patch is based on the machine combiner framework to acheive decision on "vpmaddwd + vpaddd" combination. The typical example code is as below. ``` __m256i foo(int cnt, __m256i c, __m256i b, __m256i *p) { for (int i = 0; i < cnt; ++i) { __m256i a = p[i]; __m256i m = _mm256_madd_epi16 (b, a); c = _mm256_add_epi32(m, c); } return c; } ``` Differential Revision: https://reviews.llvm.org/D148980
2023-04-20Fix uninitialized class membersAkshay Khadse1-1/+1
Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148692
2023-04-17Fix uninitialized pointer members in CodeGenAkshay Khadse1-9/+9
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303
2023-02-17[MachineCombiner] Support local strategy for tracesAnton Sidorenko1-5/+10
For in-order cores MachineCombiner makes better decisions when the critical path is calculated only for the current basic block and does not take into account other blocks from the trace. This patch adds a virtual method to TargetInstrInfo to allow each target decide which strategy to use. Depends on D140541 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D140542
2023-02-16[MachineCombiner][NFC] Rename `MinInstr` to `TraceEnsemble`Anton Sidorenko1-22/+21
We are about to allow different trace strategies for MachineCombiner. Make the name of the ensemble strategy-neutral. Depends on D140540 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D140541
2023-02-14[MachineTraceMetrics][NFC] Move Strategy enum out of the classAnton Sidorenko1-1/+1
Make forward declaration possible to reduce amount of dependencies and reduce re-compilation burden caused by further patches. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D140539
2023-01-20[MachineCombiner] Use default latency model when no detailed model availablePhilip Reames1-15/+0
This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost. Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.) The motivation here is two fold. First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler. Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree. Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch. For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes. Differential Revision: https://reviews.llvm.org/D141017
2023-01-13[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFCCraig Topper1-3/+3
Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715
2023-01-05Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ partserge-sans-paille1-2/+2
Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955
2023-01-04[MachineCombine] Reorganize code for readability and tracing [nfc]Philip Reames1-26/+20
2022-11-17[MachineCombiner][RISCV] Add fmadd/fmsub/fnmsub instructions patternsAnton Sidorenko1-0/+4
This patch adds tranformation of fmul+fadd/fsub chains to fused multiply instructions: * fmul+fadd->fmadd * fmul+fsub->fmsub/fnmsub We also will try to combine these instructions if the fmul has more than one use and cannot be deleted. However, removing the dependence between fmul and fadd can still be profitable, and we rely on machine combiner approximations of scheduling. Differential Revision: https://reviews.llvm.org/D136764
2022-10-13[NFC] Use forward decl of MachineCombinerPattern enum to reduce dependenciesAnton Sidorenko1-0/+1
Differential Revision: https://reviews.llvm.org/D135776
2022-07-17[CodeGen] Qualify auto variables in for loops (NFC)Kazu Hirata1-2/+2
2022-07-14[MachineCombiner] Don't compute the latency of transient instructionsGuozhi Wei1-3/+42
If an MI will not generate a target instruction, we should not compute its latency. Then we can compute more precise instruction sequence cost, and get better result. Differential Revision: https://reviews.llvm.org/D129615
2022-06-28[MachineCombiner, AArch64] Add a new pattern A-(B+C) => (A-B)-C to reduce ↵Guozhi Wei1-0/+2
latency Add a new pattern A - (B + C) ==> (A - B) - C to give machine combiner a chance to evaluate which instruction sequence has lower latency. Differential Revision: https://reviews.llvm.org/D124564
2022-03-16Cleanup codegen includesserge-sans-paille1-1/+0
This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10Revert "Cleanup codegen includes"Nico Weber1-0/+1
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10Cleanup codegen includesserge-sans-paille1-1/+0
after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2021-12-08[NFC] Rename MachineFunction::deleteMachineInstr (coding style)Mircea Trofin1-1/+1
2021-12-05[GlobalISel] Allow DBG_VALUE to use undefined vregs before LiveDebugValues.Jack Andersen1-1/+1
Expanding on D109750. Since `DBG_VALUE` instructions have final register validity determined in `LDVImpl::handleDebugValue`, there is no apparent reason to immediately prune unused register operands as their defs are erased. Consequently, this renders `MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval` moot; gaining a substantial performance improvement. The only necessary changes involve making relevant passes consider invalid DBG_VALUE vregs uses as valid. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D112852
2021-01-24[PowerPC] support register pressure reduction in machine combiner.Chen Zheng1-0/+3
Reassociating some patterns to generate more fma instructions to reduce register pressure. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92071
2021-01-18Revert "[PowerPC] support register pressure reduction in machine combiner."Tres Popp1-3/+0
This reverts commit 26a396c4ef481cb159bba631982841736a125a9c. See https://reviews.llvm.org/D92071 for a description of the issue.
2021-01-17[PowerPC] support register pressure reduction in machine combiner.Chen Zheng1-0/+3
Reassociating some patterns to generate more fma instructions to reduce register pressure. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92071
2020-12-14[MachineCombiner][NFC] Add MustReduceRegisterPressure goalChen Zheng1-6/+63
add a new goal MustReduceRegisterPressure for machine combiner pass. PowerPC will use this new goal to do some register pressure related optimization. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D92068
2020-06-15[PowerPC] fma chain break to expose more ILPChen Zheng1-0/+2
This patch tries to reassociate two patterns related to FMA to expose more ILP on PowerPC. // Pattern 1: // A = FADD X, Y (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMA X, M21, M22 // B = FMA Y, M31, M32 // C = FADD A, B // Pattern 2: // A = FMA X, M11, M12 (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMUL M11, M12 // B = FMA X, M21, M22 // D = FMA A, M31, M32 // C = FADD B, D Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D80175
2020-05-31[MachineCombine] add a hook for resource length limitChen Zheng1-2/+4
2019-12-09[PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi1-4/+19
Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149
2019-12-06Revert "[PGO][PGSO] Instrument the code gen / target passes."Hiroshi Yamauchi1-19/+4
This reverts commit 9a0b5e14075a1f42a72eedb66fd4fde7985d37ac. This seems to break buildbots.
2019-12-06[PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi1-4/+19
Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072
2019-11-13Sink all InitializePasses.h includesReid Kleckner1-0/+1
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211
2019-08-01Finish moving TargetRegisterInfo::isVirtualRegister() and friends to ↵Daniel Sanders1-3/+3
llvm::Register as started by r367614. NFC llvm-svn: 367633
2019-06-02[X86] Fix several places that weren't passing what they though they were to ↵Craig Topper1-2/+4
MachineInstr::print Over a year ago, MachineInstr gained a fourth boolean parameter that occurs before the TII pointer. When this happened, several places started accidentally passing TII into this boolean parameter instead of the TII parameter. llvm-svn: 362312
2019-04-04[IR] Refactor attribute methods in Function class (NFC)Evandro Menezes1-1/+1
Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731
2019-02-04[AsmPrinter] Remove hidden flag -print-schedule.Andrea Di Biagio1-7/+4
This patch removes hidden codegen flag -print-schedule effectively reverting the logic originally committed as r300311 (https://llvm.org/viewvc/llvm-project?view=revision&revision=300311). Flag -print-schedule was originally introduced by r300311 to address PR32216 (https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better testing of schedule model instruction latencies/throughputs". These days, we can use llvm-mca to test scheduling models. So there is no longer a need for flag -print-schedule in LLVM. The main use case for PR32216 is now addressed by llvm-mca. Flag -print-schedule is mainly used for debugging purposes, and it is only actually used by x86 specific tests. We already have extensive (latency and throughput) tests under "test/tools/llvm-mca" for X86 processor models. That means, most (if not all) existing -print-schedule tests for X86 are redundant. When flag -print-schedule was first added to LLVM, several files had to be modified; a few APIs gained new arguments (see for example method MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained a couple of getSchedInfoStr() methods. Method getSchedInfoStr() had to originally work for both MCInst and MachineInstr. The original implmentation of getSchedInfoStr() introduced a subtle layering violation (reported as PR37160 and then fixed/worked-around by r330615). In retrospect, that new API could have been designed more optimally. We can always query MCSchedModel to get the latency and throughput. More importantly, the "sched-info" string should not have been generated by the subtarget. Note, r317782 fixed an issue where "print-schedule" didn't work very well in the presence of inline assembly. That commit is also reverted by this change. Differential Revision: https://reviews.llvm.org/D57244 llvm-svn: 353043
2019-01-19Update the file headers across all of the LLVM projects in the monorepoChandler Carruth1-4/+3
to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
2019-01-10[MachineCombiner][NFC] Prevent dereferencing past-the-end object in an MRI ↵Gerolf Hoflehner1-0/+2
container llvm-svn: 350896
2018-05-14Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen1-24/+27
The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
2018-04-08[TargetSchedule] shrink interface for init(); NFCISanjay Patel1-1/+1
The TargetSchedModel is always initialized using the TargetSubtargetInfo's MCSchedModel and TargetInstrInfo, so we don't need to extract those and pass 3 parameters to init(). Differential Revision: https://reviews.llvm.org/D44789 llvm-svn: 329540