aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineCopyPropagation.cpp
AgeCommit message (Collapse)AuthorFilesLines
2024-10-23[MCP] Optimize copies when src is used during backward propagation (#111130)Vladimir Radosavljevic1-2/+77
Before this patch, redundant COPY couldn't be removed for the following case: ``` $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 ``` This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: ``` $R1 = OP ... ... // Replace all uses of %R0 with $R1 ```
2024-10-10[MCP] Skip invalidating def constant regs during forward propagation (#111129)Vladimir Radosavljevic1-2/+5
Before this patch, redundant COPY couldn't be removed for the following case: ``` %reg1 = COPY %const-reg ... // There is a def of %const-reg %reg2 = COPY killed %reg1 ``` where this can be optimized to: ``` ... // There is a def of %const-reg %reg2 = COPY %const-reg ``` This patch allows for such optimization by not invalidating defined constant registers. This is safe, as architectures like AArch64 and RISCV replace a dead definition of a GPR with a zero constant register for certain instructions.
2024-09-13[CodeGen] Use DenseMap::operator[] (NFC) (#108489)Kazu Hirata1-4/+4
Once we modernize CopyInfo with default member initializations, Copies.insert({Unit, ...}) becomes equivalent to: Copies.try_emplace(Unit) which we can simplify further down to Copies[Unit].
2024-08-28[RISCV][MCP] Remove redundant move from tail duplication (#89865)Piyou Chen1-1/+1
Tail duplication will generate the redundant move before return. It is because the MachineCopyPropogation can't recognize COPY after post-RA pseudoExpand. This patch make MachineCopyPropogation recognize `%0 = ADDI %1, 0` as COPY
2024-07-11[MCP] Use MCRegUnit as the key type of CopyTracker::Copies map. NFC. (#98277)Kai Luo1-3/+4
`CopyTracker` is in fact tracking at RegUnit level, not MCRegister.
2024-05-30[MCP] Remove unused TII argument. NFCDavid Green1-3/+2
Last used in e35fbf5c04f4719db8ff7c7a993cbf96bb706903.
2024-04-24[CodeGen] Make the parameter TRI required in some functions. (#85968)Xu Zhang1-1/+1
Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-03-28[MCP] Remove dead copies from basic blocks with successors. (#86973)Craig Topper1-4/+28
Previously we wouldn't remove dead copies from basic blocks with successors. The comment said we didn't want to trust the live-in lists. The comment is very old so I'm not sure if that's still a concern today. This patch checks the live-in lists and removes copies from MaybeDeadCopies if they are referenced by any live-ins in any successors. We only do this if the tracksLiveness property is set. If that property is not set, we retain the old behavior.
2024-03-28[MCP] Use MachineInstr::all_defs instead of MachineInstr::defs in ↵Craig Topper1-1/+1
hasOverlappingMultipleDef. (#86889) defs does not return the defs for inline assembly. We need to use all_defs to find them. Fixes #86880.
2024-01-23[MachineCopyPropagation] Make a SmallVector larger (NFC) (#79106)Kazu Hirata1-1/+1
This patch makes a SmallVector slightly larger. We encounter quite a few instructions with 3 or 4 defs but very few beyond that on X86. This saves 0.39% of heap allocations during the compilation of a large preprocessed file, namely X86ISelLowering.cpp, for the X86 target.
2023-12-26[MCP] Enhance MCP copy Instruction removal for special case(reapply) (#74239)Vettel1-2/+40
Machine Copy Propagation Pass may lose some opportunities to further remove the redundant copy instructions during the ForwardCopyPropagateBlock procedure. When we Clobber a "Def" register, we also need to remove the record from the copy maps that indicates "Src" defined "Def" to ensure the correct semantics of the ClobberRegister function. This patch reapplies #70778 and addresses the corner case bug #73512 specific to the AMDGPU backend. Additionally, it refines the criteria for removing empty records from the copy maps, thereby enhancing overall safety. For more information, please see the C++ test case generated code in "vector.body" after the MCP Pass: https://gcc.godbolt.org/z/nK4oMaWv5.
2023-11-27Revert "[MCP] Enhance MCP copy Instruction removal for special case (#70778)"Bjorn Pettersson1-38/+3
This reverts commit cae46f6210293ba4d3568eb21b935d438934290d. Reverted due to miscompiles. See https://github.com/llvm/llvm-project/issues/73512
2023-11-22[MCP] Enhance MCP copy Instruction removal for special case (#70778)Vettel1-3/+38
Machine Copy Propagation Pass may lose some opportunities to further remove the redundant copy instructions during the ForwardCopyPropagateBlock procedure. When we Clobber a "Def" register, we also need to remove the record from the copy maps that indicates "Src" defined "Def" to ensure the correct semantics of the ClobberRegister function. For more information, please see the C++ test case generated code in "vector.body" after the MCP Pass: https://gcc.godbolt.org/z/nK4oMaWv5.
2023-09-22Use llvm::drop_begin and llvm::drop_end (NFC)Kazu Hirata1-2/+2
2023-08-11[MCP] Invalidate copy for super register in copy sourceJeffrey Byrnes1-17/+20
We must also track the super sources of a copy, otherwise we introduce a sort of subtle bug. Consider: 1. DEF r0:r1 2. USE r1 3. r6:r9 = COPY r10:r13 4. r14:15 = COPY r0:r1 5. USE r6 6.. r1:4 = COPY r6:9 BackwardCopyPropagateBlock processes the instructions from bottom up. After processing 6., we will have propagatable copy for r1-r4 and r6-r9. After 5., we invalidate and erase the propagatble copy for r1-r4 and r6 but not for r7-r9. The issue is that when processing 3., data structures still say we have valid copies for dest regs r7-r9 (from 6.). The corresponding defs for these registers in 6. are r1:r4, which we mark as registers to invalidate. When invalidating, we find the copy that corresponds to r1 is 4. (this was added when processing 4.), and we say that r1 now maps to unpropagatable copies. Thus, when we process 2., we do not have a valid copy, but when we process 1. we do -- because the mapped copy for subregister r0 was never invalidated. The net result is to propagate the copy from 4. to 1., and replace DEF r0:r1 with DEF r14:r15. Then, we have a use before def in 2. The main issue is that we have an inconsitent state between which def regs and which src regs are valid. When processing 5., we mark all the defs in 6. as invalid, but only the subreg use as invalid. Either we must only invalidate the individual subreg for both uses and defs, or the super register for both. Differential Revision: https://reviews.llvm.org//D157564 Change-Id: I99d5e0b1a0d735e8ea3bd7d137b6464690aa9486
2023-06-29[MCP] Optimize copies from undefpvanhout1-2/+7
Revert D152502 and instead optimize away copy from undefs, but clear the undef flag on the original copy. Apparently, not optimizing the COPY can cause performance issues in some cases. Fixes SWDEV-405813, SWDEV-405899 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D153838
2023-06-16[MC] Use regunits instead of MCRegUnitIterator. NFC.Jay Foad1-10/+10
Differential Revision: https://reviews.llvm.org/D153122
2023-06-16[MC] Add MCRegisterInfo::regunits for iteration over register unitsSergei Barannikov1-17/+16
Reviewed By: foad Differential Revision: https://reviews.llvm.org/D152098
2023-06-09[MCP] Do not remove redundant copy for COPY from undefpvanhout1-1/+2
I don't think we can safely remove the second COPY as redundant in such cases. The first COPY (which has undef src) may be lowered to a KILL instruction instead, resulting in no COPY being emitted at all. Testcase is X86 so it's in the same place as other testcases for this function, but this was initially spotted on AMDGPU with the following: ``` renamable $vgpr24 = PRED_COPY undef renamable $vgpr25, implicit $exec renamable $vgpr24 = PRED_COPY killed renamable $vgpr25, implicit $exec ``` The second COPY waas removed as redundant, and the first one was lowered to a KILL (= removed too), causing $vgpr24 to not have $vgpr25's value. Fixes SWDEV-401507 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D152502
2023-04-20Fix uninitialized class membersAkshay Khadse1-1/+1
Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148692
2023-04-17Fix uninitialized pointer members in CodeGenAkshay Khadse1-3/+3
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303
2023-03-30[MCP] Do not try forward non-existent sub-register of a copySergei Barannikov1-12/+13
In this example: ``` $d14 = COPY killed $d18 $s0 = MI $s28 ``` $s28 is a sub-register of $d14. However, $d18 does not have sub-registers and thus cannot be forwarded. Previously, this resulted in $noreg being substituted in place of the use of $s28, which later led to an assertion failure. Fixes https://github.com/llvm/llvm-project/issues/60908, a regression that was introduced in D141747. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D146930
2023-03-25[MachineCopyPropagation] Pass DestSourcePair to isBackwardPropagatableCopy. NFCCraig Topper1-16/+10
Instead of calling isCopyInstr again, just pass the DestSourcePair from the isCopyInstr call from the caller.
2023-02-08[MachineCopyPropagation] Eliminate spillage copies that might be caused by ↵Kai Luo1-3/+388
eviction chain Remove spill-reload like copy chains. For example ``` r0 = COPY r1 r1 = COPY r2 r2 = COPY r3 r3 = COPY r4 <def-use r4> r4 = COPY r3 r3 = COPY r2 r2 = COPY r1 r1 = COPY r0 ``` will be folded into ``` r0 = COPY r1 r1 = COPY r4 <def-use r4> r4 = COPY r1 r1 = COPY r0 ``` Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D122118
2023-01-25Resolve a FIXME in MachineCopyPropagation by allowig propagation to ↵Owen Anderson1-6/+13
subregister uses. Reviewed By: barannikov88 Differential Revision: https://reviews.llvm.org/D141747
2022-12-04[Target] llvm::Optional => std::optionalFangrui Song1-24/+28
The updated functions are mostly internal with a few exceptions (virtual functions in TargetInstrInfo.h, TargetRegisterInfo.h). To minimize changes to LLVMCodeGen, GlobalISel files are skipped. https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02[CodeGen] Use std::nullopt instead of None (NFC)Kazu Hirata1-1/+1
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-06-17[MachineCopyPropagation][RISCV] Fix D125335 accidentally change control flow.Han-Kuan Chen1-79/+77
D125335 makes regsOverlap skip following control flow, which is not entended in the original code. Differential Revision: https://reviews.llvm.org/D128039
2022-05-26Give option to use isCopyInstr to determine which MI isAdrian Tong1-81/+175
treated as Copy instruction in MCP. This is then used in AArch64 to remove copy instructions after taildup ran in machine block placement Differential Revision: https://reviews.llvm.org/D125335
2022-03-21[MachineCopyPropagation] More robust isForwardableRegClassCopyJay Foad1-30/+27
Change the implementation of isForwardableRegClassCopy so that it does not rely on getMinimalPhysRegClass. Instead, iterate over all classes looking for any that satisfy a required property. NFCI on current upstream targets, but this copes better with downstream AMDGPU changes where some new smaller classes have been introduced, which was breaking regclass equality tests in the old code like: if (UseDstRC != CrossCopyRC && CopyDstRC == CrossCopyRC) Differential Revision: https://reviews.llvm.org/D121903
2022-03-16Cleanup codegen includesserge-sans-paille1-1/+0
This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10Revert "Cleanup codegen includes"Nico Weber1-0/+1
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10Cleanup codegen includesserge-sans-paille1-1/+0
after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2021-12-04[CodeGen] Use range-based for loops (NFC)Kazu Hirata1-16/+12
2021-10-31[CodeGen] Use make_early_inc_range (NFC)Kazu Hirata1-25/+22
2021-10-07[MachineCopyPropagation] Handle propagation of undef copiesCarl Ritson1-0/+1
When propagating undefined copies the undef flag must also be propagated. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D111219
2021-08-24[MachineCopyPropagation] Check CrossCopyRegClass for cross-class copysVang Thao1-3/+25
On some AMDGPU subtargets, copying to and from AGPR registers using another AGPR register is not possible. A intermediate VGPR register is needed for AGPR to AGPR copy. This is an issue when machine copy propagation forwards a COPY $agpr, replacing a COPY $vgpr which results in $agpr = COPY $agpr. It is removing a cross class copy that may have been optimized by previous passes and potentially creating an unoptimized cross class copy later on. To avoid this issue, check CrossCopyRegClass if a different register class will be needed for the copy. If so then avoid forwarding the copy when the destination does not match the desired register class and if the original copy already matches the desired register class. Issue seen while attempting to optimize another AGPR to AGPR issue: Live-ins: $agpr0 $vgpr0 = COPY $agpr0 $agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0 $agpr2 = COPY $vgpr0 $agpr3 = COPY $vgpr0 $agpr4 = COPY $vgpr0 After machine-cp: $vgpr0 = COPY $agpr0 $agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0 $agpr2 = COPY $agpr0 $agpr3 = COPY $agpr0 $agpr4 = COPY $agpr0 Machine-cp propagated COPY $agpr0 to replace $vgpr0 creating 3 AGPR to AGPR copys. Later this creates a cross-register copy from AGPR->VGPR->AGPR for each copy when the prior VGPR->AGPR copy was already optimal. Reviewed By: lkail, rampitec Differential Revision: https://reviews.llvm.org/D108011
2021-07-02[MachineCopyPropagation] Fix differences in code gen when compiling with -gAlexandru Octavian Butiu1-2/+22
Fixes bugs [[ https://bugs.llvm.org/show_bug.cgi?id=50580 | 50580 ]] and [[ https://bugs.llvm.org/show_bug.cgi?id=49446 | 49446 ]] When compiling with -g "DBG_VALUE <reg>" instructions are added in the MIR, if such a instruction is inserted between instructions that use <reg> then MachineCopyPropagation invalidates <reg> , this causes some copies to not be propagated and causes differences in code generation (ex bugs 50580 and 49446 ). DBG_VALUE instructions should be ignored since they don't actually modify the register. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D104394
2021-05-12Reapply "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST"Stephen Tozer1-3/+7
Previous crashes caused by this patch were the result of machine subregisters being incorrectly handled in updateDbgUsersToReg; this has been fixed by using RegUnits to determine overlapping registers, instead of using the register values directly. Differential Revision: https://reviews.llvm.org/D101523 This reverts commit 7ca26c5fa2df253878cab22e1e2f0d6f1b481218.
2021-05-07Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST"Arthur Eubanks1-6/+3
This reverts commit 0791f968fee259e5c34523167bd58179b8b081c2. Causing crashes: https://crbug.com/1206764
2021-05-07[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LISTStephen Tozer1-3/+6
This patch modifies updateDbgUsersToReg to properly handle DBG_VALUE_LIST instructions, by replacing the hard-coded operand indices (i.e. getOperand(0)) with the more general getDebugOperandsForReg(), and updating the register for all matching operands. Differential Revision: https://reviews.llvm.org/D101523
2020-10-13[NFC][Regalloc] Use MCRegister in MachineCopyPropagationMircea Trofin1-53/+53
Differential Revision: https://reviews.llvm.org/D89250
2020-09-01[MachineCopyPropagation] In isNopCopy, check the destination registers match ↵Craig Topper1-3/+1
in addition to the source registers. Previously if the source match we asserted that the destination matched. But GPR <-> mask register copies on X86 can violate this since we use the same K-registers for multiple sizes. Fixes this ISPC issue https://github.com/ispc/ispc/issues/1851 Differential Revision: https://reviews.llvm.org/D86507
2020-07-29[MachineCopyPropagation] BackwardPropagatableCopy: add check for ↵Simon Wallis1-0/+20
hasOverlappingMultipleDef In MachineCopyPropagation::BackwardPropagatableCopy(), a check is added for multiple destination registers. The copy propagation is avoided if the copied destination register is the same register as another destination on the same instruction. A new test is added. This used to fail on ARM like this: error: unpredictable instruction, RdHi and RdLo must be different umull r9, r9, lr, r0 Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D82638
2020-06-12[NFCI][MachineCopyPropagation] invalidateRegister(): use SmallSet<8> instead ↵Roman Lebedev1-1/+3
of DenseSet. This decreases the time consumed by the pass [during RawSpeed unity build] by 25% (0.0586 s -> 0.04388 s). While that isn't really impressive overall, that wasn't the goal here. The memory results here are noticeable. The baseline results are: ``` total runtime: 55.65s. calls to allocation functions: 19754254 (354960/s) temporary memory allocations: 4951609 (88974/s) peak heap memory consumption: 239.13MB peak RSS (including heaptrack overhead): 463.79MB total memory leaked: 198.01MB ``` While with this patch the results are: ``` total runtime: 55.37s. calls to allocation functions: 19068237 (344403/s) # -3.47 % temporary memory allocations: 4261772 (76974/s) # -13.93 % (!!!) peak heap memory consumption: 239.13MB peak RSS (including heaptrack overhead): 463.73MB total memory leaked: 198.01MB ``` So we get rid of *a lot* of temporary allocations. Using `SmallSet<8>` makes sense to me because at least here for x86 BdVer2, the size of that set is *never* more than 3, over all of llvm test-suite + RawSpeed. The story might be different on other targets, not sure if it will ever justify whole DenseSet, but if it does SmallDenseSet might be a compromise.
2019-12-30[MCP] Add stats for backward copy propagation. NFC.Kai Luo1-1/+5
2019-12-05Reland [MachineCopyPropagation] Extend MCP to do trivial copy backward ↵Kai Luo1-5/+217
propagation. Fix assertion error ``` bool llvm::MachineOperand::isRenamable() const: Assertion `Register::isPhysicalRegister(getReg()) && "isRenamable should only be checked on physical registers"' failed. ``` by checking if the register is 0 before invoking `isRenamable`.
2019-12-05Revert "[MachineCopyPropagation] Extend MCP to do trivial copy backward ↵Kai Luo1-211/+5
propagation" This reverts commit 75b3a1c318ccad0f96c38689279bc5db63e2ad05, since it breaks bootstrap build.
2019-12-05[MachineCopyPropagation] Extend MCP to do trivial copy backward propagationKai Luo1-5/+211
Summary: This patch mainly do such transformation ``` $R0 = OP ... ... // No read/clobber of $R0 and $R1 $R1 = COPY $R0 // $R0 is killed ``` Replace $R0 with $R1 and remove the COPY, we have ``` $R1 = OP ... ``` This transformation can also expose more opportunities for existing copy elimination in MCP. Differential Revision: https://reviews.llvm.org/D67794
2019-11-13Sink all InitializePasses.h includesReid Kleckner1-0/+1
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211