aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineCSE.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-05-22[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties ↵users/pcc/spr/main.elf-add-branch-to-branch-optimizationRahul Joshi1-2/+1
(#140002) Add per-property has<Prop>/set<Prop>/reset<Prop> functions to MachineFunctionProperties.
2025-05-05[CodeGen] Use range-based for loops (NFC) (#138488)Kazu Hirata1-6/+4
This is a reland of #138434 except that: - the bits for llvm/lib/CodeGen/RenameIndependentSubregs.cpp have been dropped because they caused a test failure under asan, and - the bits for llvm/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp have been improved with structured bindings.
2025-05-04Revert "[CodeGen] Use range-based for loops (NFC) (#138434)"Nico Weber1-4/+6
This reverts commit a9699a334bc9666570418a3bed9520bcdc21518b. Breaks CodeGen/AMDGPU/collapse-endcf.ll in several configs (sanitizer builds; macOS; possibly more), see comments on https://github.com/llvm/llvm-project/pull/138434
2025-05-04[CodeGen] Use range-based for loops (NFC) (#138434)Kazu Hirata1-6/+4
2025-03-02[MachineCSE] Const correct some function arguments. NFCCraig Topper1-4/+4
2025-03-02[CodeGen] Use MCRegister and Register. NFCCraig Topper1-8/+8
2025-01-24MachineCSE: Remove check for subreg on a def operand (#124095)Matt Arsenault1-2/+0
There are no subregister defs in SSA.
2025-01-16[CodeGen] Avoid repeated hash lookups (NFC) (#123160)Kazu Hirata1-4/+3
2024-11-09[Instrumentation] Support `MachineFunction` in `OptNoneInstrumentation` ↵paperchalice1-3/+0
(#115471) Support `MachineFunction` in `OptNoneInstrumentation`, also add `isRequired` to all necessary passes.
2024-09-04[CodeGen][NewPM] Port MachineCSE pass to new pass manager. (#106605)Christudasan Devadasan1-122/+162
2024-09-04[CodeGen][MachineCSE] Remove unused AA results(NFC) (#106604)Christudasan Devadasan1-5/+0
Alias Analysis result is never used in this pass and hence removing it.
2024-08-29[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)Stephen Tozer1-1/+2
This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-07-12[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317)paperchalice1-3/+3
- Add `MachineBlockFrequencyAnalysis`. - Add `MachineBlockFrequencyPrinterPass`. - Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager. - `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new pass manager migration.
2024-07-12[CodeGen] Guard copy propagation in machine CSE against undefs (#97413)Vikram Hegde1-1/+1
2024-06-11[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis ↵paperchalice1-4/+4
result (#94571) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.
2024-04-24[CodeGen] Make the parameter TRI required in some functions. (#85968)Xu Zhang1-1/+1
Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2023-12-03[llvm] Stop including vector (NFC)Kazu Hirata1-1/+0
Identified with clangd.
2023-08-31Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tablesDaniel Paoliello1-1/+1
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: https://github.com/microsoft/microsoft-pdb/blob/0fe89a942f9a0f8e061213313e438884f4c9b876/cvdump/dumpsym7.cpp#L5518 This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367
2023-08-25Revert "Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables"Arthur Eubanks1-1/+1
This reverts commit 8d0c3db388143f4e058b5f513a70fd5d089d51c3. Causes crashes, see comments in https://reviews.llvm.org/D149367. Some follow-up fixes are also reverted: This reverts commit 636269f4fca44693bfd787b0a37bb0328ffcc085. This reverts commit 5966079cf4d4de0285004eef051784d0d9f7a3a6. This reverts commit e7294dbc85d24a08c716d9babbe7f68390cf219b.
2023-08-25Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tablesDaniel Paoliello1-1/+1
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: https://github.com/microsoft/microsoft-pdb/blob/0fe89a942f9a0f8e061213313e438884f4c9b876/cvdump/dumpsym7.cpp#L5518 This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367
2023-08-07[MachineCSE] Add an option to override the profitability heuristicsJingu Kang1-0/+7
Differential Revision: https://reviews.llvm.org/D157002
2023-06-01[CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.Jay Foad1-8/+4
Differential Revision: https://reviews.llvm.org/D151424
2023-04-17Fix uninitialized pointer members in CodeGenAkshay Khadse1-6/+6
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303
2023-01-13[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFCCraig Topper1-10/+9
Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715
2022-11-14[MachineCSE] Allow CSE for instructions with ignorable operandsGuozhi Wei1-3/+6
Ignorable operands don't impact instruction's behavior, we can safely do CSE on the instruction. It is split from D130919. It has big impact to some AMDGPU test cases. For example in atomic_optimizations_raw_buffer.ll, when trying to check if the following instruction can be CSEed %37:vgpr_32 = V_MOV_B32_e32 0, implicit $exec Function isCallerPreservedOrConstPhysReg is called on operand "implicit $exec", this function is implemented as - return TRI.isCallerPreservedPhysReg(Reg, MF) || + return TRI.isCallerPreservedPhysReg(Reg, MF) || TII.isIgnorableUse(MO) || (MRI.reservedRegsFrozen() && MRI.isConstantPhysReg(Reg)); Both TRI.isCallerPreservedPhysReg and MRI.isConstantPhysReg return false on this operand, so isCallerPreservedOrConstPhysReg is also false, it causes LLVM failed to CSE this instruction. With this patch TII.isIgnorableUse returns true for the operand $exec, so isCallerPreservedOrConstPhysReg also returns true, it causes this instruction to be CSEed with previous instruction %14:vgpr_32 = V_MOV_B32_e32 0, implicit $exec So I got different result from here. AMDGPU's implementation of isIgnorableUse is bool SIInstrInfo::isIgnorableUse(const MachineOperand &MO) const { // Any implicit use of exec by VALU is not a real register read. return MO.getReg() == AMDGPU::EXEC && MO.isImplicit() && isVALU(*MO.getParent()) && !resultDependsOnExec(*MO.getParent()); } Since the operand $exec is not a real register read, my understanding is it's reasonable to do CSE on such instructions. Because more instructions are CSEed, so I get less instructions generated for these tests. Differential Revision: https://reviews.llvm.org/D137222
2022-11-02[MachineCSE] Allow PRE of instructions that read physical registersJohn Brawn1-10/+22
Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value being read is the same as what would be read in the place the instruction would be hoisted to. This is being done in preparation for adding FPCR handling to the AArch64 backend, in order to prevent it to from worsening the generated code, but for targets that already have a similar register it should improve things. This patch affects code generation in several tests. The new code looks better except for in Thumb2/LowOverheadLoops/memcall.ll where we perform PRE but the LowOverheadLoops transformation then undoes it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse, but actually the function as a whole is better (as a MOV is PRE'd). Differential Revision: https://reviews.llvm.org/D136675
2022-10-28Revert "[MachineCSE] Allow PRE of instructions that read physical registers"John Brawn1-15/+4
This reverts commit 628467e53f4ceecd2b5f0797f07591c66d9d9d2a. This is causing a miscompile in ffmpeg when compiled for armv7.
2022-10-27[MachineCSE] Allow PRE of instructions that read physical registersJohn Brawn1-4/+15
Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value being read is the same as what would be read in the place the instruction would be hoisted to. This is being done in preparation for adding FPCR handling to the AArch64 backend, in order to prevent it to from worsening the generated code, but for targets that already have a similar register it should improve things. This patch affects code generation in several tests. The new code looks better except for in Thumb2/LowOverheadLoops/memcall.ll where we perform PRE but the LowOverheadLoops transformation then undoes it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse, but actually the function as a whole is better (as a MOV is PRE'd). Differential Revision: https://reviews.llvm.org/D136675
2022-09-16[MachineCSE] Add a threshold to avoid spending too much time in ↵Pengxuan Zheng1-3/+16
isProfitableToCSE Currently, it can become extremely costly to compute MayIncreasePressure if the size of CSUses turns out to be very large. In that case, it's no longer cost effective to keep computing MayIncreasePressure. Therefore, to limit the amount of time spent in isProfitableToCSE, we simply conservatively assume MayIncreasePressure if the size of CSUses is too large. This can reduce overall compile time by 30% for some benchmarks. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134003
2022-08-31[MachineCSE] Use TargetInstrInfo::isAsCheapAsAMove in isPRECandidate.Craig Topper1-1/+1
Some targets like RISC-V require operands to be inspected to determine if an instruction is similar to a move. Spotted while investigating code differences between using an ADDI vs an ADDIW. RISC-V has the isAsCheapAsAMove flag for ADDI, but the TII hook checks the immediate is 0 or the register is X0. ADDIW is never generated with X0 or with an immediate of 0 so it doesn't have the isAsCheapAsAMove flag. I don't know enough about the PRE code to write a test for this yet. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D132981
2022-07-18CodeGen: Remove AliasAnalysis from regallocMatt Arsenault1-1/+1
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable. Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy. Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.
2022-04-14MachineCSE: Report this requires SSAMatt Arsenault1-0/+5
2022-03-16Cleanup codegen includesserge-sans-paille1-1/+0
This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10Revert "Cleanup codegen includes"Nico Weber1-0/+1
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10Cleanup codegen includesserge-sans-paille1-1/+0
after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2021-10-30[MachineCSE] Use make_early_inc_range (NFC)Kazu Hirata1-48/+42
2021-05-05[MachineCSE][NFC]: Refactor and comment on preventing CSE for isConvergent ↵Michael Kitzan1-5/+27
instrs - Move the code preventing CSE of `isConvergent` instrs into `ProcessBlockCSE` (from `isProfitableToCSE`) - Add comments explaining why `isConvergent` is used to prevent CSE of non-local instrs in MachineCSE and the new test
2021-04-23[MachineCSE] Prevent CSE of non-local convergent instrsMichael Kitzan1-1/+5
At the moment, MachineCSE allows CSE-ing convergent instrs which are non-local to each other. This can cause illegal codegen as convergent instrs are control flow dependent. The patch prevents non-local CSE of convergent instrs by adding a check in isProfitableToCSE and rejecting CSE-ing if we're considering CSE-ing non-local convergent instrs. We can still CSE convergent instrs which are in the same control flow scope, so the patch purposely does not make all convergent instrs non-CSE candidates in isCSECandidate. https://reviews.llvm.org/D101187
2021-01-21[CodeGen] Use llvm::append_range (NFC)Kazu Hirata1-4/+2
2020-10-28[NFC] Use [MC]Register in CSE & LICMGaurav Jain1-17/+17
Differential Revision: https://reviews.llvm.org/D90327
2020-09-26MachineCSE.cpp - use auto const& iterators in for-range loops to avoid ↵Simon Pilgrim1-2/+2
copies. NFCI.
2020-09-21MachineCSE.cpp - use auto const& iterator in for-range loop to avoid copies. ↵Simon Pilgrim1-2/+2
NFCI.
2020-07-06DomTree: Remove getChildren() accessorNicolai Hähnle1-5/+3
Summary: Avoid exposing details about how children are stored. This will enable subsequent type-erasure changes. New methods are introduced to cover common access patterns. Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f Reviewers: arsenm, RKSimon, mehdi_amini, courbet Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83083
2020-04-06[MachineCSE] Don't carry the wrong location when hoistingDavide Italiano1-0/+7
PR: 45425 <rdar://problem/61359768> Differential Revision: https://reviews.llvm.org/D77604
2019-11-13Sink all InitializePasses.h includesReid Kleckner1-0/+1
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211
2019-09-02[DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locationsJeremy Morse1-3/+5
The missing line added by this patch ensures that only spilt variable locations are candidates for being restored from the stack. Otherwise, register or constant-value information can be interpreted as a spill location, through a union. The added regression test replicates a scenario where this occurs: the stack load from [rsp] causes the register-location DBG_VALUE to be "restored" to rsi, when it should be left alone. See PR43058 for details. Un x-fail a test that was suffering from this from a previous patch. Differential Revision: https://reviews.llvm.org/D66895 llvm-svn: 370648
2019-08-15Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVMDaniel Sanders1-9/+9
Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041
2019-08-07[MachineCSE][NFC] Use 'profitable' rather than 'beneficial' to name method.Kai Luo1-8/+8
llvm-svn: 368124
2019-08-01Finish moving TargetRegisterInfo::isVirtualRegister() and friends to ↵Daniel Sanders1-13/+11
llvm::Register as started by r367614. NFC llvm-svn: 367633
2019-07-19[MachineCSE][MachinePRE] Avoid hoisting code from code regions into hot BBs.Kai Luo1-0/+25
Summary: Current PRE hoists common computations into CMBB = DT->findNearestCommonDominator(MBB, MBB1). However, if CMBB is in a hot loop body, we might get performance degradation. Differential Revision: https://reviews.llvm.org/D64394 llvm-svn: 366570