Age | Commit message (Collapse) | Author | Files | Lines |
|
If we have a rematerializable instruction without live virtual register
uses (that we didn't heuristically decide to shrink), but with a physreg
use, we were creating a kill to keep the physreg operand liveness
unchanged. This meant we *weren't* keeping around the remat instruction,
and thus inhibiting future remat within the original live interval. If
we keep the remat instruction, that *also* keeps the physreg use, and
thus we can achieve both objectives at once.
Noticed this via inspection, and we don't currently seem to have any
rematerializable instructions which could observe the difference. This
could in theory happen for both trivial and non-trivial remat, but
requires the rematerialization of an instruction with a physreg use.
|
|
This is an attempt to simplify the rematerialization logic in
InlineSpiller and SplitKit. I'd earlier done the same for
RegisterCoalescer in 57b673.
The basic idea of this change is that we don't need to check whether an
instruction is rematerializable early. Instead, we can defer the check
to the point where we're actually trying to materialize something. We
also don't need to indirect that query through a VNI key, and can
instead just check the instruction directly at the use site.
|
|
[nfc]" (#160897)
Reverts llvm/llvm-project#160765. Failures on buildbot indicate second
assertion does not in fact hold.
|
|
We should always be able to find the VNInfo in the original live
interval which corresponds to the subset we're trying to spill, and the
only cases where we have a VNInfo without a definition instruction are
if the vni is unused, or corresponds to a phi. Adjust the code structure
to explicitly check for PHIDef, and assert the stronger conditions.
|
|
weight discount (#159180)
This aims to fix the issue that caused https://reviews.llvm.org/D106408
to be reverted.
CalcSpillWeights will reduce the weight of an interval by half if it's
considered rematerializable, so it will be evicted before others.
It does this by checking TII.isTriviallyReMaterializable. However
rematerialization may still fail if any of the defining MI's uses aren't
available at the locations it needs to be rematerialized.
LiveRangeEdit::canRematerializeAt calls allUsesAvailableAt to check this
but CalcSpillWeights doesn't, so the two diverge.
This fixes it by also checking allUsesAvailableAt in CalcSpillWeights.
In practice this has zero change AArch64/X86-64/RISC-V as measured on
llvm-test-suite, but prevents weights from being perturbed in an
upcoming patch which enables more rematerialization by re-attempting
https://reviews.llvm.org/D106408
|
|
This change builds on https://github.com/llvm/llvm-project/pull/160319
which tries to clarify which *callers* (not backends) assume that the
result is actually trivial.
This change itself should be NFC. Essentially, I'm just renaming the
existing isTrivialRematerializable to the non-trivial version and then
adding a new trivial version (with the same name as the prior function)
and simplifying a few callers which want that semantic.
This change does *not* enable non-trivial remat any more broadly than
was already done for our targets which were lying through the old APIs;
that will come separately. The goal here is simply to make the code
easier to follow in terms of what assumptions are being made where.
---------
Co-authored-by: Luke Lau <luke_lau@icloud.com>
|
|
(#159839)
LiveRangeEdit's rematerialization checking logic is used in two quite
different ways. For SplitKit and InlineSpiller, we're analyzing all defs
associated with a live interval, doing that analysis up front, and then
using the result a bit later. The RegisterCoalescer, we're analysing
exactly one ValNo at a time, and using the legality result immediately.
LRE had a checkRematerializable which existed basically to adapt the
later into the former usage model.
Instead, this change bypasses the ScannedRemat and Remattable
structures, and directly queries the underlying routines. This is easy
to read, and makes it more clear as to which uses actually need the
deferred analysis. (A following change may try to unwind that too, but
it's not strictly NFC.)
|
|
When rematerialize a virtual register the register may be early
clobbered.
However rematerializeAt(...) just return the slot index of Slot_Register
which cause mis-calculating live range for the register
---------
Co-authored-by: Yuanke Luo <ykluo@birentech.com>
|
|
Only one caller cares about the true case of this parameter, so move
the check to that single caller. Note that RegisterCoalescer seems
like it should care, but it already duplicates the check several
lines above.
|
|
Fixes the "use after poison" issue introduced by #121516 (see
<https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>).
The root cause of this issue is that #121516 introduced "Called Global"
information for call instructions modeling how "Call Site" info is
stored in the machine function, HOWEVER it didn't copy the
copy/move/erase operations for call site information.
The fix is to rename and update the existing copy/move/erase functions
so they also take care of Called Global info.
|
|
(#114407)
When LiveRangeEdit::eliminateDeadDef() converts an MI to a KILL instruction,
it should also call dropMemRefs() in order to erase any MachineMemOperand
present.
This was discovered in testing as the MachineVerifier does not accept an MMO
without the corresponding MI mayLoad/mayStore flag, which the KILL opcode
lacks.
|
|
This `AA` parameter is not used and for most uses they just pass
a nullptr.
The use of `AA` was removed since 8d0383e.
|
|
I noticed this was possibly buggy with implicit operands with the same
dest register, and should maybe be using addRegisterDead. However, this
is never called in a situation where the operand wasn't already marked
dead. This is eliminateDeadDef, implying the def was already known to be
dead.
Add an assert to detect inconsistencies in dead flags. This was
apparently added in 9a16d655c71826bef98b7d6e9590e4494ac0e1a9.
|
|
|
|
It's allowed to rematerialize instructions with implicit-defs of the
same register as the single explicit def. If this happened, it was only
clearing the dead flags on the one main result.
|
|
live range splitting"
This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.
The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work
around the underlying problem with SUBREG_TO_REG.
|
|
live range splitting"
And dependent commits.
Details in D150388.
This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c.
This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e.
This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826.
This reverts commit b7836d856206ec39509d42529f958c920368166b.
No conflicts in the code, few tests had conflicts in autogenerated CHECKs:
llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll
Reviewed By: alexfh
Differential Revision: https://reviews.llvm.org/D156381
|
|
range splitting
Replacing D143754. Right now the LiveRangeSplitting during register allocation uses
TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a
problem as we have both vector and scalar copies. Vector copies perform a copy over
a vector register but only on the lanes(threads) that are active. This is mostly sufficient
however we do run into cases when we have to copy the entire vector register and
not just active lane data. One major place where we need that is live range splitting.
Allowing targets to use their own copy instructions(if defined) will provide a lot of
flexibility and ease to lower these pseudo instructions to correct MIR.
- Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range
splitting.
- Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline.
Reviewed By: arsenm, cdevadas, kparzysz
Differential Revision: https://reviews.llvm.org/D150388
|
|
This was looking for full copies produced by SplitKit, but SplitKit
introduces copy bundles if not all lanes are live. The scan for uses
needs to look at bundles, not individual instructions.
This is a prerequisite to avoiding some redundant spills due to
subregisters which will help avoid an allocation failure in a future
patch.
|
|
|
|
Use isPhysical/isVirtual methods.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D141715
|
|
It is needed to invoke the delegate methods effectively whenever a
virtual register is cloned from an existing register of the same class.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D138517
|
|
object
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D139451
|
|
Add a statistic variable for rematerialization.
Differential Revision: https://reviews.llvm.org/D134907
|
|
This patch uses the API provided by LiveRangeEdit to handle rematerialization.
It will make future maintenance and improvement more easier.
No functional change.
Differential Revision: https://reviews.llvm.org/D133610
|
|
This patch fixes an issue where an instruction reading a whole register would be moved during register allocation into a spot where one of the subregisters was dead.
The code to check whether an instruction can be rematerialized at a given point or not was already checking for subranges to ensure that subregisters are live, but only when the instruction being moved was using a subregister, this patch changes that so the subranges are checked even when the moved instruction uses the full register.
This patch also adds a case to the original test for the subrange checking that trigger the issue described above.
The original subrange checking code was introduced in this revision: https://reviews.llvm.org/D115278
And I've encountered this issue on AMDGPUs while working with DPC++: https://github.com/intel/llvm/issues/6209
Essentially the greedy register allocator attempts to move the following instruction:
```
%3961:vreg_64 = V_LSHLREV_B64_e64 3, %3078:vreg_64, implicit $exec
```
From `@3440` into the body of a loop `@16312`, but `%3078` has the following live ranges:
```
%3078 [2224r,2240r:0)[2240r,3488B:1)[16192B,38336B:1) 0@2224r 1@2240r L0000000000000003 [2224r,3440r:0) 0@2224r L000000000000000C [2240r,3488B:0)[16192B,38336B:0) 0@2240r
```
So `@16312e` `%3078.sub1` is alive but `%3078.sub0` is dead, so this instruction being moved there leads to invalid memory accesses as `3078.sub0` ends up being trashed and the result of this instruction is used as part of an address calculation for a load.
On the original ticket this issue showed up on gfx906 and gfx90a but not on gfx908, this turned out to be because on gfx908 instead of moving the shift instruction into the loop, its value is spilled into an ACC register, gfx906 doesn't have ACC registers and for gfx90a ACC registers are used like regular vector registers and so aren't used for spilling.
With this patch the original application from the DPC++ ticket works properly on gfx906, and the result of the shift instruction is correctly spilled instead of moving the instruction in the loop.
Original Author: npmiller
Reviewed by: rampitec
Submitted by: rampitec
Differential Revision: https://reviews.llvm.org/D131884
|
|
This would create a new interval missing the subrange and hit this
verifier error:
*** Bad machine code: Live interval for subreg operand has no subranges ***
- function: test_remat_subreg_def
- basic block: %bb.0 (0xa568758) [0B;128B)
- instruction: 32B dead undef %4.sub0:vreg_64 = V_MOV_B32_e32 2, implicit $exec
|
|
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.
Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.
Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
|
|
comments
|
|
LiveRangeEdit::allUsesAvailableAt checks that VNI at use is the same
as at the original use slot. However, the VNI can be the same while
a specific subrange needed for use can be dead at the new index.
This patch adds subrange liveness check if there is a subreg use.
Fixes: SWDEV-312810
Differential Revision: https://reviews.llvm.org/D115278
|
|
|
|
|
|
|
|
|
|
When RA eliminated a dead def it can either immediately delete
the instruction itself or replace it with KILL to defer the
actual removal. If this instruction has a virtual register use
killing the register it will shrink the LI of the use. However,
if the LI covers the instruction and extends beyond it the
shrink will not happen. In fact that is impossible to shrink
such use because of the KILL still using it.
If later the LI of the use will be split at the KILL and the
KILL itself is eliminated after that point the new live segment
ends up at an invalid slot index.
This extremely rare condition was hit after D106408 which has
enabled rematerialization of such instructions. The replacement
with KILL is only done for rematerialized defs which became dead
and such rematerialization did not generally happen before.
The patch deletes an instruction immediately if it is a result
of rematerialization and has such use. An alternative would be
to prohibit a split at a KILL instruction, but it looks like it
is better to split a live range rather then keeping a killed
instruction just in case it can be rematerialized further.
Fixes PR51655.
Differential Revision: https://reviews.llvm.org/D108951
|
|
D106408 enables rematerialization of instructions with virtual
register uses. That has uncovered the bug in the allUsesAvailableAt
implementation: https://bugs.llvm.org/show_bug.cgi?id=51516.
In the majority of cases canRematerializeAt() called to check if
an instruction can be rematerialized before the given UseIdx.
However, SplitEditor::enterIntvAtEnd() calls it to rematerialize
an instruction at the end of a block passing LIS.getMBBEndIdx()
into the check. In the testcase from the bug it has attempted to
rematerialize ADDXri after STRXui in bb.17. The use operand %55
of the ADD is killed by the STRX but that is undetected by the check
because it adjusts passed UseIdx to the reg slot, before the kill.
The value is dead at the index passed to the check however.
This change uses a later of passed UseIdx and its reg slot. This
shall be correct because if are checking an availability of operands
before an instruction that instruction cannot be the one defining
these operands. If we are checking for late rematerialization we
are really interested if operands live past the instruction.
The bug is not exploitable without D106408 but needed to reland
reverted D106408.
Differential Revision: https://reviews.llvm.org/D108475
|
|
Any def of EXEC prevents rematerialization of any VOP instruction
because of the physreg use. Create a callback to check if the
physreg use can be ingored to allow rematerialization.
Differential Revision: https://reviews.llvm.org/D105836
|
|
VirtRegAuxInfo is an extensibility point, so the register allocator's
decision on which implementation to use should be communicated to the
other users - namely, LiveRangeEdit.
Differential Revision: https://reviews.llvm.org/D96898
|
|
|
|
1. Removed #include "...AliasAnalysis.h" in other headers and modules.
2. Cleaned up includes in AliasAnalysis.h.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D92489
|
|
It's never null - the reason it's modeled as a pointer is because the
pass can't init it in its ctor. Passing by ref simplifies the code, too,
as the null checks were unnecessary complexity.
Differential Revision: https://reviews.llvm.org/D89171
|
|
Mostly LiveIntervals, with their effects (users).
Differential Revision: https://reviews.llvm.org/D89018
|
|
Also renamed the fields to follow style guidelines.
Accessors help with readability - weight mutation, in particular,
is easier to follow this way.
Differential Revision: https://reviews.llvm.org/D87725
|
|
|
|
Move include to LiveRangeEdit.cpp and replace legacy AliasAnalysis typedef with AAResults where necessary.
|
|
This will address the issue: P8198 and P8199 (from D73534).
The methods was not handle bundles properly.
Differential Revision: https://reviews.llvm.org/D74904
|
|
Use the isCandidateForCallSiteEntry().
This should mostly be an NFC, but there are some parts ensuring
the moveCallSiteInfo() and copyCallSiteInfo() operate with call site
entry candidates (both Src and Dest should be the call site entry
candidates).
Differential Revision: https://reviews.llvm.org/D74122
|
|
|
|
During the If-Converter optimization pay attention when copying or
deleting call instructions in order to keep call site information in
valid state.
Reviewers: aprantl, vsk, efriedma
Reviewed By: vsk, efriedma
Differential Revision: https://reviews.llvm.org/D66955
llvm-svn: 374068
|
|
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
|