Age | Commit message (Collapse) | Author | Files | Lines |
|
Check RegisterClassInfo if any registers of the new class are
actually available for use. Currently AMDGPU overrides shouldCoalesce
to avoid this situation. The target hook does not have access to the
dynamic register class counts, but ideally the target hook would
only be used for profitability concerns.
The new test doesn't change, due to the AMDGPU shouldCoalesce override,
but would be unallocatable if we dropped the override and switched
to the default implementation. The existing limit-coalesce.mir already
tests the behavior of this override, but it's too conservative and
isn't checking the case where the new class is unallocatable. Add
this check so it can be relaxed.
|
|
|
|
(#159110)
Currently, something like:
```
$eax = MOV32ri -11, implicit-def $rax
%al = COPY $eax
```
Can be rematerialized as:
```
dead $eax = MOV32ri -11, implicit-def $rax
```
Which marks the full $rax as used, not just $al.
With this change, this is rematerialized as:
```
dead $eax = MOV32ri -11, implicit-def dead $rax, implicit-def $al
```
To indicate that only $al is used.
Note: This issue is latent right now, but is exposed when #134408 is
applied, as it results in the register pressure being incorrectly
calculated (unless this patch is applied too).
I think this change is in line with past fixes in this area, notably:
https://github.com/llvm/llvm-project/commit/059cead5ed7aa11ce1eae0bcc751ea0d1e23ea75
https://github.com/llvm/llvm-project/commit/69cd121dd9945429b565b6a5eb8719130de880a7
|
|
weight discount (#159180)
This aims to fix the issue that caused https://reviews.llvm.org/D106408
to be reverted.
CalcSpillWeights will reduce the weight of an interval by half if it's
considered rematerializable, so it will be evicted before others.
It does this by checking TII.isTriviallyReMaterializable. However
rematerialization may still fail if any of the defining MI's uses aren't
available at the locations it needs to be rematerialized.
LiveRangeEdit::canRematerializeAt calls allUsesAvailableAt to check this
but CalcSpillWeights doesn't, so the two diverge.
This fixes it by also checking allUsesAvailableAt in CalcSpillWeights.
In practice this has zero change AArch64/X86-64/RISC-V as measured on
llvm-test-suite, but prevents weights from being perturbed in an
upcoming patch which enables more rematerialization by re-attempting
https://reviews.llvm.org/D106408
|
|
This change builds on https://github.com/llvm/llvm-project/pull/160319
which tries to clarify which *callers* (not backends) assume that the
result is actually trivial.
This change itself should be NFC. Essentially, I'm just renaming the
existing isTrivialRematerializable to the non-trivial version and then
adding a new trivial version (with the same name as the prior function)
and simplifying a few callers which want that semantic.
This change does *not* enable non-trivial remat any more broadly than
was already done for our targets which were lying through the old APIs;
that will come separately. The goal here is simply to make the code
easier to follow in terms of what assumptions are being made where.
---------
Co-authored-by: Luke Lau <luke_lau@icloud.com>
|
|
(#159839)
LiveRangeEdit's rematerialization checking logic is used in two quite
different ways. For SplitKit and InlineSpiller, we're analyzing all defs
associated with a live interval, doing that analysis up front, and then
using the result a bit later. The RegisterCoalescer, we're analysing
exactly one ValNo at a time, and using the legality result immediately.
LRE had a checkRematerializable which existed basically to adapt the
later into the former usage model.
Instead, this change bypasses the ScannedRemat and Remattable
structures, and directly queries the underlying routines. This is easy
to read, and makes it more clear as to which uses actually need the
deferred analysis. (A following change may try to unwind that too, but
it's not strictly NFC.)
|
|
This is a low level utility to parse the MCInstrInfo and should
not depend on the state of the function.
|
|
(#151974)
Currently, when an instruction rematerialized by the register coalescer
defines more subregs of the destination register
than the original COPY instruction did, we only add dead defs for the
newly defined subregs if they were not defined anywhere
else. For example, consider something like this before
rematerialization:
```
%0:reg64 = CONSTANT 1
%1:reg128.sub_lo64_lo32 = COPY %0.lo32
%1:reg128.sub_lo64_hi32 = ...
...
```
that would look like this after rematerializing `%0`:
```
%0:reg64 = CONSTANT 2
%1:reg128.sub_lo64 = CONSTANT 2
%1:reg128.sub_lo64_hi32 = ...
...
```
A dead def would not be added for `%1.sub_lo64_hi32` at the 2nd
instruction because it's subrange wasn't empty beforehand.
|
|
coalescing SUBREG_TO_REG" (#134408)"
This reverts commit bae8f1336db6a7f3288a7dcf253f2d484743b257.
Some issues were found:
* https://github.com/llvm/llvm-project/issues/151768
* https://github.com/llvm/llvm-project/issues/151592
* https://github.com/llvm/llvm-project/pull/134408#issuecomment-3145468321
* https://github.com/llvm/llvm-project/issues/151888#issuecomment-3149286820
I'll revert this for the time being while I investigate.
|
|
coalescing SUBREG_TO_REG" (#134408)
This tries to reland #123632 (previously reverted by commit
6b1db79887df19bc8e8c946108966aa6021c8b87)
This PR aims to fix coalescing of SUBREG_TO_REG when sub-register
liveness tracking is enabled and this is now the so-manieth
reincarnation of this effort :)
This change is needed in order to enable subreg liveness tracking for
AArch64, because without the implicit-def, Machine Copy Propagation
would remove a 'redundant' copy because it doesn't realise that the
top 32-bits of the register are zeroed, which subsequent instructions
rely on.
Changes compared to previous PR:
* Rather than updating all instructions that define the source register
(SrcReg) of the SUBREG_TO_REG, this new approach only updates
instructions
that define SrcReg when they dominate the SUBREG_TO_REG. The live-ranges
are updated accordingly.
|
|
The register coalescer collects the copies forward, the stale comment
wasn't addressed back then by:
https://github.com/tomershafir/llvm-project/commit/75961ecc1a5b0dff6303df886ea9817248b78931.
This change updates the coment and also adds a similar one to the other
branch.
|
|
(#140002)
Add per-property has<Prop>/set<Prop>/reset<Prop> functions to
MachineFunctionProperties.
|
|
|
|
Only one caller cares about the true case of this parameter, so move
the check to that single caller. Note that RegisterCoalescer seems
like it should care, but it already duplicates the check several
lines above.
|
|
|
|
|
|
|
|
I saw that there used to be an overload `operator<<(OS, const Pass &P)`.
Now that it's gone, I still don't find anyone using the print method
except the internal debug printing hidden under `LLVM_DEBUG` in this
function.
|
|
|
|
|
|
coalescing SUBREG_TO_REG" (#123632)"
There's a regression with one of the bootstrap builds for x86.
I'll revert this while I investigate.
This reverts commit 4df6d3df24ae9cff07c70c96a1663cbba6e1dca5.
|
|
coalescing SUBREG_TO_REG" (#123632)
This PR aims to reland work done by @arsenm which was previously
reverted due to some tangentially related scheduler issues as discussed
on #76416.
This PR cherry-picks the original commit (0e46b49de433), and adds
another patch on top with the following changes:
* The code in `updateRegDefsUses` now updates subranges when
subreg-liveness-tracking is enabled.
* When adding an implicit-def operand for the super-register,
the code in `reMaterializeTrivialDef` which tries to remove
undefined subranges should now take into account that the lanes
from the super-reg are no longer undefined.
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
|
|
|
|
The code added in #116191 that updated the lanemasks for rematerialized
values checked if `DefMI`'s destination register had a subreg index.
This seems to have missed the following case:
```
%0:gpr32 = MOVi32imm 1
%1:gpr64 = SUBREG_TO_REG 0, %0:gpr32, %subreg.sub_32
```
which during rematerialization would have the following variables set:
```
DefMI = %0:gpr32 = MOVi32imm 1
NewMI = %3.sub_32:gpr64 = MOVi32imm 1 (rematerialized value)
```
When checking whether the lanemasks need to be generated, considering
whether DefMI's destination has a subreg index is insufficient, we
should look at DefMI's subreg index instead.
The added tests are a bit more involved, because I was not able to
reconstruct the issue without having some control flow in the test.
These tests come from actual reproducers.
|
|
By moving the code a bit later, we can factor out some of the conditions
as those are now already tested.
This will also be useful when adding another fix on top that uses
`NewMI`'s subreg index (to follow as a separate PR).
The change is intended to be NFC.
|
|
Do not try to rematerialize a super-register def used by a subregister
extract copy into a copy to a physical register if the other pieces of
the
full physreg are live at the rematerialization point. It would insert
the
super-register def at the rematerialization point, and assert since the
other half of the register was already live.
This is analagous to the undef subregister def handling above,
which handled the virtual register case.
Fixes #120970
|
|
implicit_def (#118321)
Previously this would delete the IMPLICIT_DEF and not introduce the undef
flag on the use operand.
Fixes sub-issue found while reducing #109294
|
|
(#117936)
|
|
(#116191)"
This patch can now reland after 318c69de52b6 relanded #114827.
This reverts commit 14a58a1390a72ba6c66606e58e86425dcb902763.
|
|
(#116191)" (#117367)
To pass tests with #117307 revert.
This reverts commit 3093b29b597b9a936a3e4d1c8bc4a7ccba8fc848.
|
|
In a situation like the following:
```
undef %2.subreg = INST %1 ; DefMI (rematerializable),
; DefSubIdx = subreg
%3 = COPY %2 ; SrcIdx = DstIdx = 0
.... = SOMEINSTR %3, %2
```
there are no subranges for `%3` because the entire register is copied,
but after rematerialization the subrange of the rematerialized value
must be fixed up with the appropriate subranges for `.subreg`.
(To me this issue seemed a bit similar to the issue fixed by #96839, but
then related to rematerialization)
|
|
This produces far too much terminal output, particularly for the
instruction reduction. Since it doesn't consider the liveness of of
the instructions it's deleting, it produces quite a lot of verifier
errors.
|
|
This will allow the MachineVerifier to check these properties
instead of just asserting
|
|
|
|
This `AA` parameter is not used and for most uses they just pass
a nullptr.
The use of `AA` was removed since 8d0383e.
|
|
(#96839)
The issue with the handling of the SUBREG_TO_REG is that we don't join
the subranges correctly when we join live ranges across the
SUBREG_TO_REG. For example when joining across this:
```
32B %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit
```
we want to join these live ranges:
```
%0 [16r,32r:0) 0@16r weight:0.000000e+00
%2 [32r,112r:0) 0@32r weight:0.000000e+00
```
Before the fix the range for the resulting merged `%2` is:
```
%2 [16r,112r:0) 0@16r weight:0.000000e+00
```
After the fix it is now this:
```
%2 [16r,112r:0) 0@16r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00
```
Two tests are added to this fix. The X86 test fails without the patch.
The PowerPC test passes with and without the patch but is added as a way
track future possible failures when register classes are changed in a
future patch.
|
|
- Add `LiveIntervalsAnalysis`.
- Add `LiveIntervalsPrinterPass`.
- Use `LiveIntervalsWrapperPass` in legacy pass manager.
- Use `std::unique_ptr` instead of raw pointer for `LICalc`, so
destructor and default move constructor can handle it correctly.
This would be the last analysis required by `PHIElimination`.
|
|
- Add `SlotIndexesAnalysis`.
- Add `SlotIndexesPrinterPass`.
- Use `SlotIndexesWrapperPass` in legacy pass.
|
|
- Add `MachineLoopAnalysis`.
- Add `MachineLoopPrinterPass`.
- Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
|
|
|
|
|
|
2214026e957397cc6385f778b28d570485a31856 didn't fix an unused variable
warning correctly.
|
|
Fixes #82659
There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411.
Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.
After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
|
|
`ErasedInstrs` but erased (#79820)"
This reverts commit 8316bf34ac21117f35bc8e6fafa2b3e7da75e1d5.
|
|
`ErasedInstrs` but erased (#79820)"
This reverts commit 95b14da678f4670283240ef4cf60f3a39bed97b4.
|
|
erased (#79820)
Fixes #79718. Fixes #71178.
The same instructions may exist in an iteration. We cannot immediately
delete instructions in `ErasedInstrs`.
|
|
coalescing SUBREG_TO_REG""
This reverts commit 0e46b49de43349f8cbb2a7d4c6badef6d16e31ae.
Causes crashes, see repro on https://github.com/llvm/llvm-project/commit/0e46b49de43349f8cbb2a7d4c6badef6d16e31ae.
|
|
|
|
coalescing SUBREG_TO_REG"
This reverts commit c398fa009a47eb24f88383d5e911e59e70f8db86.
PPC backend was fixed in 2f82662ce901c6666fceb9c6c5e0de216a1c9667
|
|
coalescing SUBREG_TO_REG""
This reverts commit f4b5be1ecdc85ca4257b739afb8d57e23c7a8030.
The above change was breaking the clang-ppc64le-linux-test-suite bot.
|