Age | Commit message (Collapse) | Author | Files | Lines |
|
This heuristic was originally added in 40c4aa with the stated purpose of
avoiding global split on live long ranges created by MachineLICM
hoisting trivially rematerializable instructions. In the meantime,
various backends have introduced non-trivial rematerialization cases,
MachineLICM gained an explicitly triviality check, and we've reworked
our APIs to match naming wise. Let's move this heuristic back to truely
trivial remat only.
This is a functional change, though somewhat hard to hit. This change
will cause non-trivially rematerializable instructions to be globally
split more often. This is likely a good thing since non-trivial remat
may not be legal at all possible points in the live interval, but may
cost slightly more compile time.
I don't have a motivating example; I found it when reviewing the callers
of isRemMaterializable(MI).
|
|
This change builds on https://github.com/llvm/llvm-project/pull/160319
which tries to clarify which *callers* (not backends) assume that the
result is actually trivial.
This change itself should be NFC. Essentially, I'm just renaming the
existing isTrivialRematerializable to the non-trivial version and then
adding a new trivial version (with the same name as the prior function)
and simplifying a few callers which want that semantic.
This change does *not* enable non-trivial remat any more broadly than
was already done for our targets which were lying through the old APIs;
that will come separately. The goal here is simply to make the code
easier to follow in terms of what assumptions are being made where.
---------
Co-authored-by: Luke Lau <luke_lau@icloud.com>
|
|
Change shouldRewriteCopySrc to return the common register
class and expose it as a utility function. I've found myself
reproducing essentially the same logic in multiple places. The
purpose of this function is to jsut work through the API constraints
of which combination of register class and subreg indexes you have.
i.e. you need to use a different function if you have 0, 1, or 2
subregister indexes involved in a pair of copy-like operations.
|
|
Currently, the spill weight is only determined by isDef/isUse and
block frequency. However, for registers with different register
classes, the costs of spilling them are different.
For example, for `LMUL>1` registers (in which, several physical
registers compound a bigger logical register), the costs are larger
than `LMUL=1` case (in which, there is only one physical register).
To solve this problem, a new target hook `getSpillWeightScaleFactor`
is added. Targets can override the default factor (which is `1.0`)
according to the register class.
For RISC-V, the factors are set to the `RegClassWeight` which is
used to track register pressure. The values of `RegClassWeight`
happen to be the number of register units.
I believe all of the targets with compounded registers can benefit
from this change, but only RISC-V is customized in this patch since
it has widely been agreed to do so. The other targets need more
performance data to go further.
Partially fixes #113489.
|
|
|
|
NFC (#127968)"
This reverts commit ff99af7ea03b3be46bec7203bd2b74048d29a52a.
|
|
(#127968)
Use nonstatic member instead. This requires explicit conversions, but
many will go away as we continue converting unsigned to Register.
In a few places where it was simple, I changed unsigned to Register.
|
|
Use the nonstatic member instead.
I'm pretty sure the code in SPRIV is a layering violation. MC layer
files are using a CodeGen header.
|
|
In the problematic situation fixed in 61e556d2bdf3fa0a10dbaadd2dd03d01c341bd27,
shouldRewriteCopySrc is called with identical register class arguments,
but one has a subregister index. This was very surprising to me,
and it probably shouldn't be valid for it to occur. It happens in cases
with uncoalescable copies where the register class changes, and further
up the chain there is a subregister operand. We could possibly just
skip over uncoalsecable instructions in the chain rather than letting
this query deal with it (or pre-filter the obvious subreg with same
class case).
The generic implementation is supposed to account for checking for
valid subregisters by checking getMatchingSuperRegClass already,
but that was bypassed by the early exit for exact class match.
Also adds a reduced mir test demonstrating the exact problematic
case.
|
|
Register::stackSlot2Index. NFC (#125028)
|
|
|
|
|
|
Here we add two methods `getCommonMinimalPhysRegClass` and a LLT
version `getCommonMinimalPhysRegClassLLT`, which return the most
sub register class of the right type that contains these two input
registers.
We don't overload the `getMinimalPhysRegClass` as there will be
ambiguities.
We use it to simplify some code in RISC-V target.
|
|
Identified with misc-include-cleaner.
|
|
raw_ostream::operator<< (#106877)
These would implicitly cast the register to `unsigned`. Switch most of
them to use printReg will give a more readable output. Change some
others to use Register::id() so we can eventually remove the implicit
cast to `unsigned`.
|
|
This hint map is not required whenever a new register is added, in fact,
at -O0, it is not used at all. Growing this map is quite expensive, as
SmallVectors are not trivially copyable.
Grow this map only when hints are actually added to avoid multiple grows
and grows when no hints are added at all.
|
|
shouldRealignStack/canRealignStack are repeatedly called in PEI (through
hasStackRealignment). Checking function attributes is expensive, so
cache this data in the MachineFrameInfo, which had most data already.
This slightly changes the semantics of `MachineFrameInfo::ForcedRealign`
to be also true when the `stackrealign` attribute is set.
|
|
This is needed to provide proper size and offset for the GPRPair subreg
indices on RISC-V. The size of a GPR already uses HwMode. Previously we
said the subreg indices have unknown size and offset, but this stops
DwarfExpression::addMachineReg from being able to find the registers
that make up the pair.
I believe this fixes https://github.com/llvm/llvm-project/issues/85864
but need to verify.
|
|
I'm planning to add HwMode support to SubRegIdxRanges for RISC-V GPR
pairs. The MC layer is currently unaware of the HwMode for registers and
I'd like to keep it that way.
This information is not used by the MC layer so I think it is safe to
move it.
|
|
Finally addresses https://reviews.llvm.org/D148769#4311232 :)
No behavior change.
|
|
(#70881)
…gSizeInBits
This patch changes getRegSizeInBits to return a TypeSize instead of an
unsigned in the case that a virtual register has a scalable LLT. In the
case that register is physical, a Fixed TypeSize is returned.
The MachineVerifier pass is updated to allow copies between fixed and
scalable operands as long as the Src size will fit into the Dest size.
This is a precommit which will be stacked on by a change to GISel to
generate COPYs with a scalable destination but a fixed size source.
This patch is stacked on https://github.com/llvm/llvm-project/pull/70893
for the ability to use scalable vector types in MIR tests.
|
|
Store it in TargetRegisterInfo instead. Worth 54k on llc size.
|
|
This is rework of;
- rG13e77db2df94 (r328395; MVT)
Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h`
can be restored as well.
Depends on D148767
Differential Revision: https://reviews.llvm.org/D149024
|
|
Differential Revision: https://reviews.llvm.org/D148613
|
|
The first member of the pair should be unsigned instead of Register
because it is the hint type, 0 for simple (target independent) hints and
other values for target dependent hints.
Differential Revision: https://reviews.llvm.org/D146646
|
|
|
|
If `getCoveringSubRegIndexes` returns a set of subregister indexes where some subregisters overlap others, it can create unsatisfiable copy bundles that eventually cause VirtRegRewriter to error out due to "cycles in copy bundle".
We can simply prevent this by making the algorithm skip over subregisters indexes that would cause an overlap with already-covered lanes.
Note that in the case of AMDGPU, this problem is caused by the lack of subregisters indexes for 13/14/15-register tuples. We have everything up until 12, then we have 16 and 32 but nothing between 12 and 16.
This means that the best candidate to do the least amount of copies when splitting a 29-register tuple was to copy (e.g.) 0-15 and 14-29, causing an overlap.
With this change, getCoveringSubRegIndexes will now prefer using something like 0-15, 16-28 and 1
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D141576
|
|
Use isPhysical/isVirtual methods.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D141715
|
|
This header is very large (3M Lines once expended) and was included in location
where dwarf-specific information were not needed.
More specifically, this commit suppresses the dependencies on
llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and
llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used,
this has a decent impact on number of preprocessed lines generated during
compilation of LLVM, as showcased below.
This is achieved by moving some definitions back to the .cpp file, no
performance impact implied[0].
As a consequence of that patch, downstream user may need to manually some extra
files:
llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h
llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h
In some situations, codes maybe relying on the fact that
llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden
dependency now needs to be explicit.
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after: 10978519
before: 11245451
Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
[0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions
Differential Revision: https://reviews.llvm.org/D118781
|
|
Identified with modernize-use-bool-literals.
|
|
|
|
MachineRegisterInfo caches the reserved register set that is computed by
by TargetRegisterInfo::getReservedRegs, so call into MRI to get the
reserved regs to avoid recomputing them.
In particular this speeds up AMDGPU's SIFormMemoryClauses pass because
AMDGPU has a particularly complicated reserved set that is expensive to
compute.
Differential Revision: https://reviews.llvm.org/D102318
|
|
This was picking a concrete size for a physical register, and
enforcing exact match on the virtual register's type size. Some
targets add multiple types to a register class, and some are smaller
than the full bit width. For example x86 adds f32 to 128-bit xmm
registers, and AMDGPU adds i16/f16 to 32-bit registers.
It might be better to represent these cases as a copy of the full
register and an extraction of the subpart, but a lot of code assumes
you can directly copy. This will help fix the current usage of the DAG
calling convention infrastructure which is incompatible with how
GlobalISel is now using it.
The API is somewhat cumbersome here, but I just mirrored the existing
functions, except now with LLTs (and allow returning null on failure,
unlike the MVT version). I think the concept of selecting register
classes based on type is flawed to begin with, but I'm trying to keep
this compatible with the existing handling.
|
|
Currently needsStackRealignment returns false if canRealignStack returns false.
This means that the behavior of needsStackRealignment does not correspond to
it's name and description; a function might need stack realignment, but if it
is not possible then this function returns false. Furthermore,
needsStackRealignment is not virtual and therefore some backends have made use
of canRealignStack to indicate whether a function needs stack realignment.
This patch attempts to clarify the situation by separating them and introducing
new names:
- shouldRealignStack - true if there is any reason the stack should be
realigned
- canRealignStack - true if we are still able to realign the stack (e.g. we
can still reserve/have reserved a frame pointer)
- hasStackRealignment = shouldRealignStack && canRealignStack (not target
customisable)
Targets can now override shouldRealignStack to indicate that stack realignment
is required.
This change will make it easier in a future change to handle the case where we
need to realign the stack but can't do so (for example when the register
allocator creates an aligned spill after the frame pointer has been
eliminated).
Differential Revision: https://reviews.llvm.org/D98716
Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87
|
|
Return the best covering index, and additional needed to complete the
mask. This logically belongs in TargetRegisterInfo, although I ended
up not needing it for why I originally split this out.
|
|
Reviewed By: nemanjai, jsji
Differential Revision: https://reviews.llvm.org/D92069
|
|
This reverts commit 3bdf4507b66348ad78df4655a8e4f36c3fc10f3c.
Post commit comments need to be addressed first.
|
|
add one use check to lookThruCopyLike.
The root node is safe to be deleted if we are sure that every
definition in the copy chain only has one use.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D92069
|
|
Extend PEI to emit a DWARF expression for StackOffsets that have
a fixed and scalable component. This means the expression that needs
to be added is either:
<base> + offset
or:
<base> + offset + scalable_offset * scalereg
where for SVE, the scale reg is the Vector Granule Dwarf register, which
encodes the number of 64bit 'granules' in an SVE vector and which
the debugger can evaluate at runtime.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D90020
|
|
Also renamed the fields to follow style guidelines.
Accessors help with readability - weight mutation, in particular,
is easier to follow this way.
Differential Revision: https://reviews.llvm.org/D87725
|
|
|
|
|
|
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jholewinski, arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76348
|
|
the target
RegAllocGreedy uses a fairly compile time intensive splitting heuristic
called region splitting. This heuristic was disabled via another heuristic
when it is likely that it won't be worth the compile time. The only way
to control this other heuristic was via a command line option (huge-size-for-split).
This commit gives more control on this heuristic by making it overridable
by the target using a target hook in TargetRegisterInfo called
shouldRegionSplitForVirtReg.
The default implementation of this hook keeps the heuristic as it was
before this patch.
|
|
This was added to support fp128 on x86-64, but appears to be
unneeded now. This may be because the FR128 register class
added back then was merged with the VR128 register class later.
llvm-svn: 371815
|
|
Summary:
This was mostly an experiment to assess the feasibility of completely
eliminating a problematic implicit conversion case in D61321 in advance of
landing that* but it also happens to align with the goal of propagating the
use of Register/MCRegister instead of unsigned so I believe it makes sense
to commit it.
The overall process for eliminating the implicit conversions from
Register/MCRegister -> unsigned was to:
1. Add an explicit conversion to support genuinely required conversions to
unsigned. For example, using them as an index for IndexedMap. Sadly it's
not possible to have an explicit and implicit conversion to the same
type and only deprecate the implicit one so I called the explicit
conversion get().
2. Temporarily annotate the implicit conversion to unsigned with
LLVM_ATTRIBUTE_DEPRECATED to make them visible
3. Eliminate implicit conversions by propagating Register/MCRegister/
explicit-conversions appropriately
4. Remove the deprecation added in 2.
* My conclusion is that it isn't feasible as there's too much code to
update in one go.
Depends on D65678
Reviewers: arsenm
Subscribers: MatzeB, wdng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65685
llvm-svn: 368643
|
|
llvm::Register as started by r367614. NFC
llvm-svn: 367633
|
|
The build failure found after the rL365467 has been
resolved.
Differential Revision: https://reviews.llvm.org/D60716
llvm-svn: 367446
|
|
A build failure was found on the SystemZ platform.
This reverts commit 9e7e73578e54cd22b3c7af4b54274d743b6607cc.
llvm-svn: 365886
|
|
Dump the DWARF information about call sites and call site parameters into
debug info sections.
The patch also provides an interface for the interpretation of instructions
that could load values of a call site parameters in order to generate DWARF
about the call site parameters.
([13/13] Introduce the debug entry values.)
Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>
Differential Revision: https://reviews.llvm.org/D60716
llvm-svn: 365467
|