Age | Commit message (Collapse) | Author | Files | Lines |
|
This makes CodeGenRegisterClass::contains fast. Use this to simplify
inferMatchingSuperRegClass.
|
|
|
|
BreakFalseDeps picks the best register for undef operands if
instructions have false dependency. The problem is if the instruction is
close to the beginning of the function, ReachingDefAnalysis is over
optimism to the unused registers, which results in collision with
registers just defined in the caller.
This patch changes the selection of undef register in an reverse order,
which reduces the probability of register collisions between caller and
callee. It brings improvement in some of our internal benchmarks with
negligible effect on other benchmarks.
|
|
Give every leaf register a unique regunit, even if it has ad hoc
aliases.
Previously only leaf registers *without* ad hoc aliases would get a
unique regunit, but that caused situations where regunits could not be
used to distinguish a register from its subregs. For example:
- Registers A and B alias. They both get regunit 0 only.
- Register C has subregs A and B. It inherits regunits from its subregs,
so it also gets regunit 0 only.
After this fix, registers A and B will get a unique regunit in addition
to the regunit representing the alias, for example:
- A will get regunits 0 and 1.
- B will get regunits 0 and 2.
- C will get regunits 0, 1 and 2.
|
|
- Also eliminate unneeded std::string() around some literal strings.
|
|
|
|
#125983
|
|
This is necessary to enable composing subregisters in peephole-opt.
For now use a brute force table to find the return value. The worst
case target is AMDGPU with a 399 x 399 entry table.
|
|
|
|
MCRegister to unsigned. NFC
|
|
Also use brace initialization and emplace to avoid explicitly
constructing std::pair, and the same for std::tuple.
|
|
Some clients do not want to emit a terminator after each sub-sequence
(they have other means of determining the length of sub-sequences).
This moves `Term` argument from `emit` method to the constructor and
makes it optional. It couldn't be made optional while still on the
`emit` method because if the terminator wasn't specified, it has to be
taken into account in `layout` method as well.
The fact that `layout` method was called is now recorded in a dedicated
member variable, `IsLaidOut`. `Entries != 0` can no longer be used to
reliably check if `layout` method was called because it may be zero for
a different reason: the terminator wasn't specified and all added
sequences (if any) were empty.
This reduces the size of `*LaneMaskLists` and `*SubRegIdxLists` a bit
and resolves the removed TODO.
|
|
(#119487)
This reapplies #119122 with a fix for UBSAN errors in the X86 backend
related
to incrementing a nullptr.
|
|
tables. (#119122)"
Reverting due to UBSan failures in X86RegisterInfo::getLargestLegalSuperClass
This reverts commit c4873819a98f59ce4e2664f94c73c2dfec3393f8.
|
|
(#119122)
|
|
The issue with slow compile-time was caused by an assert in
AArch64RegisterInfo.cpp. The assert invokes 'checkAllSuperRegsMarked'
after adding all the reserved registers. This call gets very expensive
after adding the _HI registers due to the way the function searches
in the 'Exception' list, which is expected to be a small list but isn't
(the patch added 190 _HI regs).
It was possible to rewrite the code in such a way that the _HI registers
are marked as reserved after the check. This makes the problem go away
entirely and restores compile-time to what it was before (tested for
`check-runtimes`, which previously showed a ~5x slowdown).
This reverts commits:
1434d2ab215e3ea9c5f34689d056edd3d4423a78
2704647fb7986673b89cef1def729e3b022e2607
|
|
(#114827)" (#117307)
Details in #114827
This reverts commit c1c68baf7e0fcaef1f4ee86b527210f1391b55f6.
|
|
This is a step towards enabling subreg liveness tracking for AArch64,
which requires that registers are fully covered by their subregisters,
as covered here #109797.
There are several changes in this patch:
* AArch64RegisterInfo.td and tests: Define the high bits like B0_HI,
H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some
register class, this added a register class which meant that we had to
update 'magic numbers' in several tests.
The use of ComposedSubRegIndex helped 'compress' the number of bits
required for the lanemask. The correctness of the masks is tested by an
explicit unit tests.
* LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for
register tuples, but with this change to describe the high bits, a
register like 'D0' will also have 'HasDisjunctSubRegs' set to true
(because it's fullly covered by S0 and S0_HI). The fix here is to
explicitly test if the register class is one of the known D/Q/Z tuples.
|
|
This is a part of effort to have better const correctness in TableGen
backends:
https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
|
|
Factor out the timer related functionality from `RecordKeeper` to a new
`TGTimer` class in a new file.
|
|
Change RegisterInfoEmitter to use const RecordKeeper.
This is a part of effort to have better const correctness in TableGen
backends:
https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
|
|
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
|
|
Change variable name `o` to `OS` to match definition, and
`ClName` to `ClassName` for better clarity.
Cache RegBank reference in the class and do no pass around
class members to functions.
|
|
register enums. (#109086)
Otherwise, the enum defaults to 'int'. Update a few places that used
'int' for registers that now need to change to avoid a signed/unsigned
compare warning.
I was hoping this would allow us to remove the 'int' comparison
operators in Register.h and MCRegister.h, but compares with literal 0
still need them.
|
|
Change CodeGenTarget to use const RecordKeeper.
This is a part of effort to have better const correctness in TableGen
backends:
https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
|
|
Change SetTheory::RecSet/RecVec to use const Record pointers.
|
|
To address @topperc 's comment at
https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/5?u=kanrobert
|
|
|
|
Constant registers like the zero registers XZR and WZR are treated as
any other register by LLVM-MCA. This can create non existent dependency
chains.
Currently there is no method in MCA to query if a register is constant.
This patch fixes the issue by adding a bool Constant
variable to MCRegisterDesc that is true for constant registers. Since
constant registers do not create dependencies, it makes sense to add
this check to MCA.
|
|
This is needed to provide proper size and offset for the GPRPair subreg
indices on RISC-V. The size of a GPR already uses HwMode. Previously we
said the subreg indices have unknown size and offset, but this stops
DwarfExpression::addMachineReg from being able to find the registers
that make up the pair.
I believe this fixes https://github.com/llvm/llvm-project/issues/85864
but need to verify.
|
|
Refactor of the llvm-tblgen source into:
- a "Basic" library, which contains the bare minimum utilities to build
`llvm-min-tablegen`
- a "Common" library which contains all of the helpers for TableGen
backends. Such helpers can be shared by more than one backend, and even
unit tested (e.g. CodeExpander is, maybe we can add more over time)
Fixes #80647
|
|
I'm planning to add HwMode support to SubRegIdxRanges for RISC-V GPR
pairs. The MC layer is currently unaware of the HwMode for registers and
I'd like to keep it that way.
This information is not used by the MC layer so I think it is safe to
move it.
|
|
For some reason this was missing from STLExtras.
|
|
```
find llvm/utils/TableGen -iname "*.h" -o -iname "*.cpp" | xargs clang-format-16 -i
```
Split from #80847
|
|
reserveRegisterTuples is slow because it uses MCRegAliasIterator and
hence ends up reserving the same aliased registers many times. This
patch changes getReservedRegs not to use it for reserving SGPRs, VGPRs
and AGPRs. Instead it iterates through base register classes, which
should come closer to reserving each register once only.
Overall this speeds up the time to run check-llvm-codegen-amdgpu in my
Release build from 18.4 seconds to 16.9 seconds (all timings +/- 0.2).
|
|
Finally addresses https://reviews.llvm.org/D148769#4311232 :)
No behavior change.
|
|
Store it in TargetRegisterInfo instead. Worth 54k on llc size.
|
|
This simplifies every use of MCRegUnitMaskIterator.
Differential Revision: https://reviews.llvm.org/D157864
|
|
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D156110
|
|
human-readable.
Helps tracking changes in the tables on adding new register classes and
updating BaseClassOrder values.
Also eliminates tables translating base register class indexes into
TargetRegisterClass pointers.
Reviewed By: critson
Differential Revision: https://reviews.llvm.org/D156097
|
|
We don't expect this to be used on RV32 currently so remove it
to reduce number of entries in the isel table.
Teach RegisterInfoEmitter.cpp to allow a type to be missing for
a particular HwMode.
|
|
The lists contain differences between register numbers, not the register
numbers themselves. Since a difference can also be negative, this also
changes its type to signed.
Changing the type to signed exposed a "bug". For AMDGPU, which has many
registers, the first element of a sequence could be as big as ~45k.
The value does not fit into int16_t, but fits into uint16_t. The bug
didn't show up because of unsigned wrapping and truncation of the Val
field in the advance() method.
To fix the issue, I changed the way regunit difflists are encoded. The
4-bit 'scale' field of MCRegisterDesc::RegUnit was replaced by 12-bit
number of the first regunit, and the first element of each of the lists
was removed. The higher 20 bits of RegUnit field contain the initial
offset into DiffLists array.
AMDGPU has 1'409 regunits (2^12 = 4'096), and the biggest offset is
80'041 (2^20 = 1'048'576). That is, there is enough room.
Changing the encoding method also resulted in a smaller array size, the
numbers are below (I omitted targets with less than 100 elements).
```
AMDGPU | 80052 | 78741 | -1,6%
RISCV | 6498 | 6297 | -3,1%
ARM | 4181 | 3966 | -5,1%
AArch64 | 2770 | 2592 | -6,4%
PPC | 1578 | 1441 | -8,7%
Hexagon | 994 | 740 | -25,6%
R600 | 508 | 398 | -21,7%
VE | 471 | 459 | -2,5%
Sparc | 381 | 363 | -4,7%
X86 | 326 | 208 | -36,2%
Mips | 253 | 200 | -20,9%
SystemZ | 186 | 162 | -12,9%
```
Reviewed By: foad, arsenm
Differential Revision: https://reviews.llvm.org/D151036
|
|
This reverts commit fa2827f0796c08e36b0b157fc526dd59cd6368e3.
Causes build bot failres:
https://lab.llvm.org/buildbot/#/builders/38/builds/12037
|
|
The lists contain differences between register numbers, not the register
numbers themselves. Since a difference can also be negative, this also
changes its type to signed.
Changing the type to signed exposed a "bug". For AMDGPU, which has many
registers, the first element of a sequence could be as big as ~45k.
The value does not fit into int16_t, but fits into uint16_t. The bug
didn't show up because of unsigned wrapping and truncation of the Val
field in the advance() method.
To fix the issue, I changed the way regunit difflists are encoded. The
4-bit 'scale' field of MCRegisterDesc::RegUnit was replaced by 12-bit
number of the first regunit, and the first element of each of the lists
was removed. The higher 20 bits of RegUnit field contain the initial
offset into DiffLists array.
AMDGPU has 1'409 regunits (2^12 = 4'096), and the biggest offset is
80'041 (2^20 = 1'048'576). That is, there is enough room.
Changing the encoding method also resulted in a smaller array size, the
numbers are below (I omitted targets with less than 100 elements).
```
AMDGPU | 80052 | 78741 | -1,6%
RISCV | 6498 | 6297 | -3,1%
ARM | 4181 | 3966 | -5,1%
AArch64 | 2770 | 2592 | -6,4%
PPC | 1578 | 1441 | -8,7%
Hexagon | 994 | 740 | -25,6%
R600 | 508 | 398 | -21,7%
VE | 471 | 459 | -2,5%
Sparc | 381 | 363 | -4,7%
X86 | 326 | 208 | -36,2%
Mips | 253 | 200 | -20,9%
SystemZ | 186 | 162 | -12,9%
```
Reviewed By: foad, arsenm
Differential Revision: https://reviews.llvm.org/D151036
|
|
This is rework of;
- rG13e77db2df94 (r328395; MVT)
Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h`
can be restored as well.
Depends on D148767
Differential Revision: https://reviews.llvm.org/D149024
|
|
Each emitter became self-contained since it has the registration of option.
Differential Revision: https://reviews.llvm.org/D144351
|
|
"TableGenBackends.h" has declarations of emitters.
|
|
|
|
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
|
|
Allow targets to define a mapping from registers to register
classes such that each register has exactly one base class.
As registers may be in multiple register classes the base class
is determined by the container class with the lowest BaseClassOrder.
Only register classes with BaseClassOrder set are considered
when determining the base classes. By default BaseClassOrder is
unset in RegisterClass so no code is generated unless a target
explicit defines one or more base register classes.
Reviewed By: arsenm, foad
Differential Revision: https://reviews.llvm.org/D139616
|