Age | Commit message (Collapse) | Author | Files | Lines |
|
Refactor of the llvm-tblgen source into:
- a "Basic" library, which contains the bare minimum utilities to build
`llvm-min-tablegen`
- a "Common" library which contains all of the helpers for TableGen
backends. Such helpers can be shared by more than one backend, and even
unit tested (e.g. CodeExpander is, maybe we can add more over time)
Fixes #80647
|
|
Mark a variable const. Capitalize a variable name.
I'm going to add HwMode support to this code and wanted to clean it
up a bit beforehand.
|
|
|
|
These are unnecessary since C++17.
|
|
|
|
|
|
This seems to be a trick to avoid copying a RegUnitSet, but it can be
done more simply using std::move.
|
|
|
|
Historically TableGen has used `A.swap(B)` to move containers without
the expense of copying them. Perhaps this predated rvalue references. In
any case `A = std::move(B)` seems like a more direct way to implement
this when only A is required after the operation.
|
|
In a few places we test whether sets (i.e. sorted ranges) intersect by
computing the set_intersection and then testing whether it is empty. For
this purpose it should be more efficient to use a std:vector instead of
a std::set to hold the result of the set_intersection, since insertion
is simpler.
|
|
|
|
```
find llvm/utils/TableGen -iname "*.h" -o -iname "*.cpp" | xargs clang-format-16 -i
```
Split from #80847
|
|
* Introduce field `PositionOrder` for class `Register` and
`RegisterTuples`
* If register A's `PositionOrder` < register B's `PositionOrder`, then A
is placed before B in the enum in X86GenRegisterInfo.inc
* The new order of registers in the enum for X86 will be
1. Registers before AVX512,
2. AVX512 registers (X/YMM16-31, ZMM0-31, K registers)
3. AMX registers (TMM)
4. APX registers (R16-R31)
* Add a new target hook `getNumSupportedRegs()` to return the number of
registers for the function (may overestimate).
* Replace `getNumRegs()` with `getNumSupportedRegs()` in LiveVariables
to eliminate iterations on unsupported registers
This patch can reduce 0.3% instruction count regression for sqlite3
during compile-stage (O3) by not iterating on APX registers
for #67702
|
|
with expensive checks. (#67340)
This is mostly AMDGPU-specific. When the expensive checks are enabled,
generating of AMDGPUGenRegisterInfo.inc currently takes about 20 minutes
on my machine for release+asserts builds, which effectively prevents
such testing from regular use. This patch fixes this by reducing the
time to about 2 minutes.
Generation times for AMDGPUGenRegisterInfo.inc without expensive checks
and other *GenRegisterInfo.inc files with and without the expensive
checks remain approximately the same.
The patch doesn't cause any changes in the contents of the generated
files.
The root cause of the current poor performance is that where glibcxx is
used, enabling the expensive checks defines _GLIBCXX_DEBUG, which
enables various consistency checks in the library. One such check is in
std::binary_search() to make sure the range is ordered. As
CodeGenRegisterClass::contains() relies on std::binary_search() and it
is called very a large number of times from within
CodeGenRegBank::inferMatchingSuperRegClass(), the libcxx checks heavily
affect the runtimes.
|
|
This commit:
TableGen: Try to fix expensive checks failures
d2a9b87fee84766b28bd39b46c913da00e1450f4
fixed one of the sort() calls, but there's another.
Caught on expensive-checks buildbots that started to fail sporadically
after submitting
[AMDGPU] Add True16 register classes.
469b3bfad20550968ac428738eb1f8bb8ce3e96d
|
|
This simplifies every use of MCRegUnitMaskIterator.
Differential Revision: https://reviews.llvm.org/D157864
|
|
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D156110
|
|
To be consistent with RISC-V branding guidelines
https://riscv.org/about/risc-v-branding-guidelines/
Think we should be using RISC-V where possible.
More patches will follow.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D146449
|
|
Differential Revision: https://reviews.llvm.org/D142472
|
|
|
|
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from
llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
unknown size
When calculating the size of concatenated subregisters, and at least
one of the subregisters involved has an unknown size (-1), then the
concatenated size should be set to -1 as well.
This bug was found for an out-of-tree target.
Looking at lib/Target the only in-tree target that has a subregister
with unknown size is X86:
X86RegisterInfo.td: def sub_mask_0 : SubRegIndex<-1>;
But it looks like sub_mask_0 don't result in any concatenated subreg
index with faulty size if looking at X86SubRegIdxRanges[].
Differential Revision: https://reviews.llvm.org/D138341
|
|
The class priority is expected to be at most 5 bits before it starts
clobbering bits used for other fields. Also clamp the instruction
distance in case we have millions of instructions.
AMDGPU was accidentally overflowing into the global priority bit in
some cases. I think in principal we would have wanted this, but in the
cases I've looked at, it had the counter intuitive effect and
de-prioritized the large register tuple.
Avoid using weird bit hack PPC uses for global priority. The
AllocationPriority field is really 5 bits, and PPC was relying on
overflowing this to 6-bits to forcibly set the global priority
bit. Split this out as a separate flag to avoid having magic behavior
for values above 31.
|
|
This commit moves the information on whether a register is constant into
the Tablegen files to allow generating the implementaiton of
isConstantPhysReg(). I've marked isConstantPhysReg() as final in this
generated file to ensure that changes are made to tablegen instead of
overriding this function, but if that turns out to be too restrictive,
we can remove the qualifier.
This should be pretty much NFC, but I did notice that e.g. the AMDGPU
generated file also includes the LO16/HI16 registers now.
The new isConstant flag will also be used by D131958 to ensure that
constant registers are marked as call-preserved.
Differential Revision: https://reviews.llvm.org/D131962
|
|
The `hasType` function may be given a type that has been modified from
its original form (in particular made "simple", due to a predicate).
Make sure that such a type is still recognized as associated with a
register class, if the class contains it under any hw-mode.
This is somewhat optimistic though, since there is no information as
to where that simple type originated from.
|
|
This commits removes TableGens reliance on managed static global record state
by moving the RecordContext into the RecordKeeper. The RecordKeeper is now
treated similarly to a (LLVM|MLIR|etc)Context object and is passed to static
construction functions. This is an important step forward in removing TableGens
reliance on global state, and in a followup will allow for users that parse tablegen
to parse multiple tablegen files without worrying about Record lifetime.
Differential Revision: https://reviews.llvm.org/D125276
|
|
Add void casts to mark the variables used, next to the places where
they are used in assert or `LLVM_DEBUG()` expressions.
Differential Revision: https://reviews.llvm.org/D123117
|
|
This also includes a few cleanup from Support.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121331
|
|
Remove ARTIFICIAL_VGPR which only existed to make VReg_1 not empty.
Differential Revision: https://reviews.llvm.org/D119552
|
|
The "-fzero-call-used-regs" option tells the compiler to zero out
certain registers before the function returns. It's also available as a
function attribute: zero_call_used_regs.
The two upper categories are:
- "used": Zero out used registers.
- "all": Zero out all registers, whether used or not.
The individual options are:
- "skip": Don't zero out any registers. This is the default.
- "used": Zero out all used registers.
- "used-arg": Zero out used registers that are used for arguments.
- "used-gpr": Zero out used registers that are GPRs.
- "used-gpr-arg": Zero out used GPRs that are used as arguments.
- "all": Zero out all registers.
- "all-arg": Zero out all registers used for arguments.
- "all-gpr": Zero out all GPRs.
- "all-gpr-arg": Zero out all GPRs used for arguments.
This is used to help mitigate Return-Oriented Programming exploits.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110869
|
|
Tablegen currently expects targets to have at least one
pressure set for every broader register category. AMDGPU's
VGPR or AGPR, for instance, seemed to work correctly without
any pset, though we have forced one for each type to avoid
the assertion in computeRegUnitSets. However, psets can not
be entirely empty. At least one set is mandatory for every
target. This patch bypasses the assertion for the classes
when GeneratePressureSet is zero while ensuring the
RegUnitSets are not empty.
Reviewed By: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D110305
|
|
|
|
Analogous to the TSFlags for machine instructions, this
patch introduces a bit vector for register classes to have
target specific flags that become a tablegened value in
TargetRegisterClass.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D108767
|
|
When setting Allocatable on a generated register class check all
superclasses and set Allocatable true if any superclass is
allocatable.
Without this change generated register classes based on an
allocatable class may end up unallocatable due to the topological
inheritance order.
This change primarily effects AMDGPU backend; however, there are
a few changes in MIPs GlobalISel register constraints as a result.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D105967
|
|
Use range-based for loops in TableGen.
Reviewed By: Paul-C-Anagnostopoulos
Differential Revision: https://reviews.llvm.org/D101994
|
|
Switch some for loops to just use the begin()/end() implementations
in the InfoByHwMode struct.
Add a method to insert into the map for the one case that was
modifying the map directly.
|
|
This patch allows targets to define multiple cost
values for each register so that the cost model
can be more flexible and better used during the
register allocation as per the target requirements.
For AMDGPU the VGPR allocation will be more efficient
if the register cost can be associated dynamically
based on the calling convention.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D86836
|
|
|
|
Identified with readability-const-return-type.
|
|
|
|
Some of these were found by running clang-format over the generated
code, although that complains about far more issues than I have fixed
here.
Differential Revision: https://reviews.llvm.org/D90937
|
|
`getMatchingSubClassWithSubRegs`
Building LLVM with -DEXPENSIVE_CHECKS fails with the following error
message with libstdc++ in debug mode:
Error: comparison doesn't meet irreflexive requirements,
assert(!(a < a)).
The patch fixes the comparison function SizeOrder by returning false
when comparing two equal items.
|
|
This gives a nice error if you accidentally try to use an empty list for
the RegTypes of a RegisterClass.
Differential Revision: https://reviews.llvm.org/D78285
|
|
This patch rewrites the RegisterBankEmitter class to derive
RegisterClassHierarchy from CodeGenTarget::getRegBank() rather than
constructing our own copy. All are now accessed through a const
reference.
Differential Revision: https://reviews.llvm.org/D76006
|
|
Differential Revision: https://reviews.llvm.org/D74649
|
|
Differential Revision: https://reviews.llvm.org/D74744
|
|
Differential Revision: https://reviews.llvm.org/D74509
|
|
Tablegen's DAGISelMatcher emits integers in a VBR format,
so if an integer is below 128 it can fit into a single
byte, otherwise high bit is set, next byte is used etc.
MatcherTable is essentially an unsigned char table. When
SelectionDAGISel parses the table it does a reverse translation.
In a situation when numeric value of an integer to emit is
unknown it can be emitted not as OPC_EmitInteger but as
OPC_EmitStringInteger using a symbolic name of the value.
In this situation the value should not exceed 127.
One of the situations when OPC_EmitStringInteger is used is
if we need to emit a subreg into a matcher table. However,
number of subregs can exceed 127. Currently last defined subreg
for AMDGPU is 192. That results in a silent bug in the ISel
with matcher reading from an invalid offset.
Fixed this bug to emit actual VBR encoded value for a subregs
which value exceeds 127.
Differential Revision: https://reviews.llvm.org/D74368
|
|
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
|
|
|