Age | Commit message (Collapse) | Author | Files | Lines |
|
FieldName.
The word "attribute" has a specific meaning in LLVM. Avoid using it
here to mean something different.
This addresses feedback from D153180.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D153444
|
|
Re-landing the code that was reverted because of the buildbot failure
in https://lab.llvm.org/buildbot#builders/9/builds/27319.
Original commit message
======================
The class `ResourceSegments` is used to keep track of the intervals
that represent resource usage of a list of instructions that are
being scheduled by the machine scheduler.
The collection is made of intervals that are closed on the left and
open on the right (represented by the standard notation `[a, b)`).
These collections of intervals can be extended by `add`ing new
intervals accordingly while scheduling a basic block.
Unit tests are added to verify the possible configurations of
intervals, and the relative possibility of scheduling a new
instruction in these configurations. Specifically, the methods
`getFirstAvailableAtFromBottom` and `getFirstAvailableAtFromTop` are
tested to make sure that both bottom-up and top-down scheduling work
when tracking resource usage across the basic block with
`ResourceSegments`.
Note that the scheduler tracks resource usage with two methods:
1. counters (via `std::vector<unsigned> ReservedCycles;`);
2. intervals (via `std::map<unsigned, ResourceSegments> ReservedResourceSegments;`).
This patch can be considered a NFC test for existing scheduling models
because the tracking system that uses intervals is turned off by
default (field `bit EnableIntervals = false;` in the tablegen class
`SchedMachineModel`).
Reviewed By: andreadb
Differential Revision: https://reviews.llvm.org/D150312
|
|
Reverted because it produces the following builbot failure at https://lab.llvm.org/buildbot#builders/9/builds/27319:
/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/unittests/CodeGen/SchedBoundary.cpp: In member function ‘virtual void ResourceSegments_getFirstAvailableAtFromBottom_empty_Test::TestBody()’:
/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/unittests/CodeGen/SchedBoundary.cpp:395:31: error: call of overloaded ‘ResourceSegments(<brace-enclosed initializer list>)’ is ambiguous
395 | auto X = ResourceSegments({});
| ^
This reverts commit dc312f0331309692e8d6e06e93b3492b6a40989f.
|
|
The class `ResourceSegments` is used to keep track of the intervals
that represent resource usage of a list of instructions that are
being scheduled by the machine scheduler.
The collection is made of intervals that are closed on the left and
open on the right (represented by the standard notation `[a, b)`).
These collections of intervals can be extended by `add`ing new
intervals accordingly while scheduling a basic block.
Unit tests are added to verify the possible configurations of
intervals, and the relative possibility of scheduling a new
instruction in these configurations. Specifically, the methods
`getFirstAvailableAtFromBottom` and `getFirstAvailableAtFromTop` are
tested to make sure that both bottom-up and top-down scheduling work
when tracking resource usage across the basic block with
`ResourceSegments`.
Note that the scheduler tracks resource usage with two methods:
1. counters (via `std::vector<unsigned> ReservedCycles;`);
2. intervals (via `std::map<unsigned, ResourceSegments> ReservedResourceSegments;`).
This patch can be considered a NFC test for existing scheduling models
because the tracking system that uses intervals is turned off by
default (field `bit EnableIntervals = false;` in the tablegen class
`SchedMachineModel`).
Reviewed By: andreadb
Differential Revision: https://reviews.llvm.org/D150312
|
|
Conditions that need to be met:
1. count(StartAtCycle) == count(ReservedCycles);
2. For each i: StartAtCycles[i] < ReservedCycles[i];
3. For each i: StartAtCycles[i] >= 0;
4. If left unspecified, the elements are set to 0.
Differential Revision: https://reviews.llvm.org/D150310
|
|
Each emitter became self-contained since it has the registration of option.
Differential Revision: https://reviews.llvm.org/D144351
|
|
Differential Revision: https://reviews.llvm.org/D144351
|
|
"TableGenBackends.h" has declarations of emitters.
|
|
|
|
|
|
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
|
|
I've verified that every change in this patch affects generated files
and would reduce the number of warnings if None were deprecated.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
|
|
|
|
Reland of D120906 after sanitizer failures.
This patch aims to reduce a lot of the boilerplate around adding new subtarget
features. From the SubtargetFeatures tablegen definitions, a series of calls to
the macro GET_SUBTARGETINFO_MACRO are generated in
ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use
this macro to define bool members and the corresponding getter methods.
Some naming inconsistencies have been fixed to allow this, and one unused
member removed.
This implementation only applies to boolean members; in future both BitVector
and enum members could also be generated.
Differential Revision: https://reviews.llvm.org/D120906
|
|
This reverts commit dd8b0fecb95df7689aac26c2ef9ebd1f527f9f46.
|
|
This patch aims to reduce a lot of the boilerplate around adding new subtarget
features. From the SubtargetFeatures tablegen definitions, a series of calls to
the macro GET_SUBTARGETINFO_MACRO are generated in
ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use
this macro to define bool members and the corresponding getter methods.
Some naming inconsistencies have been fixed to allow this, and one unused
member removed.
This implementation only applies to boolean members; in future both BitVector
and enum members could also be generated.
Differential Revision: https://reviews.llvm.org/D120906
|
|
|
|
Since 65b13610a5226b84889b923bae884ba395ad084d, raw_string_ostream has
been unbuffered by default. Based on an audit of llvm/utils/, this
commit removes every call to `raw_string_ostream::flush()` and any call
to `raw_string_ostream::str()` whose result is ignored or that doesn't
help with clarity.
I left behind a few calls to `str()`. In these cases, the underlying
std::string was declared pretty far away and never used again, whereas
stream recently had its last write. The code is easier to read as-is;
the no-op call to `flush()` inside `str()` isn't harmful, and when
https://reviews.llvm.org/D115421 lands it'll be gone anyway.
|
|
This patch adds a pipeline to support in-order CPUs such as ARM
Cortex-A55.
In-order pipeline implements a simplified version of Dispatch,
Scheduler and Execute stages as a single stage. Entry and Retire
stages are common for both in-order and out-of-order pipelines.
Differential Revision: https://reviews.llvm.org/D94928
|
|
|
|
|
|
|
|
|
|
Differential revision: https://reviews.llvm.org/D92304
|
|
|
|
Patch fixes scheduling of ALU instructions which modify pc register. Patch
also fixes computation of mutually exclusive predicates for sequences of
variants to be properly expanded
Differential revision: https://reviews.llvm.org/D91266
|
|
Some of these were found by running clang-format over the generated
code, although that complains about far more issues than I have fixed
here.
Differential Revision: https://reviews.llvm.org/D90937
|
|
Differential revision: https://reviews.llvm.org/D89553
|
|
Let user able to know which -tune-cpu are used now.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D88951
|
|
And internalize some classes if I noticed them:)
|
|
This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line.
This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned.
One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU.
I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning.
Differential Revision: https://reviews.llvm.org/D85165
|
|
Summary:
The type used to represent functional units in MC is
'unsigned', which is 32 bits wide. This is currently
not a problem in any upstream target as no one seems
to have hit the limit on this yet, but in our
downstream one, we need to define more than 32
functional units.
Increasing the size does not seem to cause a huge
size increase in the binary (an llc debug build went
from 1366497672 to 1366523984, a difference of 26k),
so perhaps it would be acceptable to have this patch
applied upstream as well.
Subscribers: hiraditya, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71210
|
|
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
|
|
SchedModel
Assume that, ModelA has scheduling resource for InstA and ModelB has scheduling resource for InstB. This is what the llvm::MCSchedClassDesc looks like:
llvm::MCSchedClassDesc ModelASchedClasses[] = {
...
InstA, 0, ...
InstB, -1,...
};
llvm::MCSchedClassDesc ModelBSchedClasses[] = {
...
InstA, -1,...
InstB, 0,...
};
The -1 means invalid num of macro ops, while it is valid if it is >=0. This is what we look like now:
llvm::MCSchedClassDesc ModelASchedClasses[] = {
...
InstA, 0, ...
InstB, 0,...
};
llvm::MCSchedClassDesc ModelBSchedClasses[] = {
...
InstA, 0,...
InstB, 0,...
};
And compiler hit the assertion here because the SCDesc is valid now for both InstA and InstB.
Differential Revision: https://reviews.llvm.org/D67950
llvm-svn: 374524
|
|
Much like ValueTypeByHwMode/RegInfoByHwMode, this patch allows targets
to modify an instruction's encoding based on HwMode. When the
EncodingInfos field is non-empty the Inst and Size fields of the Instruction
are ignored and taken from EncodingInfos instead.
As part of this promote getHwMode() from TargetSubtargetInfo to MCSubtargetInfo.
This is NFC for all existing targets - new code is generated only if targets
use EncodingByHwMode.
llvm-svn: 372320
|
|
Summary:
Update end-of-namespace comments generated by
tablegen emitters to fulfill the rules setup by
clang-tidy's llvm-namespace-comment checker.
Fixed a few end-of-namespace comments in the
tablegen source code as well.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: craig.topper, stoklund, dschuff, sbc100, jgravelle-google, aheejin, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66396
llvm-svn: 369865
|
|
Summary:
It does not currently make sense to use WebAssembly features in some functions
but not others, so this CL adds an IR pass that takes the union of all used
feature sets and applies it to each function in the module. This allows us to
prevent atomics from being lowered away if some function has opted in to using
them. When atomics is not enabled anywhere, we detect whether there exists any
atomic operations or thread local storage that would be stripped and disallow
linking with objects that contain atomics if and only if atomics or tls are
stripped. When atomics is enabled, mark it as used but do not require it of
other objects in the link. These changes allow libraries that do not use atomics
to be built once and linked into both single-threaded and multithreaded
binaries.
Reviewers: aheejin, sbc100, dschuff
Subscribers: jgravelle-google, hiraditya, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59625
llvm-svn: 357226
|
|
single array.
These arrays are both keyed by CPU name and go into the same tablegenerated file. Merge them so we only need to store keys once.
This also removes a weird space saving quirk where we used the ProcDesc.size() to create to build an ArrayRef for ProcSched.
Differential Revision: https://reviews.llvm.org/D58939
llvm-svn: 355431
|
|
remove fields from the stack tables that aren't needed for CPUs
The description for CPUs was just the CPU name wrapped with "Select the " and " processor". We can just do that directly in the help printer instead of making a separate version in the binary for each CPU.
Also remove the Value field that isn't needed and was always 0.
Differential Revision: https://reviews.llvm.org/D58938
llvm-svn: 355429
|
|
FeatureBitArray initialization to satisfy older versions of clang.
Apparently older versions of clang like 3.6 require an extra set of curly braces around std::array initializations. I'm told the C++ language was changed regarding this by CWG 1270.
llvm-svn: 355327
|
|
subtarget feature tables
Subtarget features are stored in a std::bitset that has been subclassed. There is a special constructor to allow the tablegen files to provide a list of bits to initialize the std::bitset to. This constructor isn't constexpr and std::bitset doesn't support many constexpr operations either. This results in a static global constructor being used to initialize the feature bitsets in these files at startup.
To fix this I've introduced a new FeatureBitArray class that holds three 64-bit values representing the initial bit values and taught tablegen to emit hex constants for them based on the feature enum values. This makes the tablegen files less readable than they were before. I can add the list of features back as a comment if we think that's important.
I've added a method to convert from this class into the std::bitset subclass we had before. I considered making the new FeatureBitArray class just implement the std::bitset interface we need instead, but thought I'd see how others felts about that first.
I've simplified the interfaces to SetImpliedBits and ClearImpliedBits a little minimize the number of times we need to convert to the bitset.
This removes about 27K from my local release+asserts build of llc.
Differential Revision: https://reviews.llvm.org/D58520
llvm-svn: 355167
|
|
'unsigned' to hold the value.
This class is used for two difference tablegen generated tables. For one of the tables the Value FeatureBitset only has one bit set. For the other usage the Implies field was unused.
This patch changes the Value field to just be an unsigned. For the usage that put a real vector in bitset, we now use the previously unused Implies field and leave the Value field unused instead.
This is good for a 16K reduction in the size of llc on my local build with all targets enabled.
llvm-svn: 354243
|
|
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
|
|
load/store queues (PR36666).
This patch adds the ability to specify via tablegen which processor resources
are load/store queue resources.
A new tablegen class named MemoryQueue can be optionally used to mark resources
that model load/store queues. Information about the load/store queue is
collected at 'CodeGenSchedule' stage, and analyzed by the 'SubtargetEmitter' to
initialize two new fields in struct MCExtraProcessorInfo named `LoadQueueID` and
`StoreQueueID`. Those two fields are identifiers for buffered resources used to
describe the load queue and the store queue.
Field `BufferSize` is interpreted as the number of entries in the queue, while
the number of units is a throughput indicator (i.e. number of available pickers
for loads/stores).
At construction time, LSUnit in llvm-mca checks for the presence of extra
processor information (i.e. MCExtraProcessorInfo) in the scheduling model. If
that information is available, and fields LoadQueueID and StoreQueueID are set
to a value different than zero (i.e. the invalid processor resource index), then
LSUnit initializes its LoadQueue/StoreQueue based on the BufferSize value
declared by the two processor resources.
With this patch, we more accurately track dynamic dispatch stalls caused by the
lack of LS tokens (i.e. load/store queue full). This is also shown by the
differences in two BdVer2 tests. Stalls that were previously classified as
generic SCHEDULER FULL stalls, are not correctly classified either as "load
queue full" or "store queue full".
About the differences in the -scheduler-stats view: those differences are
expected, because entries in the load/store queue are not released at
instruction issue stage. Instead, those are released at instruction executed
stage. This is the main reason why for the modified tests, the load/store
queues gets full before PdEx is full.
Differential Revision: https://reviews.llvm.org/D54957
llvm-svn: 347857
|
|
`llvm-mca` relies on the predicates to be based on `MCSchedPredicate` in order
to resolve the scheduling for variant instructions. Otherwise, it aborts
the building of the instruction model early.
However, the scheduling model emitter in `TableGen` gives up too soon, unless
all processors use only such predicates.
In order to allow more processors to be used with `llvm-mca`, this patch
emits scheduling transitions if any processor uses these predicates. The
transition emitted for the processors using legacy predicates is the one
specified with `NoSchedPred`, which is based on `MCSchedPredicate`.
Preferably, `llvm-mca` should instead assume a reasonable default when a
variant transition is not based on `MCSchedPredicate` for a given processor.
This issue should be revisited in the future.
Differential revision: https://reviews.llvm.org/D54648
llvm-svn: 347504
|
|
Summary:
The pfm counters are now in the ExegesisTarget rather than the
MCSchedModel (PR39165).
This also compresses the pfm counter tables (PR37068).
Reviewers: RKSimon, gchatelet
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D52932
llvm-svn: 345243
|
|
via tablegen.
This patch adds the ability to identify instructions that are "move elimination
candidates". It also allows scheduling models to describe processor register
files that allow move elimination.
A move elimination candidate is an instruction that can be eliminated at
register renaming stage.
Each subtarget can specify which instructions are move elimination candidates
with the help of tablegen class "IsOptimizableRegisterMove" (see
llvm/Target/TargetInstrPredicate.td).
For example, on X86, BtVer2 allows both GPR and MMX/SSE moves to be eliminated.
The definition of 'IsOptimizableRegisterMove' for BtVer2 looks like this:
```
def : IsOptimizableRegisterMove<[
InstructionEquivalenceClass<[
// GPR variants.
MOV32rr, MOV64rr,
// MMX variants.
MMX_MOVQ64rr,
// SSE variants.
MOVAPSrr, MOVUPSrr,
MOVAPDrr, MOVUPDrr,
MOVDQArr, MOVDQUrr,
// AVX variants.
VMOVAPSrr, VMOVUPSrr,
VMOVAPDrr, VMOVUPDrr,
VMOVDQArr, VMOVDQUrr
], CheckNot<CheckSameRegOperand<0, 1>> >
]>;
```
Definitions of IsOptimizableRegisterMove from processor models of a same
Target are processed by the SubtargetEmitter to auto-generate a target-specific
override for each of the following predicate methods:
```
bool TargetSubtargetInfo::isOptimizableRegisterMove(const MachineInstr *MI)
const;
bool MCInstrAnalysis::isOptimizableRegisterMove(const MCInst &MI, unsigned
CPUID) const;
```
By default, those methods return false (i.e. conservatively assume that there
are no move elimination candidates).
Tablegen class RegisterFile has been extended with the following information:
- The set of register classes that allow move elimination.
- Maxium number of moves that can be eliminated every cycle.
- Whether move elimination is restricted to moves from registers that are
known to be zero.
This patch is structured in three part:
A first part (which is mostly boilerplate) adds the new
'isOptimizableRegisterMove' target hooks, and extends existing register file
descriptors in MC by introducing new fields to describe properties related to
move elimination.
A second part, uses the new tablegen constructs to describe move elimination in
the BtVer2 scheduling model.
A third part, teaches llm-mca how to query the new 'isOptimizableRegisterMove'
hook to mark instructions that are candidates for move elimination. It also
teaches class RegisterFile how to describe constraints on move elimination at
PRF granularity.
llvm-mca tests for btver2 show differences before/after this patch.
Differential Revision: https://reviews.llvm.org/D53134
llvm-svn: 344334
|
|
Summary: The convenience wrapper in STLExtras is available since rL342102.
Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb
Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits
Differential Revision: https://reviews.llvm.org/D52573
llvm-svn: 343163
|
|
Summary:
Example output for vzeroall:
---
mode: uops
key:
instructions:
- 'VZEROALL'
config: ''
register_initial_values:
cpu_name: haswell
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { debug_string: HWPort0, value: 0.0006, per_snippet_value: 0.0006,
key: '3' }
- { debug_string: HWPort1, value: 0.0011, per_snippet_value: 0.0011,
key: '4' }
- { debug_string: HWPort2, value: 0.0004, per_snippet_value: 0.0004,
key: '5' }
- { debug_string: HWPort3, value: 0.0018, per_snippet_value: 0.0018,
key: '6' }
- { debug_string: HWPort4, value: 0.0002, per_snippet_value: 0.0002,
key: '7' }
- { debug_string: HWPort5, value: 1.0019, per_snippet_value: 1.0019,
key: '8' }
- { debug_string: HWPort6, value: 1.0033, per_snippet_value: 1.0033,
key: '9' }
- { debug_string: HWPort7, value: 0.0001, per_snippet_value: 0.0001,
key: '10' }
- { debug_string: NumMicroOps, value: 20.0069, per_snippet_value: 20.0069,
key: NumMicroOps }
error: ''
info: ''
assembled_snippet: C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C3
...
Reviewers: gchatelet
Subscribers: tschuett, RKSimon, andreadb, llvm-commits
Differential Revision: https://reviews.llvm.org/D52539
llvm-svn: 343094
|