Age | Commit message (Collapse) | Author | Files | Lines |
|
overridePostRASchedPolicy (#149297)
This patch updates `overrideSchedPolicy` and `overridePostRASchedPolicy`
to take a
`SchedRegion` parameter instead of just `NumRegionInstrs`. This provides
access to both the
instruction range and the parent `MachineBasicBlock`, which enables
looking up function-level
attributes.
With this change, targets can select post-RA scheduling direction per
function using a function
attribute. For example:
```cpp
void overridePostRASchedPolicy(MachineSchedPolicy &Policy,
const SchedRegion &Region) const {
const Function &F = Region.RegionBegin->getMF()->getFunction();
Attribute Attr = F.getFnAttribute("amdgpu-post-ra-direction");
...
}
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
The right name was used in riscv-toolchain-conventions docs.
|
|
Introduce RISCVLoadStoreOptimizer MIR Pass that will do the
optimization. The load/store pairing pass identifies adjacent load/store
instructions operating on consecutive memory locations and merges them
into a single paired instruction.
This is part of MIPS extensions for the p8700 CPU.
Production of ldp/sdp instructions is OFF by default, since it is
beneficial for -Os only in the case of p8700 CPU.
|
|
Adding two extensions for MIPS p8700 CPU:
1. cmove (conditional move)
2. lsp (load/store pair)
The official product page here:
https://mips.com/products/hardware/p8700
|
|
This patch adds basic support of `MachinePipeliner` and disable
it by default.
The functionality should be OK and all llvm-test-suite tests have
passed.
|
|
(#119968)
#119969 adds a couple of new methods to this class, which will need to
be overridden by these targets.
Part of #119709.
Pull Request: https://github.com/llvm/llvm-project/pull/119968
|
|
The results differ on different platforms so it is really hard to
determine a common default value.
Tune info for postra scheduling direction is added and CPUs can
set their own preferable postra scheduling direction.
|
|
This tune feature will disable latency scheduling heuristic.
This can reduce the number of spills/reloads but will cause some
regressions on some cores.
CPU may add this tune feature if they find it's profitable.
Reviewers: lukel97, michaelmaitland, asb, preames, mshockwave, topperc
Reviewed By: michaelmaitland, mshockwave, topperc
Pull Request: https://github.com/llvm/llvm-project/pull/115858
|
|
We are using `PostMachineScheduler` instead of `PostRAScheduler`
since #68696.
The hook `getPostRAMutations` is only used in `PostRAScheduler` so
it is actually dead code for RISC-V now.
|
|
This is based on other targets like PPC/AArch64 and some experiments.
This PR will only enable bidirectional scheduling and tracking register
pressure.
Disclaimer: I haven't tested it on many cores, maybe we should make
some options being features. I believe downstreams must have tried
this before, so feedbacks are welcome.
|
|
Fixed length vectors use scalable vector containers. With Zve32* and not
Zvl64b, vscale is a 0.5 due RVVBitsPerBlock being 64.
To support this correctly we need to lower RVVBitsPerBlock to 32 and
change our type mapping. But we need to RVVBitsPerBlock to alway be
>= ELEN. This means we need two different mapping depending on ELEN.
That is a non-trivial amount of work so disable fixed lenght vectors
without Zvl64b for now.
We had almost no tests for Zve32x without Zvl64b which is probably why
we never realized that it was broken.
Fixes #102352.
|
|
NFC (#98375)
Instead of std::unique_ptr<RegisterBankInfo>. This allows us to return a
RISCVRegisterBankInfo* from getRegBankInfo so we can avoid a
static_cast.
This does require an additional header file to be included in
RISCVSubtarget.h, but I don't think it's a big deal.
|
|
Prior to this commit, we created the GlobalISel objects in the
RISCVSubtarget constructor, even if we are not running GlobalISel. This
patch moves creation of the GlobalISel objects into their getters, which
ensures that we only create these objects if they are actually needed.
This helps since some of the constructors of the GlobalISel objects have
a significant amount of code.
We make the `unique_ptr`s `mutable` since GlobalISel passes only have
access to `const TargetSubtargetInfo` through `MF.getSubtarget()`.
This patch is tested by the fact that all existing RISC-V GlobalISel
tests remain passing.
|
|
This removes the uses of target flags to disable subreg liveness,
relying on the `-enable-subreg-liveness` flag instead. The
`-enable-subreg-liveness` flag has been changed to take precedence over
the subtarget if set, and one use of `Subtarget->enableSubRegLiveness()`
has been changed to `MRI->subRegLivenessEnabled()` to make sure the
option properly applies.
|
|
We convert existed macro fusions to TableGen.
Bacause `Fusion` depend on `Instruction` definitions which is defined
below `RISCVFeatures.td`, so we recommend user to add fusion features
when defining new processor.
|
|
This is like what AArch64 has done in #71166 except that we don't
handle `HasMinSize` case now.
|
|
There are many information that can be used for tuning, like
alignments, cache line size, etc. But we can't make all of them
`SubtargetFeature` because some of them are not with enumerable
value, for example, `PrefetchDistance` used by `LoopDataPrefetch`.
In this patch, a searchable table `RISCVTuneInfoTable` is added,
in which each entry contains the CPU name and all tune information
defined in `RISCVTuneInfo`. Each field of `RISCVTuneInfo` should
have a default value and processor definitions can override the
default value via `let` statements.
We don't need to define a `RISCVTuneInfo` for each processor and
it will use the default value (which is for `generic`) if no
`RISCVTuneInfo` defined.
For processors in the same series, a subclass can inherit from
`RISCVTuneInfo` and override the fields. And we can also override
the fields in processor definitions if there are some differences
in the same processor series.
When initilizing `RISCVSubtarget`, we will use `TuneCPU` as the
key to serach the tune info table. So, the behavior here is if
we don't specify the tune CPU, we will use specified `CPU`, which
is expected I think.
This patch almost undoes 61ab106, in which I added tune features
of preferred function/loop alignments. More tune information can
be added in the future.
|
|
The isRV64 field contains the same information, and we can derive XLen
from that.
Differential Revision: https://reviews.llvm.org/D159306
|
|
We're already tracking XLen, we can compute XLenVt from that. Note that XLen itself should probably be driven from IsRV64 (the processor flag), but I'm leaving that to a separate change (with review).
|
|
In llvm alias analysis is off by default now.
This patch enable alias analysis on RISCV target during code generation by default,
and this makes more chances for improving performance.
Modified related test cases.
Differential Revision: https://reviews.llvm.org/D157250
|
|
This patch adds logic for determining RegisterBank size to RegisterBankInfo, which allows accounting for the HwMode of the target. Individual RegisterBanks cannot be constructed with HwMode information as construction is generated by TableGen, but a RegisterBankInfo subclass can provide the HwMode as a constructor argument. The HwMode is used to select the appropriate RegisterBank size from an array relating sizes to RegisterBanks.
Targets simply need to provide the HwMode argument to the <target>GenRegisterBankInfo constructor. The RISC-V RegisterBankInfo constructor has been updated accordingly (plus an unused argument removed).
Reviewed By: simoncook, craig.topper
Differential Revision: https://reviews.llvm.org/D76007
|
|
ShadowCallStack implementation uses s2 register on RISC-V, but that
choice is problematic for reasons described in:
https://lists.riscv.org/g/sig-toolchains/message/544,
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/370, and
https://github.com/google/android-riscv64/issues/72
The concern over the register choice was also brought up in
https://reviews.llvm.org/D84414.
https://reviews.llvm.org/D84414#2228666 said:
```
"If the register choice is the only concern about this work, then I think
we can probably land it as-is and fixup the register choice if we see
major drawbacks later. Yes, it's an ABI issue, but on the other hand the
shadow call stack is not a standard ABI anyway.""
```
Since we have now found a sufficient reason to fixup the register
choice, we should go ahead and update the implementation. We propose
using x3(gp) which is now the platform register in the RISC-V ABI.
Reviewed By: asb, hiraditya, mcgrathr, craig.topper
Differential Revision: https://reviews.llvm.org/D146463
|
|
To be consistent with RISC-V branding guidelines
https://riscv.org/about/risc-v-branding-guidelines/
Think we should be using RISC-V where possible.
More patches will follow.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D146449
|
|
This commit enable the subregister liveness by default in RISC-V.
It was previously disabled in https://reviews.llvm.org/D129646 after a previous attempt to enabled it https://reviews.llvm.org/D128016.
We believe that https://reviews.llvm.org/D129735 fixes the issue that caused it to be disabled.
Reviewed By: craig.topper, kito-cheng
Differential Revision: https://reviews.llvm.org/D145546
|
|
This patch replaces isPowerOf2_32 with llvm::has_single_bit<uint32_t>
where the argument is wider than uint32_t.
|
|
Fuchsia's ABI always reserves the x18 (s2) register for the
ShadowCallStack ABI, even when -fsanitize=shadow-call-stack is
not enabled.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D143355
|
|
|
|
|
|
Split from D139873.
Reviewed By: reames, kito-cheng
Differential Revision: https://reviews.llvm.org/D140283
|
|
|
|
This change enables the use of RISCV's variable length vector registers for fixed length vectors in the IR, and implicitly enables various IR transforms which generate fixed length vectors if legal (e.g. LoopVectorize). Specifically, this enables fixed length vectors which are known to be inbounds of the underlying variable hardware size.
For context, remember that the +V extension provides a minimum VLEN of 128. The embedded variants provide lower minimums. The analogy here is essentially vectorizing for SSE on a machine which may or may not include AVX2/AVX512. We won't get full utilization by default, but we will get some benefit. And of course, with an explicit mcpu we can vectorize to the exact target hardware.
The LV impact is mostly related to vectorizer robustness. In cases we haven't yet fully implemented scalable vectorization support, we can fall back to fixed length vectorization.
SLP has been disabled for now, even when fixed vectors are enabled. See a310637 and associated review. There are a few addiitional code quality issues which need worked through before turning SLP on would be reasonable.
Differential Revision: https://reviews.llvm.org/D131508
|
|
Saves a heap allocation and avoids an explicit call to the BitVector constructor.
Reviewed By: reames, myhsu
Differential Revision: https://reviews.llvm.org/D132674
|
|
We previously enabled subregister liveness by default when compiling
with RVV. This has been shown to cause miscompilations where RVV
register operand constraints are not met. A test was added for this in
D129639 which explains the issue in more detail.
Until this issue is fixed in some way, we should not be enabling
subregister liveness unless the user asks for it.
Reviewed By: craig.topper, rogfer01, kito-cheng
Differential Revision: https://reviews.llvm.org/D129646
|
|
This adds the macrofusion plumbing and support fusing LUI+ADDI(W).
This is similar to D73643, but handles a different case. Other cases
can be added in the future.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D128393
|
|
The failure that caused the previous revert has been fixed
by https://reviews.llvm.org/D126048
Original commit message:
RVV makes heavy use of subregisters due to LMUL>1 and segment
load/store tuples. Enabling subregister liveness tracking improves the quality
of the register allocation.
I've added a command line that can be used to turn it off if it causes compile
time or functional issues. I used the command line to keep the old behavior
for one interesting test case that was testing register allocation.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D128016
|
|
This reverts most of ed242b54c9c2aa84a47f66af5b8497d93646b68d
I'm seeing failures in our intrinsic testing on qemu that seem
related to this. Reverting while I investigate.
I've left the command line option in place for directed testing.
It defaults to off.
|
|
RVV makes heavy use of subregisters due to LMUL>1 and segment
load/store tuples. Enabling subregister liveness tracking improves the quality
of the register allocation.
I've added a command line that can be used to turn it off if it causes compile
time or functional issues. I used the command line to keep the old behavior
for one interesting test case that was testing register allocation.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125108
|
|
use Zvl*b value.
riscv-v-vector-bits-min is primarily used to opt-in to the
autovectorizer. The vector width can be determined from Zvl*b.
This patch adds support treating -1 as meaning use Zvl*b so we can
still opt-in to autovectorization without needing to repeat a
vector width already given by Zvl*b or -mcpu.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D124960
|
|
This was added before Zve extensions were defined. I think users
should use Zve32x or Zve32f now. Though we will lose support for limiting
ELEN to 16 or 8, but I hope no one was using that.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D123418
|
|
Most other targets support 'generic', but RISCV issues an error.
This can require a special case in tools that use LLVM that aren't
clang.
This patch treats "generic" the same as an empty string and remaps
it to generic-rv/rv64 based on the triple. Unfortunately, it has to
be added to RISCV.td because MCSubtargetInfo is constructed and
parses the CPU before RISCVSubtarget's constructor gets a chance
to remap it. The CPU will then reparsed and the state in the
MCSubtargetInfo subclass will be updated again.
Fixes PR54146.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D121149
|
|
In the Zve* extensions, the vlen could be 64. This patch change the vlen constraint of low bound to 64.
Differential Revision: https://reviews.llvm.org/D118217
|
|
RISCVSubtarget::getMaxELENForFixedLengthVectors.
This is needed to properly limit fractional LMULs for Zve32.
Add new RUN Zve32 RUN lines to the existing tests for the
-riscv-v-fixed-length-vector-elen-max command line option.
|
|
`zvl` is the new standard vector extension that specifies the minimum vector length of the vector extension.
The `zvl` extension is related to the `zve` extension and other updates that are added in v1.0.
According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21,
Clang defines macro `__riscv_v_min_vlen` for `zvl` and it can be used for applications that uses the vector extension.
LLVM checks whether the option `riscv-v-vector-bits-min` (if specified) matches the `zvl*` extension specified.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D108694
|
|
For large integers (for example, magic numbers generated by
TargetLowering::BuildSDIV when dividing by constant), we may
need about 4~8 instructions to build them.
In the same time, it just takes two instructions to load
constants (with extra cycles to access memory), so it may be
profitable to put these integers into constant pool.
Reviewed By: asb, craig.topper
Differential Revision: https://reviews.llvm.org/D114950
|
|
Add new hasVInstructions() which is currently equivalent.
Replace vector uses of hasStdExtZfh/F/D with new vector specific
versions. The vector spec no longer requires that the vectors implement the
same types as scalar. It only requires that the scalar type is
the maximum size the vectors can support. This is currently
implemented using the scalar rule we were using before.
Add new hasVInstructionsI64() begin using to qualify code that
requires i64 vector elements.
This is all NFC for now, but we can start using this to better
implement D112408 which introduces the Zve extensions.
Reviewed By: frasercrmck, eopXD
Differential Revision: https://reviews.llvm.org/D112496
|
|
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
|
|
fixed length vectorization.
This adds an ELEN limit for fixed length vectors. This will scalarize
any elements larger than this. It will also disable some fractional
LMULs. For example, if ELEN=32 then mf8 becomes illegal, i32/f32
vectors can't use any fractional LMULs, i16/f16 can only use mf2,
and i8 can use mf2 and mf4.
We may also need something for the scalable vectors, but that has
interactions with the intrinsics and we can't scalarize a scalable
vector.
Longer term this should come from one of the Zve* features
|
|
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D106134
|
|
Make it a static function RISCVISelLowering, the only place it
is used.
I think I'm going to make this return a fractional LMULs in some
cases so I'm sorting out where it should live before I start
making changes.
|