Age | Commit message (Collapse) | Author | Files | Lines |
|
This is a follow-up to #145643. See
https://github.com/llvm/llvm-project/pull/145643#issuecomment-3009300419.
|
|
|
|
C1 (#160163)
We can rewrite this to (srai(w)/srli X, C1) == C2 so the AND immediate
is free. This transform is done by performSETCCCombine in
RISCVISelLowering.cpp.
This fixes the opaque constant case mentioned in #157416.
|
|
This reverts commit aa08b1a9963f33ded658d3ee655429e1121b5212.
|
|
This patch is based on https://github.com/llvm/llvm-project/pull/159713
This patch extends AddressSanitizer to support indexed/segment
instructions in RVV. It enables proper instrumentation for these memory
operations.
A new member, `MaybeOffset`, is added to `InterestingMemoryOperand` to
describe the offset between the base pointer and the actual memory
reference address.
Co-authored-by: Yeting Kuo <yeting.kuo@sifive.com>
|
|
permutation instructions (#160763)
In newer SiFIve7 cores like X390, permutation instructions like
vrgather.vv operates on LMUL smaller than a single DLEN could yield a
constant cycle. For slightly larger data that fits in the constraint of
`log2(SEW/8) + log2(LMUL) <= log2(DLEN / 32)`, these instructions can
also yield cycles that are proportional to the quadratic of LMUL, rather
than being proportional to VL.
Co-authored-by: Michael Maitland <michaeltmaitland@gmail.com>
|
|
Split out from #151300 to isolate TargetTransformInfo cost modelling for
fault-only-first loads from VPlan implementation details. This change
adds costing support for vp.load.ff independently of the VPlan work.
For now, model a vp.load.ff as cost-equivalent to a vp.load.
|
|
`-riscv-fp-imm-cost` controls the threshold at which the constant pool
is used for float constants rather than generating directly (typically
into a GPR followed by an `fmv`). The value used for this knob indicates
the number of instructions that can be used to produce the value
(otherwise we fall back to the constant pool). Upping to to 3 covers a
huge number of additional constants (see
<https://github.com/llvm/llvm-project/issues/153402>), e.g. most whole
numbers which can be generated through lui+shift+fmv. As in general we
struggle with efficient code generation for constant pool accesses,
reducing the number of constant pool accesses is beneficial. We are
typically replacing a two-instruction sequence (which includes a load)
with a three instruction sequence (two simple arithmetic operations plus
a fmv), which.
The CHECK prefixes for various tests had to be updated to avoid
conflicts leading to check lines being dropped altogether (see
<https://github.com/llvm/llvm-project/pull/159321> for a change to
update_llc_test_checks to aid diagnosing this).
|
|
EVec and ContainerVT (#159373)
Fixes https://github.com/llvm/llvm-project/issues/159294
The element type of EVecContainerVT and ContainerVT can be different
after promoting integer types.
This patch disables the slideup optimization in that case.
|
|
(#160105)
When we have sequence of select pseudo instructions with stack adjustment
instructions in between, we shouldn't apply the optimization, proposed by link
https://reviews.llvm.org/D59355. If optimization is applied,
function won't be marked `adjustsStack` during Finalize ISel pass.
|
|
This removes a bunch of unreachable isel patterns for floating point
operations like fadd, fsub, fmul, etc.
Eventually we will need patterns for Zvfbfa but this will require new
pseudoinstructions with the altfmt bit set so these extra patterns
aren't helpful for that either.
Add a new AllFloatAndBFloatVectors for the instructions that we do need
both for like vrgather, vcompress, vmerge.
|
|
`qc.insb/qc.insbi` to RISCVISelLowering.cpp (#157618)
This is a follow-up to #154135 and does similar changes for
`qc.insb/qc.insbi`.
|
|
instructions (#160155)
Vector to scalar movement instructions, as well as mask instructions
like vcpop and vfirst, should have a higher latency & occupancy on
SiFive7.
---------
Co-authored-by: Michael Maitland <michaeltmaitland@gmail.com>
|
|
This more closely matches what we have done for uimm20, and should allow
us to in future differentiate between places that accept %*lo(expr) and
those where that is not allowed.
I have not introduced a `simm12` node for the moment, so that downstream
users notice the change.
|
|
.. to isReMaterializableImpl. The "Really" naming has always been
awkward, and we're working towards removing the "Trivial" part now,
so go ehead and remove both pieces in a single rename.
Note that this doesn't change any aspect of the current
implementation; we still "mostly" only return instructions which
are trivial (meaning no virtual register uses), but some targets
do lie about that today.
|
|
Recently added latency customization
([PR](https://github.com/llvm/llvm-project/pull/155420)) does not work
on RISCV since it has target-specific InstrumentManager that overrides
default functionality. Added calls to base class to ensure that common
instruments (including latency customizer) are available.
|
|
Add MC layer support for Andes XAndesVSIntH extension. The spec is
available at:
https://github.com/andestech/andes-v5-isa/releases/tag/ast-v5_4_0-release
|
|
embed in MemIntrinsicInfo #157863 (#159713)
[Previously reverted due to failures on asan-rvv-intrinsics.ll, the test
case is riscv only and it is triggered by other target]
Reland [#157863](https://github.com/llvm/llvm-project/pull/157863), and
add `; REQUIRES: riscv-registered-target` in test case to skip the
configuration that doesn't register riscv target.
Previously asan considers target intrinsics as black boxes, so asan
could not instrument accurate check. This patch make
SmallVector<InterestingMemoryOperand> a member of MemIntrinsicInfo so
that TTI can make targets describe their intrinsic informations to asan.
Note,
1. This patch move InterestingMemoryOperand from Transforms to Analysis.
2. Extend MemIntrinsicInfo by adding a
SmallVector<InterestingMemoryOperand> member.
3. This patch does not support RVV indexed/segment load/store.
|
|
Previously, `bits<0>` only had effect if `ignore-non-decodable-operands`
wasn't specified. Handle it even if the option was specified. This
should allow for a smoother transition to the option removed.
The change revealed a couple of inaccuracies in RISCV compressed
instruction definitions.
* `C_ADDI4SPN` has `bits<5> rs1` field, but `rs1` is not encoded. It
should be `bits<0>`.
* `C_ADDI16SP` has `bits<5> rd` in the base class, but it is unused
since `Inst{11-7}` is overwritten with constant bits.
We should instead set `rd = 2` and `Inst{11-7} = rd`. There are a couple
of alternative fixes, but this one is the shortest.
|
|
|
|
RISCVMatInt. NFC (#159864)
I think this better reflects the intent of modification. In all these
places we know bit 31 is 1 so we are sign extending.
|
|
I find it very confusing that we have two different kinds of
"immediates":
- MCOperands in the backend that are `isImm()` which can only be numbers
- RISCVOperands in the parser that are `isImm()` which can contain
expressions
This change aims to make it clearer that in the AsmParser, we are
dealing with expressions, rather than just numbers.
Unfortunately, `isImm` comes from the `MCParsedAsmOperand`, which is
needed for Microsoft Inline Asm, so we cannot fully get rid of it.
|
|
|
|
after LUI now. NFC (#159829)
The simm32 base case only uses lui+addiw when necessary after
3d2650bdeb8409563d917d8eef70b906323524ef
The worst case 8 instruction sequence doesn't leave a full 32 bits for
the LUI+ADDI(W) after the 3 12-bit ADDI and SLLI pairs are created. So
we will never generate LUI+ADDIW in the worst case sequence.
|
|
We're only going to modify existing items, not add or remove any
elements to the vector.
|
|
combineOp_VLToVWOp_VL. (#159205)
These instructions have one already narrow operand. Previously, we
pretended like this operand was a supported extension.
This could cause problems when we called getOrCreateExtendedOp on this
narrow operand when creating the the VWADD_VL. If the narrow operand
happened to be an extend of the opposite type, we would peek through it
and then rebuild it with the wrong extension type. So (vwadd_w_vl (i32
(sext X)), (i16 (zext Y))) would become (vwadd_vl (i16 (sext X)), (i16
(sext Y))).
To prevent this, we ignore the operand instead and pass std::nullopt for
SupportsExt to getOrCreateExtendedOp so it won't peek through any
extends on the narrow source.
Fixes #159152.
|
|
|
|
|
|
the following changes are made
a)Typo Fix (with previous PRhttps://github.com/llvm/llvm-project/pull/155747)
b)builtins support for MIPS P8700 execution control instructions .
c)Testcase
|
|
This patch adds MC support for Zvfofp8min
https://github.com/aswaterman/riscv-misc/blob/main/isa/zvfofp8min.adoc.
|
|
For vx form, we legalize it with widen scalar. And for vf form, we select the right register bank.
|
|
MachineFunction. NFC (#159664)
|
|
Don't put them onto the worklist, since they'll crash when we try to
check their opcode.
Fixes #159422
|
|
embed in MemIntrinsicInfo" (#159700)
Reverts llvm/llvm-project#157863
|
|
MemIntrinsicInfo (#157863)
Previously asan considers target intrinsics as black boxes, so asan
could not instrument accurate check. This patch make
SmallVector<InterestingMemoryOperand> a member of MemIntrinsicInfo so
that TTI can make targets describe their intrinsic informations to asan.
Note,
1. This patch move InterestingMemoryOperand from Transforms to Analysis.
2. Extend MemIntrinsicInfo by adding a
SmallVector<InterestingMemoryOperand> member.
3. This patch does not support RVV indexed/segment load/store.
|
|
This patch implements pages 21-24 from jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf
Documentation:
jhauser.us/RISCV/ext-P/RVP-baseInstrs-014.pdf
jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf
Co-authored-by: Craig Topper <craig.topper@sifive.com>
|
|
(#159678)
If the original type was i32, type legalization will sign extend
the constant. This prevents it from having a single bit set or clear
so other patterns can't match. If the upper bits aren't used, we
can ignore the sign extension.
Similar for bclri and binvi.
|
|
The original patterns for the Xqci select-like instructions used
`select`, and marked that ISD node as legal. This is not the usual way
that `select` is dealt with in the RISC-V backend.
Usually on RISC-V, we expand `select` to `riscv_select_cc` which holds
references to the operands of the comparison and the possible values
depending on the comparison. In retrospect, this is a much better fit
for our instructions, as most of them correspond to specific condition
codes, rather than more generic `select` with a truthy/falsey value.
This PR moves the Xqci select-like patterns to use `riscv_select_cc`
nodes. This applies to the Xqcicm, Xqcics and Xqcicli instruction
patterns.
In order to match the existing codegen, minor additions had to be made
to `translateSetCCForBranch` to ensure that comparisons against specific
immediate values are left in a form that can be matched more closely by
the instructions. This prevents having to insert additional `li`
instructions and use the register forms.
There are a few slight regressions:
- There are sometimes more `mv` instructions than entirely necessary. I
believe these would not be seen with larger examples where the register
allocator has more leeway.
- In some tests where just one of the three extensions is enabled,
codegen falls back to using a branch over a move. With all three
extensions enabled (the configuration we most care about), these are not
seen.
- The generated patterns are very similar to each other - they have
similar complexity (7 or 8) and there are still overlaps. Sometimes the
choice between two instructions can be affected by the order of the
patterns in the tablegen file.
One other change is that Xqcicm instructions are prioritised over Xqcics
instructions where they have identical patterns. This is done because
one of the the Xqcicm instructions is compressible (`qc.mveqi`), while
none of the Xqcics instructions are.
|
|
|
|
(#159468)
Vector integer division in SiFive7 processes a single bit at a time up
to 4 elements. This patch updates to reflect this behavior.
Co-authored-by: Michael Maitland <michaeltmaitland@gmail.com>
|
|
The latency of floating point loads in SiFive7 should be the same as
their integer counterparts.
Co-authored-by: Michael Maitland <michaeltmaitland@gmail.com>
|
|
This is the minimal case generated by clang at `-O0`; I'm not sure if
writing the test this way is appropriate.
|
|
|
|
This patch tries to match fmaxnum and fminnum to vector reductions.
|
|
This adds the CodeGen support of Zibi v0.1 experimental extension, which
depends on #127463.
|
|
### Summary
Try to implemente Lower G_SSUBE in LegalizerHelper::lower
|
|
NFC.
|
|
There is no RISCV isel for bitcast between f16 and bf16 which will
trigger "cannot select" fatal error.
Co-authored-by: Ying Wang <wy446777@alibaba-inc.com>
|
|
|
|
|