Age | Commit message (Collapse) | Author | Files | Lines |
|
Add `bits<0>` fields to instructions using the ZTR/MPR/MPR8 register
classes. These register classes contain only one register, and it is
not encoded in the instruction. This way, the generated decoder can
completely decode instructions without having to perform a post-decoding
pass to insert missing operands.
Some immediate operands are also not encoded and have only one possible
value "zero". Use this trick for them, too.
Finally, remove `-ignore-non-decodable-operands` option from
`llvm-tblgen` invocation to ensure that non-decodable operands do not
appear in the future.
|
|
This patch uses SignExtend64<N> to simplify sign extensions.
|
|
These instructions encode two operands in the same field. Instead of
fixing them after they have been incorrectly decoded, provide a custom
decoder.
This will allow to remove `-ignore-non-decodable-operands` option from
AArch64/CMakeLists.txt, see #156358 for the context.
|
|
Rearrange decode functions to be before including the generated
disassembler code and eliminate forward declarations for most of them.
This is possible because `fieldFromInstruction` is now in MCDecoder.h
and not in the generated disassembler code.
|
|
(#154802)
Extract fixed functions generated by decoder emitter into a new
MCDecoder.h header.
|
|
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/Target` library.
These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
A sub-set of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.
The bulk of this change is manual additions of `LLVM_ABI` to
`LLVMInitializeX` functions defined in .cpp files under llvm/lib/Target.
Adding `LLVM_ABI` to the function implementation is required here
because they do not `#include "llvm/Support/TargetSelect.h"`, which
contains the declarations for this functions and was already updated
with `LLVM_ABI` in a previous patch. I considered patching these files
with `#include "llvm/Support/TargetSelect.h"` instead, but since
TargetSelect.h is a large file with a bunch of preprocessor x-macro
stuff in it I was concerned it would unnecessarily impact compile times.
In addition, a number of unit tests under llvm/unittests/Target required
additional dependencies to make them build correctly against the LLVM
DLL on Windows using MSVC.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
|
|
Replace unsigned with MCRegister at some of the call sites.
|
|
Add AArch64MCSymbolizer that symbolizes `MCInst` operands during
disassembly. The symbolization was previously done in
`BinaryFunction::disassemble()`, but it is also required by
`scanExternalRefs()` for "lite" mode functionality. Hence, similar to
x86, I've implemented the symbolizer interface that uses
`BinaryFunction` relocations to properly create instruction operands. I
expect the result of the disassembly to be identical after the change.
AArch64 disassembler was not calling `tryAddingSymbolicOperand()` for
`MOV` instructions. Fix that. Additionally, the disassembler marks `ldr`
instructions as branches by setting `IsBranch` parameter to true. Ignore
the parameter and rely on `MCPlusBuilder` interface instead.
I've modified `--check-encoding` flag to check symolization of operands
of instructions that have relocations against them.
|
|
Identified with misc-include-cleaner.
|
|
(#113461)
…uctions (#112726)
This patch adds the assembly/disassembly for the following instructions:
CBB<cc>, CBH<cc>,
CB<cc>(immediate), CB<cc>(register)
CBBLE, CBBLO, CBBLS, CBBLT
CBHLE, CBHLO, CBHLS, CBHLT
CBGE, CBHS, CBLE, CBLS (immediate)
CBLE, CBLO, CBLS, CBLT(register)
According to [1]
[1]https://developer.arm.com/documentation/ddi0602
Co-authored-by: Momchil Velikov momchil.velikov@arm.com
Co-authored-by: Spencer Abson spencer.abson@arm.com
This patch was reverted(git commit 83c6e2f8f4d3) and is being submitted
again with the fix for buildbot failure in:
https://lab.llvm.org/buildbot/#/builders/25/builds/3493
The fix was to replaced a shift left of a possibly negative value with a
multiplication in DecodePCRelLabel9.
Because int64_t ImmVal is signed it needed to replace:
(ImmVal << 2)
with :
(ImmVal * 4)
|
|
This patch adds assembly/disassembly for the following SME2p2
instructions (part of the 2024 AArch64 ISA update)
- BFTMOPA (widening) - FEAT_SME2p2
- BFTMOPA (non-widening) - FEAT_SME2p2 & FEAT_SME_B16B16
- FTMOPA (4-way) - FEAT_SME2p2 & FEAT_SME_F8F32
- FTMOPA (2-way, 8-to-16) - FEAT_SME2p2 & FEAT_SME_F8F16
- FTMOPA (2-way, 16-to-32) - FEAT_SME2p2
- FTMOPA (non-widening, f16) - FEAT_SME2p2 & FEAT_SME_F16F16
- FTMOPA (non-widening, f32) - FEAT_SME2p2
- Add new ZPR_K register class and ZK register operand
- Introduce assembler extension tests for the new sme2p2 feature
In accordance with:
https://developer.arm.com/documentation/ddi0602/latest/
Co-authored-by: Marian Lukac marian.lukac@arm.com
|
|
extensions (#112341)
Add support for the following Armv9.6-A memory systems extensions:
FEAT_LSUI - Unprivileged Load Store
FEAT_OCCMO - Outer Cacheable Cache Maintenance Operation
FEAT_PCDPHINT - Producer-Consumer Data Placement Hints
FEAT_SRMASK - Bitwise System Register Write Masks
as documented here:
https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-6-architecture-extension
Co-authored-by: Jonathan Thackray <jonathan.thackray@arm.com>
---------
Co-authored-by: Jonathan Thackray <jonathan.thackray@arm.com>
|
|
instructions (#112726)"
This reverts commit dc84337f7b5bb2447e30f3364ebc863e9e04b8be.
Reversting because the sanitizer fails with the following error
llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp:502:56:
runtime error: left shift of negative value -256
|
|
(#112726)
This patch adds the assembly/disassembly for the following instructions:
CBB<cc>, CBH<cc>,
CB<cc>(immediate), CB<cc>(register)
CBBLE, CBBLO, CBBLS, CBBLT
CBHLE, CBHLO, CBHLS, CBHLT
CBGE, CBHS, CBLE, CBLS (immediate)
CBLE, CBLO, CBLS, CBLT(register)
According to [1]
[1]https://developer.arm.com/documentation/ddi0602
Co-authored-by: Momchil Velikov momchil.velikov@arm.com
Co-authored-by: Spencer Abson spencer.abson@arm.com
|
|
Remove the unused functions and register classes from the change below
https://github.com/llvm/llvm-project/commit/4679583181a9032b4f7c6476c7a1bfefe5724b47
|
|
Add new register classes/operands and their encoder/decoder behaviour
required for the new Armv9.6 instructions (see
https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-6-architecture-extension).
This work is the basis ofthe 2024 Armv9.6 architecture update effort for
SME.
Co-authored-by: Caroline Concatto caroline.concatto@arm.com
Co-authored-by: Marian Lukac marian.lukac@arm.com
Co-authored-by: Momchil Velikov momchil.velikov@arm.com
|
|
In a previous PR #81716, a new decoder function was added to
llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp. During
code review it was suggested that, as most of the decoder functions were
very similar in structure, that they be refactored into a single,
templated function. I have added the refactored function, removed the
definitions of the replaced functions, and replaced the references to
the replaced functions in AArch64Disassembler.cpp and
llvm/lib/Target/AArch64/AArch64RegisterInfo.td. To reduce the number of
duplicate references in AArch64RegisterInfo.td, I have also made a small
change to llvm/utils/TableGen/DecoderEmitter.cpp.
|
|
7dc20ab introduced an extra COPY when spilling and filling a PNR
register, which can't be elided as the input (PNR predicate) and output
(PPR predicate) register classes differ. The patch adds a new register
class that covers both PPR and PNR so that STR_PXI and LDR_PXI can
take either of them, removing the need for the copy.
|
|
As ImmVal is unsigned, it will always be >= 0
|
|
This reverts commit 199a0f9f5aaf72ff856f68e3bb708e783252af17.
Fixed the left-shift of signed integer which was causing UB.
|
|
This reverts commit 934b1099cbf14fa3f86a269dff957da8e5fb619f.
Buildbot failues on sanitizer-x86_64-linux-fast
|
|
Add assembly/disassembly support for the new PAuthLR instructions
introduced in Armv9.5-A:
- AUTIASPPC/AUTIBSPPC
- PACIASPPC/PACIBSPPC
- PACNBIASPPC/PACNBIBSPPC
- RETAASPPC/RETABSPPC
- PACM
Documentation for these instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/
|
|
This patch adds the feature flag FP8FMA and the assembly/disassembly
for the following instructions of NEON and SVE2:
* NEON:
- FMLALBlane
- FMLALTlane
- FMLALLBBlane
- FMLALLBTlane
- FMLALLTBlane
- FMLALLTTlane
- FMLALB
- FMLALT
- FMLALLB
- FMLALLBT
- FMLALLTB
- FMLALLTT
* SVE2:
- FMLALB_ZZZI
- FMLALT_ZZZI
- FMLALB_ZZZ
- FMLALT_ZZZ
- FMLALLBB_ZZZI
- FMLALLBT_ZZZI
- FMLALLTB_ZZZI
- FMLALLTT_ZZZI
- FMLALLBB_ZZZ
- FMLALLBT_ZZZ
- FMLALLTB_ZZZ
- FMLALLTT_ZZZ
That is according to this documentation:
https://developer.arm.com/documentation/ddi0602/2023-09
|
|
This patch separates PNR registers into their own register class instead
of sharing a register class with PPR registers. This primarily allows us
to return more accurate register classes when applying assembly
constraints, but also more protection from supplying an incorrect
predicate type to an invalid register operand.
|
|
Instead of using SmallVectors of SmallVectors, use a plain array.
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D150077
|
|
Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A future patch will remove opInfo_begin and opInfo_end.
Also use it instead of raw access to the OpInfo pointer. A future patch
will remove this pointer.
Differential Revision: https://reviews.llvm.org/D142213
|
|
This adds support for the new PM pstate system register introduced by
the v9.4-A Exception-based Event Profiling extension (FEAT_EBEP).
The new PM pstate register takes a 1-bit immediate and requires
different values to be specified for the higher bits of the Crm field.
To enable that, this patch creates an explicit separation between the
pstate system registers that take 4-bit and 1-bit immediate operands,
allowing each entry to specify the value for the 3 high bits of Crm.
This also updates other pstate registers to correctly accept 4-bit
immediates, matching their decoding specification from the Arm ARM.
These include: `PAN`, `UAO`, `DIT` and `SSBS`.
More information about this extension and the new register can be found
at:
* https://developer.arm.com/documentation/ddi0601/2022-09/AArch64-Registers/PM--PMU-Exception-Mask
Contributors:
* Lucas Prates
* Sam Elliott
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D139925
|
|
Virtual Memory System Architecture (VMSA)
This is part of the 2022 A-Profile Architecture extensions and adds support for
the following:
- Translation Hardening Extension (FEAT_THE)
- 128-bit Page Table Descriptors (FEAT_D128)
- 56-bit Virtual Address (FEAT_LVA3)
- Support for 128-bit System Registers (FEAT_SYSREG128)
- System Instructions that can take 128-bit inputs (FEAT_SYSINSTR128)
- 128-bit Atomic Instructions (FEAT_LSE128)
- Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE)
- Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE)
- Memory Attribute Index Enhancement (FEAT_AIE)
New instructions added:
- FEAT_SYSREG128 adds MRRS and MSRR.
- FEAT_SYSINSTR128 adds the SYSP instruction and TLBIP aliases.
- FEAT_LSE128 adds LDCLRP, LDSET, and SWPP instructions.
- FEAT_THE adds the set of RCW* instructions.
Specs for individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/
Contributors:
Keith Walker
Lucas Prates
Sam Elliott
Son Tuan Vu
Tomas Matheson
Differential Revision: https://reviews.llvm.org/D138920
|
|
This patch implements the 2022 Architecture General Data-Processing Instructions
They include:
Common Short Sequence Compression (CSSC) instructions
- scalar comparison instructions
SMAX, SMIN, UMAX, UMIN (32/64 bits) with or without immediate
- ABS (absolute), CNT (count non-zero bits), CTZ (count trailing zeroes)
- command-line options for CSSC
Associated with these instructions in the documentation is the Range Prefetch
Memory (RPRFM) instruction, which signals to the memory system that data memory
accesses from a specified range of addresses are likely to occur in the near
future. The instruction lies in hint space, and is made unconditional.
Specs for the individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/
contributors to this patch:
- Cullen Rhodes
- Son Tuan Vu
- Mark Murray
- Tomas Matheson
- Sam Elliott
- Ties Stuij
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D138488
|
|
This patch adds the assembly/disassembly for the following instruction:
SEL: Multi-vector conditionally select elements from two vectors
for 2 and 4 registers
Non-constiguous load with stride resgisters:
LD1B (scalar + immediate): Contiguous load of bytes to multiple strided vectors (immediate index).
(scalar + scalar): Contiguous load of bytes to multiple strided vectors (scalar index).
LD1D (scalar + immediate): Contiguous load of doublewords to multiple strided vectors (immediate index).
(scalar + scalar): Contiguous load of doublewords to multiple strided vectors (scalar index).
LD1H (scalar + immediate): Contiguous load of halfwords to multiple strided vectors (immediate index).
(scalar + scalar): Contiguous load of halfwords to multiple strided vectors (scalar index).
LD1W (scalar + immediate): Contiguous load of words to multiple strided vectors (immediate index).
(scalar + scalar): Contiguous load of words to multiple strided vectors (scalar index).
LDNT1B (scalar + immediate): Contiguous load non-temporal of bytes to multiple strided vectors (immediate index).
(scalar + scalar): Contiguous load non-temporal of bytes to multiple strided vectors (scalar index).
LDNT1D (scalar + immediate): Contiguous load non-temporal of doublewords to multiple strided vectors (immediate index).
(scalar + scalar): Contiguous load non-temporal of doublewords to multiple strided vectors (scalar index).
LDNT1H (scalar + immediate): Contiguous load non-temporal of halfwords to multiple strided vectors (immediate index).
(scalar + scalar): Contiguous load non-temporal of halfwords to multiple strided vectors (scalar index).
LDNT1W (scalar + immediate): Contiguous load non-temporal of words to multiple strided vectors (immediate index).
(scalar + scalar): Contiguous load non-temporal of words to multiple strided vectors (scalar index).
Non-constiguous store with stride resgisters:
ST1B (scalar + immediate): Contiguous store of bytes from multiple strided vectors (immediate index).
(scalar + scalar): Contiguous store of bytes from multiple strided vectors (scalar index).
ST1D (scalar + immediate): Contiguous store of doublewords from multiple strided vectors (immediate index).
(scalar + scalar): Contiguous store of doublewords from multiple strided vectors (scalar index).
ST1H (scalar + immediate): Contiguous store of halfwords from multiple strided vectors (immediate index).
(scalar + scalar): Contiguous store of halfwords from multiple strided vectors (scalar index).
ST1W (scalar + immediate): Contiguous store of words from multiple strided vectors (immediate index).
(scalar + scalar): Contiguous store of words from multiple strided vectors (scalar index).
STNT1B (scalar + immediate): Contiguous store non-temporal of bytes from multiple strided vectors (immediate index).
(scalar + scalar): Contiguous store non-temporal of bytes from multiple strided vectors (scalar index).
STNT1D (scalar + immediate): Contiguous store non-temporal of doublewords from multiple strided vectors (immediate index).
(scalar + scalar): Contiguous store non-temporal of doublewords from multiple strided vectors (scalar index).
STNT1H (scalar + immediate): Contiguous store non-temporal of halfwords from multiple strided vectors (immediate index).
(scalar + scalar): Contiguous store non-temporal of halfwords from multiple strided vectors (scalar index).
STNT1W (scalar + immediate): Contiguous store non-temporal of words from multiple strided vectors (immediate index).
(scalar + scalar): Contiguous store non-temporal of words from multiple strided vectors (scalar index).
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
This patch also adds a new SVE vector list to represent the stride loads/stores
ZPRVectorListStrided and the sets of 2 and 4 ZA registers:
ZZ_[b|h|w|d]_strided and ZZZZ_[b|h|w|d]_strided
Differential Revision: https://reviews.llvm.org/D136172
|
|
This patch adds the assembly/disassembly for the following instructions:
ZERO (ZT0): Zero ZT0.
LDR (ZT0): Load ZT0 register.
STR (ZT0): Store ZT0 register.
MOVT (scalar to ZT0): Move 8 bytes from general-purpose register to ZT0.
(ZT0 to scalar): Move 8 bytes from ZT0 to general-purpose register.
Consecutive:
LUTI2 (single): Lookup table read with 2-bit indexes.
(two registers): Lookup table read with 2-bit indexes.
(four registers): Lookup table read with 2-bit indexes.
LUTI4 (single): Lookup table read with 4-bit indexes.
(two registers): Lookup table read with 4-bit indexes.
(four registers): Lookup table read with 4-bit indexes.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
This patch also adds a new register class and operand for zt0
and a another index operand uimm3s8
Differential Revision: https://reviews.llvm.org/D136088
|
|
This patch adds the assembly/disassembly for the following
predicate pair instructions:
pext: Set pair of predicates from predicate-as-counter
whilelt: While incrementing signed scalar less than scalar
whilele: While incrementing signed scalar less than or equal to scalar
whilegt: While incrementing signed scalar greater than scalar
whilege: While incrementing signed scalar greater than or equal to scalar
whilelo: While incrementing unsigned scalar lower than scalar
whilels: While incrementing unsigned scalar lower or same as scalar
whilehs: While decrementing unsigned scalar higher or same as scalar
whilehi: While decrementing unsigned scalar higher than scalar
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
Differential Revision: https://reviews.llvm.org/D136759
|
|
Currently the DecoderEmitter constructor takes a bunch of string
parameters containing bits of code to interpolate.
However, there's only two ways it can be called. The one used for most
targets which doesn't handle the SoftFail DecoderStatus (not a
problem, because they don't use SoftFail). The other mode, which is
used for ARM/AArch64, does handle SoftFail, but requires an externally
defined helper function in those targets.
This is unnecessary complication; remove the parameters, and unify
onto a single version which does support SoftFail, defining the helper
itself.
|
|
This patch adds the assembly/disassembly for the following instructions:
pext (predicate) : Set predicate from predicate-as-counter
ptrue (predicate-as-counter) : Initialise predicate-as-counter to all active
This patch also introduces the predicate-as-counter registers pn8, etc.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
Differential Revision: https://reviews.llvm.org/D136678
|
|
This patch adds the assembly/disassembly for the following instructions:
For INT:
ADD(array results, multiple vectors): Add multi-vector to multi-vector with ZA array vector results.
SUB(array results, multiple vectors): Subtract multi-vector from multi-vector with ZA array vector results.
For FP:
FMLA (multiple vectors): Multi-vector floating-point fused multiply-add.
FMLS (multiple vectors): Multi-vector floating-point fused multiply-subtract.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
This patch also adds a register operand to represent multiples of ZA multi-vectors.
They are:
ZZ_s_mul_r, ZZ_d_mul_r, ZZZZ_s_mul_r and ZZZZ_d_mul_r
and represent the Zn or Zm times 2 or 4 according to the vector group.
Depends on: D135455
Differential Revision: https://reviews.llvm.org/D135468
|
|
This patch adds the assembly/disassembly for the following instructions:
For INT:
ADD(array results, multiple and single vector): Add replicated single
vector to multi-vector with ZA array vector results.
SUB(array results, multiple and single vector): Subtract replicated single
vector from multi-vector with ZA array vector results.
For FP:
FMLA (multiple and single vector): Multi-vector floating-point fused
multiply-add by vector.
FMLS (multiple and single vector): Multi-vector floating-point
multiply-subtract long by vector.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
The Matriz Operand has 2 new sizes 32(.s) and 64(.d) bits
(MatrixOp32 and MatrixOp64)
Depends on: D135448
Depends on: D135952
Differential Revision: https://reviews.llvm.org/D135455
|
|
Identified with readability-qualified-auto.
|
|
With C++17 there is no Clang pedantic warning or MSVC C5051.
|
|
Currently, when llvm-objdump is disassembling a code section and
encounters a point where no instruction can be decoded, it uses the
same policy on all targets: consume one byte of the section, emit it
as "<unknown>", and try disassembling from the next byte position.
On an architecture where instructions are always 4 bytes long and
4-byte aligned, this makes no sense at all. If a 4-byte word cannot be
decoded as an instruction, then the next place that a valid
instruction could //possibly// be found is 4 bytes further on.
Disassembling from a misaligned address can't possibly produce
anything that the code generator intended, or that the CPU would even
attempt to execute.
This patch introduces a new MCDisassembler virtual method called
`suggestBytesToSkip`, which allows each target to choose its own
resynchronization policy. For Arm (as opposed to Thumb) and AArch64,
I've filled in the new method to return a fixed width of 4.
Thumb is a more interesting case, because the criterion for
identifying 2-byte and 4-byte instruction encodings is very simple,
and doesn't require the particular instruction to be recognized. So
`suggestBytesToSkip` is also passed an ArrayRef of the bytes in
question, so that it can take that into account. The new test case
shows Thumb disassembly skipping over two unrecognized instructions,
and identifying one as 2-byte and one as 4-byte.
For targets other than Arm and AArch64, this is NFC: the base class
implementation of `suggestBytesToSkip` still returns 1, so that the
existing behavior is unchanged. Other targets can fill in their own
implementations as they see fit; I haven't attempted to choose a new
behavior for each one myself.
I've updated all the call sites of `MCDisassembler::getInstruction` in
llvm-objdump, and also one in sancov, which was the only other place I
spotted the same idiom of `if (Size == 0) Size = 1` after a call to
`getInstruction`.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D130357
|
|
MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter
to specify either the instruction size or the operand size depending on
the architecture. However, for proper symbolic disassembly on X86, we
need to know both sizes, as an instruction can have two operands, and
the instruction size cannot be reliably calculated based on the operand
offset and its size. Hence, split Size into OpSize and InstSize.
For X86, the new interface allows to fix a couple of issues:
* Correctly adjust the value of PC-relative operands.
* Set operand size to zero when the operand is specified implicitly.
Differential Revision: https://reviews.llvm.org/D126101
|
|
disassembly.
This patch simplifies the switch statement in getInstruction to add
implicit operands (register ZA and Immediate equal to zero)
in the SME operands when disassembly.
The register ZA and the zero immediate can be added by checking the operand
in MCInstDesc.
Differential Revision: https://reviews.llvm.org/D125534
|
|
The name `MCFixedLenDisassembler.h` is out of date after D120958.
Rename it as `MCDecoderOps.h` to reflect the change.
Reviewed By: myhsu
Differential Revision: https://reviews.llvm.org/D124987
|
|
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D122245
|
|
|
|
This family of instructions includes CPYF (copy forward), CPYB (copy
backward), SET (memset) and SETG (memset + initialise MTE tags), with
some sub-variants to indicate whether address translation is done in a
privileged or unprivileged way. For the copy instructions, you can
separately specify the read and write translations (so that kernels
can safely use these instructions in syscall handlers, to memcpy
between the calling process's user-space memory map and the kernel's
own privileged one).
The unusual thing about these instructions is that they write back to
multiple registers, because they perform an implementation-defined
amount of copying each time they run, and write back to _all_ the
address and size registers to indicate how much remains to be done
(and the code is expected to loop on them until the size register
becomes zero). But this is no problem in LLVM - you just define each
instruction to have multiple outputs, multiple inputs, and a set of
constraints tying their register numbers together appropriately.
This commit introduces a special subtarget feature called MOPS (after
the name the spec gives to the CPU id field), which is a dependency of
the top-level 8.8-A feature, and uses that to enable most of the new
instructions. The SETMG instructions also depend on MTE (and the test
checks that).
Differential Revision: https://reviews.llvm.org/D116157
|
|
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
|
|
Changes in architecture revision 00eac1:
* Tile slice index offset no longer prefixed with '#'.
* The syntax for 128-bit (.Q) ZA tile slice accesses must now include
an explicit zero index.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-09
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D111212
|
|
A small subset of the NEON instruction set is legal in streaming mode.
This patch adds support for the following vector to integer move
instructions:
0x00 1110 0000 0001 0010 11xx xxxx xxxx # SMOV W|Xd,Vn.B[0]
0x00 1110 0000 0010 0010 11xx xxxx xxxx # SMOV W|Xd,Vn.H[0]
0100 1110 0000 0100 0010 11xx xxxx xxxx # SMOV Xd,Vn.S[0]
0000 1110 0000 0001 0011 11xx xxxx xxxx # UMOV Wd,Vn.B[0]
0000 1110 0000 0010 0011 11xx xxxx xxxx # UMOV Wd,Vn.H[0]
0000 1110 0000 0100 0011 11xx xxxx xxxx # UMOV Wd,Vn.S[0]
0100 1110 0000 1000 0011 11xx xxxx xxxx # UMOV Xd,Vn.D[0]
Only the zero index variants are legal, all others indexes are illegal.
To support this, new instructions are defined specifically for zero
index which is hardcoded, along an implicit 'VectorIndex0' operand.
Since the index operand is implicit and takes no bits in the encoding,
custom decoding is required to add the operand.
I'm not sure if this is the best approach but the predicate constraint
on a subset of an operand is unusual. Would be interested to hear some
alternatives.
The instructions are predicated on 'HasNEONorStreamingSVE', i.e. they're
enabled by either +neon or +streaming-sve. This follows on from the work
in D106272 to support the subset of SVE(2) instructions that are legal
in streaming mode.
Depends on D107902.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D107903
|
|
The register classes are generated by TableGen, use them instead of
handwritten tables.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D107763
|
|
The decoder function and table are the same as FPR128, use that instead.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D107644
|