Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds support for following sme2.1 movaz instructions and
the spec is available here [1].
1. MOVAZ (array to vector, two registers).
2. MOVAZ (array to vector, four registers).
3. MOVAZ (tile to vector, single).
[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
|
|
This patch adds support for following sme2.1 luti2 and luti4 instructions, spec is
available here [1]
1. LUTI2 (two registers) strided.
2. LUTI2 (four registers) strided.
3. LUTI4 (two registers) strided.
4. LUTI4 (four registers) strided.
[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
|
|
This patch fixes the syntax of sve2p1 "dupq" instruction by modifying the way
2nd operand does the encoding and decoding using the [<imm>] value.
dupq makes use of already existing aarch64_ins_sve_index and aarch64_ext_sve_index
inserter and extractor functions. The definitions of aarch64_ins_sve_index_imm (inserter)
and aarch64_ext_sve_index_imm (extractor) is removed in this patch.
This issues was reported here:
https://sourceware.org/pipermail/binutils/2024-February/132408.html
|
|
This includes all the instructions under the following features:
- FEAT_FP8FMA (+fp8fma)
- FEAT_FP8DOT4 (+fp8dot4)
- FEAT_FP8DOT2 (+fp8dot2)
- FEAT_SSVE_FP8FMA (+ssve-fp8fma)
- FEAT_SSVE_FP8DOT4 (+ssve-fp8dot4)
- FEAT_SSVE_FP8DOT2 (+ssve-fp8dot2)
|
|
Introduces instructions for the SME2 lutv2 extension for AArch64. They
are documented in the following document:
* ARM DDI0602
For both luti4 instructions, we introduced an operand called
SME_Znx2_BIT_INDEX. We use the existing function parse_vector_reg_list
for parsing but modified that function so that it can accept operands
without qualifiers and rejects instructions that have operands with
qualifiers but are not supposed to have operands with qualifiers.
For disassembly, we modified print_register_list so that it could
accept register lists without qualifiers.
For one luti4 instruction, we introduced a SME_Zdnx4_STRIDED. It is
similar to SME_Ztx4_STRIDED and we could use existing code for parsing,
encoding, and disassembly.
For movt instruction, we introduced an operand called SME_ZT0_INDEX2_12.
This is a ZT0 register with a bit index encoded in [13:12]. It is
similar to SME_ZT0_INDEX.
We also introduced an iclass named sme_size_12_b so that we can encode
size bits [13:12] correctly when only 'b' is allowed as qualifier.
|
|
Introduces instructions for the Advanced SIMD lut extension for AArch64. They are documented in the following links:
* luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
* luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en
These instructions needed definition of some new operands. We will first
discuss operands for the third operand of the instructions and then
discuss a vector register list operand needed for the second operand.
The third operands are vectors with bit indices and without type
qualifiers. They are called Em_INDEX1_14, Em_INDEX2_13, and Em_INDEX3_12
and they have 1 bit, 2 bit, and 3 bit indices respectively. For these
new operands, we defined new parsing case branch. The lsb and width of
these operands are the same as many existing but the convention is to
give different names to fields that serve different purpose so we
introduced new fields in aarch64-opc.c and aarch64-opc.h for these new
operands.
For the second operand of these instructions, we introduced a new
operand called LVn_LUT. This represents a vector register list with
stride 1. We defined new inserter and extractor for this new operand and
it is encoded in FLD_Rn. We are enforcing the number of registers in the
reglist using opcode flag rather than operand flag as this is what other
SIMD vector register list operands are doing. The disassembly also uses
opcode flag to print the correct number of registers.
|
|
Fix integer value being returned from boolean function, as introduced
in `aarch64: Remove asserts from operand qualifier decoders [PR31595]'.
|
|
Given that the disassembler should never abort when decoding
(potentially random) data, assertion statements in the
`get_*reg_qualifier_from_value' function family prove problematic.
Consider the random 32-bit word W, encoded in a data segment and
encountered on execution of `objdump -D <obj_name>'.
If:
(W & ~opcode_mask) == valid instruction
Then before `print_insn_aarch64_word' has a chance to report the
instruction as potentially undefined, an attempt will be made to have
the qualifiers for the instruction's register operands (if any)
decoded. If the relevant bits do not map onto a valid qualifier for
the matched instruction-like word, an abort will be triggered and the
execution of objdump aborted.
As this scenario is perfectly feasible and, in light of the fact that
objdump must successfully decode all sections of a given object file,
it is not appropriate to assert in this family of functions.
Therefore, we add a new pseudo-qualifier `AARCH64_OPND_QLF_ERR' for
handling invalid qualifier-associated values and re-purpose the
assertion conditions in qualifier-retrieving functions to be the
predicate guarding the returning of the calculated qualifier type.
If the predicate fails, we return this new qualifier and allow the
caller to handle the error as appropriate.
As these functions are called either from within
`aarch64_extract_operand' or `do_special_decoding', both of which are
expected to return non-zero values, it suffices that callers return
zero upon encountering `AARCH64_OPND_QLF_ERR'.
Ar present the error presented in the hypothetical scenario has been
encountered in `get_sreg_qualifier_from_value', but the change is made
to the whole family to keep the interface consistent.
Bug: https://sourceware.org/PR31595
|
|
The following instructions are added in this patch:
- ADDPT and SUBPT - Add/Subtract checked pointer
- MADDPT and MSUBPT - Multiply Add/Subtract checked pointer
These instructions are part of Checked Pointer Arithmetic extension.
This patch adds assembler and disassembler support for these instructions
with relevant checks. Tests are included as well.
A new flag "+cpa" added to documentation. This flag enables CPA extension.
Regression tested on the aarch64-none-linux-gnu target and no regressions
have been found.
|
|
|
|
Beyond the need to encode any registers involved in data transfer and
the address base register for load/stores, it is necessary to specify
the data register addressing mode and whether the address register is
to be pre/post-indexed, whereby loads may be post-indexed and stores
pre-indexed with write-back.
The use of a single bit to specify both the indexing mode and indexing
value requires a novel function be written to accommodate this for
address operand insertion in assembly and another for extraction in
disassembly, along with the definition of two insn fields for use with
these instructions.
This therefore defines the following functions:
- aarch64_ins_rcpc3_addr_opt_offset
- aarch64_ins_rcpc3_addr_offset
- aarch64_ext_rcpc3_addr_opt_offset
- aarch64_ext_rcpc3_addr_offset
It extends the `do_special_{encoding|decoding}' functions and defines
two rcpc3 instruction fields:
- FLD_opc2
- FLD_rcpc3_size
|
|
There are some tlbi operations that don't have a corresponding tlbip operation,
but we were incorrectly using the same list for both. Add the missing tlbi
*nxs operations, and use the F_REG_128 flag to filter tlbi operations that
don't have a tlbip analogue. For increased clarity, I have also used a macro
to reduce duplication between the 'nxs' and non-'nxs' variants, and added a
test to verify that no invalid combinations are accepted.
Additionally, fix two missing checks for AARCH64_OPND_SYSREG_TLBIP that were
preventing disassembly of tlbip instructions.
|
|
Hi,
This patch add support for SVE2.1 instructions ld1q,
ld2q, ld3q and ld4q, st1q, st2q, st3q and st4q.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for SVE2.1 instruction dupq, eorqv and extq.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for FEAT_SVE2p1 (SVE2.1 Extension) feature
along with +sve2p1 optional flag to enabe this feature.
Also support for following SVE2p1 instructions is added
addqv, andqv, smaxqv, sminqv, umaxqv, uminqv and uminqv.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for FEAT_SME2p1 and "movaz" instructions
along with the optional flag +sme2p1.
Following "movaz" instructions are add:
Move and zero two ZA tile slices to vector registers.
Move and zero four ZA tile slices to vector registers.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
With the addition of 128-bit system registers to the Arm architecture
starting with Armv9.4-a, a mechanism for manipulating their contents
is introduced with the `msrr' and `mrrs' instruction pair.
These move values from one such 128-bit system register into a pair of
contiguous general-purpose registers and vice-versa, as for example:
msrr ttlb0_el1, x0, x1
mrrs x0, x1, ttlb0_el1
This patch adds the necessary support for these instructions, adding
checks for system-register width by defining a new operand type in the
form of `AARCH64_OPND_SYSREG128' and the `aarch64_sys_reg_128bit_p'
predicate, responsible for checking whether the requested system
register table entry is marked as implemented in the 128-bit mode via
the F_REG_128 flag.
|
|
Mirroring the use of the `sys' - System Instruction assembly
instruction, this implements its 128-bit counterpart, `sysp'.
This optionally takes two contiguous general-purpose registers
starting at an even number or, when these are omitted, by default
sets both of these to xzr.
Syntax:
sysp #<op1>, <Cn>, <Cm>, #<op2>{, <Xt1>, <Xt2>}
|
|
Analysis of the allowed operand values for `sysp' and `tlbip' reveals
a significant departure from the allowed behavior for operand register
pairs (hitherto labeled AARCH64_OPND_PAIRREG) observed for other
insns in this category.
For instructions `casp', `mrrs' and `msrr' the register pair must
always start at an even index and the second register in the pair is
the index + 1. This precludes the use of xzr as the first register,
given it corresponds to register number 31.
This is different in the case of `sysp' and `tlbip', however. These
allow the use of xzr and, where the first operand in the pair is
omitted, this is the default value assigned to it. When this
operand is assigned xzr, it is expected that the second operand will
likewise take on a value of xzr.
These two instructions therefore "break" two rules of register pairs:
* The first of the two registers is odd-numbered.
* The index of the second register is equal to that of the first,
and not n+1.
To allow for this departure from hitherto standard behavior, we
extend the functionality of the assembler by defining an extension of
the AARCH64_OPND_PAIRREG, called AARCH64_OPND_PAIRREG_OR_XZR.
It is used in defining `sysp' and `tlbip' and allows
`operand_general_constraint_met_p' to allow the pair to both take on
the value of xzr.
|
|
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:
1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
author I haven't committed, 'Kalray SA.', to cover gas testsuite
files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
|
|
The AArch64 feature-flag code is currently limited to a maximum
of 64 features. This patch reworks it so that the limit can be
increased more easily. The basic idea is:
(1) Turn the ARM_FEATURE_FOO macros into an enum, with the enum
counting bit positions.
(2) Make the feature-list macros take an array index argument
(currently always 0). The macros then return the
aarch64_feature_set contents for that array index.
An N-element array would then be initialised as:
{ MACRO (0), ..., MACRO (N - 1) }
(3) Provide convenience macros for initialising an
aarch64_feature_set for:
- a single feature
- a list of individual features
- an architecture version
- an architecture version + a list of additional features
(2) and (3) use the preprocessor to generate static initialisers.
The main restriction was that uses of the same preprocessor macro
cannot be nested. So if a macro wants to do something for N individual
arguments, it needs to use a chain of N macros to do it. There then
needs to be a way of deriving N, as a preprocessor token suitable for
pasting.
The easiest way of doing that was to precede each list of features
by the number of features in the list. So an aarch64_feature_set
initialiser for three features A, B and C would be written:
AARCH64_FEATURES (3, A, B, C)
This scheme makes it difficult to keep AARCH64_FEATURE_CRYPTO as a
synonym for SHA2+AES, so the patch expands the former to the latter.
|
|
gprofng uses insn_type in print_address_func().
But insn_type is always zero on aarch64.
opcodes/ChangeLog:
2023-09-07 Vladimir Mezentsev <vladimir.mezentsev@oracle.com>
* opcodes/aarch64-dis.c (print_insn_aarch64_word): Set insn_type for
branch instructions.
|
|
Historically, flags and variables relating to architectural revisions
for the A-profile architecture omitted the trailing `A' such that, for
example, assembling for `-march=armv8.4-a' set the `AARCH64_ARCH_V8_4'
flag in the assembler.
This leads to some ambiguity, since Binutils also targets the
R-profile Arm architecture. Therefore, it seems prudent to have
everything associated with the A-profile cores end in `A' and likewise
`R' for the R-profile. Referring back to the example above, the flag
set for `-march=armv8.4-a' is better characterized if labeled
`AARCH64_ARCH_V8_4A'.
The only exception to the rule of appending `A' to variables is found
in the handling of the `AARCH64_FEATURE_V8' macro, as it is the
baseline from which ALL processors derive and should therefore be left
unchanged.
In reflecting the `ARM' architectural nomenclature choices, where we
have `ARM_ARCH_V8A' and `ARM_ARCH_V8R', the choice is made to not have
an underscore separating the numerical revision number and the
A/R-profile indicator suffix. This has meant that renaming of
R-profile related flags and variables was warranted, thus going from
`.*_[vV]8_[rR]' to `.*_[vV]8[rR]'.
Finally, this is more in line with conventions within GCC and adds consistency
across the toolchain.
gas/ChangeLog:
* gas/config/tc-aarch64.c:
(aarch64_cpus): Reference to arch feature macros updated.
(aarch64_archs): Likewise.
include/ChangeLog:
* include/opcode/aarch64.h:
(AARCH64_FEATURE_V8A): Updated name: V8_A -> V8A.
(AARCH64_FEATURE_V8_1A): A-suffix added.
(AARCH64_FEATURE_V8_2A): Likewise.
(AARCH64_FEATURE_V8_3A): Likewise.
(AARCH64_FEATURE_V8_4A): Likewise.
(AARCH64_FEATURE_V8_5A): Likewise.
(AARCH64_FEATURE_V8_6A): Likewise.
(AARCH64_FEATURE_V8_7A): Likewise.
(AARCH64_FEATURE_V8_8A):Likewise.
(AARCH64_FEATURE_V9A): Likewise.
(AARCH64_FEATURE_V8R): Updated name: V8_R -> V8R.
(AARCH64_ARCH_V8A_FEATURES): Updated name: V8_A -> V8A.
(AARCH64_ARCH_V8_1A_FEATURES): A-suffix added.
(AARCH64_ARCH_V8_2A_FEATURES): Likewise.
(AARCH64_ARCH_V8_3A_FEATURES): Likewise.
(AARCH64_ARCH_V8_4A_FEATURES): Likewise.
(AARCH64_ARCH_V8_5A_FEATURES): Likewise.
(AARCH64_ARCH_V8_6A_FEATURES): Likewise.
(AARCH64_ARCH_V8_7A_FEATURES): Likewise.
(AARCH64_ARCH_V8_8A_FEATURES): Likewise.
(AARCH64_ARCH_V9A_FEATURES): Likewise.
(AARCH64_ARCH_V9_1A_FEATURES): Likewise.
(AARCH64_ARCH_V9_2A_FEATURES): Likewise.
(AARCH64_ARCH_V9_3A_FEATURES): Likewise.
(AARCH64_ARCH_V8A): Updated name: V8_A -> V8A.
(AARCH64_ARCH_V8_1A): A-suffix added.
(AARCH64_ARCH_V8_2A): Likewise.
(AARCH64_ARCH_V8_3A): Likewise.
(AARCH64_ARCH_V8_4A): Likewise.
(AARCH64_ARCH_V8_5A): Likewise.
(AARCH64_ARCH_V8_6A): Likewise.
(AARCH64_ARCH_V8_7A): Likewise.
(AARCH64_ARCH_V8_8A): Likewise.
(AARCH64_ARCH_V9A): Likewise.
(AARCH64_ARCH_V9_1A): Likewise.
(AARCH64_ARCH_V9_2A): Likewise.
(AARCH64_ARCH_V9_3A): Likewise.
(AARCH64_ARCH_V8_R): Updated name: V8_R -> V8R.
opcodes/ChangeLog:
* opcodes/aarch64-opc.c (SR_V8A): Updated name: V8_A -> V8A.
(SR_V8_1A): A-suffix added.
(SR_V8_2A): Likewise.
(SR_V8_3A): Likewise.
(SR_V8_4A): Likewise.
(SR_V8_6A): Likewise.
(SR_V8_7A): Likewise.
(SR_V8_8A): Likewise.
(aarch64_sys_regs): Reference to arch feature macros updated.
(aarch64_pstatefields): Reference to arch feature macros updated.
(aarch64_sys_ins_reg_supported_p): Reference to arch feature macros
updated.
* opcodes/aarch64-tbl.h:
(aarch64_feature_v8_2a): a-suffix added.
(aarch64_feature_v8_3a): Likewise.
(aarch64_feature_fp_v8_3a): Likewise.
(aarch64_feature_v8_4a): Likewise.
(aarch64_feature_fp_16_v8_2a): Likewise.
(aarch64_feature_v8_5a): Likewise.
(aarch64_feature_v8_6a): Likewise.
(aarch64_feature_v8_7a): Likewise.
(aarch64_feature_v8r): Updated name: v8_r-> v8r.
(ARMV8R): Updated name: V8_R-> V8R.
(ARMV8_2A): A-suffix added.
(ARMV8_3A): Likewise.
(FP_V8_3A): Likewise.
(ARMV8_4A): Likewise.
(FP_F16_V8_2A): Likewise.
(ARMV8_5): Likewise.
(ARMV8_6A): Likewise.
(ARMV8_6A_SVE): Likewise.
(ARMV8_7A): Likewise.
(V8_2A_INSN): `A' added to macro symbol.
(V8_3A_INSN): Likewise.
(V8_4A_INSN): Likewise.
(FP16_V8_2A_INSN): Likewise.
(V8_5A_INSN): Likewise.
(V8_6A_INSN): Likewise.
(V8_7A_INSN): Likewise.
(V8R_INSN): Updated name: V8_R-> V8R.
|
|
There are two instruction formats here:
- SQRSHR, SQRSHRU and UQRSHR, which operate on lists of two
or four registers.
- SQRSHRN, SQRSHRUN and UQRSHRN, which operate on lists of
four registers.
These are the first SME2 instructions to have immediate operands.
The patch makes sure that, when parsing SME2 instructions with
immediate operands, the new predicate-as-counter registers are
parsed as registers rather than as #-less immediates.
|
|
There are two instruction formats here:
- SQCVT, SQCVTU and UQCVT, which operate on lists of two or
four registers.
- SQCVTN, SQCVTUN and UQCVTN, which operate on lists of
four registers.
|
|
The {BF,F,S,U}MLAL and {BF,F,S,U}MLSL instructions share the same
encoding. They are the first instance of a ZA (as opposed to ZA tile)
operand having a range of offsets. As with ZA tiles, the expected
range size is encoded in the operand-specific data field.
|
|
This patch adds the SME2 multi-register forms of F{MAX,MIN}{,NM}
and {S,U}{MAX,MIN}. SQDMULH, SRSHL and URSHL have the same form
as SMAX etc., so the patch adds them too.
|
|
Add support for the SME2 ADD. SUB, FADD and FSUB instructions.
SUB and FSUB have the same form as ADD and FADD, except that
ADD also has a 2-operand accumulating form.
The 64-bit ADD/SUB instructions require FEAT_SME_I16I64 and the
64-bit FADD/FSUB instructions require FEAT_SME_F64F64.
These are the first instructions to have tied register list
operands, as opposed to tied single registers.
The parse_operands change prevents unsuffixed Z registers (width==-1)
from being treated as though they had an Advanced SIMD-style suffix
(.4s etc.). It means that:
Error: expected element type rather than vector type at operand 2 -- `add za\.s\[w8,0\],{z0-z1}'
becomes:
Error: missing type suffix at operand 2 -- `add za\.s\[w8,0\],{z0-z1}'
|
|
SME2 adds lookup table instructions for quantisation. They use
a new lookup table register called ZT0.
LUTI2 takes an unsuffixed SVE vector index of the form Zn[<imm>],
which is the first time that this syntax has been used.
|
|
Implementation-wise, the main things to note here are:
- the WHILE* instructions have forms that return a pair of predicate
registers. This is the first time that we've had lists of predicate
registers, and they wrap around after register 15 rather than after
register 31.
- the predicate-as-counter WHILE* instructions have a fourth operand
that specifies the vector length. We can treat this as an enumeration,
except that immediate values aren't allowed.
- PEXT takes an unsuffixed predicate index of the form PN<n>[<imm>].
This is the first instance of a vector/predicate index having
no suffix.
|
|
SME2 adds LD1 and ST1 variants for lists of 2 and 4 registers.
The registers can be consecutive or strided. In the strided case,
2-register lists have a stride of 8, starting at register x0xxx.
4-register lists have a stride of 4, starting at register x00xx.
The instructions are predicated on a predicate-as-counter register in
the range pn8-pn15. Although we already had register fields with upper
bounds of 7 and 15, this is the first plain register operand to have a
nonzero lower bound. The patch uses the operand-specific data field
to record the minimum value, rather than having separate inserters
and extractors for each lower bound. This in turn required adding
an extra bit to the field.
|
|
SME2 defines new MOVA instructions for moving multiple registers
to and from ZA. As with SME, the instructions are also available
through MOV aliases.
One notable feature of these instructions (and many other SME2
instructions) is that some register lists must start at a multiple
of the list's size. The patch uses the general error "start register
out of range" when this constraint isn't met, rather than an error
specifically about multiples. This ensures that the error is
consistent between these simple consecutive lists and later
strided lists, for which the requirements aren't a simple multiple.
|
|
SME2 adds various new 3-bit immediate fields, so this patch adds
an lsb position suffix to the name of the field that we already have.
|
|
SME2 has instructions that accept strided register lists,
such as { z0.s, z4.s, z8.s, z12.s }. The purpose of this
patch is to extend binutils to support such lists.
The parsing code already had (unused) support for strides of 2.
The idea here is instead to accept all strides during parsing
and reject invalid strides during constraint checking.
The SME2 instructions that accept strided operands also have
non-strided forms. The errors about invalid strides therefore
take a bitmask of acceptable strides, which allows multiple
possibilities to be summed up in a single message.
I've tried to update all code that handles register lists.
|
|
Some FLD_imm* suffixes used a counting scheme such as FLD_immN,
FLD_immN_2, FLD_immN_3, etc., while others used the lsb as the
suffix. The latter seems more mnemonic, and was a big help
in doing the SME2 work.
Similarly, the _10 suffix on FLD_SME_size_10 was nonobvious.
Presumably it indicated a 2-bit field, but it actually starts
in bit 22.
|
|
If an instruction has invalid qualifiers, GAS would report the
error against the final opcode entry that got to the qualifier-
checking stage. It seems better to report the error against
the opcode entry that had the closest match, just like we
pick the closest match within an opcode entry for the
"did you mean this?" message.
This patch adds the number of invalid operands as an
argument to AARCH64_OPDE_INVALID_VARIANT and then picks the
AARCH64_OPDE_INVALID_VARIANT with the lowest argument.
|
|
za_tile_vector is also used for indexing ZA as a whole, rather than
just for indexing tiles. The former is more common than the latter
in SME2, so this patch generalises the name to "indexed_za".
The patch also names the associated structure, so that later patches
can reuse it during parsing.
|
|
This patch makes all SME instructions use F_STRICT, so that qualifiers
have to be provided explicitly rather than being inferred from other
operands. The main change is to move the qualifier setting from the
operand-level decoders to the opcode level.
This is one step towards consolidating the ZA parsing code and
extending it to handle SME2.
|
|
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
|
|
This commit enables disassembler styling for AArch64. After this
commit it is possible to have objdump style AArch64 disassembler
output (using --disassembler-color option). Once the required GDB
patches are merged, GDB will also style the disassembler output.
The changes to support styling are mostly split between two files
opcodes/aarch64-dis.c and opcodes/aarch64-opc.c.
The entry point for the AArch64 disassembler can be found in
aarch64-dis.c, this file handles printing the instruction mnemonics,
and assembler directives (e.g. '.byte', '.word', etc). Some operands,
mostly relating to assembler directives are also printed from this
file. This commit changes all of this to pass through suitable
styling information.
However, for most "normal" instructions, the instruction operands are
printed using a two step process. From aarch64-dis.c, in the
print_operands function, the function aarch64_print_operand is called,
this function is in aarch64-opc.c, and converts an instruction operand
into a string. Then, back in print_operands (aarch64-dis.c), the
operand string is printed.
Unfortunately, the string returned by aarch64_print_operand can be
quite complex, it will include syntax elements, like '[' and ']', in
addition to register names and immediate values. In some cases, a
single operand will expand into what will appear (to the user) as
multiple operands separated with a ','.
This makes the task of styling more complex, all these different
components need to by styled differently, so we need to get the
styling information out of aarch64_print_operand in some way.
The solution that I propose here is similar to the solution that I
used for the i386 disassembler.
Currently, aarch64_print_operand uses snprintf to write the operand
text into a buffer provided by the caller.
What I propose is that we pass an extra argument to the
aarch64_print_operand function, this argument will be a structure, the
structure contains a callback function and some state.
When aarch64_print_operand needs to format part of its output this can
be done by using the callback function within the new structure, this
callback returns a string with special embedded markers that indicate
which mode should be used for each piece of text. Back in
aarch64-dis.c we can spot these special style markers and use this to
split the disassembler output up and apply the correct style to each
piece.
To make aarch64-opc.c clearer a series of new static functions have
been added, e.g. 'style_reg', 'style_imm', etc. Each of these
functions formats a piece of text in a different style, 'register' and
'immediate' in this case.
Here's an example taken from aarch64-opc.c of the new functions in
use:
snprintf (buf, size, "[%s, %s]!",
style_reg (styler, base),
style_imm (styler, "#%d", opnd->addr.offset.imm));
The aarch64_print_operand function is also called from the assembler
to aid in printing diagnostic messages. Right now I have no plans to
add styling to the assembler output, and so, the callback function
used in the assembler ignores the styling information and just returns
an plain string.
I've used the source files in gas/testsuite/gas/aarch64/ for testing,
and have manually gone through and checked that the styling looks
reasonable, however, I'm not an AArch64 expert, so it is possible that
the odd piece is styled incorrectly. Please point out any mistakes
I've made.
With objdump disassembler color turned off, there should be no change
in the output after this commit.
|
|
The function aarch64_print_operand (aarch64-opc.c) is responsible for
converting an instruction operand into the textual representation of
that operand.
In some cases, a comment is included in the operand representation,
though this (currently) only happens for the last operand of the
instruction.
In a future commit I would like to enable the new libopcodes styling
for AArch64, this will allow objdump and GDB[1] to syntax highlight
the disassembler output, however, having operands and comments
combined in a single string like this makes such styling harder.
In this commit, I propose to extend aarch64_print_operand to take a
second buffer. Any comments for the instruction are written into this
extra buffer. The two callers of aarch64_print_operand are then
updated to pass an extra buffer, and print any resulting comment.
In this commit no styling is added, that will come later. However, I
have adjusted the output slightly. Before this commit some comments
would be separated from the instruction operands with a tab character,
while in other cases the comment was separated with two single spaces.
After this commit I use a single tab character in all cases. This
means a few test cases needed updated. If people would prefer me to
move everyone to use the two spaces, then just let me know. Or maybe
there was a good reason why we used a mix of styles, I could probably
figure out a way to maintain the old output exactly if that is
critical.
Other than that, there should be no user visible changes after this
commit.
[1] GDB patches have not been merged yet, but have been posted to the
GDB mailing list:
https://sourceware.org/pipermail/gdb-patches/2022-June/190142.html
|
|
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.
The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
|
|
AARCH64_OPDE_EXPECTED_A_AFTER_B and AARCH64_OPDE_A_SHOULD_FOLLOW_B
are not paired with an error string, but we had an assert that the
error was nonnull. Previously this assert was testing uninitialised
memory and so could pass or fail arbitrarily.
opcodes/
* aarch64-opc.c (verify_mops_pme_sequence): Initialize the error
field to null for AARCH64_OPDE_EXPECTED_A_AFTER_B and
AARCH64_OPDE_A_SHOULD_FOLLOW_B.
* aarch64-dis.c (print_verifier_notes): Move assert.
|
|
The MOPS instructions should be used as a triple, such as:
cpyfp [x0]!, [x1]!, x2!
cpyfm [x0]!, [x1]!, x2!
cpyfe [x0]!, [x1]!, x2!
The registers should also be the same for each writeback operand.
This patch adds a warning for code that doesn't follow this rule,
along similar lines to the warning that we already emit for
invalid uses of MOVPRFX.
include/
* opcode/aarch64.h (C_SCAN_MOPS_P, C_SCAN_MOPS_M, C_SCAN_MOPS_E)
(C_SCAN_MOPS_PME): New macros.
(AARCH64_OPDE_A_SHOULD_FOLLOW_B): New aarch64_operand_error_kind.
(AARCH64_OPDE_EXPECTED_A_AFTER_B): Likewise.
(aarch64_operand_error): Make each data value a union between
an int and a string.
opcodes/
* aarch64-tbl.h (MOPS_CPY_OP1_OP2_INSN): Add scan flags.
(MOPS_SET_OP1_OP2_INSN): Likewise.
* aarch64-opc.c (set_out_of_range_error): Update after change to
aarch64_operand_error.
(set_unaligned_error, set_reg_list_error): Likewise.
(init_insn_sequence): Use a 3-instruction sequence for
MOPS P instructions.
(verify_mops_pme_sequence): New function.
(verify_constraints): Call it.
* aarch64-dis.c (print_verifier_notes): Handle
AARCH64_OPDE_A_SHOULD_FOLLOW_B and AARCH64_OPDE_EXPECTED_A_AFTER_B.
gas/
* config/tc-aarch64.c (operand_mismatch_kind_names): Add entries
for AARCH64_OPDE_A_SHOULD_FOLLOW_B and AARCH64_OPDE_EXPECTED_A_AFTER_B.
(operand_error_higher_severity_p): Check that
AARCH64_OPDE_A_SHOULD_FOLLOW_B and AARCH64_OPDE_EXPECTED_A_AFTER_B
come between AARCH64_OPDE_RECOVERABLE and AARCH64_OPDE_SYNTAX_ERROR;
their relative order is not significant.
(record_operand_error_with_data): Update after change to
aarch64_operand_error.
(output_operand_error_record): Likewise. Handle
AARCH64_OPDE_A_SHOULD_FOLLOW_B and AARCH64_OPDE_EXPECTED_A_AFTER_B.
* testsuite/gas/aarch64/mops_invalid_2.s,
testsuite/gas/aarch64/mops_invalid_2.d,
testsuite/gas/aarch64/mops_invalid_2.l: New test.
|
|
This patch adds support for FEAT_MOPS, an Armv8.8-A extension
that provides memcpy and memset acceleration instructions.
I took the perhaps controversial decision to generate the individual
instruction forms using macros rather than list them out individually.
This becomes useful with a follow-on patch to check that code follows
the correct P/M/E sequence.
[https://developer.arm.com/documentation/ddi0596/2021-09/Base-Instructions?lang=en]
include/
* opcode/aarch64.h (AARCH64_FEATURE_MOPS): New macro.
(AARCH64_ARCH_V8_8): Make armv8.8-a imply AARCH64_FEATURE_MOPS.
(AARCH64_OPND_MOPS_ADDR_Rd): New aarch64_opnd.
(AARCH64_OPND_MOPS_ADDR_Rs): Likewise.
(AARCH64_OPND_MOPS_WB_Rn): Likewise.
opcodes/
* aarch64-asm.h (ins_x0_to_x30): New inserter.
* aarch64-asm.c (aarch64_ins_x0_to_x30): New function.
* aarch64-dis.h (ext_x0_to_x30): New extractor.
* aarch64-dis.c (aarch64_ext_x0_to_x30): New function.
* aarch64-tbl.h (aarch64_feature_mops): New feature set.
(aarch64_feature_mops_memtag): Likewise.
(MOPS, MOPS_MEMTAG, MOPS_INSN, MOPS_MEMTAG_INSN)
(MOPS_CPY_OP1_OP2_PME_INSN, MOPS_CPY_OP1_OP2_INSN, MOPS_CPY_OP1_INSN)
(MOPS_CPY_INSN, MOPS_SET_OP1_OP2_PME_INSN, MOPS_SET_OP1_OP2_INSN)
(MOPS_SET_INSN): New macros.
(aarch64_opcode_table): Add MOPS instructions.
(aarch64_opcode_table): Add entries for AARCH64_OPND_MOPS_ADDR_Rd,
AARCH64_OPND_MOPS_ADDR_Rs and AARCH64_OPND_MOPS_WB_Rn.
* aarch64-opc.c (aarch64_print_operand): Handle
AARCH64_OPND_MOPS_ADDR_Rd, AARCH64_OPND_MOPS_ADDR_Rs and
AARCH64_OPND_MOPS_WB_Rn.
(verify_three_different_regs): New function.
* aarch64-asm-2.c: Regenerate.
* aarch64-dis-2.c: Likewise.
* aarch64-opc-2.c: Likewise.
gas/
* doc/c-aarch64.texi: Document +mops.
* config/tc-aarch64.c (parse_x0_to_x30): New function.
(parse_operands): Handle AARCH64_OPND_MOPS_ADDR_Rd,
AARCH64_OPND_MOPS_ADDR_Rs and AARCH64_OPND_MOPS_WB_Rn.
(aarch64_features): Add "mops".
* testsuite/gas/aarch64/mops.s, testsuite/gas/aarch64/mops.d: New test.
* testsuite/gas/aarch64/mops_invalid.s,
* testsuite/gas/aarch64/mops_invalid.d,
* testsuite/gas/aarch64/mops_invalid.l: Likewise.
|
|
disabled.
PR 28614
* aarch64-asm.c: Replace assert(0) with real code.
* aarch64-dis.c: Likewise.
* aarch64-opc.c: Likewise.
|
|
This patch is adding new SVE2 instructions added to support SME extension.
The following SVE2 instructions are added by the SME architecture:
* PSEL,
* REVD, SCLAMP and UCLAMP.
gas/ChangeLog:
* config/tc-aarch64.c (parse_sme_pred_reg_with_index):
New parser.
(parse_operands): New parser.
* testsuite/gas/aarch64/sme-9-illegal.d: New test.
* testsuite/gas/aarch64/sme-9-illegal.l: New test.
* testsuite/gas/aarch64/sme-9-illegal.s: New test.
* testsuite/gas/aarch64/sme-9.d: New test.
* testsuite/gas/aarch64/sme-9.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New operand
AARCH64_OPND_SME_PnT_Wm_imm.
opcodes/ChangeLog:
* aarch64-asm.c (aarch64_ins_sme_pred_reg_with_index):
New inserter.
* aarch64-dis.c (aarch64_ext_sme_pred_reg_with_index):
New extractor.
* aarch64-opc.c (aarch64_print_operand): Printout of
OPND_SME_PnT_Wm_imm.
* aarch64-opc.h (enum aarch64_field_kind): New bitfields
FLD_SME_Rm, FLD_SME_i1, FLD_SME_tszh, FLD_SME_tszl.
* aarch64-tbl.h (OP_SVE_NN_BHSD): New qualifier.
(OP_SVE_QMQ): New qualifier.
(struct aarch64_opcode): New instructions PSEL, REVD,
SCLAMP and UCLAMP.
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
|
|
This patch is adding new SME mode selection and state access instructions:
* Add SMSTART and SMSTOP instructions.
* Add SVCR system register.
gas/ChangeLog:
* config/tc-aarch64.c (parse_sme_sm_za): New parser.
(parse_operands): New parser.
* testsuite/gas/aarch64/sme-8-illegal.d: New test.
* testsuite/gas/aarch64/sme-8-illegal.l: New test.
* testsuite/gas/aarch64/sme-8-illegal.s: New test.
* testsuite/gas/aarch64/sme-8.d: New test.
* testsuite/gas/aarch64/sme-8.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New operand
AARCH64_OPND_SME_SM_ZA.
(enum aarch64_insn_class): New instruction classes
sme_start and sme_stop.
opcodes/ChangeLog:
* aarch64-asm.c (aarch64_ins_pstatefield): New inserter.
(aarch64_ins_sme_sm_za): New inserter.
* aarch64-dis.c (aarch64_ext_imm): New extractor.
(aarch64_ext_pstatefield): New extractor.
(aarch64_ext_sme_sm_za): New extractor.
* aarch64-opc.c (operand_general_constraint_met_p):
New pstatefield value for SME instructions.
(aarch64_print_operand): Printout for OPND_SME_SM_ZA.
(SR_SME): New register SVCR.
* aarch64-opc.h (F_REG_IN_CRM): New register endcoding.
* aarch64-opc.h (F_IMM_IN_CRM): New immediate endcoding.
(PSTATE_ENCODE_CRM): Encode CRm field.
(PSTATE_DECODE_CRM): Decode CRm field.
(PSTATE_ENCODE_CRM_IMM): Encode CRm immediate field.
(PSTATE_DECODE_CRM_IMM): Decode CRm immediate field.
(PSTATE_ENCODE_CRM_AND_IMM): Encode CRm and immediate
field.
* aarch64-tbl.h (struct aarch64_opcode): New SMSTART
and SMSTOP instructions.
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
|
|
This patch is adding new loads and stores defined by SME instructions.
gas/ChangeLog:
* config/tc-aarch64.c (parse_sme_address): New parser.
(parse_sme_za_hv_tiles_operand_with_braces): New parser.
(parse_sme_za_array): New parser.
(output_operand_error_record): Print error details if
present.
(parse_operands): Support new operands.
* testsuite/gas/aarch64/sme-5-illegal.d: New test.
* testsuite/gas/aarch64/sme-5-illegal.l: New test.
* testsuite/gas/aarch64/sme-5-illegal.s: New test.
* testsuite/gas/aarch64/sme-5.d: New test.
* testsuite/gas/aarch64/sme-5.s: New test.
* testsuite/gas/aarch64/sme-6-illegal.d: New test.
* testsuite/gas/aarch64/sme-6-illegal.l: New test.
* testsuite/gas/aarch64/sme-6-illegal.s: New test.
* testsuite/gas/aarch64/sme-6.d: New test.
* testsuite/gas/aarch64/sme-6.s: New test.
* testsuite/gas/aarch64/sme-7-illegal.d: New test.
* testsuite/gas/aarch64/sme-7-illegal.l: New test.
* testsuite/gas/aarch64/sme-7-illegal.s: New test.
* testsuite/gas/aarch64/sme-7.d: New test.
* testsuite/gas/aarch64/sme-7.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New operands.
(enum aarch64_insn_class): Added sme_ldr and sme_str.
(AARCH64_OPDE_UNTIED_IMMS): New operand error kind.
opcodes/ChangeLog:
* aarch64-asm.c (aarch64_ins_sme_za_hv_tiles): New inserter.
(aarch64_ins_sme_za_list): New inserter.
(aarch64_ins_sme_za_array): New inserter.
(aarch64_ins_sme_addr_ri_u4xvl): New inserter.
* aarch64-asm.h (AARCH64_DECL_OPD_INSERTER): Added
ins_sme_za_list, ins_sme_za_array and ins_sme_addr_ri_u4xvl.
* aarch64-dis.c (aarch64_ext_sme_za_hv_tiles): New extractor.
(aarch64_ext_sme_za_list): New extractor.
(aarch64_ext_sme_za_array): New extractor.
(aarch64_ext_sme_addr_ri_u4xvl): New extractor.
* aarch64-dis.h (AARCH64_DECL_OPD_EXTRACTOR): Added
ext_sme_za_list, ext_sme_za_array and ext_sme_addr_ri_u4xvl.
* aarch64-opc.c (operand_general_constraint_met_p):
(aarch64_match_operands_constraint): Handle sme_ldr, sme_str
and sme_misc.
(aarch64_print_operand): New operands supported.
* aarch64-tbl.h (OP_SVE_QUU): New qualifier.
(OP_SVE_QZU): New qualifier.
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
|
|
This patch is adding new MOV (alias) and MOVA SME instruction.
gas/ChangeLog:
* config/tc-aarch64.c (enum sme_hv_slice): new enum.
(struct reloc_entry): Added ZAH and ZAV registers.
(parse_sme_immediate): Immediate parser.
(parse_sme_za_hv_tiles_operand): ZA tile parser.
(parse_sme_za_hv_tiles_operand_index): Index parser.
(parse_operands): Added ZA tile parser calls.
(REGNUMS): New macro. Regs with suffix.
(REGSET16S): New macro. 16 regs with suffix.
* testsuite/gas/aarch64/sme-2-illegal.d: New test.
* testsuite/gas/aarch64/sme-2-illegal.l: New test.
* testsuite/gas/aarch64/sme-2-illegal.s: New test.
* testsuite/gas/aarch64/sme-2.d: New test.
* testsuite/gas/aarch64/sme-2.s: New test.
* testsuite/gas/aarch64/sme-2a.d: New test.
* testsuite/gas/aarch64/sme-2a.s: New test.
* testsuite/gas/aarch64/sme-3-illegal.d: New test.
* testsuite/gas/aarch64/sme-3-illegal.l: New test.
* testsuite/gas/aarch64/sme-3-illegal.s: New test.
* testsuite/gas/aarch64/sme-3.d: New test.
* testsuite/gas/aarch64/sme-3.s: New test.
* testsuite/gas/aarch64/sme-3a.d: New test.
* testsuite/gas/aarch64/sme-3a.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New enums
AARCH64_OPND_SME_ZA_HV_idx_src and
AARCH64_OPND_SME_ZA_HV_idx_dest.
(struct aarch64_opnd_info): New ZA tile vector struct.
opcodes/ChangeLog:
* aarch64-asm.c (aarch64_ins_sme_za_hv_tiles):
New inserter.
* aarch64-asm.h (AARCH64_DECL_OPD_INSERTER):
New inserter ins_sme_za_hv_tiles.
* aarch64-dis.c (aarch64_ext_sme_za_hv_tiles):
New extractor.
* aarch64-dis.h (AARCH64_DECL_OPD_EXTRACTOR):
New extractor ext_sme_za_hv_tiles.
* aarch64-opc.c (aarch64_print_operand):
Handle SME_ZA_HV_idx_src and SME_ZA_HV_idx_dest.
* aarch64-opc.h (enum aarch64_field_kind): New enums
FLD_SME_size_10, FLD_SME_Q, FLD_SME_V and FLD_SME_Rv.
(struct aarch64_operand): Increase fields size to 5.
* aarch64-tbl.h (OP_SME_BHSDQ_PM_BHSDQ): New qualifiers
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
|