aboutsummaryrefslogtreecommitdiff
path: root/include/opcode/aarch64.h
AgeCommit message (Collapse)AuthorFilesLines
3 daysaarch64: Add support for --march=armv9.6-aAlice Carlotti1-0/+10
3 daysaarch64: Refactor exclusion of reg names in immediatesAlice Carlotti1-1/+18
When parsing immediate values, register names should not be misinterpreted as symbols. However, for backwards compatibility we need to permit some newer register names within older instructions. The current mechanism for doing so depends on the list of explicit architecture requirements for the instructions, which is fragile and easy to forget, and grows increasingly messy as more architecture features are added. This patch add explicit flags to each opcode to indicate which set of register names is disallowed in each instance. These flags are mandatory for all opcodes with immediate operands, which ensures that the choice of disallowed names will always be deliberate and explicit. This patch should have no functional change.
4 daysaarch64: Support for FEAT_SVE_AES2Ezra Sitorus1-0/+9
FEAT_SVE_AES2 implements the SVE multi-vector Advanced Encryption Standard and 128-bit destination element polynomial multiply long instructions, when the PE is not in Streaming SVE mode.
4 daysaarch64: Support for FEAT_LSUIEzra Sitorus1-0/+2
FEAT_LSUI introduces unprivileged variants of load and store instructions so that clearing PSTATE.PAN is never required in privileged software.
4 daysaarch64: Support for FEAT_PCDPHINTEzra Sitorus1-0/+3
FEAT_PCDPHINT - Producer-consumer data placement hints - is an optional ISA extension that provides hint instructions to indicate: - a store in the current execution thread is generating data at a specific location, which a thread of execution on one or more other observers is waiting on. - the thread of execution on the current PE will read a location that may not yet have been written with the value to be consumed. This extension introduces: - STSHH, a hint instruction, with operands (policies) keep and strm - PRFM *IR*, a new prefetch memory operand.
6 daysaarch64: Add support for FEAT_SVE2p2 and FEAT_SME2p2Alice Carlotti1-1/+12
2025-06-25aarch64: Add supports for FEAT_PoPS feature and DC instructions.Srinath Parvathaneni1-0/+2
This patch add support for FEAT_PoPS feature which can be enabled through +pops command line flag. This patch also adds support for following DC instructions and the spec can be found here [1]. 1. "dc cigdvaps" enabled on passing +memtag+pops command line flags. 2. "dc civaps" enabled on passing +pops command line flag. [1]: https://developer.arm.com/documentation/ddi0601/2025-03/AArch64-Instructions?lang=en
2025-06-19aarch64: Support for FEAT_LSFEEzra Sitorus1-2/+7
FEAT_LSFE - Large System Float Extension - implements A64 base atomic floating-point in-memory instructions.
2025-06-19aarch64: Support for FEAT_SVE_F16F32MM, FEAT_F8F16M, FEAT_F8F32MMEzra Sitorus1-0/+6
FEAT_SVE_F16F32MM introduces the SVE half-precision floating-point matrix multiply-accumulate to single-precision instruction. FEAT_F8F32MM introduces the Advanced SIMD 8-bit floating-point matrix multiply-accumulate to single-precision instruction. FEAT_F8F16MM introduces the Advanced SIMD 8-bit floating-point matrix multiply-accumulate to half-precision instruction.
2025-06-19aarch64: Support for FEAT_CMPBREzra Sitorus1-0/+5
FEAT_CMPBR - Compare and branch instructions. This patch adds these instructions: - CB<CC> (register) - CB<CC> (immediate) - CBH<CC> - CBB<CC> where CC is one of the following: - EQ - NE - GT - GE - LT - LE - HI - HS - LO - LS
2025-06-19aarch64: Add occmo flag for FEAT_OCCMOEzra Sitorus1-0/+2
FEAT_OCCMO support was introduced, but the feature flags were missing. This patch adds these flags, as well as splitting up the tests to test occmo vs occmo+memtag operands.
2025-06-19aarch64: Support for FEAT_SVE_BFSCALEEzra Sitorus1-0/+5
FEAT_SVE_BFSCALE introduces the SVE BFSCALE instruction, when the PE is not in Streaming SVE mode. If FEAT_SME2 is implemented, FEAT_SVE_BFSCALE also introduces SME multi-vector Z-targeting BFloat16 scaling instructions, BFSCALE and BFMUL.
2025-06-12aarch64: Add support for FEAT_FPRCVTRichard Ball1-0/+4
FEAT_FPRCVT introduces new versions of previous instructions. The instructions are used to convert between floating points and Integers. These new versions take as operands SIMD&FP registers for both the source and destination register. FEAT_FPRCVT also enables the use of some existing AdvSIMD instructions in streaming mode. However, no changes are needed in gas to support this.
2025-06-11aarch64: Add definitions for missing architecture bitsYury Khrustalev1-4/+16
Complete macros for feature bits for v9.1-A, v9.2-A, v9.3-A, and v9.4-A.
2025-06-09aarch64: Increase the number of feature words to 3Richard Earnshaw1-1/+2
Now that most of the effort of updating the number of feature words is handled by macros, add an additional one, taking the number of supported features to 192.
2025-06-09aarch64: use macro trickery to automate feature array size replicationRichard Earnshaw1-36/+87
There are quite a few macros that need to be changed when we need to increase the number of words in the features data structure. With some macro trickery we can automate most of this so that a single macro needs to be updated. With C2X we could probably do even better by using recursion, but this is still a much better situation than we had previously. A static assertion is used to ensure that there is always enough space in the flags macro for the number of feature bits we need to support.
2025-06-09aarch64: Fix typos in opcode headersYury Khrustalev1-4/+4
2025-05-09aarch64: Eliminate AARCH64_OPND_SVE_ADDR_RAlice Carlotti1-6/+10
Adjust parsing for AARCH64_OPND_SVE_ADDR_RR{_LSL*} operands to accept implicit XZR offsets. Add new AARCH64_OPND_SVE_ADDR_RM{_LSL*} operands to support instructions where an XZR offset is allowed but must be specified explicitly. This allows the removal of the duplicate opcode table entries using AARCH64_OPND_SVE_ADDR_R.
2025-01-17aarch64: Fix sve2p1 gating and add missing instructionsAndrew Carlotti1-2/+8
Many FEAT_SVE2p1 instructions need to be enabled by either of two different features (one for streaming mode, and one for non-streaming mode). This patch adds correct gating conditions for these instructions. There were also a few sve2p1 instructions missing altogether, so add those as well. The testsuite is modified to check for all alternative enablement conditions. In many cases this is done by adding an alternative assembler commands to existing test files. For some SME/SME2 tests, only some of the instructions are enabled by +sve2p1, so these are copied into a separate test. For original SVE2p1 tests, the non-SME2p1 instructions have been moved to a separate test file. There are also new tests for the newly added instructions. These include a couple of fixme comments relating to bad error reporting, which should be investigated later.
2025-01-10aarch64: Add support for FEAT_SME_B16B16 feature.Srinath Parvathaneni1-0/+2
This patch adds support for SME ZA-targeting non-widening BFloat16 instructions, under tick FEAT_SME_B16B16 and command line flag "+sme-b16b16". FEAT_SME_B16B16 implements FEAT_SME2 and FEAT_SVE_B16B16, in accordance with that "+sme-b16b16" enables "+sme2" and "+sve-b16b16". Also the test files related to FEAT_SME_B16B16 are prefixed with sme-b16b16*. eg: sme-b16b16-1.s, sme-b16b16-1.d. The spec for this feature and instructions is availabe here [1]: [1]: https://developer.arm.com/documentation/ddi0602/2024-06/SME-Instructions?lang=en
2025-01-10aarch64: Add support for FEAT_SVE_B16B16 feature.Srinath Parvathaneni1-2/+2
In the current code, SVE2 Bfloat16 instructions are implemented with tick FEAT_B16B16 and command line flag "+b16b16" and this feature was suspended due to incomplete support. In the new spec available here[1], FEAT_B16B16 is replaced with FEAT_SVE_B16B16 and command line flag "+b16b16" is replace with "sve-b16b16". Also the test files related to FEAT_SVE_B16B16 are prefixed with sve-b16b16*. eg: sve-b16b16-sve2-1.s, sve-b16b16-sve2-1.d. This patch supports the SVE Z-targeting non-widening BFloat16 instructions with command line flag "+sve-b16b16+sve2". [1]: https://developer.arm.com/documentation/ddi0602/2024-06/SVE-Instructions?lang=en
2025-01-10aarch64: Rename AARCH64_OPND_SME_ZT0_INDEX2_12Andrew Carlotti1-1/+1
Rename to AARCH64_OPND_SME_ZT0_INDEX_MUL_VL.
2025-01-10aarch64: Remove redundant sme-lutv2 qualifiers and operandsAndrew Carlotti1-1/+0
2025-01-10aarch64: Add support for FEAT_SME_F16F16 feature.Srinath Parvathaneni1-0/+2
This patch adds support for FEAT_SME_F16F16 feature (Non-widening half-precision FP16 to FP16 arithmetic for SME2), which is enabled using command line flags +sme-f16f16 to -march (which enables both FEAT_SME2 and FEAT_SME_F16F16). There are couple of instructions (fadd and fsub variants) which should be allowed by the assembler on either passing +sme-f16f16 or +sme-f8f16. Those instructions are already supported in the current assembler, this patch adds tests for those instructions as well.
2025-01-01Update year range in copyright notice of binutils filesAlan Modra1-1/+1
2024-11-08aarch64: improve debuggability on array of enumMatthieu Longo1-3/+3
The current space optmization on enum aarch64_opn_qualifier forced its encoding using an unsigned char. This "hard-coded" optimization has the bad consequence of making the array of such enums being completely unreadable when debugging with GDB because the enum type is lost along the way. Keeping this space optimization, and the enum type as well, is possible when the declaration of the enum is tagged with attribute((packed)). attribute((packed)) is a GNU extension, and is wrapped in the macro ATTRIBUTE_PACKED (defined in ansidecl.h), and should be used instead.
2024-11-08aarch64: change returned type to bool to match semantic of functionsMatthieu Longo1-1/+1
2024-11-08aarch64: make comment clearer about the locationMatthieu Longo1-1/+2
The enum aarch64_opnd_qualifiers in include/opcode/aarch64.h needs to stay in sync with the array of struct operand_qualifier_data which defines various properties for the different type of operands. For instance, for: - registers: the size of the register, the number of elements. - immediates: lower and upper bits to determine the range of values.
2024-07-18opcodes: aarch64: enforce checks on subclass flags in aarch64-gen.cIndu Bhagat1-1/+2
Enforce some checks on the newly added subclass flags: - If a subclass is set of one insn of an iclass, every insn of that iclass must have non-zero subclass field. - For all other iclasses, the subclass bits are zero for all insns. include/ * opcode/aarch64.h (enum aarch64_insn_class): Identify the maximum iclass enum value. opcodes/ * aarch64-gen.c (iclass_has_subclasses_p): New array of bool. (read_table): Enforce checks on subclass flags.
2024-07-18include: opcodes: aarch64: define new subclassesIndu Bhagat1-1/+31
The existing iclass information tells us the general shape and purpose of the instructions. In some cases, however, we need to further disect the iclass on the basis of other finer-grain information. E.g., for the purpose of SCFI, we need to know whether a given insn with iclass of ldst_* is a load or a store. Similarly, whether a particular arithmetic insn is an add or sub or mov, etc. This patch defines new flags to demarcate the insns. Also provide an access function for subclass lookup. Later, we will enforce (in aarch64-gen.c) that if an iclass has at least one instruction with a non-zero subclass, all instructions of the iclass must have a non-zero subclass information. If none of the defined subclasses are applicable (or not required for SCFI purposes), F_SUBCLASS_OTHER can be used for such instructions. include/ * opcode/aarch64.h (F_SUBCLASS): New flag. (F_SUBCLASS_OTHER): Likewise. (F_LDST_LOAD): Likewise. (F_LDST_STORE): Likewise. (F_ARITH_ADD): Likewise. (F_ARITH_SUB): Likewise. (F_ARITH_MOV): Likewise. (F_BRANCH_CALL): Likewise. (F_BRANCH_RET): Likewise. (F_DP_TAG_ONLY): Likewise. (aarch64_opcode_subclass_p): New definition.
2024-07-12aarch64: Add support for sme2.1 zero instructions.Srinath Parvathaneni1-1/+10
This patch adds support for following sme2.1 zero instructions and the spec is available here [1]. 1. ZERO (single-vector). 2. ZERO (double-vector). 3. ZERO (quad-vector). The VECTOR GROUP symbols VGx2 and VGx4 are optional for the assembler for most of the sme and sve instructions. But for few of the sme2.1 zero instruction variants VECTOR GROUP symbols VGx2 and VGx4 are mandatory. To address this a bit "F_VG_REQ" is introduced in this patch, on setting F_VG_REQ bit in flags of aarch64_opcode forces the assembler to accept instruction operand only having VECTOR GROUP symbols. [1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
2024-07-12aarch64: Add support for sme2.1 movaz instructions.Srinath Parvathaneni1-0/+1
This patch adds support for following sme2.1 movaz instructions and the spec is available here [1]. 1. MOVAZ (array to vector, two registers). 2. MOVAZ (array to vector, four registers). 3. MOVAZ (tile to vector, single). [1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
2024-07-12aarch64: Add support for sme2.1 luti2 and luti4 instructions.Srinath Parvathaneni1-0/+1
This patch adds support for following sme2.1 luti2 and luti4 instructions, spec is available here [1] 1. LUTI2 (two registers) strided. 2. LUTI2 (four registers) strided. 3. LUTI4 (two registers) strided. 4. LUTI4 (four registers) strided. [1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
2024-07-08aarch64: Add support for sve2p1 pmov instruction.srinath1-0/+8
This patch adds support for followign SVE2p1 instruction, spec is available here [1]. 1. PMOV (to vector) 2. PMOV (to predicate) Both pmov (to vector) and pmov (to predicate) have destination scalable vector register and source scalable vector register respectively as an operand with no suffix and optional index. To handle this case we have added 8 new operands in this patch. AARCH64_OPND_SVE_Zn0_INDEX, /* Zn[index], bits [9:5]. */ AARCH64_OPND_SVE_Zn1_17_INDEX, /* Zn[index], bits [9:5,17]. */ AARCH64_OPND_SVE_Zn2_18_INDEX, /* Zn[index], bits [9:5,18:17]. */ AARCH64_OPND_SVE_Zn3_22_INDEX, /* Zn[index], bits [9:5,18:17,22]. */ AARCH64_OPND_SVE_Zd0_INDEX, /* Zn[index], bits [4:0]. */ AARCH64_OPND_SVE_Zd1_17_INDEX, /* Zn[index], bits [4:0,17]. */ AARCH64_OPND_SVE_Zd2_18_INDEX, /* Zn[index], bits [4:0,18:17]. */ AARCH64_OPND_SVE_Zd3_22_INDEX, /* Zn[index], bits [4:0,18:17,22]. */ Since the index of the <Zd> operand is optional, the index part is dropped in disassembly in both the cases of "no index" or "zero index". As per spec: PMOV <Zd>{[<imm>]}, <Pn>.D PMOV <Pn>.D, <Zd>{[<imm>]} Example1: Assembly: pmov z5[0], p6.d Disassembly: pmov z5, p6.d Assembly: pmov z5, p6.d Disassembly: pmov z5, p6.d Example2: Assembly: pmov p4.b, z5[0] Disassembly: pmov p4.b, z5 Assembly: pmov p4.b, z5 Disassembly: pmov p4.b, z5 [1]: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions?lang=en
2024-07-05aarch64: add STEP2 feature and its associated registersMatthieu Longo1-0/+3
AArch64 defines new registers for the feature step2 (Enhanced Software Step Extension). step2 is an Armv9.5-A feature. This patch also adds relevant tests. Regression tested on aarch64-none-elf, and no regression found.
2024-07-05aarch64: add SPMU2 feature and its associated registersMatthieu Longo1-0/+3
AArch64 defines new registers for the feature spmu2 (System Performance Monitors Extension version 2). spmu2 is an Armv9.5-A feature. This patch also adds relevant tests. Regression tested on aarch64-none-elf, and no regression found.
2024-07-05aarch64: add E3DSE feature and its associated registersMatthieu Longo1-1/+5
AArch64 defines new registers for the feature e3dse (Delegated SError exceptions for EL3): vdisr_el3 and vdisr_el3. e3dse is an Armv9.5-A feature. This patch also adds relevant tests. Regression tested on aarch64-none-elf, and no regression found.
2024-06-28aarch64: Add support for Armv9.5-A architectureClaudio Bantaloukas1-0/+8
The new -march=armv9.5-a flag enables access to the mandatory cpa, lut and faminmax extensions. Existing test cases for features are extended to verify they work without additional flags.
2024-06-25aarch64: Fix sve2p1 ld[1-4]/st[1-4]q instruction operands.Srinath Parvathaneni1-3/+1
This patch fixes encoding and syntax for sve2p1 instructions ld[1-4]q/st[1-4]q as mentioned below, for the issues reported here. https://sourceware.org/pipermail/binutils/2024-February/132408.html 1) Previously all the ld[1-4]q/st[1-4]q instructions are wrongly added as predicated instructions and this issue is fixed in this patch by replacing "SVE2p1_INSNC" with "SVE2p1_INSN" macro. 2) Wrong first operand in all the ld[1-4]q/st[1-4]q instructions is fixed by replacing "SVE_Zt" with "SVE_ZtxN". 3) Wrong operand qualifiers in ld1q and st1q instructions are also fixed in this patch. 4) In ld1q/st1q the index in the second argument is optional and if index is xzr and is skipped in the assembly, the index field is ignored by the disassembler. Fixing above mentioned issues helps with following: 1) ld1q and st1q first register operand accepts enclosed figure braces. 2) ld2q, ld3q, ld4q, st2q, st3q, and st4q instructions accepts wrapping sequence of vector registers. For the instructions ld[2-4]q/st[2-4]q, tests for wrapping sequence of vector registers are added along with short-form of operands for non-wrapping sequence. I have added test using following logic: ld2q {Z0.Q, Z1.Q}, p0/Z, [x0, #0, MUL VL] //raw insn encoding (all zeroes) ld2q {Z31.Q, Z0.Q}, p0/Z, [x0, #0, MUL VL] // encoding of <Zt1> ld2q {Z0.Q, Z1.Q}, p7/Z, [x0, #0, MUL VL] // encoding of <Pg> ld2q {Z0.Q, Z1.Q}, p0/Z, [x30, #0, MUL VL] // encoding of <Xm> ld2q {Z0.Q, Z1.Q}, p0/Z, [x0, #-16, MUL VL] // encoding of <imm> (low value) ld2q {Z0.Q, Z1.Q}, p0/Z, [x0, #14, MUL VL] // encoding of <imm> (high value) ld2q {Z31.Q, Z0.Q}, p7/Z, [x30, #-16, MUL VL] // encoding of all fields (all ones) ld2q {Z30.Q, Z31.Q}, p1/Z, [x3, #-2, MUL VL] // random encoding. For all the above form of instructions the hyphenated form is preferred for disassembly if there are more than two registers in the list, and the register numbers are monotonically increasing in increments of one.
2024-06-25aarch64: Fix sve2p1 extq instruction operands.Srinath Parvathaneni1-1/+1
This patch fixes the syntax of sve2p1 "extq" instruction by modifying the operands count to 4. A new operand AARCH64_OPND_SVE_UIMM4 is defined to handle the 4th argument an 4-bit unsigned immediate of extq instruction. The instruction encoding is updated to use constraint C_SCAN_MOVPRFX, to enable "extq" instruction to immediately precede in program order by a MOVPRFX instruction. Also removed the unused operand AARCH64_OPND_SVE_Zm_imm4. This issues was reported here: https://sourceware.org/pipermail/binutils/2024-February/132408.html
2024-06-25aarch64: Fix sve2p1 dupq instruction operands.Srinath Parvathaneni1-1/+1
This patch fixes the syntax of sve2p1 "dupq" instruction by modifying the way 2nd operand does the encoding and decoding using the [<imm>] value. dupq makes use of already existing aarch64_ins_sve_index and aarch64_ext_sve_index inserter and extractor functions. The definitions of aarch64_ins_sve_index_imm (inserter) and aarch64_ext_sve_index_imm (extractor) is removed in this patch. This issues was reported here: https://sourceware.org/pipermail/binutils/2024-February/132408.html
2024-06-25aarch64: Enable mandatory feature bits for v9.4-A.Srinath Parvathaneni1-1/+2
This patch fixes the mandatory feature bits in v9.4-a architectures, by enabling FEAT_SVE2p1 for Armv9.4-A architecture by default.
2024-06-24aarch64: Add SME FP8 multiplication instructionsAndrew Carlotti1-0/+11
This includes: - FEAT_SME_F8F32 (+sme-f8f32) - FEAT_SME_F8F16 (+sme-f8f16) The FP16 addition/subtraction instructions originally added by FEAT_SME_F16F16 haven't been added to Binutils yet. They are also required to be enabled if FEAT_SME_F8F16 is present, so they are included in this patch.
2024-06-24aarch64: Add FP8 Neon and SVE multiplication instructionsAndrew Carlotti1-6/+31
This includes all the instructions under the following features: - FEAT_FP8FMA (+fp8fma) - FEAT_FP8DOT4 (+fp8dot4) - FEAT_FP8DOT2 (+fp8dot2) - FEAT_SSVE_FP8FMA (+ssve-fp8fma) - FEAT_SSVE_FP8DOT4 (+ssve-fp8dot4) - FEAT_SSVE_FP8DOT2 (+ssve-fp8dot2)
2024-06-24gas, aarch64: Add SME2 lutv2 extensionsaurabh.jha@arm.com1-0/+6
Introduces instructions for the SME2 lutv2 extension for AArch64. They are documented in the following document: * ARM DDI0602 For both luti4 instructions, we introduced an operand called SME_Znx2_BIT_INDEX. We use the existing function parse_vector_reg_list for parsing but modified that function so that it can accept operands without qualifiers and rejects instructions that have operands with qualifiers but are not supposed to have operands with qualifiers. For disassembly, we modified print_register_list so that it could accept register lists without qualifiers. For one luti4 instruction, we introduced a SME_Zdnx4_STRIDED. It is similar to SME_Ztx4_STRIDED and we could use existing code for parsing, encoding, and disassembly. For movt instruction, we introduced an operand called SME_ZT0_INDEX2_12. This is a ZT0 register with a bit index encoded in [13:12]. It is similar to SME_ZT0_INDEX. We also introduced an iclass named sme_size_12_b so that we can encode size bits [13:12] correctly when only 'b' is allowed as qualifier.
2024-06-23aarch64: Enable +cssc for armv8.9-aAndrew Carlotti1-0/+1
FEAT_CSSC is mandatory in the architecture from Armv8.9.
2024-06-12aarch64: add Branch Record Buffer extension instructionsClaudio Bantaloukas1-1/+6
The FEAT_BRBE extension provides two aliases of sys: - brb iall (Invalidates all Branch records in the Branch Record Buffer) - brb inj (Injects the Branch Record held in BRBINFINJ_EL1, BRBSRCINJ_EL1, and BRBTGTINJ_EL1 into the Branch Record Buffer) This patch adds: - the feature option "brbe" that must be added for the aliases to be available - a new operand flag AARCH64_OPND_Rt_IN_SYS_ALIASES that warns in a comment when Rt is set to the non default value 0b11111 (it is constrained unpredictable whether the instruction is undefined or behaves as if the Rt field is set to 0b11111). - a new operand flag AARCH64_OPND_BRBOP that encodes and decodes Op2 values from bit 5 - support for the two brb aliases above See: - https://developer.arm.com/documentation/ddi0602/2024-03/Base-Instructions/BRB--Branch-Record-Buffer--an-alias-of-SYS-?lang=en - https://developer.arm.com/documentation/ddi0601/2024-03/AArch64-Instructions/BRB-INJ--Branch-Record-Injection-into-the-Branch-Record-Buffer?lang=en - https://developer.arm.com/documentation/ddi0601/2024-03/AArch64-Instructions/BRB-IALL--Invalidate-the-Branch-Record-Buffer?lang=en
2024-05-28gas, aarch64: Add SVE2 lut extensionsaurabh.jha@arm.com1-0/+3
Introduces instructions for the SVE2 lut extension for AArch64. They are documented in the following links: * luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en * luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en These instructions use new SVE2 vector operands. They are called SVE_Zm1_23_INDEX, SVE_Zm2_22_INDEX, and Zm3_12_INDEX and they have 1 bit, 2 bit, and 3 bit indices respectively. The lsb and width of these new operands are the same as many existing operands but the convention is to give different names to fields that serve different purpose so we introduced new fields in aarch64-opc.c and aarch64-opc.h. We made a design choice for the second operand of the halfword variant of luti4 with two register tables. We could have either defined a new operand, like SVE_Znx2, or we could have use the existing operand SVE_ZnxN. With the new operand, we would need to implement constraints on register lists based on either operand or opcode flag. With existing operand, we could just existing constraint checks using opcode flag. We chose the second approach and went with SVE_ZnxN and added opcode flag to enforce lengths of vector register list operands. This way, we can reuse the existing constraint check logic.
2024-05-28gas, aarch64: Add AdvSIMD lut extensionsaurabh.jha@arm.com1-1/+8
Introduces instructions for the Advanced SIMD lut extension for AArch64. They are documented in the following links: * luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en * luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en These instructions needed definition of some new operands. We will first discuss operands for the third operand of the instructions and then discuss a vector register list operand needed for the second operand. The third operands are vectors with bit indices and without type qualifiers. They are called Em_INDEX1_14, Em_INDEX2_13, and Em_INDEX3_12 and they have 1 bit, 2 bit, and 3 bit indices respectively. For these new operands, we defined new parsing case branch. The lsb and width of these operands are the same as many existing but the convention is to give different names to fields that serve different purpose so we introduced new fields in aarch64-opc.c and aarch64-opc.h for these new operands. For the second operand of these instructions, we introduced a new operand called LVn_LUT. This represents a vector register list with stride 1. We defined new inserter and extractor for this new operand and it is encoded in FLD_Rn. We are enforcing the number of registers in the reglist using opcode flag rather than operand flag as this is what other SIMD vector register list operands are doing. The disassembly also uses opcode flag to print the correct number of registers.
2024-05-16aarch64: fp8 convert and scale - add feature flags and related structuresVictor Do Nascimento1-0/+2