aboutsummaryrefslogtreecommitdiff
path: root/gas/config
AgeCommit message (Collapse)AuthorFilesLines
2023-04-26gas: support for the BPF pseudo-c assembly syntaxGuillermo E. Martinez2-4/+1519
This patch adds support to the GNU assembler for an alternative assembly syntax used in BPF. This syntax is C-like and very unconventional for an assembly language, but it is generated by clang/llvm and is also used in inline asm templates in kernel code, so we ought to support it. After this patch, the assembler is able to parse instructions in both supported syntax: the normal assembly-like syntax and the pseudo-C syntax. Instruction formats can be mixed in the source program: the assembler recognizes the right syntax to use. gas/ChangeLog: 2023-04-20 Guillermo E. Martinez <guillermo.e.martinez@oracle.com> PR gas/29728 * config/tc-bpf.h (TC_EQUAL_IN_INSN): Define. * config/tc-bpf.c (LEX_IS_SYMBOL_COMPONENT): Define. (LEX_IS_WHITESPACE): Likewise. (LEX_IS_NEWLINE): Likewise. (LEX_IS_ARITHM_OP): Likewise. (LEX_IS_STAR): Likewise. (LEX_IS_CLSE_BR): Likewise. (LEX_IS_OPEN_BR): Likewise. (LEX_IS_EQUAL): Likewise. (LEX_IS_EXCLA): Likewise. (ST_EOI): Likewise. (MAX_TOKEN_SZ): Likewise. (init_pseudoc_lex): New function. (md_begin): Call init_pseudoc_lex. (valid_expr): New function. (build_bpf_non_generic_load): Likewise. (build_bpf_atomic_insn): Likewise. (build_bpf_jmp_insn): Likewise. (build_bpf_arithm_insn): Likewise. (build_bpf_endianness): Likewise. (build_bpf_load_store_insn): Likewise. (look_for_reserved_word): Likewise. (is_register): Likewise. (is_cast): Likewise. (get_token): Likewise. (bpf_pseudoc_to_normal_syntax): Likewise. (md_assemble): Try pseudo-C syntax if an instruction cannot be parsed.
2023-04-25RISC-V: adjust logic to avoid register name symbolsJan Beulich2-27/+98
Special casing GPR names in my_getSmallExpression() leads to a number of inconsistencies. Generalize this by utilizing the md_parse_name() hook, limited to when instruction operands are being parsed (really: probed). Then both the GPR lookup there and the yet more ad hoc workaround for PR/gas 29940 can be removed (including its extension needed for making the compressed form JAL work again).
2023-04-25RISC-V: don't recognize bogus relocationsJan Beulich1-2/+1
With my_getSmallExpression() consistently and silently failing on relocation operators not fitting an insn, it is no longer necessary to hand it percent_op_itype[] "just in case" (i.e. to avoid errors when a subsequent parsing attempt for another operand combination might succeed). This also eliminates the latent problem of percent_op_itype[] and percent_op_stype[] growing a non-identical set of recognized relocation operators.
2023-04-25RISC-V: avoid redundant and misleading/wrong error messagesJan Beulich1-0/+9
The use of a wrong (for the insn) relocation operator (or a future one which simply isn't recognized by older gas yet) doesn't render the (rest of the) expression "bad". Furthermore alongside the error from expression() in most cases the parser would emit another error then anyway. Suppress the call to my_getExpression() in such a case, arranging for a guaranteed subsequent error message by marking the expression "illegal".
2023-04-25RISC-V: drop "percent_op" parameter from my_getOpcodeExpression()Jan Beulich1-4/+4
Both callers check for no relocations, so there's no point parsing for some. Have the function pass percent_op_null into my_getSmallExpression(). Note that there's no point passing percent_op_itype: Elsewhere, especially when processing compressed alias insns ahead of non-alias ones, this has the effect of avoiding "bad expression" errors when another parsing pass may follow (and succeed). Here, however, all alternative forms of an insn type will again start with the same O4 or O2, so avoiding errors earlier on doesn't really help. Plus constructs with a relocation specifier (as percent_op_itype would permit) can't be specified anyway, as the scrubber eats the whitespace between .insn's type and the O4 or O2 expression when that starts with % or ( - i.e. these will be seen as e.g. "i%lo(x)", and riscv_ip() looks only for whitespace when finding the end of a mnemonic.
2023-04-25RISC-V: minor effort reduction in relocation specifier parsingJan Beulich1-16/+16
The sole caller of parse_relocation() has already checked for the % prefix, so there's no need to check for it again in the strncasecmp() and there's also no reason to make the involved string literals longer than necessary.
2023-04-23MIPS: fix loongson3 llsc workaroundYunQiang Su1-7/+3
-mfix-looongson3-llsc may add sync instructions not needed on some asm code with lots of debug info. PR: 30153 * gas/config/tc-mips.c(fix_loongson3_llsc): clear logistic.
2023-04-19x86: parse_register() must not alter the parsed stringJan Beulich1-13/+9
This reverts the code change done by 100f993c53a5 ("x86: Check unbalanced braces in memory reference"), which wrongly identified e87fb6a6d0cd ("x86/gas: support quoted address scale factor in AT&T syntax") as the root cause of PR gas/30248. (The testcase is left in place, no matter that it's at best marginally useful in that shape.) The problem instead is that parse_register() alters the string handed to it, thus breaking valid assumptions in subsequent parsing code. Since the function's behavior is a result of get_symbol_name()'s, make a copy of the incoming string before invoking that function. Like for parse_real_register() follow the model of strtol() et al: input string is const-qualified to signal that the string isn't altered, but the returned "end" pointer is not const-qualified, requiring const to be cast away (which generally is a bad idea, but the alternative would again be more convoluted code).
2023-04-19x86: parse_real_register() does not alter the parsed stringJan Beulich1-4/+4
Follow the model of strtol() et al - input string is const-qualified to signal that the string isn't altered, but the returned "end" pointer is not const-qualified, requiring const to be cast away (which generally is a bad idea, but the alternative would be more convoluted code).
2023-04-18Symbols with GOT relocatios do not fix adjustbalemengqinggang1-0/+15
gas * config/tc-loongarch.c (loongarch_fix_adjustable): Symbols with GOT relocatios do not fix adjustbale. * testsuite/gas/loongarch/macro_op_large_abs.d: Regenerated. * testsuite/gas/loongarch/macro_op_large_pc.d: Regenerated. ld * testsuite/ld-loongarch-elf/macro_op.d: Regenerated. -
2023-04-07Support Intel AMX-COMPLEXHaochen Jiang1-0/+1
gas/ChangeLog: * NEWS: Support Intel AMX-COMPLEX. * config/tc-i386.c: Add amx_complex. * doc/c-i386.texi: Document .amx_complex. * testsuite/gas/i386/i386.exp: Run AMX-COMPLEX tests. * testsuite/gas/i386/amx-complex-inval.l: New test. * testsuite/gas/i386/amx-complex-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-complex-bad.d: Ditto. * testsuite/gas/i386/x86-64-amx-complex-bad.s: Ditto. * testsuite/gas/i386/x86-64-amx-complex-intel.d: Ditto. * testsuite/gas/i386/x86-64-amx-complex.d: Ditto. * testsuite/gas/i386/x86-64-amx-complex.s: Ditto. opcodes/ChangeLog: * i386-dis.c (MOD_VEX_0F386C_X86_64_W_0): New. (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0): Ditto. (X86_64_VEX_0F386C): Ditto. (VEX_LEN_0F386C_X86_64_W_0_M_1): Ditto. (VEX_W_0F386C_X86_64): Ditto. (mod_table): Add MOD_VEX_0F386C_X86_64_W_0. (prefix_table): Add PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0. (x86_64_table): Add X86_64_VEX_0F386C. (vex_len_table): Add VEX_LEN_0F386C_X86_64_W_0_M_1. (vex_w_table): Add VEX_W_0F386C_X86_64. * i386-gen.c (cpu_flag_init): Add CPU_AMX_COMPLEX_FLAGS and CPU_ANY_AMX_COMPLEX_FLAGS. * i386-init.h: Regenerated. * i386-mnem.h: Ditto. * i386-opc.h (CpuAMX_COMPLEX): New. (i386_cpu_flags): Add cpuamx_complex. * i386-opc.tbl: Add AMX-COMPLEX instructions. * i386-tbl.h: Regenerated.
2023-04-03ubsan: aarch64 parse_vector_reg_listAlan Modra1-4/+4
tc-aarch64.c:1473:27: runtime error: left shift of 7 by 30 places cannot be represented in type 'int'. * config/tc-aarch64.c (parse_vector_reg_list): Avoid UB left shift.
2023-03-31RISC-V: Allocate "various" operand typeTsukasa OI1-17/+47
This commit intends to move operands that require very special handling or operand types that are so minor (e.g. only useful on a few instructions) under "W". I also intend this "W" to be "temporary" operand storage until we can find good two character (or less) operand type. In this commit, prefetch offset operand "f" for 'Zicbop' extension is moved to "Wif" because of its special handling (and allocating single character "f" for this operand type seemed too much). Current expected allocation guideline is as follows: 1. 'W' 2. The most closely related single-letter extension in lowercase (strongly recommended but not mandatory) 3. Identify operand type The author currently plans to allocate following three-character operand types (for operands including instructions from unratified extensions). 1. "Wif" ('Zicbop': fetch offset) 2. "Wfv" (unratified 'Zfa': value operand from FLI.[HSDQ] instructions) 3. "Wfm" / "WfM" 'Zfh', 'F', 'D', 'Q': rounding modes "m" with special handling solely for widening conversion instructions. gas/ChangeLog: * config/tc-riscv.c (validate_riscv_insn, riscv_ip): Move from "f" to "Wif". opcodes/ChangeLog: * riscv-dis.c (print_insn_args): Move from "f" to "Wif". * riscv-opc.c (riscv_opcodes): Reflect new operand type.
2023-03-31x86: handle immediate operands for .insnJan Beulich2-3/+108
Since we have no insn suffix and it's also not realistic to infer immediate size from the size of other (typically register) operands (like optimize_imm() does), and since we also don't have a template telling us permitted size(s), a new syntax construct is introduced to allow size (and signedness) specification. In the absence of such, the size is inferred from significant bits (which obviously may yield inconsistent results at least for effectively negative values, depending on whether BFD64 is enabled), and only if supplied expressions can be evaluated at parsing time. Being explicit is generally recommended to users. Size specification is permitted at bit granularity, but of course the eventually emitted immediate values will be padded up to 8-, 16-, 32-, or 64-bit fields.
2023-03-31x86: allow for multiple immediates in output_disp()Jan Beulich1-5/+5
.insn isn't going to have a constraint of only a single immediate when, in particular, RIP-relative addressing is used.
2023-03-31x86: handle EVEX Disp8 for .insnJan Beulich1-1/+97
In particular the scaling factor cannot always be determined from pre- existing operand attributes. Introduce a new {:d<N>} vector operand syntax extension, restricted to .insn only, to allow specifying this in (at least) otherwise ambiguous cases.
2023-03-31x86: process instruction operands for .insnJan Beulich2-21/+302
Deal with register and memory operands; immediate operands will follow later, as will the handling of EVEX embedded broadcast and EVEX Disp8 scaling. Note that because we can't really know how to encode their use, %cr8 and up cannot be used with .insn outside of 64-bit mode. Users would need to specify an explicit LOCK prefix in combination with %cr0 etc.
2023-03-31x86: parse special opcode modifiers for .insnJan Beulich1-1/+38
So called "short form" encoding is specified by a trailing "+r", whereas a possible extension opcode is specified by the usual "/<digit>". Take these off the expression before handing it to get_absolute_expression(). Note that on targets where / starts a comment, --divide needs passing to gas in order to make use of the extension opcode functionality.
2023-03-31x86: parse VEX and alike specifiers for .insnJan Beulich1-6/+238
All encoding spaces can be used this way; there's a certain risk that the bits presently reserved could be used for other purposes down the road, but people using .insn are expected to know what they're doing anyway. Plus this way there's at least _some_ way to have those bits set. For now this will only allow operand-less insns to be encoded this way.
2023-03-31x86: introduce .insn directiveJan Beulich1-10/+155
For starters this deals with only very basic constructs.
2023-03-30aarch64: Add the RPRFM instructionRichard Sandiford1-0/+5
This patch adds the RPRFM (range prefetch) instruction. It was introduced as part of SME2, but it belongs to the prefetch hint space and so doesn't require any specific ISA flags. The aarch64_rprfmop_array initialiser (deliberately) only fills in the leading non-null elements.
2023-03-30aarch64: Add new SVE dot-product instructionsRichard Sandiford1-0/+1
This patch adds the SVE FDOT, SDOT and UDOT instructions, which are available when FEAT_SME2 is implemented. The patch also reorders the existing SVE_Zm3_22_INDEX to keep the operands numerically sorted.
2023-03-30aarch64: Add the SME2 shift instructionsRichard Sandiford1-3/+14
There are two instruction formats here: - SQRSHR, SQRSHRU and UQRSHR, which operate on lists of two or four registers. - SQRSHRN, SQRSHRUN and UQRSHRN, which operate on lists of four registers. These are the first SME2 instructions to have immediate operands. The patch makes sure that, when parsing SME2 instructions with immediate operands, the new predicate-as-counter registers are parsed as registers rather than as #-less immediates.
2023-03-30aarch64: Add the SME2 MLALL and MLSLL instructionsRichard Sandiford1-0/+5
SMLALL, SMLSLL, UMLALL and UMLSLL have the same format. USMLALL and SUMLALL allow the same operand types as those instructions, except that SUMLALL does not have the multi-vector x multi-vector forms (which would be redundant with USMLALL).
2023-03-30aarch64: Add the SME2 MLAL and MLSL instructionsRichard Sandiford1-0/+4
The {BF,F,S,U}MLAL and {BF,F,S,U}MLSL instructions share the same encoding. They are the first instance of a ZA (as opposed to ZA tile) operand having a range of offsets. As with ZA tiles, the expected range size is encoded in the operand-specific data field.
2023-03-30aarch64: Add the SME2 FMLA and FMLS instructionsRichard Sandiford1-0/+2
2023-03-30aarch64: Add the SME2 ADD and SUB instructionsRichard Sandiford1-1/+2
Add support for the SME2 ADD. SUB, FADD and FSUB instructions. SUB and FSUB have the same form as ADD and FADD, except that ADD also has a 2-operand accumulating form. The 64-bit ADD/SUB instructions require FEAT_SME_I16I64 and the 64-bit FADD/FSUB instructions require FEAT_SME_F64F64. These are the first instructions to have tied register list operands, as opposed to tied single registers. The parse_operands change prevents unsuffixed Z registers (width==-1) from being treated as though they had an Advanced SIMD-style suffix (.4s etc.). It means that: Error: expected element type rather than vector type at operand 2 -- `add za\.s\[w8,0\],{z0-z1}' becomes: Error: missing type suffix at operand 2 -- `add za\.s\[w8,0\],{z0-z1}'
2023-03-30aarch64: Add the SME2 ZT0 instructionsRichard Sandiford1-9/+64
SME2 adds lookup table instructions for quantisation. They use a new lookup table register called ZT0. LUTI2 takes an unsuffixed SVE vector index of the form Zn[<imm>], which is the first time that this syntax has been used.
2023-03-30aarch64: Add the SME2 predicate-related instructionsRichard Sandiford1-14/+71
Implementation-wise, the main things to note here are: - the WHILE* instructions have forms that return a pair of predicate registers. This is the first time that we've had lists of predicate registers, and they wrap around after register 15 rather than after register 31. - the predicate-as-counter WHILE* instructions have a fourth operand that specifies the vector length. We can treat this as an enumeration, except that immediate values aren't allowed. - PEXT takes an unsuffixed predicate index of the form PN<n>[<imm>]. This is the first instance of a vector/predicate index having no suffix.
2023-03-30aarch64: Add the SME2 multivector LD1 and ST1 instructionsRichard Sandiford1-0/+11
SME2 adds LD1 and ST1 variants for lists of 2 and 4 registers. The registers can be consecutive or strided. In the strided case, 2-register lists have a stride of 8, starting at register x0xxx. 4-register lists have a stride of 4, starting at register x00xx. The instructions are predicated on a predicate-as-counter register in the range pn8-pn15. Although we already had register fields with upper bounds of 7 and 15, this is the first plain register operand to have a nonzero lower bound. The patch uses the operand-specific data field to record the minimum value, rather than having separate inserters and extractors for each lower bound. This in turn required adding an extra bit to the field.
2023-03-30aarch64: Add the SME2 MOVA instructionsRichard Sandiford1-0/+8
SME2 defines new MOVA instructions for moving multiple registers to and from ZA. As with SME, the instructions are also available through MOV aliases. One notable feature of these instructions (and many other SME2 instructions) is that some register lists must start at a multiple of the list's size. The patch uses the general error "start register out of range" when this constraint isn't met, rather than an error specifically about multiples. This ensures that the error is consistent between these simple consecutive lists and later strided lists, for which the requirements aren't a simple multiple.
2023-03-30aarch64: Add support for predicate-as-counter registersRichard Sandiford1-3/+32
SME2 adds a new format for the existing SVE predicate registers: predicates as counters rather than predicates as masks. In assembly code, operands that interpret predicates as counters are written pn<N> rather than p<N>. This patch adds support for these registers and extends some existing instructions to support them. Since the new forms are just a programmer convenience, there's no need to make them more restrictive than the earlier predicate-as-mask forms.
2023-03-30aarch64; Add support for vector offset rangesRichard Sandiford1-0/+23
Some SME2 instructions operate on a range of consecutive ZA vectors. This is indicated by syntax such as: za[<Wv>, <imml>:<immh>] Like with the earlier vgx2 and vgx4 support, we get better error messages if the parser allows all ZA indices to have a range. We can then reject invalid cases during constraint checking.
2023-03-30aarch64: Add support for vgx2 and vgx4Richard Sandiford1-1/+32
Many SME2 instructions operate on groups of 2 or 4 ZA vectors. This is indicated by adding a "vgx2" or "vgx4" group size to the ZA index. The group size is optional in assembly but preferred for disassembly. There is not a binary distinction between mnemonics that have group sizes and mnemonics that don't, nor between mnemonics that take vgx2 and mnemonics that take vgx4. We therefore get better error messages if we allow any ZA index to have a group size during parsing, and wait until constraint checking to reject invalid sizes. A quirk of the way errors are reported means that if an instruction is wrong both in its qualifiers and its use of a group size, we'll print suggested alternative instructions that also have an incorrect group size. But that's a general property that also applies to things like out-of-range immediates. It's also not obviously the wrong thing to do. We need to be relatively confident that we're looking at the right opcode before reporting detailed operand-specific errors, so doing qualifier checking first seems resonable.
2023-03-30aarch64: Add _off4 suffix to AARCH64_OPND_SME_ZA_arrayRichard Sandiford1-1/+1
SME2 adds various new fields that are similar to AARCH64_OPND_SME_ZA_array, but are distinguished by the size of their offset fields. This patch adds _off4 to the name of the field that we already have.
2023-03-30aarch64: Add +sme2Richard Sandiford1-0/+2
This patch adds bare-bones support for +sme2. Later patches fill in the rest.
2023-03-30aarch64: Prefer register ranges & support wrappingRichard Sandiford1-5/+7
Until now, binutils has supported register ranges such as { v0.4s - v3.4s } as an unofficial shorthand for { v0.4s, v1.4s, v2.4s, v3.4s }. The SME2 ISA embraces this form and makes it the preferred disassembly. It also embraces wrapped lists such as { z31.s - z2.s }, which is something that binutils didn't previously allow. The range form was already binutils's preferred disassembly for 3- and 4-register lists. This patch prefers it for 2-register lists too. The patch also adds support for wrap-around.
2023-03-30aarch64: Add support for strided register listsRichard Sandiford1-20/+38
SME2 has instructions that accept strided register lists, such as { z0.s, z4.s, z8.s, z12.s }. The purpose of this patch is to extend binutils to support such lists. The parsing code already had (unused) support for strides of 2. The idea here is instead to accept all strides during parsing and reject invalid strides during constraint checking. The SME2 instructions that accept strided operands also have non-strided forms. The errors about invalid strides therefore take a bitmask of acceptable strides, which allows multiple possibilities to be summed up in a single message. I've tried to update all code that handles register lists.
2023-03-30aarch64: Rename some of GAS's REG_TYPE_* macrosRichard Sandiford1-71/+71
In GAS, the vector and predicate registers are identified by REG_TYPE_VN, REG_TYPE_ZN and REG_TYPE_PN. This "N" is obviously a placeholder for the register number. However, we don't use that convention for integer and FP registers, and (more importantly) SME2 adds "predicate-as-counter" registers that are denoted PN. This patch therefore drops the "N" suffix from the existing registers. The main hitch is that Z was also used for the zero register in things like R_Z, but using ZR seems more consistent with the SP-based names.
2023-03-30aarch64: Add a aarch64_cpu_supports_inst_p helperRichard Sandiford1-2/+1
Quite a lot of SME2 instructions have an opcode bit that selects between 32-bit and 64-bit forms of an instruction, with the 32-bit forms being part of base SME2 and with the 64-bit forms being part of an optional extension. It's nevertheless useful to have a single opcode entry for both forms since (a) that matches the ISA definition and (b) it tends to improve error reporting. This patch therefore adds a libopcodes function called aarch64_cpu_supports_inst_p that tests whether the target supports a particular instruction. In future it will depend on internal libopcodes routines.
2023-03-30aarch64: Tweak priorities of parsing-related errorsRichard Sandiford1-5/+45
There are three main kinds of error reported during parsing, in increasing order of priority: - AARCH64_OPDE_RECOVERABLE (register seen instead of immediate) - AARCH64_OPDE_SYNTAX_ERROR - AARCH64_OPDE_FATAL_SYNTAX_ERROR This priority makes sense when comparing errors reported against the same operand. But if we get to operand 3 (say) and see a register instead of an immediate, that's likely to be a better match than something that fails with a syntax error at operand 1. The idea of this patch is to prioritise parsing-related errors based on operand index first, then by error code. Post-parsing errors still win over parsing errors, and their relative priorities don't change.
2023-03-30aarch64: Try to report invalid variants against the closest matchRichard Sandiford1-0/+4
If an instruction has invalid qualifiers, GAS would report the error against the final opcode entry that got to the qualifier- checking stage. It seems better to report the error against the opcode entry that had the closest match, just like we pick the closest match within an opcode entry for the "did you mean this?" message. This patch adds the number of invalid operands as an argument to AARCH64_OPDE_INVALID_VARIANT and then picks the AARCH64_OPDE_INVALID_VARIANT with the lowest argument.
2023-03-30aarch64: Tweak register list errorsRichard Sandiford1-4/+2
The error for invalid register lists had the form: invalid number of registers in the list; N registers are expected at operand M -- `insn' This seems a bit verbose. Also, the "bracketing" is really: (invalid number of registers in the list; N registers are expected) at operand M but the semicolon works against that. This patch goes for slightly shorter messages, setting a template that later patches can use for more complex cases.
2023-03-30aarch64: Make AARCH64_OPDE_REG_LIST take a bitfieldRichard Sandiford1-20/+34
AARCH64_OPDE_REG_LIST took a single operand that specified the expected number of registers. However, there are quite a few SME2 instructions that have both 2-register forms and (separate) 4-register forms. If the user tries to use a 3-register list, it isn't obvious which opcode entry they meant. Saying that we expect 2 registers and saying that we expect 4 registers would both be wrong. This patch therefore switches the operand to a bitfield. If a AARCH64_OPDE_REG_LIST is reported against multiple opcode entries, the patch ORs up the expected lengths. This has no user-visible effect yet. A later patch adds more error strings, alongside tests that use them.
2023-03-30aarch64: Add an error code for out-of-range registersRichard Sandiford1-0/+8
libopcodes currently reports out-of-range registers as a general AARCH64_OPDE_OTHER_ERROR. However, this means that each register range needs its own hard-coded string, which is a bit cumbersome if the range is determined programmatically. This patch therefore adds a dedicated error type for out-of-range errors.
2023-03-30aarch64: Deprioritise AARCH64_OPDE_REG_LISTRichard Sandiford1-3/+3
SME2 has many instructions that take a list of SVE registers. There are often multiple forms, with different forms taking different numbers of registers. This means that if, after a successful parse and qualifier match, we find that the number of registers does not match the opcode entry, the associated error should have a lower priority/severity than other errors reported at the same stage. For example, if there are 2-register and 4-register forms of an instruction, and if the assembly code uses the 2-register form with an out-of-range value, the out-of-range value error against the 2-register instruction should have a higher priority than the "wrong number of registers" error against the 4-register instruction. This is tested by the main SME2 patches, but seemed worth splitting out.
2023-03-30aarch64: Update operand_mismatch_kind_namesRichard Sandiford1-0/+2
The contents of operand_mismatch_kind_names were out of sync with the enum.
2023-03-30aarch64: Rework reporting of failed register checksRichard Sandiford1-118/+282
There are many opcode table entries that share the same mnemonic. Trying to parse an invalid assembly line will trigger an error for each of these entries, but the specific error might vary from one entry to another, depending on the exact nature of the problem. GAS has quite an elaborate system for picking the most appropriate error out of all the failed matches. And in many cases it works well. However, one of the limitations is that the error is always reported against a single opcode table entry. If that table entry isn't the one that the user intended to use, then the error can end up being overly specific. This is particularly true if an instruction has a typoed register name, or uses a type of register that is not accepted by any opcode table entry. For example, one of the expected error matches for an attempted SVE2 instruction is: Error: operand 1 must be a SIMD scalar register -- `addp z32\.s,p0/m,z32\.s,z0\.s' even though the hypothetical user was presumably attempting to use the SVE form of ADDP rather than the Advanced SIMD one. There are many other instances of this in the testsuite. The problem becomes especially acute with SME2, since many SME2 instructions reuse existing mnemonics. This could lead to us reporting an SME-related error against a non-SME instruction, or a non-SME-related error against an SME instruction. This patch tries to improve things by collecting together all the register types that an opcode table entry expected for a given operand. It also records what kind of register was actually seen, if any. It then tries to summarise all this in a more directed way, falling back to a generic error if the combination defies a neat summary. The patch includes tests for all new messages except REG_TYPE_ZA, which only triggers with SME2. To test this, I created an assembly file that contained the cross product of all known mnemonics and one example from each register class. I then looked for cases where the new routines fell back on the generic errors ("expected a register" or "unexpected register type"). I locally added dummy messages for each one until there were no more hits. The patch adds a specimen instruction to diagnostics.s for each of these combinations. In each case, the combination didn't seem like something that could be summarised in a natural way, so the generic messages seemed better. There's always going to be an element of personal taste around this kind of thing though. Adding more register types made 1<<REG_TYPE_MAX exceed the range of the type, but we don't actually need/want 1<<REG_TYPE_MAX.
2023-03-30aarch64: Try to avoid inappropriate default errorsRichard Sandiford1-4/+17
After parsing a '{' and the first register, parse_typed_reg would report errors in subsequent registers in the same way as for the first register. It used set_default_error, which reports errors of the form "operand N must be X". The problem is that if there are multiple opcode entries for the same mnemonic, there could be several matches that lead to a default error. There's no guarantee that the default error for the register list is the one that will be chosen. To take an example from the testsuite: ext z0.b,{z31.b,z32.b},#0 gave: operand 2 must be an SVE vector register with the error being reported against the single-vector version of ext, even though the operand is clearly a list. This patch uses set_fatal_syntax_error to bump the priority of the error once we're sure that the operand is a list of the right type.
2023-03-30aarch64: Improve errors for malformed register listsRichard Sandiford1-13/+22
parse_typed_reg is used for parsing both bare registers and registers that occur in lists. If it doesn't see a register, or sees an unexpected kind of register, it queues a default error to report the problem. These default errors have the form "operand N must be an X", where X comes from the operand table. If there are multiple opcode entries that report default errors, GAS tries to pick the most appropriate one, using the opcode table order as a tiebreaker. But this can lead to cases where a syntax error in a register list is reported against an opcode that doesn't accept register lists. For example, the unlikely error: ext z0.b,{,},#0 is reported as: operand 2 must be an SVE vector register -- `ext z0.b,{,},#0' even though operand 2 can be a register list. If we've parsed the opening '{' of a register list, and then see something that isn't remotely register-like, it seems better to report that directly as a syntax error, rather than rely on the default error. The operand won't be a valid list of anything, so there's no need to pick a specific Y in "operand N must be a list of Y".