Age | Commit message (Collapse) | Author | Files | Lines |
|
PR31595
|
|
APX spec removed KEYLOCKER and SHA promotions from EVEX MAP4.
https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html
gas/ChangeLog:
* NEWS: Mention that remove KEYLOCKER and SHA promotions from EVEX
* MAP4.
* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l: Removed KEYLOCKER
* and SHA instructions.
* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted.d: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-prefix.h: Removed KEYLOCKER and SHA instructions.
* i386-dis-evex.h: Ditto.
* i386-opc.tbl: Ditto.
* i386-tbl.h: Regenerated.
|
|
The assembler wrongly expects plain register name instead of
memory-form 2nd operand for gcsstr and gcssttr instructions.
This patch fixes the issue.
|
|
binutils/
* doc/binutils.texi (PowerPC -M option): Mention power11 and pwr11.
gas/
* config/tc-ppc.c: (md_show_usage): Mention -mpower11 and -mpwr11.
* doc/c-ppc.texi: Likewise.
opcodes/
* ppc-dis.c (ppc_opts): Add "power11" and "pwr11" entries.
(powerpc_init_dialect): Default to "power11".
(cherry picked from commit 4199cf1e152daab0460f08cc7dbd1f727ac3e4cc)
|
|
In eea4357967b6 ("x86/APX: VROUND{P,S}{S,D} can generally be encoded") I
failed to add the AVX512* ISA dependency of the two new entries.
|
|
|
|
|
|
a0aa6f4ab (LoongArch: ld: Add support for TLS LE symbol with addend) to 2.42 branch.
|
|
|
|
|
|
VRNDSCALE{P,S}{S,D} is the AVX512 generalization of these AVX insns. As
long as the immediate has the top 4 bits clear, they are equivalent to
the earlier VEX-encoded insns, and hence can be used to permit use of
eGPR-s in the memory operand. Since this is the normal way of using
these insns, also alter the resulting diagnostic to complain about the
immediate, not the eGPR use.
|
|
When there's a suitably disambiguating register operand, suffixes are
generally omitted (unless in suffix-always mode). All NDD insns have a
suitable register operand, so they shouldn't have suffixes by default.
|
|
This was missed in 6177c84d5edc ("Support APX GPR32 with extend evex
prefix").
|
|
|
|
|
|
|
|
|
|
Along with the relevant unit-tests, this adds the following rcpc3
instructions:
STL1 { <Vt>.D }[<index>], [<Xn|SP>]
LDAP1 { <Vt>.D }[<index>], [<Xn|SP>]
LDAPUR <Bt>, [<Xn|SP>{, #<simm>}]
LDAPUR <Ht>, [<Xn|SP>{, #<simm>}]
LDAPUR <St>, [<Xn|SP>{, #<simm>}]
LDAPUR <Dt>, [<Xn|SP>{, #<simm>}]
LDAPUR <Qt>, [<Xn|SP>{, #<simm>}]
STLUR <Bt>, [<Xn|SP>{, #<simm>}]
STLUR <Ht>, [<Xn|SP>{, #<simm>}]
STLUR <St>, [<Xn|SP>{, #<simm>}]
STLUR <Dt>, [<Xn|SP>{, #<simm>}]
STLUR <Qt>, [<Xn|SP>{, #<simm>}]
with `#<simm>' taking on a signed 8-bit integer value in the range
[-256,255] and `index' the values 0 or 1.
Co-authored-by: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
|
|
Along with the relevant unit tests and updates to the existing
regression tests, this adds support for the following novel rcpc3
insns:
LDIAPP <Wt1>, <Wt2>, [<Xn|SP>]
LDIAPP <Wt1>, <Wt2>, [<Xn|SP>], #8
LDIAPP <Xt1>, <Xt2>, [<Xn|SP>]
LDIAPP <Xt1>, <Xt2>, [<Xn|SP>], #16
STILP <Wt1>, <Wt2>, [<Xn|SP>]
STILP <Wt1>, <Wt2>, [<Xn|SP>, #-8]!
STILP <Xt1>, <Xt2>, [<Xn|SP>]
STILP <Xt1>, <Xt2>, [<Xn|SP>, #-16]!
LDAPR <Wt>, [<Xn|SP>], #4
LDAPR <Xt>, [<Xn|SP>], #8
STLR <Wt>, [<Xn|SP>, #-4]!
STLR <Xt>, [<Xn|SP>, #-8]!
|
|
This patch adds the necessary macro for encoding FEAT_RCPC3-dependent
instructions in Binutils.
|
|
Given the introduction of the new address operand types for rcpc3
instructions, this patch adds the necessary logic to teach
`general_constraint_met_p` how to proper handle these.
|
|
The particular choices of address indexing, along with their encoding
for RCPC3 instructions lead to the requirement of a new set of operand
descriptions, along with the relevant inserter/extractor set.
That is, for the integer load/stores, there is only a single valid
indexing offset quantity and offset mode is allowed - The value is
always equivalent to the amount of data read/stored by the
operation and the offset is post-indexed for Load-Acquire RCpc, and
pre-indexed with writeback for Store-Release insns.
This indexing quantity/mode pair is selected by the setting of a
single bit in the instruction. To represent these insns, we add the
following operand types:
- AARCH64_OPND_RCPC3_ADDR_OPT_POSTIND
- AARCH64_OPND_RCPC3_ADDR_OPT_PREIND_WB
In the case of loads and stores involving SIMD/FP registers, the
optional offset is encoded as an 8-bit signed immediate, but neither
post-indexing or pre-indexing with writeback is available. This
created the need for an operand type similar to
AARCH64_OPND_ADDR_OFFSET, with the difference that FLD_index should
not be checked.
We thus introduce the AARCH64_OPND_RCPC3_ADDR_OFFSET operand, a
variant of AARCH64_OPND_ADDR_OFFSET, w/o the FLD_index bitfield.
|
|
Beyond the need to encode any registers involved in data transfer and
the address base register for load/stores, it is necessary to specify
the data register addressing mode and whether the address register is
to be pre/post-indexed, whereby loads may be post-indexed and stores
pre-indexed with write-back.
The use of a single bit to specify both the indexing mode and indexing
value requires a novel function be written to accommodate this for
address operand insertion in assembly and another for extraction in
disassembly, along with the definition of two insn fields for use with
these instructions.
This therefore defines the following functions:
- aarch64_ins_rcpc3_addr_opt_offset
- aarch64_ins_rcpc3_addr_offset
- aarch64_ext_rcpc3_addr_opt_offset
- aarch64_ext_rcpc3_addr_offset
It extends the `do_special_{encoding|decoding}' functions and defines
two rcpc3 instruction fields:
- FLD_opc2
- FLD_rcpc3_size
|
|
The allowed immediate offsets in integer rcpc3 load store instructions
are not encoded explicitly in the instruction itself, being rather
implicitly equivalent to the amount of data loaded/stored by the
instruction.
This leads to the requirement that this quantity be calculated based on
the number of registers involved in the transfer, either as data
source or destination registers and their respective qualifiers.
This is done via `calc_ldst_datasize (const aarch64_opnd_info *opnds)'
implemented here, using a cumulative sum of qualifier sizes preceding
the address operand in the OPNDS operand list argument.
|
|
Indicating the presence of the Armv8.2-a feature adding further
support for the Release Consistency Model, the `+rcpc3' architectural
extension flag is added to the list of possible `-march' options in
Binutils, together with the necessary macro for encoding rcpc3
instructions.
|
|
There are some tlbi operations that don't have a corresponding tlbip operation,
but we were incorrectly using the same list for both. Add the missing tlbi
*nxs operations, and use the F_REG_128 flag to filter tlbi operations that
don't have a tlbip analogue. For increased clarity, I have also used a macro
to reduce duplication between the 'nxs' and non-'nxs' variants, and added a
test to verify that no invalid combinations are accepted.
Additionally, fix two missing checks for AARCH64_OPND_SYSREG_TLBIP that were
preventing disassembly of tlbip instructions.
|
|
Add an aarch64_feature_set field to aarch64_sys_ins_reg, and use this for
feature checks instead of testing against a list of operand codes.
|
|
|
|
Hi,
This patch add support for SVE2.1 instructions ld1q,
ld2q, ld3q and ld4q, st1q, st2q, st3q and st4q.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for SVE2.1 instruction faddqv,
fmaxnmqv, fmaxqv, fminnmqv and fminqv.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for SVE2.1 instruction dupq, eorqv and extq.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for FEAT_SVE2p1 (SVE2.1 Extension) feature
along with +sve2p1 optional flag to enabe this feature.
Also support for following SVE2p1 instructions is added
addqv, andqv, smaxqv, sminqv, umaxqv, uminqv and uminqv.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for FEAT_SME2p1 and "movaz" instructions
along with the optional flag +sme2p1.
Following "movaz" instructions are add:
Move and zero two ZA tile slices to vector registers.
Move and zero four ZA tile slices to vector registers.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for SVE2.1 and SME2.1 non-widening BFloat16
(FEAT_B16B16) instructions.
Following instructions predicated, unpredicated and indexed
variants are added in this patch.
bfadd, bfclamp, bfmax bfmaxnm, bfmin,bfminnm,
bfmla,bfmls,bfmul and bfsub.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
The ginsn representation keeps the DWARF register number of the
operands. The API ginsn_dw2_regnum relies on the the relative ordering
of these register entries in the table. Add a comment to make it clear.
opcodes/
* i386-reg.tbl: Add a comment.
|
|
Some x86 instructions affect the stack pointer implicitly. Add a new
operand constraint to reflect this. This will be useful for SCFI
implmentation to ensure its correctness.
Mark all push, pop, call, ret, enter, leave, INT, iret instructions.
opcodes/
* i386-gen.c: Update opcode_modifiers.
* i386-opc.h: Add a new constraint.
* i386-opc.tbl: Update the affected instructions.
* i386-tbl.h: Regenerated.
|
|
Rex2 is currently an operand constraint. For the upcoming SCFI
implementation in GAS, we need to identify operations which implicitly
update the stack pointer. An operand constraint enumerator for implicit
stack op seems more appropriate than an attribute. However, two opcodes
currently necessitate both Rex2 and an implicit stack op marker; this
prompts revisiting the current representations a bit.
Make Rex2 a standalone attribute, so that later a new operand constraint
may be added for IMPLICIT_STACK_OP.
ChangeLog:
* gas/config/tc-i386.c (is_apx_rex2_encoding): Update the check.
* opcodes/i386-gen.c: Add a new BITFIELD for Rex2.
* opcodes/i386-opc.h (REX2_REQUIRED): Remove.
* opcodes/i386-opc.tbl: Remove Rex2 operand constraint.
* opcodes/i386-tbl.h: Regenerated.
|
|
Most of this code became redundant in my previous commits, but ARMV8_6A_SVE was
already dead when it was first added.
|
|
There's no reason to disallow the aliases when the aliased instructions are
always available. The new behaviour matches existing LLVM behaviour.
|
|
Additionally, change FEAT_XS tlbi variants to be gated on "+xs" instead of
"+d128". This is an incremental improvement; there are still some FEAT_XS tlbi
variants that are gated incorrectly or missing entirely.
|
|
|
|
|
|
|
|
Due to the formatted output of objdump, some instructions
that do not require output operands (such as nop/ret) will
have extra spaces added after them.
Determine whether to output operands through the format
of opcodes. When opc->format is an empty string, no extra
spaces are output.
|
|
This patch adds support for the new AArch64 system registers that are part of the following extensions:
* FEAT_DEBUGv8p9
* FEAT_PMUv3p9
* FEAT_PMUv3_SS
* FEAT_PMUv3_ICNTR
* FEAT_SEBEP
|
|
As already indicated during review, we can't get away without certain
adjustments here: Without these, respective {evex}-prefixed insns are
assembled to APX encodings even when APX_F is turned off.
While there also extend the respective comment in the opcode table, to
explain why this construct is used.
|
|
This patch adds support for FEAT_THE doubleword and quadword instructions.
doubleword insturctions are enabled by "+the" flag whereas quadword
instructions are enabled on passing both "+the and +d128" flags.
Support for following sets of instructions is added in this patch.
Read check write compare and swap doubleword:
(rcwcas, rcwcasa, rcwcasal, rcwcasl)
Read check write compare and swap quadword:
(rcwcasp,rcwcaspa, rcwcaspal, rcwcaspl)
Read check write software compare and swap doubleword:
(rcwscas, rcwscasa, rcwscasal, rcwscasl)
Read check write software compare and swap quadword:
(rcwscasp, rcwscaspa, rcwscaspal, rcwscaspl)
Read check write atomic bit clear on doubleword:
(rcwclr, rcwclra, rcwclral, rcwclrl)
Read check write atomic bit clear on quadword:
(rcwclrp, rcwclrpa, rcwclrpal, rcwclrpl)
Read check write software atomic bit clear on doubleword:
(rcwsclr, rcwsclra, rcwsclral, rcwsclrl)
Read check write software atomic bit clear on quadword:
(rcwsclrp,rcwsclrpa, rcwsclrpal,rcwsclrpl)
Read check write atomic bit set on doubleword:
(rcwset,rcwseta, rcwsetal,rcwsetl)
Read check write atomic bit set on quadword:
(rcwsetp,rcwsetpa,rcwsetpal,rcwsetpl)
Read check write software atomic bit set on doubleword:
(rcwsset,rcwsseta,rcwssetal,rcwssetl)
Read check write software atomic bit set on quadword:
(rcwssetp,rcwssetpa,rcwssetpal,rcwssetpl)
Read check write swap doubleword:
(rcwswp,rcwswpa,rcwswpal,rcwswpl)
Read check write swap quadword:
(rcwswpp,rcwswppa, rcwswppal,rcwswppl)
Read check write software swap doubleword:
(rcwsswp,rcwsswpa,rcwsswpal,rcwsswpl)
Read check write software swap quadword:
(rcwsswpp,rcwsswppa,rcwsswppal,rcwsswppl)
|
|
|
|
With the addition of 128-bit system registers to the Arm architecture
starting with Armv9.4-a, a mechanism for manipulating their contents
is introduced with the `msrr' and `mrrs' instruction pair.
These move values from one such 128-bit system register into a pair of
contiguous general-purpose registers and vice-versa, as for example:
msrr ttlb0_el1, x0, x1
mrrs x0, x1, ttlb0_el1
This patch adds the necessary support for these instructions, adding
checks for system-register width by defining a new operand type in the
form of `AARCH64_OPND_SYSREG128' and the `aarch64_sys_reg_128bit_p'
predicate, responsible for checking whether the requested system
register table entry is marked as implemented in the 128-bit mode via
the F_REG_128 flag.
|
|
The 2020 Architecture Extensions to the Arm A-profile architecture
added FEAT_XS, the XS attribute feature, giving cores the ability to
identify devices which can be subject to long response delays. TLB
invalidate (TLBI) operations and barriers can also be annotated with
this attribute[1].
With the introduction of the 128-bit translation tables with the
Armv8.9-a/Armv9.4-a Translation Hardening Extension, a series of new
TLB invalidate operations are introduced which make use of this
extension. These are added to aarch64_sys_regs_tlbi[] for use
with the `tlbip' insn.
[1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2020
|