Age | Commit message (Collapse) | Author | Files | Lines |
|
Instruction templates with only sign-extended 8-bit immediate operand
also have a second template with full-operand-size immediate operand
under a different opcode. Add {noimm8s} pseudo prefix to exclude
templates with only sign-extended 8-bit immediate operand.
gas/
PR gas/32811
* config/tc-i386.c (pseudo_prefixes): Add no_imm8s.
(operand_size_match): Return false for templates with only sign-
extended 8-bit immediate operand if {noimm8s} is used.
(parse_insn): Handle Prefix_NoImm8s.
* doc/c-i386.texi: Document {noimm8s}.
* testsuite/gas/i386/pseudos.s: Add tests for {noimm8s}.
* testsuite/gas/i386/x86-64-pseudos.s: Likewise.
* testsuite/gas/i386/pseudos.d: Updated.
* testsuite/gas/i386/x86-64-pseudos.d: Likewise.
opcodes/
PR gas/32811
* opcodes/i386-opc.h (Prefix_NoImm8s): New.
* i386-opc.tbl: Add {noimm8s} pseudo prefix.
* i386-mnem.h: Regenerated.
* i386-tbl.h: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
The privileged spec 1.9.1 support was removed since binutils 2.43. The
linker only recognizes it and then reports a warning that it may
conflict with other spec versions.
While the support is removed, binutils should still recognize it, but it
shouldn't be exposed to the user in `disassember-options` help.
Signed-off-by: Marek Pikuła <m.pikula@partner.samsung.com>
|
|
Since we will support 512 bit on both P-core and E-core for AVX10, 256 bit
rounding is not that useful because we currently have rounding feature
directly on E-core now and no need to use 256-bit rounding as somehow
a workaround. This patch will remove all the support and backport to
Binutils 2.44.
gas/ChangeLog:
* NEWS: Mention support removal.
* config/tc-i386.c (build_evex_prefix): Remove U bit encode.
(check_VecOperands): Remove ymm check for rounding.
(s_insn): Revise .insn comment.
* testsuite/gas/i386/avx10_2-256-cvt-intel.d: Remove ymm
rounding related test.
* testsuite/gas/i386/avx10_2-256-cvt.d: Ditto.
* testsuite/gas/i386/avx10_2-256-cvt.s: Ditto.
* testsuite/gas/i386/avx10_2-256-miscs-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-256-miscs.d: Ditto.
* testsuite/gas/i386/avx10_2-256-miscs.s: Ditto.
* testsuite/gas/i386/avx10_2-256-satcvt-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-256-satcvt.d: Ditto.
* testsuite/gas/i386/avx10_2-256-satcvt.s: Ditto.
* testsuite/gas/i386/evex.d: Ditto.
* testsuite/gas/i386/evex.s: Ditto.
* testsuite/gas/i386/i386.exp: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-miscs-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-miscs.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-miscs.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-satcvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-satcvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-satcvt.s: Ditto.
* testsuite/gas/i386/x86-64-evex.d: Ditto.
* testsuite/gas/i386/x86-64.exp: Ditto.
* testsuite/gas/i386/avx10_2-rounding-intel.d: Removed.
* testsuite/gas/i386/avx10_2-rounding-inval.l: Removed.
* testsuite/gas/i386/avx10_2-rounding-inval.s: Removed.
* testsuite/gas/i386/avx10_2-rounding.d: Removed.
* testsuite/gas/i386/avx10_2-rounding.s: Removed.
* testsuite/gas/i386/x86-64-avx10_2-rounding-intel.d: Removed.
* testsuite/gas/i386/x86-64-avx10_2-rounding.d: Removed.
* testsuite/gas/i386/x86-64-avx10_2-rounding.s: Removed.
opcodes/ChangeLog:
* i386-dis.c (struct instr_info): Remove U bit.
(get_valid_dis386): Roll back to APX condition.
* i386-opc.tbl: Remove ymm rounding support.
* i386-tbl.h: Regenerated.
|
|
Turns out the return value of parse_loongarch_dis_option acts as an
error code, and previously the function always signified failure with
a non-zero return value, making only the first disassembly option get
to take effect.
Fix by adding the missing `return 0`'s to the two success code paths.
Signed-off-by: WANG Xuerui <git@xen0n.name>
|
|
Add instruction `mnret' support
Ref:
https://github.com/riscv/riscv-isa-manual/blob/bb8b9127f81965eeff2d150c211d1c89376591c4/src/rnmi.adoc
https://github.com/riscv/riscv-opcodes/blob/946eb673874b3a0f2474d1424dc28bc7ee53c306/extensions/rv_smrnmi
bfd/ChangeLog:
* elfxx-riscv.c: Add new Smrnmi instruction class handling
gas/ChangeLog:
* testsuite/gas/riscv/smrnmi.s: New test for mnret
* testsuite/gas/riscv/rmrnmi.d: Likewise
include/ChangeLog:
* opcode/ricsv-opc.h: Add MATCH_MNRET, MASK_MNRET
* opcode/riscv.h: Add new instruction class
opcodes/ChangeLog:
* riscv-opc.c: Add `mnret' instruction
Signed-off-by: Jerry Zhang Jian <jerry.zhangjian@sifive.com>
|
|
FEAT_MEC support was introduced in [1]. However, the dc instruction was
missing these encodings:
- DC CIPAE
- DC CIGDPAE
Furthermore, the Arm ARM states that FEAT_MEC is an optional extension,
introduced for v9.2-a.
Therefore, these sysregs:
- MECIDR_EL2
- MECID_P0_EL2
- MECID_A0_EL2
- MECID_P1_EL2
- MECID_A1_EL2
- VMECID_P_EL2
- VMECID_A_EL2
- MECID_RL_A_EL3
which were introduced in that commit now require -march=armv9.2-a at the very
least to be enabled, as well as the dc encodings.
opcodes/ChangeLog:
* aarch64-opc.c (aarch64_sys_regs_dc): Add "cipae" and "cigdpae".
* aarch64-sys-regs.def: Add V8_7A as a requirement for the above system
registers.
gas/testsuite/gas/ChangeLog
* aarch64/mec-invalid.s: Add .arch directive.
* aarch64/mec.d: Add .arch directive and check for cipae, cigdpae.
* aarch64/mec.s: Add MEC data cache operations test.
* aarch64/mec-arch-bad.d: New test to check for bad arch version.
* aarch64/mec-arch-bad.l: Above.
[1]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=31f2faf5cf112931cfb8c0564a2b78477c907fe3
Regression tested on aarch64-none-elf
|
|
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds the additional extension "XTheadVdot" based on the
"V" extension, and it provides four 8-bit multiply and add with
32-bit instructions for the "v" extension. The 'th' prefix and the
"XTheadVector" extension are documented in a PR for the
RISC-V toolchain conventions ([2]).
Co-Authored-By: Lifang Xia <lifang_xia@linux.alibaba.com>
[1] https://github.com/XUANTIE-RV/thead-extension-spec/tree/master/xtheadvdot
[2] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
bfd/ChangeLog:
* elfxx-riscv.c (riscv_multi_subset_supports): Add support
for "XTheadVdot" extension.
(riscv_multi_subset_supports_ext): Likewise.
gas/ChangeLog:
* doc/c-riscv.texi: Likewise.
* testsuite/gas/riscv/march-help.l: Likewise.
* testsuite/gas/riscv/x-thead-vdot.d: New test.
* testsuite/gas/riscv/x-thead-vdot.s: New test.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VMAQA_VV): New.
* opcode/riscv.h (enum riscv_insn_class): Add insn class for
XTheadVdot.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
|
|
Since we now always generate $x+isa for now, these would increase the
dis-assemble time by parsing the same architecture string repeatedly. We
already have `arch_str' field into `subset_list' to record the current
architecture stirng, but it's only useful for assembler, since dis-assembler
and linker don't need it before. Now for dis-assembler, we just need to
update the `arch_str' after parsing the architecture stirng, and then avoid
parsing repeatedly if the strings are the same.
|
|
The mapping symbol "$x" without an ISA string "means using ISA
configuration from ELF attribute."[1]. Currently the code does not
reset the subset_list. This means that a previous mapping symbol that
overrides the ISA string will continue to be used, rather than the
default string set in the ELF file's .riscv.attributes section. This
can cause incorrect or failed instruction decodings.
In practice, this causes problems when disassembling code generated by
LLVM, which (unlike gas) does not emit explicit mapping symbols at the
start of each section.
This change stores the default architecture string seen at the beginning
of disassembly in the global parse data struct, and restores that to
subset_list whenever a bare "$x" symbol is seen.
[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc#mapping-symbol
Before this patch, the mapping-x.s was dumped as,
00000000 <.text>:
0: 00000013 nop
4: 0001 .insn 2, 0x0001
6: 0001 .insn 2, 0x0001
Which is caused by the definiation of $x was conflict with the psABI.
|
|
If data is encountered that is not a power of two, dump all of the data with
a .<N>byte directive. The current largest support risc-v instruction length
is 22, so the data over 22 bytes will be displayed by,
.insn, 22, ... + .<N-22>byte.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
|
|
Add gpr and fpr names for the o64 ABI to objdump.
With the recent addition of both EABIs, this completes support for the
standard ABI options (ABI-breaking options such as -modd-spreg or
-mabi=32 -mfp64 notwithstanding). The names have been verified against
GCC's usage of the registers. Notably, the only(?) documentation that
defines the o64 ABI at
https://gcc.gnu.org/projects/mipso64-abi.html
appears to contain a mistake w.r.t. floating-point arguments. In
particular:
> If the first and second arguments floating-point arguments to a
> function are 32-bit values, they are passed in $f12 and $f14.
As from 4.0.0 this does not happen in GCC's implementation of the ABI;
a pair of single-float arguments are still passed in $f12 and $f13, the
same as when one or both of the arguments are double-precision floats.
The registers $f12, $f13 and $f14 have been named $fa0, $fa1 and $ft10
to match the implementation.
Signed-off-by: Maximilian Ciric <max.ciric@gmail.com>
|
|
Commit 3f61a38b5e81 moved the disassembler subset_list from a static
variable to disassembler private_data. It is now malloc'd in
riscv_init_disasm_info so should be freed when disassemble_free_target
runs.
* riscv-dis.c (disassemble_free_riscv): Free subset_list.
|
|
Extend gpr and fpr register names with names suitable for both EABIs.
Heavily inspired by the EABI documenation written by Eric Christopher,
which can be read at
https://sourceware.org/legacy-ml/binutils/2003-06/msg00436.html
2025-02-15 Anghelo Carvajal <angheloalf95@gmail.com>
* mips-dis.c (mips_fpr_names_eabi32): New variable.
(mips_fpr_names_eabi64): New variable.
(mips_abi_choices): Add "eabi32" and "eabi64" options.
Signed-off-by: Anghelo Carvajal <angheloalf95@gmail.com>
|
|
Previously we limited SSAMOSWAP.W only available on RV32, but it should
be available on RV64 as well.
See
https://github.com/riscv/riscv-cfi/blob/main/src/cfi_backward.adoc
https://github.com/riscv/riscv-isa-manual/blob/702a3e6e843235a2a13b918ae6938b04f8974ffc/src/unpriv-cfi.adoc#L789
|
|
I got a request said that the JDK multi-thread compiler may be broken
if two or more threads are trying to print/disassemble stuff, and filling
the disassemble_info, setting callbacks, and grabbing the function pointer
to disasm at the same time. Since such as the target global static stuff,
including subset of extensions and mapping symbol stuff, seems to only be
one globally. Ideally, for dis-assembler, all global static target stuff
should/can be better to be defined into the target private data, since they
are target-dependency.
opcodes/
* riscv-dis.c: Moved all global static target-dependency stuff into
riscv_private_data, including architecture and mapping symbol stuff.
(set_default_riscv_dis_options): Updated since global static target-
dependency stuff are moved into riscv_private_data.
(parse_riscv_dis_option_without_args): Likewise.
(parse_riscv_dis_option): Likewise.
(parse_riscv_dis_options): Likewise.
(maybe_print_address): Likewise.
(print_reg_list): Likewise.
(riscv_get_spimm): Likewise.
(print_insn_args): Likewise.
(riscv_disassemble_insn): Likewise.
(riscv_update_map_state): Likewise.
(riscv_search_mapping_symbol): Likewise.
(riscv_data_length): Likewise.
(print_insn_riscv): Likewise. Call the riscv_init_disasm_info before
parsing any disassembler options, since the related stuff are moved
into riscv_private_data.
(riscv_init_disasm_info): Likewise. Parse and set the architecture
string and privileged spec version since riscv_get_disassembler is
no longer needed.
(riscv_get_disassembler): Removed.
(disassemble_free_riscv): Only free the subset_list if
riscv_private_data exsits.
* disassemble.c (disassembler): Since riscv_get_disassembler is
removed, call to print_insn_riscv.
* disassemble.h: Removed extern riscv_get_disassembler.
|
|
The CPUID EDX bit[28] indicates its enablement, and it includes REP
XMODEXP and REP MONTMUL2. XMODX stands for modular exponentiation, it indicates
the support of modular exponentiation feature, both REP XMODEXP and
REP MONTMUL2 use it.
gas/ChangeLog:
* NEWS: Support Zhaoxin PadLock XMODX instructions.
* config/tc-i386.c (add_branch_prefix_frag_p): Don't add prefix to
PadLockXMODX instructions.
(output_insn): Handle PadLockXMODX instructions.
* doc/c-i386.texi: Document PadLockXMODX.
* testsuite/gas/i386/i386.exp: Add PadLockXMODX test.
* testsuite/gas/i386/padlockxmodx.d: Ditto.
* testsuite/gas/i386/padlockxmodx.s: Ditto.
opcodes/ChangeLog:
* i386-dis.c: Add PadLockXMODX.
* i386-gen.c: Ditto
* i386-opc.h (CpuPadLockXMODX): New.
* i386-opc.tbl: Add Zhaoxin PadLock XMODX instructions.
* i386-tbl.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-init.h: Ditto.
|
|
Opcodes and bitmasks are 32-bit numbers and omitting leading
zeros might be interpreted as if they are 28-bit.
|
|
Clean up whitespace for conforming to GNU coding standards
|
|
We agreed with LLVM that we shouldn't enforce the architectural
dependencies between fp8 muliplication features, so remove them.
Additionally, fix a typo in the gating for FEAT_SME_F8F16 instructions,
which were mistakenly gated by +sme-f8f32 instead. Until now this
mistake had been masked by the dependency between the features.
|
|
It lives in a "forbidden" row, yet its disassembler table entry was
lacking a respective marker.
|
|
Like for RMPUPDATE documentation is about to change as far as operands
are concerned. They're merely the other way around here.
While adjustind gas documentation, also add the missing RMPQUERY
counterparts there.
|
|
AMD are about to update their doc, to help clarify that what we
currently do isn't quite right: In particular it is not %rax but %rcx
which is affected by address size. In fact, that's a normal memory
operand, just not expressed via ModR/M byte, but fixed to (%rcx) (or
(%ecx) with 32-bit addressing).
To support this in the assembler, generalize memory operand handling so
far specific to XLAT (which isn't really a string insn, but requires its
memory operand to be (%bx) / (%ebx) / (%rbx)).
In the disassembler mimic handling after XLAT's, too.
|
|
Printing implicit %ds: and %es: prefixes is pretty meaningless in 64-bit
mode. The SDM explicitly omits them for the 64-bit forms, and it
obviously has them for the other ones only to cover non-64-bit modes
(oddly enough the AMD PM has them present).
|
|
Instead of using 0xFFFF0000, or with (~0xFFFF) to sign extend
negative 16-bit value and with (~0xFFFF) to extract higher order
address bits
opcodes/
* microblaze-dis.c: (print_insn_microblaze): Widen mask
(microblaze_get_target_address): Likewis
Signed-off-by: Gopi Kumar Bulusu <gopi@sankhya.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
|
|
Vector index registers are currently only used in the VRV instruction
format. Unlike general purpose index registers an operand value of
zero (e.g. %v0, 0, or omitted) does not imply a zero value:
"For VRV format instructions, a vector element is used in the formation
of the intermediate value. This vector element is an unsigned binary
integer value that is added to the base address and 12-bit displacement
to form a 64-bit intermediate sum. The vector element is designated by
a vector register and an element index. A zero V field accesses the
element in vector register zero and does not imply a zero value." [1]
Therefore do not omit vector index register 0 in disassembly, that is
disassemble D(VX,B) with VX=0 as D(VX,B) instead of D(B). Also do not
disassemble index register 0 as "0", that is disassemble D(VX,B) with
VX=0 as D(%v0,B) instead of D(0,B). Note that a base register 0 still
still gets disassembled as "0", that is D(VX,B) with B=0 disassembles
into D(VX,0).
[1]: IBM z/Architecture Principles of Operation, SA22-7832-13, IBM z16,
https://publibfp.dhe.ibm.com/epubs/pdf/a227832d.pdf
opcodes/
* s390-dis.c (s390_print_insn_with_opcode): Do not omit vector
index register 0 in disassembly. Disassemble it as %v0.
gas/testsuite/
* gas/s390/zarch-base-index-0.d (vgef): Expect vector index
register 0 in disassembly.
* gas/s390/zarch-omitted-base-index.d (vgef): Likewise.
Suggested-by: Florian Krohm <flo2030@eich-krohm.de>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
|
|
opcodes/
* ppc-opc.c (insert_m2, extract_m2): New functions.
(AESM, PGF1, XX2M, XX3M, XX3GF, XX2AES_MASK, XX2AESM_MASK,
XX3AES_MASK, XX3AESM_MASK, XX3GF_MASK): New macros.
(UIM): Update for new macros.
(powerpc_opcodes): Add xxaes128encp, xxaes192encp, xxaes256encp,
xxaesencp, xxaes128decp, xxaes192decp, xxaes256decp, xxaesdecp,
xxaes128genlkp, xxaes192genlkp, xxaes256genlkp, xxaesgenlkp,
xxgfmul128gcm, xxgfmul128xts, xxgfmul128.
gas/
* testsuite/gas/ppc/future.s: New test.
* testsuite/gas/ppc/future.d: Likewise.
|
|
|
|
|
|
|
|
Many FEAT_SVE2p1 instructions need to be enabled by either of two
different features (one for streaming mode, and one for non-streaming
mode). This patch adds correct gating conditions for these
instructions.
There were also a few sve2p1 instructions missing altogether, so add
those as well.
The testsuite is modified to check for all alternative enablement
conditions. In many cases this is done by adding an alternative
assembler commands to existing test files. For some SME/SME2 tests,
only some of the instructions are enabled by +sve2p1, so these are
copied into a separate test. For original SVE2p1 tests, the non-SME2p1
instructions have been moved to a separate test file.
There are also new tests for the newly added instructions. These
include a couple of fixme comments relating to bad error reporting,
which should be investigated later.
|
|
There are separate CPUID feature bits for SM2 and CCS instructions.
CCS is the acronym of Chinese Cipher System, it includes SM3 and SM4
instructions. This patch adds CpuGMISM2 and CpuGMICCS to replace CpuGMI on
corresponding instructions.
gas/ChangeLog:
* config/tc-i386.c: Add gmism2 and gmiccs to replace gmi.
* doc/c-i386.texi: Ditto.
opcodes/ChangeLog:
* i386-gen.c: Add GMISM2 and GMICCS to replace GMI.
* i386-opc.h (enum i386_cpu): Add CpuGMISM2 and CpuGMICCS to
replace CpuGMI.
* i386-opc.tbl: Replace GMI with GMISM2 on sm2 instruction. Replace GMI
with GMICCS on sm3 and sm4 instructions.
* i386-tbl.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-init.h: Ditto.
|
|
cpu_flags_match() is a hot path. Move the special casing that
b7267244a355 ("Support Intel AMX-MOVRS") added there to i386-gen, thus
affecting only build time performance.
|
|
This change is to make tail conform with software guarded jump of Zicfilp. The
reason to not choose t1 as the label register is that t1 is also as .got.plt
offset of _dl_runtime_resolve in PLT.
See more: https://github.com/riscv-non-isa/riscv-asm-manual/pull/93
|
|
https://github.com/riscv/riscv-cfi/releases/tag/v1.0
This patch only support the CFI instructions and CSR in assembler.
|
|
https://github.com/riscv/riscv-control-transfer-records/releases/tag/v1.0
The privileged spec v1.10 already removed the sfence.vm instruction, and the
encoding of sfence.vm instruction is overlapped with the sctrclr instruction
of ssctr/smctr. But since the privileged spec v1.10 already removed the
sfence.vm, and we no longer support the privileged spec v1.9.1 for now, we
had to remove the sfence.vm.
bfd/
* elfxx-riscv.c (riscv_implicit_subsets): Imply zicsr for ssctr/smctr.
(riscv_supported_std_s_ext): Added ssctr/smctr with version 1.0.
(riscv_multi_subset_supports): Handle INSN_CLASS for ssctr/smctr.
(riscv_multi_subset_supports_ext): Likewise.
gas/
* config/tc-riscv.c (enum riscv_csr_class, riscv_csr_address):
Added and handle CSR_CLASS_SSCTR and CSR_CLASS_SMCTR.
(riscv_is_priv_insn): Removed SFENCE_VM check.
* testsuite/gas/riscv/attribute-14e.d: Removed since sfence.vm is no
longer supported since privileged spec v1.10.
* testsuite/gas/riscv/attribute-14.s: Likewise.
* testsuite/gas/riscv/csr-version-1p10.d: Updated for ssctr/smctr CSRs.
* testsuite/gas/riscv/csr-version-1p10.l: Likewise.
* testsuite/gas/riscv/csr-version-1p11.d: Likewise.
* testsuite/gas/riscv/csr-version-1p11.l: Likewise.
* testsuite/gas/riscv/csr-version-1p12.d: Likewise.
* testsuite/gas/riscv/csr-version-1p12.l: Likewise.
* testsuite/gas/riscv/csr.s: Likewise.
* testsuite/gas/riscv/csr-dw-regnums.d: Likewise.
* testsuite/gas/riscv/csr-dw-regnums.s: Likewise.
* testsuite/gas/riscv/march-help.l: Updated for ssctr/smctr.
* testsuite/gas/riscv/smctr-ssctr.d: New testcase for sctr instruction.
* testsuite/gas/riscv/smctr-ssctr.s: Likewise.
include/
* opcode/riscv-opc.h: Added encoding macro for sctrclr, but removed
encoding macro for sfence.vm since encoding conflict. Added CSR
numbers for ssctr/smctr CSRs.
* opcode/riscv.h (enum riscv_insn_class): Added
INSN_CLASS_SMCTR_OR_SSCTR for sctrclr.
opcodes/
* riscv-opc.c (riscv_opcodes): Added sctrclr, but removed sfence.vm
since encoding conflict.
|
|
of reporting bad for disassembler
According to SDM, vcvt[,u]si2sd under r32 and vcvt[,u]dq2pd treat
Rounding as Ignored when trying to using them. Thus, disassembler
should accept bytecode with rounding instead of reporting bad.
For assembler, it needs some more time to decide how to deal
with that.
gas/ChangeLog:
* testsuite/gas/i386/evex.d: Add new testcase for vcvt[,u]dq2pd.
Change the output for vcvt[,u]si2sd.
* testsuite/gas/i386/evex.s: Ditto.
* testsuite/gas/i386/x86-64-evex.d: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-w.h: Add EXxEVexR64 for vcvt[,u]dq2pd.
* i386-dis.c (OP_Rounding): Mark EVEX_b as used to change the handle
for ignored rounding.
|
|
The CPUID EDX bit[26] indicates its enablement, and it includes REP
XSHA384 and REP XSHA512.
gas/ChangeLog:
* NEWS: Support Zhaoxin PadLock PHE2 instructions.
* config/tc-i386.c (add_branch_prefix_frag_p): Don't add prefix to
PadLockPHE2 instructions.
(output_insn): Handle PadLockPHE2 instructions.
* doc/c-i386.texi: Document PadLockPHE2.
* testsuite/gas/i386/i386.exp: Add PadLockPHE2 test.
* testsuite/gas/i386/padlock_phe2.d: Ditto.
* testsuite/gas/i386/padlock_phe2.s: Ditto.
opcodes/ChangeLog:
* i386-dis.c: Add PadLockPHE2.
* i386-gen.c: Ditto
* i386-opc.h (CpuPadLockPHE2): New.
* i386-opc.tbl: Add Zhaoxin PadLock PHE2 instructions.
* i386-tbl.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-init.h: Ditto.
|
|
This fixes leaks in a ppc disassembler buffer. I'm not sure now why I
used a private buffer for section contents, but I'm not going to
change that just now.
* disassemble.h (disassemble_free_powerpc): Declare.
* disassemble.c (disassemble_free_target): Call it.
* ppc-dis.c (disassemble_free_powerpc): New function.
|
|
NE is quite ambiguous and misleading in mnemonics since it should be
Rounding to Nearest Even, but could be mis-interpretated to No
Exception.
Under its correct meaning, which means rounding, it should only be used
in down-convert, since up-convert is always exact for normal values
It could be difficult to judge which kind of convert it is if we have
the convert between same bit float types.
For all AI data types including BF16 and FP8, the default rounding is
Rounding to Nearest Even. So removing them in mnemonics would reduce
burden for programmers to consider whether it should be added or not
in mnemonics and stop the ambiguous meaning on "NE" itself.
If the convert itself is using a rounding mode other than RNE, it would
be explicitly added in mnemonics (e.g., Long used "T" and "BIAS"
introduced in AVX10.2).
gas/ChangeLog:
* testsuite/gas/i386/avx10_2-256-cvt-intel.d: Refine testcases
according to mnemonics change.
* testsuite/gas/i386/avx10_2-256-cvt.d: Ditto.
* testsuite/gas/i386/avx10_2-256-cvt.s: Ditto.
* testsuite/gas/i386/avx10_2-256-satcvt-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-256-satcvt.d: Ditto.
* testsuite/gas/i386/avx10_2-256-satcvt.s: Ditto.
* testsuite/gas/i386/avx10_2-512-cvt-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-cvt.d: Ditto.
* testsuite/gas/i386/avx10_2-512-cvt.s: Ditto.
* testsuite/gas/i386/avx10_2-512-satcvt-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-satcvt.d: Ditto.
* testsuite/gas/i386/avx10_2-512-satcvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-satcvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-satcvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-satcvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-cvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-cvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-cvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-satcvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-satcvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-satcvt.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-prefix.h: Remove ne in mnemonics for
convert insns.
* i386-opc.tbl: Ditto.
* i386-mnem.h: Regenerated.
* i386-tbl.h: Ditto.
|
|
The functionality for VCOMSBF16 is exactly the same as the VCOMISD/S/H.
The only difference is the bf16 type. Thus, it should be VCOMISBF16.
This patch would fix that.
gas/ChangeLog:
* testsuite/gas/i386/avx10_2-256-bf16-intel.d: Refine testcase
according to mnemonics change.
* testsuite/gas/i386/avx10_2-256-bf16.d: Ditto.
* testsuite/gas/i386/avx10_2-256-bf16.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-prefix.h: Rename VCOMSBF16 to VCOMISBF16.
* i386-opc.tbl: Ditto.
* i386-mnem.h: Regenerated.
* i386-tbl.h: Ditto.
|
|
Since the bf16 is an AI data types, it will be implicitly packed. Thus,
"P" (for packed) is omitted in mnemonics from its introduction. AVX10.2
BF16 arithmetic insns are introduced with "P" in mnemonics with packed.
This patch will remove them for consistency.
NE is quite ambiguous and misleading in mnemonics since it should be
Rounding to Nearest Even, but could be mis-interpretated to No
Exception. While AI data types like BF16 and FP8 are using Rounding to
Nearest Even as default rounding modes. There is no need to use the
ambiguous mnemonics in AVX10.2 insns. This patch will also remove them.
For convert insns, it will be handled in the upcoming patch.
gas/ChangeLog:
* testsuite/gas/i386/avx10_2-256-bf16-intel.d: Refine testcase
according to new mnemonics.
* testsuite/gas/i386/avx10_2-256-bf16.d: Ditto.
* testsuite/gas/i386/avx10_2-256-bf16.s: Ditto.
* testsuite/gas/i386/avx10_2-256-miscs-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-256-miscs.d: Ditto.
* testsuite/gas/i386/avx10_2-256-miscs.s: Ditto.
* testsuite/gas/i386/avx10_2-512-bf16-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-bf16.d: Ditto.
* testsuite/gas/i386/avx10_2-512-bf16.s: Ditto.
* testsuite/gas/i386/avx10_2-512-miscs-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-miscs.d: Ditto.
* testsuite/gas/i386/avx10_2-512-miscs.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-miscs-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-miscs.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-miscs.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-bf16-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-bf16.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-bf16.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-miscs-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-miscs.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-miscs.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-prefix.h: Remove p and ne in bf16 mnemonics.
* i386-opc.tbl: Ditto.
* i386-mnem.h: Regenerated.
* i386-tbl.h: Ditto.
|
|
This patch will support AMX-AVX512. In disassmbler, we pull out all
GPR mode out of the vex length switch to make it more general.
gas/ChangeLog:
* NEWS: Mention the full support on DMR AMX ISAs.
* config/tc-i386.c: Add amx_avx512.
* doc/c-i386.texi: Document .amx_avx512.
* testsuite/gas/i386/x86-64.exp: Run AMX-AVX512 tests.
* testsuite/gas/i386/x86-64-amx-avx512-intel.d: New test.
* testsuite/gas/i386/x86-64-amx-avx512.d: Ditto.
* testsuite/gas/i386/x86-64-amx-avx512.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-len.h: Add EVEX_LEN_0F384A_X86_64_W_0,
EVEX_LEN_0F386D_X86_64_W_0, EVEX_LEN_0F3A07_X86_64_W_0,
EVEX_LEN_0F3A77_X86_64_W_0.
* i386-dis-evex-prefix.h: Add PREFIX_EVEX_0F384A_W_0_L_2,
PREFIX_EVEX_0F386D_W_0_L_2, PREFIX_EVEX_0F3A07_W_0_L_2,
PREFIX_EVEX_0F3A77_W_0_L_2.
* i386-dis-evex-w.h: Add EVEX_W_0F384A_X86_64, EVEX_W_0F386D_X86_64,
EVEX_W_0F3A07_X86_64, EVEX_W_0F3A77_X86_64.
* i386-dis-evex-x86-64.h: Add X86_64_EVEX_0F384A, X86_64_EVEX_0F386D,
X86_64_EVEX_0F3A07, X86_64_EVEX_0F3A77.
* i386-dis-evex.h: Ditto.
* i386-dis.c (EVEX_LEN_0F384A_X86_64_W_0): New.
(EVEX_LEN_0F386D_X86_64_W_0): Ditto.
(EVEX_LEN_0F3A07_X86_64_W_0): Ditto.
(EVEX_LEN_0F3A77_X86_64_W_0): Ditto.
(MOD_EVEX_0F384A_X86_64_W_0): Ditto.
(MOD_EVEX_0F386D_X86_64_W_0): Ditto.
(MOD_EVEX_0F3A07_X86_64_W_0): Ditto.
(MOD_EVEX_0F3A77_X86_64_W_0): Ditto.
(PREFIX_EVEX_0F384A_W_0_L_2): Ditto.
(PREFIX_EVEX_0F386D_W_0_L_2): Ditto.
(PREFIX_EVEX_0F3A07_W_0_L_2): Ditto.
(PREFIX_EVEX_0F3A77_W_0_L_2): Ditto.
(EVEX_W_0F384A_X86_64): Ditto.
(EVEX_W_0F386D_X86_64): Ditto.
(EVEX_W_0F3A07_X86_64): Ditto.
(EVEX_W_0F3A77_X86_64): Ditto.
(X86_64_EVEX_0F384A): Ditto.
(X86_64_EVEX_0F386D): Ditto.
(X86_64_EVEX_0F3A07): Ditto.
(X86_64_EVEX_0F3A77): Ditto.
(OP_VEX): Pull out all GPR mode out of the vector length switch.
* i386-gen.c (isa_dependencies): Add AMX-AVX512.
(cpu_flags): Ditto.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h (CpuAMX_AVX512): New.
(i386_cpu_flags): Add cpuamx_avx512.
* i386-opc.tbl: Add AMX-AVX512 instructions.
* i386-tbl.h: Regenerated.
|
|
This patch will support AMX-MOVRS feature. Unlike all the other
AMX insns in vector space where we pass vex_len_table before
vex_w_table, we first pass vex_w_table for tileloaddrs[,t1] to
align with the order in EVEX space. The reason why we first pass
vex_w_table in EVEX space is due to AMX-AVX512, where tcvtrowd2ps
and tilemovrow with r32 shares the same opcode with tileloaddrs[,t1].
All of them have evex.w = 0 but with different evex.length. Re-doing
that shortly is not ideal.
APX_F extension is also implemented in this patch. The encoding will
be:
- EVEX.128.NP/66.MAP5.W0 F8/F9 !(11):rrr:100 for
T2RPNTLVW[Z0,Z1]RS[,T1] with NF=0.
- EVEX.128.F2/66.0F38.W0 4A !(11):rrr:100 FOR TILELOADDRS[,T1] with
NF=0.
For APX_F extension, we could not use APX_F(AMX_TRANSPOSE&AMX_MOVRS)
since the transformation could not be done. Instead, we will use
AMX_TRANSPOSE & APX_F(AMX_MOVRS). Thus, we should set AMX_TRANSPOSE
for "any" for cpu_flags in assembler. Since it will only affect the
cpu_flags_match, handle that there.
gas/ChangeLog:
* config/tc-i386.c (cpu_arch): Add amx_movrs.
(cpu_flags_match): Set any bitfield for multiple cpuid
enabled insns.
* doc/c-i386.texi: Document .amx_movrs.
* testsuite/gas/i386/x86-64.exp: Run AMX-MOVRS tests.
* testsuite/gas/i386/x86-64-amx-movrs-intel.d: New test.
* testsuite/gas/i386/x86-64-amx-movrs-inval.l: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs-inval.s: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs.d: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-len.h (EVEX_LEN_0F384A_X86_64_W_0): New.
* i386-dis-evex-w.h (EVEX_W_0F384A_X86_64): Ditto.
* i386-dis-evex-x86-64.h (X86_64_EVEX_0F384A): Ditto.
* i386-dis-evex.h: New entry for AMX-MOVRS.
* i386-dis.c:
(PREFIX_VEX_0F384A_X86_64_L_0_W_0): New.
(PREFIX_VEX_MAP5_F8_X86_64_L_0_W_0): Ditto.
(PREFIX_VEX_MAP5_F9_X86_64_L_0_W_0): Ditto.
(X86_64_VEX_0F384A): Ditto.
(X86_64_VEX_MAP5_F8): Ditto.
(X86_64_VEX_MAP5_F9): Ditto.
(X86_64_EVEX_0F384A): Ditto.
(VEX_LEN_0F384A_X86_64_W_0): Ditto.
(VEX_LEN_MAP5_F8_X86_64): Ditto.
(VEX_LEN_MAP5_F9_X86_64): Ditto.
(EVEX_LEN_0F384A_X86_64_W_0): Ditto.
(VEX_W_0F384A_X86_64): Ditto.
(VEX_W_MAP5_F8_X86_64): Ditto.
(VEX_W_MAP5_F9_X86_64): Ditto.
(EVEX_W_0F384A_X86_64): Ditto.
(prefix_table): New entry for AMX-MOVRS.
(x86_64_table): Ditto.
(vex_len_table): Ditto.
(vex_w_table): Ditto.
(map5_f8_opcode): New.
(map5_f9_opcode): Ditto.
(get_valid_dis386): Handle VEX_MAP5 opcode for AMX-MOVRS.
* i386-gen.c (isa_dependencies): Add AMX_MOVRS.
(cpu_flags): Ditto.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h (CpuAMX_MOVRS): New.
(i386_cpu_flags): Add cpuamx_movrs.
* i386-opc.tbl: Add AMX-MOVRS instructions.
* i386-tbl.h: Regenerated.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
This patch focus on supporting MOVRS ISA. We could take this full ISA
as four part: PREFETCHRST2, MOVRS, MOVRS APX_F extension and MOVRS AVX10.2
extension.
The APX_F extension for MOVRS will be:
- EVEX.LLZ.NP.MAP4.WIG 8A !(11):rrr:bbb for r8/m8 with NF=0 and
ND=0
- EVEX.LLZ.NP/66.MAP4.SCALABLE 8B !(11):rrr:bbb for rv/mv with NF=0
and ND=0
We did not merge the table together for APX_F since there is an explicit
x64 for movrs insn. The current APX_F() did not support the combination
between CPUIDs. Also, the space is different for legacy and apx_f forms.
gas/ChangeLog:
* NEWS: Support Intel MOVRS.
* config/tc-i386.c: Add MOVRS.
* doc/c-i386.texi: Document .movrs.
* testsuite/gas/i386/i386.exp: Run MOVRS tests.
* testsuite/gas/i386/x86-64.exp: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Add MOVRS
tests.
* testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted.d: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted.s: Ditto.
* testsuite/gas/i386/lfence-load.d: Add prefetchrst2.
* testsuite/gas/i386/lfence-load.s: Ditto.
* testsuite/gas/i386/nops-8.d: Ditto.
* testsuite/gas/i386/prefetch-intel.d: Ditto.
* testsuite/gas/i386/prefetch.d: Ditto.
* testsuite/gas/i386/x86-64-lfence-load.d: Ditto.
* testsuite/gas/i386/x86-64-lfence-load.s: Ditto.
* testsuite/gas/i386/x86-64-prefetch-intel.d: Ditto.
* testsuite/gas/i386/x86-64-prefetch.d: Ditto.
* testsuite/gas/i386/movrs-intel.d: New test.
* testsuite/gas/i386/movrs-inval.l: Ditto.
* testsuite/gas/i386/movrs-inval.s: Ditto.
* testsuite/gas/i386/movrs.d: Ditto.
* testsuite/gas/i386/movrs.s: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-256-intel.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-256.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-256.s: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-512-intel.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-512.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-512.s: Ditto.
* testsuite/gas/i386/x86-64-movrs-intel.d: Ditto.
* testsuite/gas/i386/x86-64-movrs.d: Ditto.
* testsuite/gas/i386/x86-64-movrs.s: Ditto.
* testsuite/gas/i386/x86-64-movrs-intel-suffix.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-suffix.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-suffix.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-prefix.h: Add PREFIX_EVEX_MAP5_6F_X86_64.
* i386-dis-evex-x86.h: Add X86_64_EVEX_MAP5_6F.
* i386-dis-evex.h (evex_table): New entry for movrs.
* i386-dis.c (MOD_0F18_REG_4): New.
(PREFIX_EVEX_MAP5_6F_X86_64): Ditto.
(X86_64_0F388A): Ditto.
(X86_64_0F388B): Ditto.
(X86_64_EVEX_MAP5_6F): Ditto.
(three_byte_table): New entry for MOVRS.
(reg_table): Ditto.
(mod_table): Ditto.
(x86_64_table): Ditto. Also include i386-dis-evex-x86.h.
* i386-gen.c (cpu_flags): Add MOVRS.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h (i386_cpu_flags): Add cpumovrs.
* i386-opc.tbl: Add MOVRS instrctions.
* i386-tbl.h: Regenerated.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
Co-authored-by: Lili Cui <lili.cui@intel.com>
|
|
When using MVexSIBMEM, OP_M will help check modrm. Thus, no need
to pass mod_table.
Since we have OP_M do the work, from now on, mod_table[] should
not gain any new entries, unless both slots of them are populated,
e.g., different modrm leading to different insns could not be
combined (Bad_Opcode is not the case since OP_M could handle that).
opcodes/ChangeLog:
* i386-dis.c: Remove mod_table pass for MVexSIBMEM.
|
|
This patch adds support for SME ZA-targeting non-widening BFloat16 instructions,
under tick FEAT_SME_B16B16 and command line flag "+sme-b16b16".
FEAT_SME_B16B16 implements FEAT_SME2 and FEAT_SVE_B16B16, in accordance with that
"+sme-b16b16" enables "+sme2" and "+sve-b16b16".
Also the test files related to FEAT_SME_B16B16 are prefixed with sme-b16b16*.
eg: sme-b16b16-1.s, sme-b16b16-1.d.
The spec for this feature and instructions is availabe here [1]:
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SME-Instructions?lang=en
|
|
This patch adds support for SME Z-targeting multi-vector non-widening
BFloat16 instructions, under tick FEAT_SVE_B16B16 and command line flag
"+sve-b16b16+sme2".
Also the test files related to FEAT_SVE_B16B16 (+sme2) are prefixed with
sve-b16b16-sme2*.
eg: sve-b16b16-sme2-1.s, sve-b16b16-sme2-1.d.
The spec for this feature and instructions is availabe here [1]:
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SME-Instructions?lang=en
|
|
In the current code, SVE2 Bfloat16 instructions are implemented with tick
FEAT_B16B16 and command line flag "+b16b16" and this feature was suspended
due to incomplete support.
In the new spec available here[1], FEAT_B16B16 is replaced with
FEAT_SVE_B16B16 and command line flag "+b16b16" is replace with "sve-b16b16".
Also the test files related to FEAT_SVE_B16B16 are prefixed with sve-b16b16*.
eg: sve-b16b16-sve2-1.s, sve-b16b16-sve2-1.d.
This patch supports the SVE Z-targeting non-widening BFloat16 instructions
with command line flag "+sve-b16b16+sve2".
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SVE-Instructions?lang=en
|
|
Add tests for this, and update the existing fvdotb and fvdott tests to
include the VGx4 symbol so that they continue to test for the intended
errors.
|
|
Rename to AARCH64_OPND_SME_ZT0_INDEX_MUL_VL.
|