Age | Commit message (Collapse) | Author | Files | Lines |
|
Use NoSuf to replace No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf
and add the explicit NoSuf to AddrPrefixOpReg in templates.
* i386-opc.tbl (NoSuf): New macro.
(AddrPrefixOpReg): Remove No_?Suf.
Replace No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf with
NoSuf in templates.
Add NoSuf to AddrPrefixOpReg in templates.
|
|
2022-09-28 Tejas Joshi <TejasSanjay.Joshi@amd.com>
gas/
* config/tc-i386.c (cpu_arch): Add znver4 ARCH and rmpquery SUBARCH.
(md_assemble): Expand comment before swap_operands() with rmpquery.
* doc/c-i386.texi: Add znver4.
* testsuite/gas/i386/arch-14-1.d: New.
* testsuite/gas/i386/arch-14-1.s: New.
* testsuite/gas/i386/arch-14-znver4.d: New.
* testsuite/gas/i386/i386.exp: Add new znver4 test cases.
* testsuite/gas/i386/rmpquery.d: New.
* testsuite/gas/i386/rmpquery.s: New.
* testsuite/gas/i386/x86-64-arch-4-1.d: New.
* testsuite/gas/i386/x86-64-arch-4-1.s: New.
* testsuite/gas/i386/x86-64-arch-4-znver4.d: New.
opcodes/
* i386-dis.c (x86_64_table): Add rmpquery.
* i386-gen.c (cpu_flag_init): Add CPU_ZNVER4_FLAGS and
CPU_RMPQUERY_FLAGS.
(cpu_flags): Add CpuRMPQUERY.
* i386-opc.h (enum): Add CpuRMPQUERY.
(i386_cpu_flags): Add cpurmpquery.
* i386-opc.tbl: Add rmpquery insn.
* i386-init.h: Re-generated.
* i386-tbl.h: Re-generated.
|
|
Attributes which aren't used together in any single insn template can be
converted from individual booleans to a single enum, as was done for a few
other attributes before. This is more space efficient. Collect together
all attributes which express special operand constraints (and which fit
the criteria for folding).
|
|
The need for IsString on the PadLock insns went away with the
introduction of RepPrefixOk. Drop these leftovers.
|
|
gas/ChangeLog:
* NEWS: Support Intel RAO-INT.
* config/tc-i386.c: Add raoint.
* doc/c-i386.texi: Document .raoint.
* testsuite/gas/i386/i386.exp: Run RAO_INT tests.
* testsuite/gas/i386/raoint-intel.d: New test.
* testsuite/gas/i386/raoint.d: Ditto.
* testsuite/gas/i386/raoint.s: Ditto.
* testsuite/gas/i386/x86-64-raoint-intel.d: Ditto.
* testsuite/gas/i386/x86-64-raoint.d: Ditto.
* testsuite/gas/i386/x86-64-raoint.s: Ditto.
opcodes/ChangeLog:
* i386-dis.c (PREFIX_0F38FC): New.
(prefix_table): Add PREFIX_0F38FC.
* i386-gen.c: (cpu_flag_init): Add CPU_RAO_INT_FLAGS and
CPU_ANY_RAO_INT_FLAGS.
* i386-init.h: Regenerated.
* i386-opc.h: (CpuRAO_INT): New.
(i386_cpu_flags): Add cpuraoint.
* i386-opc.tbl: Add RAO_INT instructions.
* i386-tbl.h: Regenerated.
|
|
gas/ChangeLog:
* NEWS: Support Intel AVX-NE-CONVERT.
* config/tc-i386.c: Add avx_ne_convert.
* doc/c-i386.texi: Document .avx_ne_convert.
* testsuite/gas/i386/i386.exp: Run AVX NE CONVERT tests.
* testsuite/gas/i386/avx-ne-convert-intel.d: New test.
* testsuite/gas/i386/avx-ne-convert.d: Ditto.
* testsuite/gas/i386/avx-ne-convert.s: Ditto.
* testsuite/gas/i386/x86-64-avx-ne-convert-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx-ne-convert.d: Ditto.
* testsuite/gas/i386/x86-64-avx-ne-convert.s: Ditto.
opcodes/ChangeLog:
* i386-dis.c (Mw): New.
(PREFIX_VEX_0F3872): Ditto.
(PREFIX_VEX_0F38B0_W_0): Ditto.
(PREFIX_VEX_0F38B1_W_0): Ditto.
(VEX_W_0F3872_P_1): Ditto.
(VEX_W_0F38B0): Ditto.
(VEX_W_0F38B1): Ditto.
(prefix_table): Add PREFIX_VEX_0F3872, PREFIX_VEX_0F38B0_W_0,
PREFIX_VEX_0F38B1_W_0.
(vex_w_table): Add VEX_W_0F3872_P_1, VEX_W_0F38B0, VEX_W_0F38B1.
* i386-gen.c (cpu_flag_init): Add CPU_AVX_NE_CONVERT_FLGAS and
CPU_ANY_AVX_NE_CONVERT_FLAGS.
(cpu_flags): Add CpuAVX_NE_CONVERT.
* i386-init.h: Regenerated.
* i386-opc.h (CpuAVX_NE CONVERT): New.
(i386_cpu_flags): Add cpuavx_ne_convert.
* i386-opc.tbl: Add Intel AVX-NE-CONVERT instructions.
* i386-tbl.h: Regenerated.
|
|
opcodes/
* i386-opc.tbl: Rename <xy> template for VEX insn with x/y suffix to <Vxy>.
Rename <xy> for EVEX insn with x/y suffix to <Exy>.
|
|
Prior to commit 1cb0ab18ad24 ("x86/Intel: restrict suffix derivation")
the Tbyte modifier on the FLDT and FSTPT templates was pointless, as
No_ldSuf would have prevented it being accepted. Due to the special
nature of LONG_DOUBLE_MNEM_SUFFIX said commit, however, has led to these
insns being accepted in Intel syntax mode even when "tbyte ptr" was
present. Restore original behavior by dropping Tbyte there. (Note that
these insns in principle should by marked AT&T syntax only, but since
they haven't been so far we probably shouldn't change that.)
|
|
gas/ChangeLog:
* NEWS: Support Intel MSRLIST.
* config/tc-i386.c: Add msrlist.
* doc/c-i386.texi: Document .msrlist.
* testsuite/gas/i386/i386.exp: Add MSRLIST tests.
* testsuite/gas/i386/msrlist-inval.l: New test.
* testsuite/gas/i386/msrlist-inval.s: Ditto.
* testsuite/gas/i386/x86-64-msrlist-intel.d: Ditto.
* testsuite/gas/i386/x86-64-msrlist.d: Ditto.
* testsuite/gas/i386/x86-64-msrlist.s: Ditto.
opcodes/ChangeLog:
* i386-dis.c (X86_64_0F01_REG_0_MOD_3_RM_6_P_1): New.
(X86_64_0F01_REG_0_MOD_3_RM_6_P_3): Ditto.
(prefix_table): New entry for msrlist.
(x86_64_table): Add X86_64_0F01_REG_0_MOD_3_RM_6_P_1
and X86_64_0F01_REG_0_MOD_3_RM_6_P_3.
* i386-gen.c (cpu_flag_init): Add CPU_MSRLIST_FLAGS
and CPU_ANY_MSRLIST_FLAGS.
* i386-init.h: Regenerated.
* i386-opc.h (CpuMSRLIST): New.
(i386_cpu_flags): Add cpumsrlist.
* i386-opc.tbl: Add MSRLIST instructions.
* i386-tbl.h: Regenerated.
|
|
gas/ChangeLog:
* NEWS: Support Intel WRMSRNS.
* config/tc-i386.c: Add wrmsrns.
* doc/c-i386.texi: Document .wrmsrns.
* testsuite/gas/i386/i386.exp: Add WRMSRNS tests.
* testsuite/gas/i386/wrmsrns-intel.d: New test.
* testsuite/gas/i386/wrmsrns.d: Ditto.
* testsuite/gas/i386/wrmsrns.s: Ditto.
* testsuite/gas/i386/x86-64-wrmsrns-intel.d: Ditto.
* testsuite/gas/i386/x86-64-wrmsrns.d: Ditto.
opcodes/ChangeLog:
* i386-dis.c (PREFIX_0F01_REG_0_MOD_3_RM_6): New.
(prefix_table): Add PREFIX_0F01_REG_0_MOD_3_RM_6.
(rm_table): New entry for wrmsrns.
* i386-gen.c (cpu_flag_init): Add CPU_WRMSRNS_FLAGS
and CPU_ANY_WRMSRNS_FLAGS.
(cpu_flags): Add CpuWRMSRNS.
* i386-init.h: Regenerated.
* i386-opc.h (CpuWRMSRNS): New.
(i386_cpu_flags): Add cpuwrmsrns.
* i386-opc.tbl: Add WRMSRNS instructions.
* i386-tbl.h: Regenerated.
|
|
gas/ChangeLog:
* NEWS: Support Intel CMPccXADD.
* config/tc-i386.c: Add cmpccxadd.
(build_modrm_byte): Add operations for Vex.VVVV reg
on operand 0 while have memory operand.
* doc/c-i386.texi: Document .cmpccxadd.
* testsuite/gas/i386/i386.exp: Run CMPccXADD tests.
* testsuite/gas/i386/cmpccxadd-inval.s: New test.
* testsuite/gas/i386/cmpccxadd-inval.l: Ditto.
* testsuite/gas/i386/x86-64-cmpccxadd-intel.d: Ditto.
* testsuite/gas/i386/x86-64-cmpccxadd.s: Ditto.
* testsuite/gas/i386/x86-64-cmpccxadd.d: Ditto.
opcodes/ChangeLog:
* i386-dis.c (Mdq): New.
(X86_64_VEX_0F38E0): Ditto.
(X86_64_VEX_0F38E1): Ditto.
(X86_64_VEX_0F38E2): Ditto.
(X86_64_VEX_0F38E3): Ditto.
(X86_64_VEX_0F38E4): Ditto.
(X86_64_VEX_0F38E5): Ditto.
(X86_64_VEX_0F38E6): Ditto.
(X86_64_VEX_0F38E7): Ditto.
(X86_64_VEX_0F38E8): Ditto.
(X86_64_VEX_0F38E9): Ditto.
(X86_64_VEX_0F38EA): Ditto.
(X86_64_VEX_0F38EB): Ditto.
(X86_64_VEX_0F38EC): Ditto.
(X86_64_VEX_0F38ED): Ditto.
(X86_64_VEX_0F38EE): Ditto.
(X86_64_VEX_0F38EF): Ditto.
(x86_64_table): Add X86_64_VEX_0F38E0, X86_64_VEX_0F38E1,
X86_64_VEX_0F38E2, X86_64_VEX_0F38E3, X86_64_VEX_0F38E4,
X86_64_VEX_0F38E5, X86_64_VEX_0F38E6, X86_64_VEX_0F38E7,
X86_64_VEX_0F38E8, X86_64_VEX_0F38E9, X86_64_VEX_0F38EA,
X86_64_VEX_0F38EB, X86_64_VEX_0F38EC, X86_64_VEX_0F38ED,
X86_64_VEX_0F38EE, X86_64_VEX_0F38EF.
* i386-gen.c (cpu_flag_init): Add CPU_CMPCCXADD_FLAGS and
CPU_ANY_CMPCCXADD_FLAGS.
(cpu_flags): Add CpuCMPCCXADD.
* i386-init.h: Regenerated.
* i386-opc.h (CpuCMPCCXADD): New.
(i386_cpu_flags): Add cpucmpccxadd. Comment unused for it is actually 0.
* i386-opc.tbl: Add Intel CMPccXADD instructions.
* i386-tbl.h: Regenerated.
|
|
gas/
* NEWS: Support Intel AVX-VNNI-INT8.
* config/tc-i386.c: Add avx_vnni_int8.
* doc/c-i386.texi: Document avx_vnni_int8.
* testsuite/gas/i386/avx-vnni-int8-intel.d: New file.
* testsuite/gas/i386/avx-vnni-int8.d: Likewise.
* testsuite/gas/i386/avx-vnni-int8.s: Likewise.
* testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d: Likewise.
* testsuite/gas/i386/x86-64-avx-vnni-int8.d: Likewise.
* testsuite/gas/i386/x86-64-avx-vnni-int8.s: Likewise.
* testsuite/gas/i386/i386.exp: Run AVX VNNI INT8 tests.
opcodes/
* i386-dis.c: (PREFIX_VEX_0F3850) New.
(PREFIX_VEX_0F3851): Likewise.
(VEX_W_0F3850_P_0): Likewise.
(VEX_W_0F3850_P_1): Likewise.
(VEX_W_0F3850_P_2): Likewise.
(VEX_W_0F3850_P_3): Likewise.
(VEX_W_0F3851_P_0): Likewise.
(VEX_W_0F3851_P_1): Likewise.
(VEX_W_0F3851_P_2): Likewise.
(VEX_W_0F3851_P_3): Likewise.
(VEX_W_0F3850): Delete.
(VEX_W_0F3851): Likewise.
(prefix_table): Add PREFIX_VEX_0F3850 and PREFIX_VEX_0F3851.
(vex_table): Add PREFIX_VEX_0F3850 and PREFIX_VEX_0F3851,
delete VEX_W_0F3850 and VEX_W_0F3851.
(vex_w_table): Add VEX_W_0F3850_P_0, VEX_W_0F3850_P_1, VEX_W_0F3850_P_2
VEX_W_0F3850_P_3, VEX_W_0F3851_P_0, VEX_W_0F3851_P_1, VEX_W_0F3851_P_2
and VEX_W_0F3851_P_3, delete VEX_W_0F3850 and VEX_W_0F3851.
* i386-gen.c: (cpu_flag_init): Add CPU_AVX_VNNI_INT8_FLAGS
and CPU_ANY_AVX_VNNI_INT8_FLAGS.
(cpu_flags): Add CpuAVX_VNNI_INT8.
* i386-opc.h (CpuAVX_VNNI_INT8): New.
* i386-opc.tbl: Add Intel AVX_VNNI_INT8 instructions.
* i386-init.h: Regenerated.
* i386-tbl.h: Likewise.
|
|
x86: Support Intel AVX-IFMA
Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
cleared by default. Without {vex} pseudo prefix, Intel IFMA instructions
are encoded with EVEX prefix. {vex} pseudo prefix will turn on VEX
encoding for Intel IFMA instructions.
gas/
* NEWS: Support Intel AVX-IFMA.
* config/tc-i386.c (cpu_arch): Add avx_ifma.
* doc/c-i386.texi: Document .avx_ifma.
* testsuite/gas/i386/avx-ifma.d: New file.
* testsuite/gas/i386/avx-ifma-intel.d: Likewise.
* testsuite/gas/i386/avx-ifma.s: Likewise.
* testsuite/gas/i386/x86-64-avx-ifma.d: Likewise.
* testsuite/gas/i386/x86-64-avx-ifma-intel.d: Likewise.
* testsuite/gas/i386/x86-64-avx-ifma.s: Likewise.
* testsuite/gas/i386/i386.exp: Run AVX IFMA tests.
opcodes/
* i386-dis.c (PREFIX_VEX_0F38B4): New.
(PREFIX_VEX_0F38B5): Likewise.
(VEX_W_0F38B4_P_2): Likewise.
(VEX_W_0F38B5_P_2): Likewise.
(prefix_table): Add PREFIX_VEX_0F38B4 and PREFIX_VEX_0F38B5.
(vex_table): Add VEX_W_0F38B4_P_2 and VEX_W_0F38B5_P_2.
* i386-dis-evex.h: Fold AVX512IFMA entries to AVX-IFMA.
* i386-gen.c (cpu_flag_init): Clear the CpuAVX_IFMA bit in
CPU_UNKNOWN_FLAGS. Add CPU_AVX_IFMA_FLGAS and
CPU_ANY_AVX_IFMA_FLAGS. Add CpuAVX_IFMA to CPU_AVX2_FLAGS.
(cpu_flags): Add CpuAVX_IFMA.
* i386-opc.h (CpuAVX_IFMA): New.
(i386_cpu_flags): Add cpuavx_ifma.
* i386-opc.tbl: Add Intel AVX IFMA instructions.
* i386-init.h: Regenerated.
* i386-tbl.h: Likewise.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
gas/ChangeLog:
* NEWS: Add support for Intel PREFETCHI instruction.
* config/tc-i386.c (load_insn_p): Use prefetch* to fold all prefetches.
(md_assemble): Add warning for illegal input of PREFETCHI.
* doc/c-i386.texi: Document .prefetchi.
* testsuite/gas/i386/i386.exp: Run PREFETCHI tests.
* testsuite/gas/i386/x86-64-lfence-load.d: Add PREFETCHI.
* testsuite/gas/i386/x86-64-lfence-load.s: Likewise.
* testsuite/gas/i386/x86-64-prefetch.d: New test.
* testsuite/gas/i386/x86-64-prefetchi-intel.d: Likewise.
* testsuite/gas/i386/x86-64-prefetchi-inval-register.d: Likewise..
* testsuite/gas/i386/x86-64-prefetchi-inval-register.s: Likewise.
* testsuite/gas/i386/x86-64-prefetchi-warn.l: Likewise.
* testsuite/gas/i386/x86-64-prefetchi-warn.s: Likewise.
* testsuite/gas/i386/x86-64-prefetchi.d: Likewise.
* testsuite/gas/i386/x86-64-prefetchi.s: Likewise.
opcodes/ChangeLog:
* i386-dis.c (reg_table): Add MOD_0F18_REG_6 and MOD_0F18_REG_7
(x86_64_table): Add X86_64_0F18_REG_6_MOD_0 and X86_64_0F18_REG_7_MOD_0.
(mod_table): Add MOD_0F18_REG_6 and MOD_0F18_REG_7.
(prefix_table): Add PREFIX_0F18_REG_6_MOD_0_X86_64 and
PREFIX_0F18_REG_7_MOD_0_X86_64.
(PREFETCHI_Fixup): New.
* i386-gen.c (cpu_flag_init): Add CPU_PREFETCHI_FLAGS.
(cpu_flags): Add CpuPREFETCHI.
* i386-opc.h (CpuPREFETCHI): New.
(i386_cpu_flags): Add cpuprefetchi.
* i386-opc.tbl: Add Intel PREFETCHI instructions.
* i386-init.h: Regenerated.
* i386-tbl.h: Likewise.
|
|
gas/
* NEWS: Add support for Intel AMX-FP16 instruction.
* config/tc-i386.c: Add amx_fp16.
* doc/c-i386.texi: Document .amx_fp16.
* testsuite/gas/i386/i386.exp: Add AMX-FP16 tests.
* testsuite/gas/i386/x86-64-amx-fp16-intel.d: New test.
* testsuite/gas/i386/x86-64-amx-fp16.d: Likewise.
* testsuite/gas/i386/x86-64-amx-fp16.s: Likewise.
* testsuite/gas/i386/x86-64-amx-fp16-bad.d: Likewise.
* testsuite/gas/i386/x86-64-amx-fp16-bad.s: Likewise.
opcodes/
* i386-dis.c (MOD_VEX_0F385C_X86_64_P_3_W_0): New.
(VEX_LEN_0F385C_X86_64_P_3_W_0_M_0): Likewise.
(VEX_W_0F385C_X86_64_P_3): Likewise.
(prefix_table): Add VEX_W_0F385C_X86_64_P_3.
(vex_len_table): Add VEX_LEN_0F385C_X86_64_P_3_W_0_M_0.
(vex_w_table): Add VEX_W_0F385C_X86_64_P_3.
(mod_table): Add MOD_VEX_0F385C_X86_64_P_3_W_0.
* i386-gen.c (cpu_flag_init): Add AMX-FP16_FLAGS.
(CPU_ANY_AMX_TILE_FLAGS): Add CpuAMX_FP16.
(cpu_flags): Add CpuAMX-FP16.
* i386-opc.h (enum): Add CpuAMX-FP16.
(i386_cpu_flags): Add cpuamx_fp16.
* i386-opc.tbl: Add Intel AMX-FP16 instruction.
* i386-init.h: Regenerate.
* i386-tbl.h: Likewise.
|
|
By putting the templates after their AVX512 counterparts, the AVX512
flavors will be picked by default. That way the need to always use {vex}
ceases to exist once respective CPU features (AVX512-VNNI or AVX512VL as
a whole) have been disabled. This way the need for the PseudoVexPrefix
attribute also disappears.
|
|
While in some cases deriving an AT&T-style suffix from an Intel syntax
memory operand size specifier is necessary, in many cases this is not
only pointless, but has led to the introduction of various workarounds:
Excessive use of IgnoreSize and NoRex64 as well as the ToDword and
ToQword attributes. Suppress suffix derivation when we can clearly tell
that the memory operand's size isn't going to be needed to infer the
possible need for the low byte/word opcode bit or an operand size prefix
(0x66 or REX.W).
As a result ToDword and ToQword can be dropped entirely, plus a fair
number of IgnoreSize and NoRex64 can also be got rid of. Note that
IgnoreSize needs to remain on legacy encoded SIMD insns with GPR
operand, to avoid emitting an operand size prefix in 16-bit mode. (Since
16-bit code using SIMD insns isn't well tested, clone an existing
testcase just enough to cover a few insns which are potentially
problematic but are being touched here.)
Note that while folding the VCVT{,T}S{S,D}2SI templates, VCVT{,T}SH2SI
isn't included there. This is to fulfill the request of not allowing L
and Q suffixes there, despite the inconsistency with VCVT{,T}S{S,D}2SI.
|
|
Now that we can purge templates, let's use this to improve readability a
little by shortening a few of their names, making functionally similar
ones also have identical names in their multiple incarnations.
|
|
Many of the vector conversion insns come with X/Y/Z suffixed forms, for
disambiguation purposes in AT&T syntax. All of these gorups follow
certain patterns. Introduce "xy" and "xyz" templates to reduce
redundancy.
To facilitate using a uniform name for both AVX and AVX512, further
introduce a means to purge a previously defined template: A standalone
<name> will be recognized to have this effect.
Note that in the course of the conversion VFPCLASSPH is properly split
to separate AT&T and Intel syntax forms, matching VFPCLASSP{S,D} and
yielding the intended "ambiguous operand size" diagnostic in Intel mode.
|
|
Many of the vector integer insns come in byte/word element pairs. Most
of these pairs follow certain encoding patterns. Introduce a "bw"
template to reduce redundancy.
Note that in the course of the conversion
- the AVX VPEXTRW template which is not being touched needs to remain
ahead of the new "combined" ones, as (a) this should be tried first
when matching insns against templates and (b) its Load attributes
requires it to be first,
- this add a benign/meaningless IgnoreSize attribute to the memory form
of PEXTRB; it didn't seem worth avoiding this.
|
|
The AVX2 gather ones are nicely grouped - do the same for the various
AVX512 scatter/gather ones. On the moved lines also convert EVex=<n> to
EVex<N>.
|
|
Many of the vector integer insns come in dword/qword element pairs. Most
of these pairs follow certain encoding patterns. Introduce a "dq"
template to reduce redundancy.
Note that in the course of the conversion
- a few otherwise untouched templates are moved, so they end up next to
their siblings),
- drop an unhelpful Cpu64 from the GPR form of VPBROADCASTQ, matching
what we already have for KMOVQ - the diagnostic is better this way for
insns with multiple forms (i.e. the same Cpu64 attributes on {,V}MOVQ,
{,V}PEXTRQ, and {,V}PINSRQ are useful to keep),
- this adds benign/meaningless IgnoreSize attributes to the GPR forms of
KMOVD and VPBROADCASTD; it didn't seem worth avoiding this.
|
|
The vast majority of vector FP insns comes in single/double pairs. Many
pairs follow certain encoding patterns. Introduce an "sd" template to
reduce redundancy. Similarly, to further cover similarities between
AVX512F and AVX512-FP16, introduce an "sdh" template.
For element-size Disp8 shift generalize i386-gen's broadcast size
determination, allowing Disp8MemShift to be specified without an operand
in the affected templated templates. While doing the adjustment also
eliminate an unhelpful (lost information) diagnostic combined with a use
after free in what is now get_element_size().
Note that in the course of the conversion
- the AVX512F form of VMOVUPD has a stray (leftover) Load attribute
dropped,
- VMOVSH has a benign IgnoreSize added (the attribute is still strictly
necessary for VMOVSD, and necessary for VMOVSS as long as we permit
strange combinations like "-march=i286+avx"),
- VFPCLASSPH is properly split to separate AT&T and Intel syntax forms,
matching VFPCLASSP{S,D}.
|
|
This reverts commit 384f368958f2a5bb083660e58e5f8a010e6ad429, which
broke i386-gen's emitting of diagnostics. As a replacement to address
the original issue of newer gcc no longer splicing lines when dropping
the line continuation backslashes, switch to using + as the line
continuation character, doing the line splicing in i386-gen.
|
|
It is unclear to me why the corresponding MOV (no Q suffix) can be
issued without REX.W, but MOVQ has to have that prefix (bit). Add
NoRex64 and in exchange drop Size64.
|
|
The non-SSE2AVX form of the SIMD variant of the instruction needlessly
has the (still multi-purpose) IgnoreSize attribute. All other similar
SSE2 insns use NoRex64 instead. Make this consistent, noting that the
SSE2AVX form can't have the same change made - there the memory operand
doesn't at the same time permit RegXMM (which logic uses when deciding
whether a Q suffix is okay outside of 64-bit mode).
|
|
While the other three variants each differ in attributes and hence can't
be folded, these two pairs actually can be (and were previously
overlooked). This effectively matches their AVX512VL counterparts, which
are also expressed as a single template.
|
|
While the x/y/z suffix isn't necessary to use in this case, it is still
odd that these forms don't support broadcast (unlike their AVX512F /
AVX512DQ counterparts). The lack thereof can e.g. make macro-ized
programming more difficult.
|
|
One more place where pre-existing templates should have been taken as a
basis: In Intel syntax we want to consistently issue an "ambiguous
operand size" error when a size-less memory operand is specified for an
insn where register use alone isn't sufficient for disambiguation.
|
|
Just like all Size64 insns are marked Cpu64, all Size32 insns ought to
be marked Cpu386.
|
|
First of all rename the meanwhile misleading Opcode_SIMD_FloatD, as it
has also been used for KMOV* and BNDMOV. Then simplify the condition
selecting which form if "reversing" to use - except for the MOV to/from
control/debug/test registers all extended opcode space insns use bit 0
(rather than bit 1) to indicate the direction (from/to memory) of an
operation. With that, D can simply be set on the first of the two
templates, while the other can be dropped.
|
|
By mistake it was permitted to be used from the very introduction of XOP
support.
|
|
Without it in 16-bit mode a pointless operand size prefix would be
emitted.
|
|
It's entirely unclear why some of the KeyLocker insns had NoRex64 on
them - there's nothing here which could cause emission of REX.W (except
of course a user-specified "rex.w", which we ought to honor anyway).
|
|
A standalone (without SAE) StaticRounding attribute is meaningless, and
indeed all other similar insns have ATTSyntax there instead. I can only
assume this was some strange copy-and-paste mistake.
|
|
I clearly screwed up in 6ff00b5e12e7 ("x86/Intel: correct permitted
operand sizes for AVX512 scatter/gather") giving all AVX512F scatter
insns Dword element size. Update testcases (also their gather parts),
utilizing that there previously were two identical lines each (for no
apparent reason).
|
|
Both forms were missing VexW0 (thus allowing Evex.W=1 to be encoded by
suitable means, which would cause #UD). The memory operand form further
was using the wrong Masking value, thus allowing zeroing-masking to be
encoded for the store form (which would again cause #UD).
|
|
This once again allows to reduce redundancy in (and size of) the opcode
table.
Don't go as far as also making D work on the two 5-operand XOP insns:
This would significantly complicate the code, as there the first
(immediate) operand would need special treatment in several places.
Note that the .s suffix isn't being enabled to have any effect, for
being deprecated. Whereas neither {load} nor {store} pseudo prefixes
make sense here, as the respective operands are inputs (loads) only
anyway, regardless of order. Hence there is (as before) no way for the
programmer to request the alternative encoding to be used for register-
only insns.
Note further that it is always the first original template which is
retained (and altered), to make sure the same encoding as before is
used for register-only insns. This has the slightly odd (but pre-
existing) effect of XOP register-only insns having XOP.W clear, but FMA4
ones having VEX.W set.
|
|
The only case where 64-bit code uses non-sign-extended (can also be
considered zero-extended) displacements is when an address size override
is in place for a memory operand (i.e. particularly excluding
displacements of direct branches, which - if at all - are controlled by
operand size, and then are still sign-extended, just from 16 bits).
Hence the distinction in templates is unnecessary, allowing code to be
simplified in a number of places. The only place where logic becomes
more complicated is when signed-ness of relocations is determined in
output_disp().
The other caveat is that Disp64 cannot be specified anymore in an insn
template at the same time as Disp32. Unlike for non-64-bit mode,
templates don't specify displacements for both possible addressing
modes; the necessary adjustment to the expected ones has already been
done in match_template() anyway (but of course the logic there needs
tweaking now). Hence the single template so far doing so is split.
|
|
Presumably this being there was a result of taking CALL as a reference
when adding the RTM insns. But with No_qSuf the attribute has no effect.
|
|
As a preparatory step to allowing proper non-operand forms of specifying
embedded rounding / SAE, convert the internal representation to non-
operand form. While retaining properties (and in a few cases perhaps
providing more meaningful diagnostics), this means doing away with a few
hundred standalone templates, thus - as a nice side effect - reducing
memory consumption / cache occupancy.
|
|
This also was mistakenly flagged as Evex.128.
|
|
These were mistakenly flagged as Evex.128. Getting the LLIG status right
for insns allowing for SAE is a prereq for planned further work.
|
|
Like VFPCLASSP{S,D} it has only a single operand allowing multiple
sizes, hence there are no pairs of operands to check for consistent
size.
|
|
By slightly relaxing the checking in operand_type_register_match() we
can fold the vector shift insns with an XMM source as well. While
strictly speaking an overlap in just one size (see the code comment) is
not enough (both operands could have multiple sizes with just a single
common one), this is good enough for all templates we have, or which
could sensibly / usefully appear (within the scope of the present
operand matching model).
Tightening this a little would be possible, but would require broadcast
related information to be passed into the function.
|
|
They have only a single operand allowing multiple sizes, hence there are
no pairs of operands to check for consistent size.
|
|
Like for AVX512VL we can make the handling of operand sizes a little
more flexible to allow reducing the number of templates we have.
|
|
To avoid issues like that addressed by 6e3e5c9e4181 ("x86: extend SSE
check to PCLMULQDQ, AES, and GFNI insns"), base the check on opcode
attributes and operand types.
|
|
With the introduction of CpuPOPCNT the NoAVX attribute has become
meaningless for POPCNT.
|
|
As already indicated in a remark when introducing these templates, the
"commutative" attribute is ignored for legacy encoding templates. Hence
it is possible to shorten a number of templates by specifying C directly
rather than through a template parameter. I think this helps readability
a bit.
|