aboutsummaryrefslogtreecommitdiff
path: root/gas/doc
diff options
context:
space:
mode:
authorHu, Lin1 <lin1.hu@intel.com>2025-01-14 10:30:38 +0800
committerHaochen Jiang <haochen.jiang@intel.com>2025-01-14 10:30:38 +0800
commitb7267244a3556fd3cf3219ebc9f890b065150bef (patch)
tree0aac70c352ed9e7d2597a4b31fca835ef462a448 /gas/doc
parentb41ab42df1ed8cfda4f3dd83f001dd693ede7ec1 (diff)
downloadbinutils-b7267244a3556fd3cf3219ebc9f890b065150bef.zip
binutils-b7267244a3556fd3cf3219ebc9f890b065150bef.tar.gz
binutils-b7267244a3556fd3cf3219ebc9f890b065150bef.tar.bz2
Support Intel AMX-MOVRS
This patch will support AMX-MOVRS feature. Unlike all the other AMX insns in vector space where we pass vex_len_table before vex_w_table, we first pass vex_w_table for tileloaddrs[,t1] to align with the order in EVEX space. The reason why we first pass vex_w_table in EVEX space is due to AMX-AVX512, where tcvtrowd2ps and tilemovrow with r32 shares the same opcode with tileloaddrs[,t1]. All of them have evex.w = 0 but with different evex.length. Re-doing that shortly is not ideal. APX_F extension is also implemented in this patch. The encoding will be: - EVEX.128.NP/66.MAP5.W0 F8/F9 !(11):rrr:100 for T2RPNTLVW[Z0,Z1]RS[,T1] with NF=0. - EVEX.128.F2/66.0F38.W0 4A !(11):rrr:100 FOR TILELOADDRS[,T1] with NF=0. For APX_F extension, we could not use APX_F(AMX_TRANSPOSE&AMX_MOVRS) since the transformation could not be done. Instead, we will use AMX_TRANSPOSE & APX_F(AMX_MOVRS). Thus, we should set AMX_TRANSPOSE for "any" for cpu_flags in assembler. Since it will only affect the cpu_flags_match, handle that there. gas/ChangeLog: * config/tc-i386.c (cpu_arch): Add amx_movrs. (cpu_flags_match): Set any bitfield for multiple cpuid enabled insns. * doc/c-i386.texi: Document .amx_movrs. * testsuite/gas/i386/x86-64.exp: Run AMX-MOVRS tests. * testsuite/gas/i386/x86-64-amx-movrs-intel.d: New test. * testsuite/gas/i386/x86-64-amx-movrs-inval.l: Ditto. * testsuite/gas/i386/x86-64-amx-movrs-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-movrs.d: Ditto. * testsuite/gas/i386/x86-64-amx-movrs.s: Ditto. opcodes/ChangeLog: * i386-dis-evex-len.h (EVEX_LEN_0F384A_X86_64_W_0): New. * i386-dis-evex-w.h (EVEX_W_0F384A_X86_64): Ditto. * i386-dis-evex-x86-64.h (X86_64_EVEX_0F384A): Ditto. * i386-dis-evex.h: New entry for AMX-MOVRS. * i386-dis.c: (PREFIX_VEX_0F384A_X86_64_L_0_W_0): New. (PREFIX_VEX_MAP5_F8_X86_64_L_0_W_0): Ditto. (PREFIX_VEX_MAP5_F9_X86_64_L_0_W_0): Ditto. (X86_64_VEX_0F384A): Ditto. (X86_64_VEX_MAP5_F8): Ditto. (X86_64_VEX_MAP5_F9): Ditto. (X86_64_EVEX_0F384A): Ditto. (VEX_LEN_0F384A_X86_64_W_0): Ditto. (VEX_LEN_MAP5_F8_X86_64): Ditto. (VEX_LEN_MAP5_F9_X86_64): Ditto. (EVEX_LEN_0F384A_X86_64_W_0): Ditto. (VEX_W_0F384A_X86_64): Ditto. (VEX_W_MAP5_F8_X86_64): Ditto. (VEX_W_MAP5_F9_X86_64): Ditto. (EVEX_W_0F384A_X86_64): Ditto. (prefix_table): New entry for AMX-MOVRS. (x86_64_table): Ditto. (vex_len_table): Ditto. (vex_w_table): Ditto. (map5_f8_opcode): New. (map5_f9_opcode): Ditto. (get_valid_dis386): Handle VEX_MAP5 opcode for AMX-MOVRS. * i386-gen.c (isa_dependencies): Add AMX_MOVRS. (cpu_flags): Ditto. * i386-init.h: Regenerated. * i386-mnem.h: Ditto. * i386-opc.h (CpuAMX_MOVRS): New. (i386_cpu_flags): Add cpuamx_movrs. * i386-opc.tbl: Add AMX-MOVRS instructions. * i386-tbl.h: Regenerated. Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
Diffstat (limited to 'gas/doc')
-rw-r--r--gas/doc/c-i386.texi3
1 files changed, 2 insertions, 1 deletions
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index e104a82..7b8b514 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -232,6 +232,7 @@ accept various extension mnemonics. For example,
@code{amx_transpose},
@code{amx_tf32},
@code{amx_fp8}
+@code{amx_movrs},
@code{amx_tile},
@code{vmx},
@code{vmfunc},
@@ -1707,7 +1708,7 @@ supported on the CPU specified. The choices for @var{cpu_type} are:
@item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk}
@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16}
@item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_tf32}
-@item @samp{.amx_fp8} @tab @samp{.amx_tile}
+@item @samp{.amx_fp8} @tab @samp{.amx_movrs} @tab @samp{.amx_tile}
@item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset}
@item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5}
@item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme}