diff options
author | Hu, Lin1 <lin1.hu@intel.com> | 2025-01-14 10:30:38 +0800 |
---|---|---|
committer | Haochen Jiang <haochen.jiang@intel.com> | 2025-01-14 10:30:38 +0800 |
commit | b7267244a3556fd3cf3219ebc9f890b065150bef (patch) | |
tree | 0aac70c352ed9e7d2597a4b31fca835ef462a448 /gas/doc | |
parent | b41ab42df1ed8cfda4f3dd83f001dd693ede7ec1 (diff) | |
download | binutils-b7267244a3556fd3cf3219ebc9f890b065150bef.zip binutils-b7267244a3556fd3cf3219ebc9f890b065150bef.tar.gz binutils-b7267244a3556fd3cf3219ebc9f890b065150bef.tar.bz2 |
Support Intel AMX-MOVRS
This patch will support AMX-MOVRS feature. Unlike all the other
AMX insns in vector space where we pass vex_len_table before
vex_w_table, we first pass vex_w_table for tileloaddrs[,t1] to
align with the order in EVEX space. The reason why we first pass
vex_w_table in EVEX space is due to AMX-AVX512, where tcvtrowd2ps
and tilemovrow with r32 shares the same opcode with tileloaddrs[,t1].
All of them have evex.w = 0 but with different evex.length. Re-doing
that shortly is not ideal.
APX_F extension is also implemented in this patch. The encoding will
be:
- EVEX.128.NP/66.MAP5.W0 F8/F9 !(11):rrr:100 for
T2RPNTLVW[Z0,Z1]RS[,T1] with NF=0.
- EVEX.128.F2/66.0F38.W0 4A !(11):rrr:100 FOR TILELOADDRS[,T1] with
NF=0.
For APX_F extension, we could not use APX_F(AMX_TRANSPOSE&AMX_MOVRS)
since the transformation could not be done. Instead, we will use
AMX_TRANSPOSE & APX_F(AMX_MOVRS). Thus, we should set AMX_TRANSPOSE
for "any" for cpu_flags in assembler. Since it will only affect the
cpu_flags_match, handle that there.
gas/ChangeLog:
* config/tc-i386.c (cpu_arch): Add amx_movrs.
(cpu_flags_match): Set any bitfield for multiple cpuid
enabled insns.
* doc/c-i386.texi: Document .amx_movrs.
* testsuite/gas/i386/x86-64.exp: Run AMX-MOVRS tests.
* testsuite/gas/i386/x86-64-amx-movrs-intel.d: New test.
* testsuite/gas/i386/x86-64-amx-movrs-inval.l: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs-inval.s: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs.d: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-len.h (EVEX_LEN_0F384A_X86_64_W_0): New.
* i386-dis-evex-w.h (EVEX_W_0F384A_X86_64): Ditto.
* i386-dis-evex-x86-64.h (X86_64_EVEX_0F384A): Ditto.
* i386-dis-evex.h: New entry for AMX-MOVRS.
* i386-dis.c:
(PREFIX_VEX_0F384A_X86_64_L_0_W_0): New.
(PREFIX_VEX_MAP5_F8_X86_64_L_0_W_0): Ditto.
(PREFIX_VEX_MAP5_F9_X86_64_L_0_W_0): Ditto.
(X86_64_VEX_0F384A): Ditto.
(X86_64_VEX_MAP5_F8): Ditto.
(X86_64_VEX_MAP5_F9): Ditto.
(X86_64_EVEX_0F384A): Ditto.
(VEX_LEN_0F384A_X86_64_W_0): Ditto.
(VEX_LEN_MAP5_F8_X86_64): Ditto.
(VEX_LEN_MAP5_F9_X86_64): Ditto.
(EVEX_LEN_0F384A_X86_64_W_0): Ditto.
(VEX_W_0F384A_X86_64): Ditto.
(VEX_W_MAP5_F8_X86_64): Ditto.
(VEX_W_MAP5_F9_X86_64): Ditto.
(EVEX_W_0F384A_X86_64): Ditto.
(prefix_table): New entry for AMX-MOVRS.
(x86_64_table): Ditto.
(vex_len_table): Ditto.
(vex_w_table): Ditto.
(map5_f8_opcode): New.
(map5_f9_opcode): Ditto.
(get_valid_dis386): Handle VEX_MAP5 opcode for AMX-MOVRS.
* i386-gen.c (isa_dependencies): Add AMX_MOVRS.
(cpu_flags): Ditto.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h (CpuAMX_MOVRS): New.
(i386_cpu_flags): Add cpuamx_movrs.
* i386-opc.tbl: Add AMX-MOVRS instructions.
* i386-tbl.h: Regenerated.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
Diffstat (limited to 'gas/doc')
-rw-r--r-- | gas/doc/c-i386.texi | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index e104a82..7b8b514 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -232,6 +232,7 @@ accept various extension mnemonics. For example, @code{amx_transpose}, @code{amx_tf32}, @code{amx_fp8} +@code{amx_movrs}, @code{amx_tile}, @code{vmx}, @code{vmfunc}, @@ -1707,7 +1708,7 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} @item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_tf32} -@item @samp{.amx_fp8} @tab @samp{.amx_tile} +@item @samp{.amx_fp8} @tab @samp{.amx_movrs} @tab @samp{.amx_tile} @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} |