Age | Commit message (Collapse) | Author | Files | Lines |
|
Vector index registers are currently only used in the VRV instruction
format. Unlike general purpose index registers an operand value of
zero (e.g. %v0, 0, or omitted) does not imply a zero value:
"For VRV format instructions, a vector element is used in the formation
of the intermediate value. This vector element is an unsigned binary
integer value that is added to the base address and 12-bit displacement
to form a 64-bit intermediate sum. The vector element is designated by
a vector register and an element index. A zero V field accesses the
element in vector register zero and does not imply a zero value." [1]
Therefore when using s390-specific assembler option "-mwarn-areg-zero"
do not warn if vector index register 0 is specified.
[1]: IBM z/Architecture Principles of Operation, SA22-7832-13, IBM z16,
https://publibfp.dhe.ibm.com/epubs/pdf/a227832d.pdf
gas/
* config/tc-s390.c (md_gather_operands): Do not warn about
vector index register 0.
gas/testsuite/
* gas/s390/zarch-warn-areg-zero.l (vgef): Do not expect warning
about vector index register 0.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
|
|
Vector index registers are currently only used in the VRV instruction
format. Unlike general purpose index registers an operand value of
zero (e.g. %v0, 0, or omitted) does not imply a zero value:
"For VRV format instructions, a vector element is used in the formation
of the intermediate value. This vector element is an unsigned binary
integer value that is added to the base address and 12-bit displacement
to form a 64-bit intermediate sum. The vector element is designated by
a vector register and an element index. A zero V field accesses the
element in vector register zero and does not imply a zero value." [1]
Therefore do not omit vector index register 0 in disassembly, that is
disassemble D(VX,B) with VX=0 as D(VX,B) instead of D(B). Also do not
disassemble index register 0 as "0", that is disassemble D(VX,B) with
VX=0 as D(%v0,B) instead of D(0,B). Note that a base register 0 still
still gets disassembled as "0", that is D(VX,B) with B=0 disassembles
into D(VX,0).
[1]: IBM z/Architecture Principles of Operation, SA22-7832-13, IBM z16,
https://publibfp.dhe.ibm.com/epubs/pdf/a227832d.pdf
opcodes/
* s390-dis.c (s390_print_insn_with_opcode): Do not omit vector
index register 0 in disassembly. Disassemble it as %v0.
gas/testsuite/
* gas/s390/zarch-base-index-0.d (vgef): Expect vector index
register 0 in disassembly.
* gas/s390/zarch-omitted-base-index.d (vgef): Likewise.
Suggested-by: Florian Krohm <flo2030@eich-krohm.de>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
|
|
This complements commit aacf780bca29 ("s390: Allow to explicitly omit
base register operand in assembly").
gas/testsuite/
* gas/s390/zarch-warn-areg-zero.l: Add tests for omitted base
register operands.
* gas/s390/zarch-warn-areg-zero.s: Likewise.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
|
|
Simplify the .machine directive parsing logic, so that cpu_string is
always xstrdup'd and can therefore always be xfree'd before returning
to the caller.
This resolves the following memory leak reported by ASAN:
Direct leak of 13 byte(s) in 3 object(s) allocated from:
#0 0x3ff8aafbb1d in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
#1 0x2aa338861cf in xmalloc ../../libiberty/xmalloc.c:149
#2 0x2aa338868ff in xstrdup ../../libiberty/xstrdup.c:34
#3 0x2aa320253cb in s390_machine ../../gas/config/tc-s390.c:2172
#4 0x2aa31fddc7b in read_a_source_file ../../gas/read.c:1293
#5 0x2aa31f4f7bf in perform_an_assembly_pass ../../gas/as.c:1223
#6 0x2aa31f4f7bf in main ../../gas/as.c:1436
#7 0x3ff8a02be35 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#8 0x3ff8a02bf33 in __libc_start_main_impl ../csu/libc-start.c:360
#9 0x2aa31f5758f (/home/jremus/git/binutils/build-asan/gas/as-new+0x2d5758f) (BuildId: ...)
While at it add tests with double quoted .machine
"<cpu>[+<extension>...]" values.
gas/
* config/tc-s390.c (s390_machine): Simplify parsing and free
cpu_string before returning.
gas/testsuite/
* gas/s390/machine-parsing-1.l: Add tests with double quoted
values.
* gas/s390/machine-parsing-1.s: Likewise.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
|
|
This resolves the following memory leak reported by ASAN:
Direct leak of 17 byte(s) in 1 object(s) allocated from:
#0 0x3ffb32fbb1d in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
#1 0x2aa149861cf in xmalloc ../../libiberty/xmalloc.c:149
#2 0x2aa149868ff in xstrdup ../../libiberty/xstrdup.c:34
#3 0x2aa1312391f in s390_machinemode ../../gas/config/tc-s390.c:2241
#4 0x2aa130ddc7b in read_a_source_file ../../gas/read.c:1293
#5 0x2aa1304f7bf in perform_an_assembly_pass ../../gas/as.c:1223
#6 0x2aa1304f7bf in main ../../gas/as.c:1436
#7 0x3ffb282be35 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#8 0x3ffb282bf33 in __libc_start_main_impl ../csu/libc-start.c:360
#9 0x2aa1305758f (/home/jremus/git/binutils/build-asan/gas/as-new+0x2d5758f) (BuildId: ...)
gas/
* config/tc-s390.c (s390_machinemode): Free mode_string before
returning.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
|
|
Reviewed-By Richard Earnshaw <richard.earnshaw@arm.com>
|
|
If legacy-encoded arithmetic insns are eligible for @gotpcrel
relaxation, EVEX-encoded ones ought to be, too.
Further anything that MOV-from-memory can be used for (and transformed
from) should then also extend to MOVRS.
While extending the apx-load* testcases add -mrelax-relocations=yes to
the two ones which were missing this: Without this option the intended
testing would not occur on configurations defaulting the option to off.
|
|
opcodes/
* ppc-opc.c (insert_m2, extract_m2): New functions.
(AESM, PGF1, XX2M, XX3M, XX3GF, XX2AES_MASK, XX2AESM_MASK,
XX3AES_MASK, XX3AESM_MASK, XX3GF_MASK): New macros.
(UIM): Update for new macros.
(powerpc_opcodes): Add xxaes128encp, xxaes192encp, xxaes256encp,
xxaesencp, xxaes128decp, xxaes192decp, xxaes256decp, xxaesdecp,
xxaes128genlkp, xxaes192genlkp, xxaes256genlkp, xxaesgenlkp,
xxgfmul128gcm, xxgfmul128xts, xxgfmul128.
gas/
* testsuite/gas/ppc/future.s: New test.
* testsuite/gas/ppc/future.d: Likewise.
|
|
PR gas/32579
The deprecated .s (swapped operand encoding) functionality got in the
way of properly recognizing this specific form. Move the Solaris-
specific code ahead of that.
|
|
I have missed @tab for .gmiccs and .padlockphe2, so fix this doc error.
gas/ChangeLog:
* doc/c-i386.texi: Add the missing @tab for .gmiccs and
.padlockphe2
|
|
On casual reading of older gcc configure scripts it might be supposed
that the test for gas string merge support tries with %progbits after
a fail on ARM with @progbits. It doesn't succeed due to a bug. So to
support building of older gcc's for ARM without users having to edit
gcc sources, add a hack to gas. The hack can disappear in a few years
when building older gcc's likely requires other work too.
I've changed the docs to reflect what we actually allow for .section
syntax prior to this patch. (No way should this hack be documented as
allowed!)
PR 32491
* config/obj-elf.c (obj_elf_section): Allow missing entsize
for ARM gcc configure bug.
* doc/as.texi: Correct syntax of ELF .section directive.
* testsuite/gas/elf/string.s,
* testsuite/gas/elf/string.d: Test it.
|
|
|
|
|
|
Commit af3394d97a8c5187085c0eec5fb03e8da88db5fb allowed sections
declared with "S" (SHF_STRING) to specify the entity size, but then
would warn if the entity size was omitted, as with the old syntax.
Unfortunately, since specifying the entity size is incompatible with
binutils 2.43 or earlier, this makes it impossible to specify a
strings section in source code without generating an assembly warning
(the new syntax isn't supported in older assemblers and the old syntax
generates warnings).
Nevertheless, the old code was wrong in that it did not set the entity
size at all, in contravention of the ELF specification (though to date
there are no known cases where this mattered outside of mergeable
sections).
Fix this by permitting the original syntax without a warning again,
but by defaulting the entity size to 1. This is compatible with the
most common case of strings being byte-based.
Added some tests for the various flavours of declaration that we
support.
|
|
This modifies _bfd_elf_free_cached_info to unmap/free section
contents. To do that we need to *not* free sections where contents
are bfd_alloc'd or point to constant strings or somesuch. I've chosen
to implement this be adding another flag to struct bfd_section,
"alloced" to say the section contents can't be freed. Most of the
patch is about setting that flag in many places.
|
|
|
|
|
|
Many FEAT_SVE2p1 instructions need to be enabled by either of two
different features (one for streaming mode, and one for non-streaming
mode). This patch adds correct gating conditions for these
instructions.
There were also a few sve2p1 instructions missing altogether, so add
those as well.
The testsuite is modified to check for all alternative enablement
conditions. In many cases this is done by adding an alternative
assembler commands to existing test files. For some SME/SME2 tests,
only some of the instructions are enabled by +sve2p1, so these are
copied into a separate test. For original SVE2p1 tests, the non-SME2p1
instructions have been moved to a separate test file.
There are also new tests for the newly added instructions. These
include a couple of fixme comments relating to bad error reporting,
which should be investigated later.
|
|
There are separate CPUID feature bits for SM2 and CCS instructions.
CCS is the acronym of Chinese Cipher System, it includes SM3 and SM4
instructions. This patch adds CpuGMISM2 and CpuGMICCS to replace CpuGMI on
corresponding instructions.
gas/ChangeLog:
* config/tc-i386.c: Add gmism2 and gmiccs to replace gmi.
* doc/c-i386.texi: Ditto.
opcodes/ChangeLog:
* i386-gen.c: Add GMISM2 and GMICCS to replace GMI.
* i386-opc.h (enum i386_cpu): Add CpuGMISM2 and CpuGMICCS to
replace CpuGMI.
* i386-opc.tbl: Replace GMI with GMISM2 on sm2 instruction. Replace GMI
with GMICCS on sm3 and sm4 instructions.
* i386-tbl.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-init.h: Ditto.
|
|
cpu_flags_match() is a hot path. Move the special casing that
b7267244a355 ("Support Intel AMX-MOVRS") added there to i386-gen, thus
affecting only build time performance.
|
|
Deriving operand size may no longer assume 512-bit vector size when
embedded rounding is in use. In fact it was apparently wrong to do so
in the first place, as that's not correct for scalar insns. Drop the
rounding type check altogether; we fall back to EVEX.LIG when no
suitable operand was specified anyway, later in the function (and, btw,
similarly for VEX encodings).
|
|
The ".quad with division (fwdref)" gas test fails with asan warning
negation of -9223372036854775808 cannot be represented in type 'long int'
Fix this and another similar case.
* symbols.c (resolve_symbol_value): Cast "left" to valueT
before negating.
|
|
|
|
|
|
This change is to make tail conform with software guarded jump of Zicfilp. The
reason to not choose t1 as the label register is that t1 is also as .got.plt
offset of _dl_runtime_resolve in PLT.
See more: https://github.com/riscv-non-isa/riscv-asm-manual/pull/93
|
|
https://github.com/riscv/riscv-cfi/releases/tag/v1.0
This patch only support the CFI instructions and CSR in assembler.
|
|
https://github.com/riscv/riscv-control-transfer-records/releases/tag/v1.0
The privileged spec v1.10 already removed the sfence.vm instruction, and the
encoding of sfence.vm instruction is overlapped with the sctrclr instruction
of ssctr/smctr. But since the privileged spec v1.10 already removed the
sfence.vm, and we no longer support the privileged spec v1.9.1 for now, we
had to remove the sfence.vm.
bfd/
* elfxx-riscv.c (riscv_implicit_subsets): Imply zicsr for ssctr/smctr.
(riscv_supported_std_s_ext): Added ssctr/smctr with version 1.0.
(riscv_multi_subset_supports): Handle INSN_CLASS for ssctr/smctr.
(riscv_multi_subset_supports_ext): Likewise.
gas/
* config/tc-riscv.c (enum riscv_csr_class, riscv_csr_address):
Added and handle CSR_CLASS_SSCTR and CSR_CLASS_SMCTR.
(riscv_is_priv_insn): Removed SFENCE_VM check.
* testsuite/gas/riscv/attribute-14e.d: Removed since sfence.vm is no
longer supported since privileged spec v1.10.
* testsuite/gas/riscv/attribute-14.s: Likewise.
* testsuite/gas/riscv/csr-version-1p10.d: Updated for ssctr/smctr CSRs.
* testsuite/gas/riscv/csr-version-1p10.l: Likewise.
* testsuite/gas/riscv/csr-version-1p11.d: Likewise.
* testsuite/gas/riscv/csr-version-1p11.l: Likewise.
* testsuite/gas/riscv/csr-version-1p12.d: Likewise.
* testsuite/gas/riscv/csr-version-1p12.l: Likewise.
* testsuite/gas/riscv/csr.s: Likewise.
* testsuite/gas/riscv/csr-dw-regnums.d: Likewise.
* testsuite/gas/riscv/csr-dw-regnums.s: Likewise.
* testsuite/gas/riscv/march-help.l: Updated for ssctr/smctr.
* testsuite/gas/riscv/smctr-ssctr.d: New testcase for sctr instruction.
* testsuite/gas/riscv/smctr-ssctr.s: Likewise.
include/
* opcode/riscv-opc.h: Added encoding macro for sctrclr, but removed
encoding macro for sfence.vm since encoding conflict. Added CSR
numbers for ssctr/smctr CSRs.
* opcode/riscv.h (enum riscv_insn_class): Added
INSN_CLASS_SMCTR_OR_SSCTR for sctrclr.
opcodes/
* riscv-opc.c (riscv_opcodes): Added sctrclr, but removed sfence.vm
since encoding conflict.
|
|
of reporting bad for disassembler
According to SDM, vcvt[,u]si2sd under r32 and vcvt[,u]dq2pd treat
Rounding as Ignored when trying to using them. Thus, disassembler
should accept bytecode with rounding instead of reporting bad.
For assembler, it needs some more time to decide how to deal
with that.
gas/ChangeLog:
* testsuite/gas/i386/evex.d: Add new testcase for vcvt[,u]dq2pd.
Change the output for vcvt[,u]si2sd.
* testsuite/gas/i386/evex.s: Ditto.
* testsuite/gas/i386/x86-64-evex.d: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-w.h: Add EXxEVexR64 for vcvt[,u]dq2pd.
* i386-dis.c (OP_Rounding): Mark EVEX_b as used to change the handle
for ignored rounding.
|
|
The CPUID EDX bit[26] indicates its enablement, and it includes REP
XSHA384 and REP XSHA512.
gas/ChangeLog:
* NEWS: Support Zhaoxin PadLock PHE2 instructions.
* config/tc-i386.c (add_branch_prefix_frag_p): Don't add prefix to
PadLockPHE2 instructions.
(output_insn): Handle PadLockPHE2 instructions.
* doc/c-i386.texi: Document PadLockPHE2.
* testsuite/gas/i386/i386.exp: Add PadLockPHE2 test.
* testsuite/gas/i386/padlock_phe2.d: Ditto.
* testsuite/gas/i386/padlock_phe2.s: Ditto.
opcodes/ChangeLog:
* i386-dis.c: Add PadLockPHE2.
* i386-gen.c: Ditto
* i386-opc.h (CpuPadLockPHE2): New.
* i386-opc.tbl: Add Zhaoxin PadLock PHE2 instructions.
* i386-tbl.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-init.h: Ditto.
|
|
* config/tc-ppc.c (ppc_machine): Free cpu_string.
|
|
This adds the section to HANDLE_ALIGN args, so that the frag created
by the ppc backend can be properly allocated on the frag obstack.
I've added an extra param to frag_alloc too, for cases where we know
the frag requires at least some bytes in fr_literal. This simplifies
some existing code, for example in compress_debug and relax_segment.
In the case of the relax_segment code, I think we may have had a bug
there in using obstack_blank_fast, which doesn't check that the frag
has room.
* config/tc-ppc.c (ppc_handle_align): Add section param,
use frag obstack to allocate frag.
* config/tc-ppc.h (HANDLE_ALIGN, ppc_handle_align): Add extra
param.
* config/tc-aarch64.h (HANDLE_ALIGN): Add extra param.
* config/tc-alpha.h: Likewise.
* config/tc-arc.h: Likewise.
* config/tc-arm.h: Likewise.
* config/tc-avr.h: Likewise.
* config/tc-epiphany.h: Likewise.
* config/tc-frv.h: Likewise.
* config/tc-i386.h: Likewise.
* config/tc-ia64.h: Likewise.
* config/tc-kvx.h: Likewise.
* config/tc-loongarch.h: Likewise.
* config/tc-m32c.h: Likewise.
* config/tc-m32r.h: Likewise.
* config/tc-metag.h: Likewise.
* config/tc-mips.h: Likewise.
* config/tc-mn10300.h: Likewise.
* config/tc-nds32.h: Likewise.
* config/tc-riscv.h: Likewise.
* config/tc-rl78.h: Likewise.
* config/tc-rx.h: Likewise.
* config/tc-sh.h: Likewise.
* config/tc-sparc.h: Likewise.
* config/tc-spu.h: Likewise.
* config/tc-tilegx.h: Likewise.
* config/tc-tilepro.h: Likewise.
* config/tc-v850.h: Likewise.
* config/tc-visium.h: Likewise.
* config/tc-wasm32.h: Likewise.
* config/tc-xtensa.h: Likewise.
* frags.h (frag_alloc): Update prototype.
* frags.c (frag_alloc): Add extra size param, allocate extra.
(frag_new): Update.
* subsegs.c (subseg_set_rest): Update frag_alloc call.
* write.c: Formatting.
(cvt_frag_to_fill): Pass sec to HANDLE_ALIGN.
(compress_frag): Update frag_alloc call.
(compress_debug): Use new frag_alloc to simplify frag sizing.
(relax_segment): Likewise.
|
|
Today, SFrame v2 specification does not describe how to encode the
information corresponding to the PAuth_LR PAC signing method (it only
supports PAuth PAC signing method).
SFrame v3 specification should hopefully specify it.
In the meantime, if the GNU assembler finds .cfi_negate_ra_state_with_pc
and --gsframe is specified, it will output a warning to the user and
will fail to generate the FDE entry.
A new SFrame test for .cfi_negate_ra_state_with_pc is also added to
reflect this issue.
Approved-by: Indu Bhagat <indu.bhagat@oracle.com>
|
|
This patch adds a new CFI directive (cfi_negate_ra_state_with_pc) which
set an additional bit in the RA state to inform that RA was signed with
SP but also PC as an additional diversifier.
RA state | Description
0b00 | Return address not signed (default if no cfi_negate_ra_state*)
0b01 | Return address signed with SP (cfi_negate_ra_state)
0b10 | Invalid state
0b11 | Return address signed with SP+PC (cfi_negate_ra_state_with_pc)
Approved-by: Indu Bhagat <indu.bhagat@oracle.com>
Approved-by: Jan Beulich <jbeulich@suse.com>
|
|
ARMv8.3 addded support for a new security feature named Pointer
Authentication. Support for this feature in SFrame already exists.
In GCC 14 and older, the Sparc DWARF extension .cfi_gnu_window_save
is emitted instead of .cfi_negate_ra_state.
GCC 15 fixed this issue, but this behavior is preserved for backward
compatibility.
The existing sframe test for AArch64 PAC was using .cfi_gnu_window_save.
This patch replaces this CFI in the existing test by the preferred one,
and adds a new test to check for backward compatibility when using
.cfi_gnu_window_save.
Approved-by: Indu Bhagat <indu.bhagat@oracle.com>
|
|
- add a detailed comment when parsing DW_CFA_GNU_window_save in SFrame to
explain why we are checking whether the targeted architecture is AArch64,
whereas this CFI is a Sparc extension.
- replace .cfi_gnu_window_save by .cfi_negate_ra_state in existing AArch64
DWARF tests as this is the preferred directive since GCC 15.
- add a new AArch64 test to check backward compatibility with old GCC
versions that emits .cfi_gnu_window_save.
Approved-by: Indu Bhagat <indu.bhagat@oracle.com>
Approved-by: Richard Earnshaw <richard.earnshaw@arm.com>
|
|
NE is quite ambiguous and misleading in mnemonics since it should be
Rounding to Nearest Even, but could be mis-interpretated to No
Exception.
Under its correct meaning, which means rounding, it should only be used
in down-convert, since up-convert is always exact for normal values
It could be difficult to judge which kind of convert it is if we have
the convert between same bit float types.
For all AI data types including BF16 and FP8, the default rounding is
Rounding to Nearest Even. So removing them in mnemonics would reduce
burden for programmers to consider whether it should be added or not
in mnemonics and stop the ambiguous meaning on "NE" itself.
If the convert itself is using a rounding mode other than RNE, it would
be explicitly added in mnemonics (e.g., Long used "T" and "BIAS"
introduced in AVX10.2).
gas/ChangeLog:
* testsuite/gas/i386/avx10_2-256-cvt-intel.d: Refine testcases
according to mnemonics change.
* testsuite/gas/i386/avx10_2-256-cvt.d: Ditto.
* testsuite/gas/i386/avx10_2-256-cvt.s: Ditto.
* testsuite/gas/i386/avx10_2-256-satcvt-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-256-satcvt.d: Ditto.
* testsuite/gas/i386/avx10_2-256-satcvt.s: Ditto.
* testsuite/gas/i386/avx10_2-512-cvt-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-cvt.d: Ditto.
* testsuite/gas/i386/avx10_2-512-cvt.s: Ditto.
* testsuite/gas/i386/avx10_2-512-satcvt-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-satcvt.d: Ditto.
* testsuite/gas/i386/avx10_2-512-satcvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-satcvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-satcvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-satcvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-cvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-cvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-cvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-satcvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-satcvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-satcvt.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-prefix.h: Remove ne in mnemonics for
convert insns.
* i386-opc.tbl: Ditto.
* i386-mnem.h: Regenerated.
* i386-tbl.h: Ditto.
|
|
The functionality for VCOMSBF16 is exactly the same as the VCOMISD/S/H.
The only difference is the bf16 type. Thus, it should be VCOMISBF16.
This patch would fix that.
gas/ChangeLog:
* testsuite/gas/i386/avx10_2-256-bf16-intel.d: Refine testcase
according to mnemonics change.
* testsuite/gas/i386/avx10_2-256-bf16.d: Ditto.
* testsuite/gas/i386/avx10_2-256-bf16.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-prefix.h: Rename VCOMSBF16 to VCOMISBF16.
* i386-opc.tbl: Ditto.
* i386-mnem.h: Regenerated.
* i386-tbl.h: Ditto.
|
|
Since the bf16 is an AI data types, it will be implicitly packed. Thus,
"P" (for packed) is omitted in mnemonics from its introduction. AVX10.2
BF16 arithmetic insns are introduced with "P" in mnemonics with packed.
This patch will remove them for consistency.
NE is quite ambiguous and misleading in mnemonics since it should be
Rounding to Nearest Even, but could be mis-interpretated to No
Exception. While AI data types like BF16 and FP8 are using Rounding to
Nearest Even as default rounding modes. There is no need to use the
ambiguous mnemonics in AVX10.2 insns. This patch will also remove them.
For convert insns, it will be handled in the upcoming patch.
gas/ChangeLog:
* testsuite/gas/i386/avx10_2-256-bf16-intel.d: Refine testcase
according to new mnemonics.
* testsuite/gas/i386/avx10_2-256-bf16.d: Ditto.
* testsuite/gas/i386/avx10_2-256-bf16.s: Ditto.
* testsuite/gas/i386/avx10_2-256-miscs-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-256-miscs.d: Ditto.
* testsuite/gas/i386/avx10_2-256-miscs.s: Ditto.
* testsuite/gas/i386/avx10_2-512-bf16-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-bf16.d: Ditto.
* testsuite/gas/i386/avx10_2-512-bf16.s: Ditto.
* testsuite/gas/i386/avx10_2-512-miscs-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-miscs.d: Ditto.
* testsuite/gas/i386/avx10_2-512-miscs.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-bf16.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-miscs-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-miscs.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-miscs.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-bf16-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-bf16.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-bf16.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-miscs-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-miscs.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-miscs.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-prefix.h: Remove p and ne in bf16 mnemonics.
* i386-opc.tbl: Ditto.
* i386-mnem.h: Regenerated.
* i386-tbl.h: Ditto.
|
|
This patch will support AMX-AVX512. In disassmbler, we pull out all
GPR mode out of the vex length switch to make it more general.
gas/ChangeLog:
* NEWS: Mention the full support on DMR AMX ISAs.
* config/tc-i386.c: Add amx_avx512.
* doc/c-i386.texi: Document .amx_avx512.
* testsuite/gas/i386/x86-64.exp: Run AMX-AVX512 tests.
* testsuite/gas/i386/x86-64-amx-avx512-intel.d: New test.
* testsuite/gas/i386/x86-64-amx-avx512.d: Ditto.
* testsuite/gas/i386/x86-64-amx-avx512.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-len.h: Add EVEX_LEN_0F384A_X86_64_W_0,
EVEX_LEN_0F386D_X86_64_W_0, EVEX_LEN_0F3A07_X86_64_W_0,
EVEX_LEN_0F3A77_X86_64_W_0.
* i386-dis-evex-prefix.h: Add PREFIX_EVEX_0F384A_W_0_L_2,
PREFIX_EVEX_0F386D_W_0_L_2, PREFIX_EVEX_0F3A07_W_0_L_2,
PREFIX_EVEX_0F3A77_W_0_L_2.
* i386-dis-evex-w.h: Add EVEX_W_0F384A_X86_64, EVEX_W_0F386D_X86_64,
EVEX_W_0F3A07_X86_64, EVEX_W_0F3A77_X86_64.
* i386-dis-evex-x86-64.h: Add X86_64_EVEX_0F384A, X86_64_EVEX_0F386D,
X86_64_EVEX_0F3A07, X86_64_EVEX_0F3A77.
* i386-dis-evex.h: Ditto.
* i386-dis.c (EVEX_LEN_0F384A_X86_64_W_0): New.
(EVEX_LEN_0F386D_X86_64_W_0): Ditto.
(EVEX_LEN_0F3A07_X86_64_W_0): Ditto.
(EVEX_LEN_0F3A77_X86_64_W_0): Ditto.
(MOD_EVEX_0F384A_X86_64_W_0): Ditto.
(MOD_EVEX_0F386D_X86_64_W_0): Ditto.
(MOD_EVEX_0F3A07_X86_64_W_0): Ditto.
(MOD_EVEX_0F3A77_X86_64_W_0): Ditto.
(PREFIX_EVEX_0F384A_W_0_L_2): Ditto.
(PREFIX_EVEX_0F386D_W_0_L_2): Ditto.
(PREFIX_EVEX_0F3A07_W_0_L_2): Ditto.
(PREFIX_EVEX_0F3A77_W_0_L_2): Ditto.
(EVEX_W_0F384A_X86_64): Ditto.
(EVEX_W_0F386D_X86_64): Ditto.
(EVEX_W_0F3A07_X86_64): Ditto.
(EVEX_W_0F3A77_X86_64): Ditto.
(X86_64_EVEX_0F384A): Ditto.
(X86_64_EVEX_0F386D): Ditto.
(X86_64_EVEX_0F3A07): Ditto.
(X86_64_EVEX_0F3A77): Ditto.
(OP_VEX): Pull out all GPR mode out of the vector length switch.
* i386-gen.c (isa_dependencies): Add AMX-AVX512.
(cpu_flags): Ditto.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h (CpuAMX_AVX512): New.
(i386_cpu_flags): Add cpuamx_avx512.
* i386-opc.tbl: Add AMX-AVX512 instructions.
* i386-tbl.h: Regenerated.
|
|
This patch will support AMX-MOVRS feature. Unlike all the other
AMX insns in vector space where we pass vex_len_table before
vex_w_table, we first pass vex_w_table for tileloaddrs[,t1] to
align with the order in EVEX space. The reason why we first pass
vex_w_table in EVEX space is due to AMX-AVX512, where tcvtrowd2ps
and tilemovrow with r32 shares the same opcode with tileloaddrs[,t1].
All of them have evex.w = 0 but with different evex.length. Re-doing
that shortly is not ideal.
APX_F extension is also implemented in this patch. The encoding will
be:
- EVEX.128.NP/66.MAP5.W0 F8/F9 !(11):rrr:100 for
T2RPNTLVW[Z0,Z1]RS[,T1] with NF=0.
- EVEX.128.F2/66.0F38.W0 4A !(11):rrr:100 FOR TILELOADDRS[,T1] with
NF=0.
For APX_F extension, we could not use APX_F(AMX_TRANSPOSE&AMX_MOVRS)
since the transformation could not be done. Instead, we will use
AMX_TRANSPOSE & APX_F(AMX_MOVRS). Thus, we should set AMX_TRANSPOSE
for "any" for cpu_flags in assembler. Since it will only affect the
cpu_flags_match, handle that there.
gas/ChangeLog:
* config/tc-i386.c (cpu_arch): Add amx_movrs.
(cpu_flags_match): Set any bitfield for multiple cpuid
enabled insns.
* doc/c-i386.texi: Document .amx_movrs.
* testsuite/gas/i386/x86-64.exp: Run AMX-MOVRS tests.
* testsuite/gas/i386/x86-64-amx-movrs-intel.d: New test.
* testsuite/gas/i386/x86-64-amx-movrs-inval.l: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs-inval.s: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs.d: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-len.h (EVEX_LEN_0F384A_X86_64_W_0): New.
* i386-dis-evex-w.h (EVEX_W_0F384A_X86_64): Ditto.
* i386-dis-evex-x86-64.h (X86_64_EVEX_0F384A): Ditto.
* i386-dis-evex.h: New entry for AMX-MOVRS.
* i386-dis.c:
(PREFIX_VEX_0F384A_X86_64_L_0_W_0): New.
(PREFIX_VEX_MAP5_F8_X86_64_L_0_W_0): Ditto.
(PREFIX_VEX_MAP5_F9_X86_64_L_0_W_0): Ditto.
(X86_64_VEX_0F384A): Ditto.
(X86_64_VEX_MAP5_F8): Ditto.
(X86_64_VEX_MAP5_F9): Ditto.
(X86_64_EVEX_0F384A): Ditto.
(VEX_LEN_0F384A_X86_64_W_0): Ditto.
(VEX_LEN_MAP5_F8_X86_64): Ditto.
(VEX_LEN_MAP5_F9_X86_64): Ditto.
(EVEX_LEN_0F384A_X86_64_W_0): Ditto.
(VEX_W_0F384A_X86_64): Ditto.
(VEX_W_MAP5_F8_X86_64): Ditto.
(VEX_W_MAP5_F9_X86_64): Ditto.
(EVEX_W_0F384A_X86_64): Ditto.
(prefix_table): New entry for AMX-MOVRS.
(x86_64_table): Ditto.
(vex_len_table): Ditto.
(vex_w_table): Ditto.
(map5_f8_opcode): New.
(map5_f9_opcode): Ditto.
(get_valid_dis386): Handle VEX_MAP5 opcode for AMX-MOVRS.
* i386-gen.c (isa_dependencies): Add AMX_MOVRS.
(cpu_flags): Ditto.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h (CpuAMX_MOVRS): New.
(i386_cpu_flags): Add cpuamx_movrs.
* i386-opc.tbl: Add AMX-MOVRS instructions.
* i386-tbl.h: Regenerated.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
This patch focus on supporting MOVRS ISA. We could take this full ISA
as four part: PREFETCHRST2, MOVRS, MOVRS APX_F extension and MOVRS AVX10.2
extension.
The APX_F extension for MOVRS will be:
- EVEX.LLZ.NP.MAP4.WIG 8A !(11):rrr:bbb for r8/m8 with NF=0 and
ND=0
- EVEX.LLZ.NP/66.MAP4.SCALABLE 8B !(11):rrr:bbb for rv/mv with NF=0
and ND=0
We did not merge the table together for APX_F since there is an explicit
x64 for movrs insn. The current APX_F() did not support the combination
between CPUIDs. Also, the space is different for legacy and apx_f forms.
gas/ChangeLog:
* NEWS: Support Intel MOVRS.
* config/tc-i386.c: Add MOVRS.
* doc/c-i386.texi: Document .movrs.
* testsuite/gas/i386/i386.exp: Run MOVRS tests.
* testsuite/gas/i386/x86-64.exp: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Add MOVRS
tests.
* testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted.d: Ditto.
* testsuite/gas/i386/x86-64-apx-evex-promoted.s: Ditto.
* testsuite/gas/i386/lfence-load.d: Add prefetchrst2.
* testsuite/gas/i386/lfence-load.s: Ditto.
* testsuite/gas/i386/nops-8.d: Ditto.
* testsuite/gas/i386/prefetch-intel.d: Ditto.
* testsuite/gas/i386/prefetch.d: Ditto.
* testsuite/gas/i386/x86-64-lfence-load.d: Ditto.
* testsuite/gas/i386/x86-64-lfence-load.s: Ditto.
* testsuite/gas/i386/x86-64-prefetch-intel.d: Ditto.
* testsuite/gas/i386/x86-64-prefetch.d: Ditto.
* testsuite/gas/i386/movrs-intel.d: New test.
* testsuite/gas/i386/movrs-inval.l: Ditto.
* testsuite/gas/i386/movrs-inval.s: Ditto.
* testsuite/gas/i386/movrs.d: Ditto.
* testsuite/gas/i386/movrs.s: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-256-intel.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-256.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-256.s: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-512-intel.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-512.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-avx10_2-512.s: Ditto.
* testsuite/gas/i386/x86-64-movrs-intel.d: Ditto.
* testsuite/gas/i386/x86-64-movrs.d: Ditto.
* testsuite/gas/i386/x86-64-movrs.s: Ditto.
* testsuite/gas/i386/x86-64-movrs-intel-suffix.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-suffix.d: Ditto.
* testsuite/gas/i386/x86-64-movrs-suffix.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-prefix.h: Add PREFIX_EVEX_MAP5_6F_X86_64.
* i386-dis-evex-x86.h: Add X86_64_EVEX_MAP5_6F.
* i386-dis-evex.h (evex_table): New entry for movrs.
* i386-dis.c (MOD_0F18_REG_4): New.
(PREFIX_EVEX_MAP5_6F_X86_64): Ditto.
(X86_64_0F388A): Ditto.
(X86_64_0F388B): Ditto.
(X86_64_EVEX_MAP5_6F): Ditto.
(three_byte_table): New entry for MOVRS.
(reg_table): Ditto.
(mod_table): Ditto.
(x86_64_table): Ditto. Also include i386-dis-evex-x86.h.
* i386-gen.c (cpu_flags): Add MOVRS.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h (i386_cpu_flags): Add cpumovrs.
* i386-opc.tbl: Add MOVRS instrctions.
* i386-tbl.h: Regenerated.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
Co-authored-by: Lili Cui <lili.cui@intel.com>
|
|
PR ld/12291
PR ld/12430
PR ld/13298
* config/tc-h8300.c (h8300_elf_section): Handle .gnu_object_only
section.
|
|
Link with mixed IR/non-IR objects
* 2 kinds of object files
o non-IR object file has
* non-IR sections
o IR object file has
* IR sections
* non-IR sections
* The output of "ld -r" with mixed IR/non-IR objects should work with:
o Compilers/linkers with IR support.
o Compilers/linkers without IR support.
* Add the mixed object file which has
o IR sections
o non-IR sections:
* Object codes from IR sections.
* Object codes from non-IR object files.
o Object-only section:
* With section name ".gnu_object_only" and SHT_GNU_OBJECT_ONLY type
on ELF:
https://gitlab.com/x86-psABIs/Linux-ABI
#define SHT_GNU_OBJECT_ONLY 0x6ffffff8 /* Object only */
* Contain non-IR object file.
* Input is discarded after link.
* Linker action:
o Classify each input object file:
* If there is a ".gnu_object_only" section, it is a mixed object file.
* If there is a IR section, it is an IR object file.
* Otherwise, it is a non-IR object file.
o Relocatable non-IR link:
* Prepare for an object-only output.
* Prepare for a regular output.
* For each mixed object file:
* Add IR and non-IR sections to the regular output.
* For object-only section:
* Extract object only file.
* Add it to the object-only output.
* Discard object-only section.
* For each IR object file:
* Add IR and non-IR sections to the regular output.
* For each non-IR object file:
* Add non-IR sections to the regular output.
* Add non-IR sections to the object-only output.
* Final output:
* If there are IR objects, non-IR objects and the object-only
output isn't empty:
* Put the object-only output into the object-only section.
* Add the object-only section to the regular output.
* Remove the object-only output.
o Normal link and relocatable IR link:
* Prepare for output.
* IR link:
* For each mixed object file:
* Compile and add IR sections to the output.
* Discard non-IR sections.
* Object-only section:
* Extract object only file.
* Add it to the output.
* Discard object-only section.
* For each IR object file:
* Compile and add IR sections to the output.
* Discard non-IR sections.
* For each non-IR object file:
* Add non-IR sections to the output.
* Non-IR link:
* For each mixed object file:
* Add non-IR sections to the output.
* Discard IR sections and object-only section.
* For each IR object file:
* Add non-IR sections to the output.
* Discard IR sections.
* For each non-IR object file:
* Add non-IR sections to the output.
This is useful for Linux kernel build with LTO.
bfd/
PR ld/12291
PR ld/12430
PR ld/13298
* bfd.c (bfd_lto_object_type): Add lto_mixed_object.
(bfd): Add object_only_section.
(bfd_group_signature): New.
* elf.c (special_sections_g): Add .gnu_object_only.
* format.c: Include "plugin-api.h" and "plugin.h" if
BFD_SUPPORTS_PLUGINS is defined.
(bfd_set_lto_type): Set type to lto_mixed_object for
GNU_OBJECT_ONLY_SECTION_NAME section.
(bfd_check_format_matches): Don't check the plugin target twice
if the plugin target is explicitly specified.
* opncls.c (bfd_extract_object_only_section): New.
* plugin.c (bfd_plugin_fake_text_section): New.
(bfd_plugin_fake_data_section): Likewise.
(bfd_plugin_fake_bss_section): Likewise.
(bfd_plugin_fake_common_section): Likewise.
(bfd_plugin_get_symbols_in_object_only): Likewise.
* plugin.c (add_symbols): Call
bfd_plugin_get_symbols_in_object_only and count
plugin_data->object_only_nsyms.
(bfd_plugin_get_symtab_upper_bound): Count
plugin_data->object_only_nsyms.
bfd_plugin_get_symbols_in_object_only and add symbols from
object only section.
(bfd_plugin_canonicalize_symtab): Remove fake_section,
fake_data_section, fake_bss_section and fake_common_section.
Set udata.p to NULL. Use bfd_plugin_fake_text_section,
bfd_plugin_fake_data_section, bfd_plugin_fake_bss_section and
bfd_plugin_fake_common_section.
Set udata.p to NULL.
* plugin.h (plugin_data_struct): Add object_only_nsyms and
object_only_syms.
* section.c (GNU_OBJECT_ONLY_SECTION_NAME): New.
* bfd-in2.h: Regenerated.
binutils/
PR ld/12291
PR ld/12430
PR ld/13298
* objcopy.c (group_signature): Removed.
(is_strip_section): Replace group_signature with
bfd_group_signature.
(setup_section): Likewise.
* readelf.c (get_os_specific_section_type_name): Handle
SHT_GNU_OBJECT_ONLY.
gas/
PR ld/12291
PR ld/12430
PR ld/13298
* testsuite/gas/elf/section9.s: Add the .gnu_object_only test.
* testsuite/gas/elf/section9.d: Updated.
include/
PR ld/12291
PR ld/12430
PR ld/13298
* elf/common.h (SHT_GNU_OBJECT_ONLY): New.
ld/
PR ld/12291
PR ld/12430
PR ld/13298
* ld.h (ld_config_type): Add emit_gnu_object_only and
emitting_gnu_object_only.
* ldelf.c (orphan_init_done): Make it file scope.
(ldelf_place_orphan): Rename hold to orig_hold. Initialize hold
from orig_hold at run-time.
(ldelf_finish): New.
* ldelf.h (ldelf_finish): New.
* ldexp.c (ldexp_init): Take a bfd_boolean argument to supprt
object-only output.
(ldexp_finish): Likewise.
* ldexp.h (ldexp_init): Take a bfd_boolean argument.
(ldexp_finish): Likewise.
* ldfile.c (ldfile_try_open_bfd): Call
cmdline_check_object_only_section.
* ldlang.c: Include "ldwrite.h" and elf-bfd.h.
* ldlang.c (cmdline_object_only_file_list): New.
(cmdline_object_only_archive_list): Likewise.
(cmdline_temp_object_only_list): Likewise.
(cmdline_lists_init): Likewise.
(cmdline_list_new): Likewise.
(cmdline_list_append): Likewise.
(print_cmdline_list): Likewise.
(cmdline_on_object_only_archive_list_p): Likewise.
(cmdline_object_only_list_append): Likewise.
(cmdline_get_object_only_input_files): Likewise.
(cmdline_arg): Likewise.
(setup_section): Likewise.
(copy_section): Likewise.
(cmdline_fopen_temp): Likewise.
(cmdline_add_object_only_section): Likewise.
(cmdline_emit_object_only_section): Likewise.
(cmdline_extract_object_only_section): Likewise.
(cmdline_check_object_only_section): Likewise.
(cmdline_remove_object_only_files): Likewise.
(lang_init): Take a bfd_boolean argument to supprt object-only
output. Call cmdline_lists_init.
(load_symbols): Call cmdline_on_object_only_archive_list_p
to check if an archive member should be loaded.
(lang_process): Handle object-only link.
* ldlang.h (lang_init): Take a bfd_boolean argument.
(cmdline_enum_type): New.
(cmdline_header_type): Likewise.
(cmdline_file_type): Likewise.
(cmdline_bfd_type): Likewise.
(cmdline_union_type): Likewise.
(cmdline_list_type): Likewise.
(cmdline_emit_object_only_section): Likewise.
(cmdline_check_object_only_section): Likewise.
(cmdline_remove_object_only_files): Likewise.
* ldmain.c (main): Call xatexit with
cmdline_remove_object_only_files. Pass FALSE to lang_init,
ldexp_init and ldexp_finish. Use ld_parse_linker_script.
Set link_info.output_bfd to NULL after close. Call
cmdline_emit_object_only_section if needed.
(add_archive_element): Call cmdline_check_object_only_section.
(ld_parse_linker_script): New.
* ldmain.h (ld_parse_linker_script): New.
* plugin.c (plugin_maybe_claim): Call
cmdline_check_object_only_section on claimed IR files.
* scripttempl/elf.sc: Also discard .gnu_object_only sections.
* scripttempl/elf64hppa.sc: Likewise.
* scripttempl/elfxtensa.sc: Likewise.
* scripttempl/mep.sc: Likewise.
* scripttempl/pe.sc: Likewise.
* scripttempl/pep.sc: Likewise.
* emultempl/aarch64elf.em (gld${EMULATION_NAME}_finish): Replace
finish_default with ldelf_finish.
* emultempl/alphaelf.em (alpha_finish): Likewise.
* emultempl/avrelf.em (avr_finish): Likewise.
* emultempl/elf.em (ld_${EMULATION_NAME}_emulation): Likewise.
* emultempl/ppc32elf.em (ppc_finish): Likewise.
* emultempl/ppc64elf.em (gld${EMULATION_NAME}_finish): Likewise.
* emultempl/spuelf.em (gld${EMULATION_NAME}_finish): Likewise.
* testsuite/ld-plugin/lto-10.out: New file.
* testsuite/ld-plugin/lto-10a.c: Likewise.
* testsuite/ld-plugin/lto-10b.c: Likewise.
* testsuite/ld-plugin/lto-10r.d: Likewise.
* testsuite/ld-plugin/lto-4.out: Likewise.
* testsuite/ld-plugin/lto-4a.c: Likewise.
* testsuite/ld-plugin/lto-4b.c: Likewise.
* testsuite/ld-plugin/lto-4c.c: Likewise.
* testsuite/ld-plugin/lto-4r-a.d: Likewise.
* testsuite/ld-plugin/lto-4r-b.d: Likewise.
* testsuite/ld-plugin/lto-4r-c.d: Likewise.
* testsuite/ld-plugin/lto-4r-d.d: Likewise.
* testsuite/ld-plugin/lto.exp (lto_link_tests): Prepare for
"LTO 4[acd]", "lto-4r-[abcd]" and "LTO 10" tests.
(lto_run_tests): Add "LTO 4[acd]" and "LTO 10" tests.
Build liblto-4.a. Run "lto-4r-[abcd]" tests.
Run lto-10r and create tmpdir/lto-10.o.
Add test for nm on mixed LTO/non-LTO object.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
This patch adds support for SME ZA-targeting non-widening BFloat16 instructions,
under tick FEAT_SME_B16B16 and command line flag "+sme-b16b16".
FEAT_SME_B16B16 implements FEAT_SME2 and FEAT_SVE_B16B16, in accordance with that
"+sme-b16b16" enables "+sme2" and "+sve-b16b16".
Also the test files related to FEAT_SME_B16B16 are prefixed with sme-b16b16*.
eg: sme-b16b16-1.s, sme-b16b16-1.d.
The spec for this feature and instructions is availabe here [1]:
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SME-Instructions?lang=en
|
|
This patch adds support for SME Z-targeting multi-vector non-widening
BFloat16 instructions, under tick FEAT_SVE_B16B16 and command line flag
"+sve-b16b16+sme2".
Also the test files related to FEAT_SVE_B16B16 (+sme2) are prefixed with
sve-b16b16-sme2*.
eg: sve-b16b16-sme2-1.s, sve-b16b16-sme2-1.d.
The spec for this feature and instructions is availabe here [1]:
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SME-Instructions?lang=en
|
|
In the current code, SVE2 Bfloat16 instructions are implemented with tick
FEAT_B16B16 and command line flag "+b16b16" and this feature was suspended
due to incomplete support.
In the new spec available here[1], FEAT_B16B16 is replaced with
FEAT_SVE_B16B16 and command line flag "+b16b16" is replace with "sve-b16b16".
Also the test files related to FEAT_SVE_B16B16 are prefixed with sve-b16b16*.
eg: sve-b16b16-sve2-1.s, sve-b16b16-sve2-1.d.
This patch supports the SVE Z-targeting non-widening BFloat16 instructions
with command line flag "+sve-b16b16+sve2".
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SVE-Instructions?lang=en
|
|
Add tests for this, and update the existing fvdotb and fvdott tests to
include the VGx4 symbol so that they continue to test for the intended
errors.
|
|
Rename to AARCH64_OPND_SME_ZT0_INDEX_MUL_VL.
|
|
The error message really isn't appropriate (both here and elsewhere in
the test file), but I don't currently have time to investigate further.
|