Age | Commit message (Collapse) | Author | Files | Lines |
|
In LoongArch ABI, r22 register can be used as frame pointer or
static register(s9).
Link: https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#general-purpose-registers
|
|
For R_LARCH_TLS_{LE_HI20_R,LE_ADD_R,LD_PC_HI20,GD_PC_HI20, DESC_PC_HI20}
relocations, start a new frag to get correct eh_frame Call Frame Information
FDE DW_CFA_advance_loc info.
|
|
Gcc may generate "\t.align\t%d,54525952,4\n" before commit
b20c7ee066cb7d952fa193972e8bc6362c6e4063. To write 54525952 (NOP) to object
file, we call s_align_ptwo (-4). It result in alignment padding must be a
multiple of 4 if .align has second parameter.
Use default s_align_ptwo for .align.
|
|
For two register macros (e.g. la.local $t0, $t1, symbol) used in extreme code
model, do not emit R_LARCH_RELAX relocations.
|
|
VRNDSCALE{P,S}{S,D} is the AVX512 generalization of these AVX insns. As
long as the immediate has the top 4 bits clear, they are equivalent to
the earlier VEX-encoded insns, and hence can be used to permit use of
eGPR-s in the memory operand. Since this is the normal way of using
these insns, also alter the resulting diagnostic to complain about the
immediate, not the eGPR use.
|
|
This was missed in 6177c84d5edc ("Support APX GPR32 with extend evex
prefix").
|
|
The particular choices of address indexing, along with their encoding
for RCPC3 instructions lead to the requirement of a new set of operand
descriptions, along with the relevant inserter/extractor set.
That is, for the integer load/stores, there is only a single valid
indexing offset quantity and offset mode is allowed - The value is
always equivalent to the amount of data read/stored by the
operation and the offset is post-indexed for Load-Acquire RCpc, and
pre-indexed with writeback for Store-Release insns.
This indexing quantity/mode pair is selected by the setting of a
single bit in the instruction. To represent these insns, we add the
following operand types:
- AARCH64_OPND_RCPC3_ADDR_OPT_POSTIND
- AARCH64_OPND_RCPC3_ADDR_OPT_PREIND_WB
In the case of loads and stores involving SIMD/FP registers, the
optional offset is encoded as an 8-bit signed immediate, but neither
post-indexing or pre-indexing with writeback is available. This
created the need for an operand type similar to
AARCH64_OPND_ADDR_OFFSET, with the difference that FLD_index should
not be checked.
We thus introduce the AARCH64_OPND_RCPC3_ADDR_OFFSET operand, a
variant of AARCH64_OPND_ADDR_OFFSET, w/o the FLD_index bitfield.
|
|
Indicating the presence of the Armv8.2-a feature adding further
support for the Release Consistency Model, the `+rcpc3' architectural
extension flag is added to the list of possible `-march' options in
Binutils, together with the necessary macro for encoding rcpc3
instructions.
|
|
There are some tlbi operations that don't have a corresponding tlbip operation,
but we were incorrectly using the same list for both. Add the missing tlbi
*nxs operations, and use the F_REG_128 flag to filter tlbi operations that
don't have a tlbip analogue. For increased clarity, I have also used a macro
to reduce duplication between the 'nxs' and non-'nxs' variants, and added a
test to verify that no invalid combinations are accepted.
Additionally, fix two missing checks for AARCH64_OPND_SYSREG_TLBIP that were
preventing disassembly of tlbip instructions.
|
|
Add an aarch64_feature_set field to aarch64_sys_ins_reg, and use this for
feature checks instead of testing against a list of operand codes.
|
|
Hi,
This patch add support for SVE2.1 instructions ld1q,
ld2q, ld3q and ld4q, st1q, st2q, st3q and st4q.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for SVE2.1 instruction dupq, eorqv and extq.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for FEAT_SVE2p1 (SVE2.1 Extension) feature
along with +sve2p1 optional flag to enabe this feature.
Also support for following SVE2p1 instructions is added
addqv, andqv, smaxqv, sminqv, umaxqv, uminqv and uminqv.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for FEAT_SME2p1 and "movaz" instructions
along with the optional flag +sme2p1.
Following "movaz" instructions are add:
Move and zero two ZA tile slices to vector registers.
Move and zero four ZA tile slices to vector registers.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
Hi,
This patch add support for SVE2.1 and SME2.1 non-widening BFloat16
(FEAT_B16B16) instructions.
Following instructions predicated, unpredicated and indexed
variants are added in this patch.
bfadd, bfclamp, bfmax bfmaxnm, bfmin,bfminnm,
bfmla,bfmls,bfmul and bfsub.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
|
|
This patch adds support in GAS to create generic GAS instructions
(a.k.a., the ginsn) for the x86 backend (AMD64 ABI only at this time).
Using this ginsn infrastructure, GAS can then synthesize CFI for
hand-written asm for x86_64.
A ginsn is a target-independent representation of the machine
instructions. One machine instruction may need one or more ginsn.
This patch also adds skeleton support for printing ginsn in the listing
output for debugging purposes.
Since the current use-case of ginsn is to synthesize CFI, the x86 target
needs to generate ginsns necessary for the following machine
instructions only:
- All change of flow instructions, including all conditional and
unconditional branches, call and return from functions.
- All register saves and unsaves to the stack.
- All instructions affecting the two registers that could potentially
be used as the base register for CFA tracking. For SCFI, the base
register for CFA tracking is limited to REG_SP and REG_FP only for
now.
The representation of ginsn is kept simple:
- GAS instruction has GINSN_NUM_SRC_OPNDS (defined to be 2 at this time)
number of source operands and one destination operand at this time.
- GAS instruction uses DWARF register numbers in its representation and
does not track register size.
- GAS instructions carry location information (file name and line
number).
- GAS instructions are ID's with a natural number in order of their
addtion to the list. This can be used as a proxy for the static
program order of the corresponding machine instructions.
Note that, GAS instruction (ginsn) format does not support
GINSN_TYPE_PUSH and GINSN_TYPE_POP. Some architectures, like aarch64,
do not have push and pop instructions, but rather STP/LDP/STR/LDR etc.
instructions. Further these instructions have a variety of addressing
modes, like offset, pre-indexing and post-indexing etc. Among other
things, one of differences in these addressing modes is _when_ the addr
register is updated with the result of the address calculation: before
or after the memory operation. To best support such needs, the generic
instructions like GINSN_TYPE_LOAD, GINSN_TYPE_STORE together with
GINSN_TYPE_ADD, and GINSN_TYPE_SUB may be used.
The functionality provided in ginsn.c and scfi.c is compiled in when a
target defines TARGET_USE_SCFI and TARGET_USE_GINSN. This can be
revisited later when there are other use-cases of creating ginsn's in
GAS, apart from the current use-case of synthesizing CFI for
hand-written asm.
Support is added only for System V AMD64 ABI for ELF at this time. If
the user enables SCFI with --32, GAS issues an error:
"Fatal error: SCFI is not supported for this ABI"
For synthesizing (DWARF) CFI, the SCFI machinery requires the programmer
to adhere to some pre-requisites for their asm:
- Hand-written asm block must begin with a .type foo, @function
It is highly recommended to, additionally, also ensure that:
- Hand-written asm block ends with a .size foo, .-foo
The SCFI machinery encodes some rules which align with the standard
calling convention specified by the ABI. Apart from the rules, the SCFI
machinery employs some heuristics. For example:
- The base register for CFA tracking may be either REG_SP or REG_FP.
- If the base register for CFA tracking is REG_SP, the precise amount of
stack usage (and hence, the value of REG_SP) must be known at all times.
- If using dynamic stack allocation, the function must switch to
FP-based CFA. This means using instructions like the following (in
AMD64) in prologue:
pushq %rbp
movq %rsp, %rbp
and analogous instructions in epilogue.
- Save and Restore of callee-saved registers must be symmetrical.
However, the SCFI machinery at this time only warns if any such
asymmetry is seen.
These heuristics/rules are architecture-independent and are meant to
employed for all architectures/ABIs using SCFI in the future.
gas/
* Makefile.am: Add new files.
* Makefile.in: Regenerated.
* as.c (defined): Handle documentation and listing option for
ginsns and SCFI.
* config/obj-elf.c (obj_elf_size): Invoke ginsn_data_end.
(obj_elf_type): Invoke ginsn_data_begin.
* config/tc-i386.c (x86_scfi_callee_saved_p): New function.
(ginsn_prefix_66H_p): Likewise.
(ginsn_dw2_regnum): Likewise.
(x86_ginsn_addsub_reg_mem): Likewise.
(x86_ginsn_addsub_mem_reg): Likewise.
(x86_ginsn_alu_imm): Likewise.
(x86_ginsn_move): Likewise.
(x86_ginsn_lea): Likewise.
(x86_ginsn_jump): Likewise.
(x86_ginsn_jump_cond): Likewise.
(x86_ginsn_enter): Likewise.
(x86_ginsn_safe_to_skip): Likewise.
(x86_ginsn_unhandled): Likewise.
(x86_ginsn_new): New functionality to generate ginsns.
(md_assemble): Invoke x86_ginsn_new.
(s_insn): Likewise.
(i386_target_format): Add hard error for usage of SCFI with non AMD64 ABIs.
* config/tc-i386.h (TARGET_USE_GINSN): New definition.
(TARGET_USE_SCFI): Likewise.
(SCFI_MAX_REG_ID): Likewise.
(REG_FP): Likewise.
(REG_SP): Likewise.
(SCFI_INIT_CFA_OFFSET): Likewise.
(SCFI_CALLEE_SAVED_REG_P): Likewise.
(x86_scfi_callee_saved_p): Likewise.
* gas/listing.h (LISTING_GINSN_SCFI): New define for ginsn and
SCFI.
* gas/read.c (read_a_source_file): Close SCFI processing at end
of file read.
* gas/scfidw2gen.c (scfi_process_cfi_label): Add implementation.
(scfi_process_cfi_signal_frame): Likewise.
* subsegs.h (struct frch_ginsn_data): New forward declaration.
(struct frchain): New member for ginsn data.
* gas/subsegs.c (subseg_set_rest): Initialize the new member.
* symbols.c (colon): Invoke ginsn_frob_label to convey
user-defined labels to ginsn infrastructure.
* ginsn.c: New file.
* ginsn.h: New file.
* scfi.c: New file.
* scfi.h: New file.
|
|
Rex2 is currently an operand constraint. For the upcoming SCFI
implementation in GAS, we need to identify operations which implicitly
update the stack pointer. An operand constraint enumerator for implicit
stack op seems more appropriate than an attribute. However, two opcodes
currently necessitate both Rex2 and an implicit stack op marker; this
prompts revisiting the current representations a bit.
Make Rex2 a standalone attribute, so that later a new operand constraint
may be added for IMPLICIT_STACK_OP.
ChangeLog:
* gas/config/tc-i386.c (is_apx_rex2_encoding): Update the check.
* opcodes/i386-gen.c: Add a new BITFIELD for Rex2.
* opcodes/i386-opc.h (REX2_REQUIRED): Remove.
* opcodes/i386-opc.tbl: Remove Rex2 operand constraint.
* opcodes/i386-tbl.h: Regenerated.
|
|
Additionally, change FEAT_XS tlbi variants to be gated on "+xs" instead of
"+d128". This is an incremental improvement; there are still some FEAT_XS tlbi
variants that are gated incorrectly or missing entirely.
|
|
|
|
|
|
|
|
|
|
|
|
Add "+rdm" as an explicit alias for "+rdma", to maintain existing compatibility
with Clang.
|
|
|
|
|
|
gas/ChangeLog:
* config/tc-i386.c (establish_rex): Fix indentation.
(check_EgprOperands): Use true/false instead of 1/0.
|
|
As already indicated during review, we can't get away without certain
adjustments here: Without these, respective {evex}-prefixed insns are
assembled to APX encodings even when APX_F is turned off.
While there also extend the respective comment in the opcode table, to
explain why this construct is used.
|
|
PR gas/31178
In da0784f961d8 ("x86: fold FMA VEX and EVEX templates") I overlooked
that C aliases StaticRounding, and hence build_vex_prefix() now needs to
be aware of that aliasing. Disambiguation is easy, as StaticRounding is
only ever used together with SAE (hence why the overlaying works in the
first place).
|
|
With the addition of 128-bit system registers to the Arm architecture
starting with Armv9.4-a, a mechanism for manipulating their contents
is introduced with the `msrr' and `mrrs' instruction pair.
These move values from one such 128-bit system register into a pair of
contiguous general-purpose registers and vice-versa, as for example:
msrr ttlb0_el1, x0, x1
mrrs x0, x1, ttlb0_el1
This patch adds the necessary support for these instructions, adding
checks for system-register width by defining a new operand type in the
form of `AARCH64_OPND_SYSREG128' and the `aarch64_sys_reg_128bit_p'
predicate, responsible for checking whether the requested system
register table entry is marked as implemented in the 128-bit mode via
the F_REG_128 flag.
|
|
The addition of 128-bit page table descriptors and, with it, the
addition of 128-bit system registers for these means that special
"invalidate translation table entry" instructions are needed to cope
with the new 128-bit model. This is introduced with the `tlbpi'
instruction, implemented here.
|
|
While CRn and CRm fields in the SYSP instruction are 4-bit wide and
are thus able to accommodate values in the range 0-15, the
specifications for the SYSP instructions limit their ranges to 8-9 for
CRm and 0-7 in the case of CRn.
This led to the need to signal in some way to the operand parser that
a given operand is under special restrictions regarding its use. This
is done via the new `F_OPD_NARROW' flag, indicating a narrowing in the
range of operand values for fields in the instruction tagged with the
flag.
The flag is then used in `parse_operands' when the instruction is
assembled, but needs not be taken into consideration during
disassembly.
|
|
Two of the instructions added by the `+d128' architectural extension
add the flexibility to have two optional operands. Prior to the
addition of the `tlbip' and `sysp' instructions, no mnemonic allowed
more than one such optional operand.
With `tlbip' as an example, some TLBIP instruction names do not allow
for any optional operands, while others allow for both to be optional.
In the latter case, it is possible that either the second operand
alone is omitted or both operands are omitted.
Therefore, a considerable degree of flexibility needed to be added to
the way operands were parsed. It was, however, possible to achieve
this with relatively few changes to existing code.
it is noteworthy that opcode flags specifying the optional operand
number are non-orthogonal. For example, we have:
#define F_OPD1_OPT (2 << 12) : 0b10 << 12
#define F_OPD2_OPT (3 << 12) : 0b11 << 12
such that by virtue of the observation that
(F_OPD1_OPT | F_OPD2_OPT) == F_OPD2_OPT
it is impossible to mark both operands 1 and 2 as optional for an
instruction and it is assumed that a maximum of 1 operand can ever be
optional. This is not overly-problematic given that, for optional
pairs, the second optional operand is always found immediately after
the first. Thus, it suffices for us to flag that there is a second
optional operand. With this fact, we can infer its position in the
mnemonic from the position of the first (e.g. if the second operand in
the mnemonic is optional, we know the third is too). We therefore
define the `F_OPD_PAIR_OPT' flag and calculate its position in the
mnemonic from the value encoded by the `F_OPD<n>_OPT' flag.
Another observation is that there is a tight coupling between default
values assigned to the two registers when one (or both) are omitted
from the mnemonic. Namely, if Xt1 has a value of 0x1f (the zero
register is specified), Xt2 defaults to the same value, otherwise Xt2
will be assigned Xt + 1. This meant that where you have default value
validation, in checking the second optional operand's value, it is
also necessary to look at the value assigned to the
previously-processed operand value before deciding its validity. Thus
`process_omitted_operand' needs not only access to its `operand'
argument, but also to the global `inst' struct.
|
|
Analysis of the allowed operand values for `sysp' and `tlbip' reveals
a significant departure from the allowed behavior for operand register
pairs (hitherto labeled AARCH64_OPND_PAIRREG) observed for other
insns in this category.
For instructions `casp', `mrrs' and `msrr' the register pair must
always start at an even index and the second register in the pair is
the index + 1. This precludes the use of xzr as the first register,
given it corresponds to register number 31.
This is different in the case of `sysp' and `tlbip', however. These
allow the use of xzr and, where the first operand in the pair is
omitted, this is the default value assigned to it. When this
operand is assigned xzr, it is expected that the second operand will
likewise take on a value of xzr.
These two instructions therefore "break" two rules of register pairs:
* The first of the two registers is odd-numbered.
* The index of the second register is equal to that of the first,
and not n+1.
To allow for this departure from hitherto standard behavior, we
extend the functionality of the assembler by defining an extension of
the AARCH64_OPND_PAIRREG, called AARCH64_OPND_PAIRREG_OR_XZR.
It is used in defining `sysp' and `tlbip' and allows
`operand_general_constraint_met_p' to allow the pair to both take on
the value of xzr.
|
|
Indicating the presence of the Armv9.4-a features concerning 128-bit
Page Table Descriptors, 128-bit System Registers and Instructions,
the "+d128" architectural extension flag is added to the list of
possible -march options in Binutils, together with the necessary macro
for encoding d128 instructions.
|
|
This patch adds AArch32 support for -march=armv8.9-a and
-march=armv9.4-a. The behaviour of the new options can be
expressed using a combination of existing feature flags
and tables.
The cpu_arch_ver entries for ARM_ARCH_V9_4A and ARM_ARCH_V8_9A
are technically redundant but it including them for macro code
consistency across architectures.
|
|
gas/
* config/tc-i386.c (cpu_arch): Add znver5 ARCH.
* doc/c-i386.texi: Add znver5.
* testsuite/gas/i386/arch-15.d: New.
* testsuite/gas/i386/arch-15.s: Likewise.
* testsuite/gas/i386/arch-15-znver5.d: Likewise.
* testsuite/gas/i386/i386.exp: Add new znver5 test cases.
* testsuite/gas/i386/x86-64.exp: Likewise.
* testsuite/gas/i386/x86-64-arch-5.d: Likewise.
* testsuite/gas/i386/x86-64-arch-5.s: Likewise.
* testsuite/gas/i386/x86-64-arch-5-znver5.d: Likewise.
opcodes/
* i386-gen.c (isa_dependencies): Add ZNVER5 dependencies.
* i386-init.h: Re-generated.
|
|
There are a number of issues with 734dfd1cc966 ("x86: pack CPU flags in
opcode table"):
- the condition when two array slots need writing wasn't correct (with
enough new Cpu* added an out of bounds array access would validly have
been complained about by the compiler),
- table generation didn't take into account CpuAttrUnused and CpuUnused
being independent, and hence there not always (not) being an "unused"
bitfield member in both structures,
- cpu_flags_from_attr() wasn't ready for use on big-endian hosts,
- there were two style violations.
|
|
It doesn't look to be a good idea to override the custom handlers that
ELF and COFF have; afaict doing so broke .previous on ELF, and a sub-
section specifier wasn't accepted either.
|
|
The comment in s_bss() looks bogus (perhaps simply stale, or wrongly
copied from another target). It also doesn't look to be a good idea to
override the custom handler that ELF has (afaict doing so broke
.previous as well as sub-section specification).
The override for .skip is simply pointless, for read.c having exactly
the same.
While there also drop two adjacent redundant (with read.h) declarations
(which would be outright dangerous if read.h wasn't included anyway).
|
|
While there doesn't look to be anything wrong with this override,
there's also no apparent reason why this override would be needed. Drop
it, reducing overall size a tiny bit.
|
|
The comment looks bogus (perhaps simply stale, or wrongly copied from
another target). It also doesn't look to be a good idea to override the
custom handler that ELF has (afaict doing so broke .previous as well as
sub-section specification).
While there also fold the identical handlers for .text (there likely is
more room for such folding).
|
|
The comment looks bogus (perhaps simply stale), and there are also no
other precautions against subsections being used on ELF with .bss. It
also doesn't look to be a good idea to override the custom handler that
ELF has (afaict doing so further broke .previous).
|
|
It doesn't look to be a good idea to override the custom handler that
ELF has; afaict doing so broke .previous.
|
|
It doesn't look to be a good idea to override the custom handler that
ELF has; afaict doing so broke .previous.
|
|
Just like obj_elf_section() is called for .section, obj_elf_{text,data}()
need calling for .text/.data.
|
|
While only ELF is supported right now, (stub) code generally is in place
for the non-ELF case as well. Don't override .bss for ELF - that's
unlikely to be a good idea anyway and prevented the sub-section
specifier from being usable. Don't override .text and .data at all - for
.data and ELF for the same reason, while for .text and ELF obj-elf.c's is
all we need, and for (hypothetical) non-ELF read.c's identical handling
would have been invoked anyway.
|
|
The comment looks bogus (perhaps simply stale), and there are also no
other precautions against subsections being used on ELF with .bss. It
also doesn't look to be a good idea to override the custom handler that
ELF has (afaict doing so further broke .previous).
|
|
It doesn't look to be a good idea to override the custom handler that
ELF has; afaict doing so broke .previous.
|
|
It doesn't look to be a good idea to override the custom handlers that
ELF and COFF have. While in this case interaction with ELF's .previous
wasn't screwed, the sub-section specifier wasn't permitted.
|