aboutsummaryrefslogtreecommitdiff
path: root/gas
AgeCommit message (Collapse)AuthorFilesLines
2024-05-20RISC-V: PR31733, Change initial CFI operation from DW_CFA_def_cfa_register ↵Sung-hun Kim1-1/+1
to DW_CFA_def_cfa The DWARF specification (especially, DWARF4 and 5 [1,2]) states that DW_CFA_def_cfa_register cannot be used as the first CFI operation. It said DW_CFA_def_cfa_register as follows: ... This operation is valid only if the current CFA rule is defined to use a register and offset. So, DW_CFA_def_cfa_register can be used after that other definition operation such as DW_CFA_def_cfa is called. However, the current gas code emits DW_CFA_def_cfa_register as an initial CFI operation for RISCV. In the libgcc, the unwinding function does not care about it, so it can unwind the call stack. However, on the third party library such as libunwindstack in Android, it causes a fatal error. This patch changes the initial CFI operation to DW_CFA_def_cfa with offset 0. It works as same as the previous one, but it does not have any limitation so it satisfies the DWARF spec. This change resolves the compatibility issue while preserving the original behaviour. [1] DWARF4 specification, https://dwarfstd.org/doc/DWARF4.pdf [2] DWARF5 specification, https://dwarfstd.org/doc/DWARF5.pdf Signed-off-by: Sung-hun Kim <sfoon.kim@samsung.com> Reviewed-By: Andrew Burgess <aburgess@redhat.com> Approved-By: Nelson Chu <nelson@rivosinc.com> gas/ PR 31733 config/tc-riscv.c (riscv_cfi_frame_initial_instructions): Use DW_CFA_def_cfa rather than DW_CFA_def_cfa_register.
2024-05-17aarch64: correct SVE2.1 ld2q (scalar plus scalar)Jan Beulich1-1/+1
It's opcode was wrong, as was e.g. easily visible from the inappropriate testcase expectation.
2024-05-17aarch64: correct SVE2.1 ld{3,4}q / st{3,4}q (scalar plus immediate)Jan Beulich3-13/+13
Like their byte, half, word, and doubleword counterparts their immediates are multiples of 3 / 4 respectively.
2024-05-17LoongArch: gas: Adjust DWARF CIE alignment factorsmengqinggang1-5/+9
Set DWARF2_CIE_DATA_ALIGNMENT (data alignment factors) to -8. It helps to save space. Data Alignment Factor A signed LEB128 encoded value that is factored out of all offset instructions that are associated with this CIE or its FDEs. This value shall be multiplied by the register offset argument of an offset instruction to obtain the new offset value.
2024-05-16gas: sframe: fix typo to use FP instead of BPIndu Bhagat1-4/+4
gas/ * gen-sframe.c (output_sframe_internal): Use BP instead of FP.
2024-05-16aarch64: fp8 convert and scale - add sme2 insn variantsVictor Do Nascimento6-2/+623
Add the SME2 variant of the FP8 convert and scale instructions, enabled at assembly-time using the `+sme2+fp8' architectural extension flag. More specifically, support is added for the following instructions: Multi-vector floating-point convert from FP8 to BFloat16 (in-order): ----------------------------------------------- - bf1cvt { <Zd1>.H-<Zd2>.H }, <Zn>.B - bf2cvt { <Zd1>.H-<Zd2>.H }, <Zn>.B Multi-vector floating-point convert from FP8 to deinterleaved BFloat16: ----------------------------------------------- - bf1cvtl { <Zd1>.H-<Zd2>.H }, <Zn>.B - bf2cvtl { <Zd1>.H-<Zd2>.H }, <Zn>.B Multi-vector floating-point convert from BFloat16 to packed FP8 format: ------------------------------------------------- - bfcvt <Zd>.B, { <Zn1>.H-<Zn2>.H } Multi-vector floating-point convert from FP8 to half-precision (in-order): ----------------------------------------------- - f1cvt { <Zd1>.H-<Zd2>.H }, <Zn>.B - f2cvt { <Zd1>.H-<Zd2>.H }, <Zn>.B Multi-vector floating-point convert from FP8 to deinterleaved half-precision: ----------------------------------------------- - f1cvtl { <Zd1>.H-<Zd2>.H }, <Zn>.B - f2cvtl { <Zd1>.H-<Zd2>.H }, <Zn>.B Multi-vector floating-point convert from half-precision to packed FP8 format: ------------------------------------------------------- fcvt_2h Multi-vector floating-point convert from single-precision to packed FP8 format: --------------------------------------------------------- fcvt_4s Multi-vector floating-point convert from single-precision to interleaved FP8 format: --------------------------------------------------------- - fcvtn <Zd>.B, { <Zn1>.S-<Zn4>.S } Multi-vector floating-point adjust exponent by vector: ------------------------------------------------------ - fscale { <Zdn1>.H-<Zdn2>.H }, { <Zdn1>.H-<Zdn2>.H }, <Zm>.H - fscale { <Zdn1>.S-<Zdn2>.S }, { <Zdn1>.S-<Zdn2>.S }, <Zm>.S - fscale { <Zdn1>.D-<Zdn2>.D }, { <Zdn1>.D-<Zdn2>.D }, <Zm>.D Multi-vector floating-point adjust exponent: -------------------------------------------- - fscale { <Zdn1>.H-<Zdn2>.H }, { <Zdn1>.H-<Zdn2>.H }, { <Zm1>.H - <Zm2>.H } - fscale { <Zdn1>.S-<Zdn2>.S }, { <Zdn1>.S-<Zdn2>.S }, { <Zm1>.S - <Zm2>.S } - fscale { <Zdn1>.D-<Zdn2>.D }, { <Zdn1>.D-<Zdn2>.D }, { <Zm1>.D - <Zm2>.D }
2024-05-16aarch64: fp8 convert and scale - add sve2 insn variantsVictor Do Nascimento7-0/+313
Add the SVE2 variant of the FP8 convert and scale instructions, enabled at assembly-time using the `+sve2+fp8' architectural extension flag. More specifically, support is added for the following instructions: FP8 convert to BFloat16 (bottom/top): ------------------------------------- - bf1cvt Z<d>.H, Z<n>.B - bf2cvt Z<d>.H, Z<n>.B - bf1cvtlt Z<d>.H, Z<n>.B - bf2cvtlt Z<d>.H, Z<n>.B FP8 convert to half-precision (bottom/top): ------------------------------------------- - f1cvt Z<d>.H, Z<n>.B - f2cvt Z<d>.H, Z<n>.B - f1cvtlt Z<d>.H, Z<n>.B - f2cvtlt Z<d>.H, Z<n>.B BFloat16/half-precision convert, narrow and interleave to FP8: ------------------------------------------- - bfcvtn Z<d>.B, { Z<n>1.H - Z<n>2.H } - fcvtn Z<d>.B, { Z<n>1.H - Z<n>2.H } Single-precision convert, narrow and interleave to FP8 (bottom/top): ----------------------------------------------- - fcvtnb Z<d>.B, { Z<n>1.S - Z<n>2.S } - fcvtnt Z<d>.B, { Z<n>1.S - Z<n>2.S }
2024-05-16aarch64: fp8 convert and scale - Add advsimd insn variantsVictor Do Nascimento5-0/+581
Add the advanced SIMD variant of the FP8 convert and scale instructions, enabled at assembly-time using the `+fp8' architectural extension flag. More specifically, support is added for the following instructions: FP8 convert to BFloat16 (vector): --------------------------------- - bf1cvtl V<d>.8H, V<n>.8B - bf2cvtl V<d>.8H, V<n>.8B - bf1cvtl2 V<d>.8H, V<n>.16B - bf2cvtl2 V<d>.8H, V<n>.16B FP8 convert to half-precision (vector): --------------------------------------- - f1cvtl V<d>.8H, V<n>.8B - f2cvtl V<d>.8H, V<n>.8B - f1cvtl2 V<d>.8H, V<n>.16B - f2cvtl2 V<d>.8H, V<n>.16B Single-precision to FP8 convert and narrow (vector): ---------------------------------------------------- - fcvtn V<d>.8B, V<n>.4S, V<m>.4S - fcvtn2 V<d>.16B, V<n>.4S, V<m>.4S Half-precision to FP8 convert and narrow (vector): -------------------------------------------------- - fcvtn V<d>.8B, V<n>.4H, V<m>.4H - fcvtn V<d>.16B, V<n>.8H, V<m>.8H Floating-point adjust exponent by vector: ----------------------------------------- - fscale V<d>.4H, V<n>.4H, V<m>.4H - fscale V<d>.8H, V<n>.8H, V<m>.8H - fscale V<d>.2S, V<n>.2S, V<m>.2S - fscale V<d>.4S, V<n>.4S, V<m>.4S - fscale V<d>.2d, V<n>.2d, V<m>.2d
2024-05-16aarch64: fp8 convert and scale - add feature flags and related structuresVictor Do Nascimento2-0/+3
2024-05-16aarch64: add SPMU feature and its associated registersMatthieu Longo3-0/+27
2024-05-16Move assembler "IRP \+" test into a separate file. Add XFAILs for targets ↵Nick Clifton6-9/+19
that do not support it.
2024-05-16arm: remove incorrect handling of FP bignums in move_or_literal_poolRichard Earnshaw1-6/+24
This hunk of code in move_or_literal_pool just looks wrong, but I can't find a testcase that will tickle it to prove it. It looks a bit like it was intended to catch cases where a bignum contained a floating-point value, but there were a number of problems with it. - It tested X_add_number == -1, but an FP bignum is indicated by any value <= 0. - It converted the floating-point value to extended precision, but that's not used on Arm beyond the legacy FPA code. No attempt was made to match the FP value to the intended memory/mov operation. Since I can't construct a viable testcase, I've just removed the existing code and made the function error out in this case: this seems more sensible than generating wrong code or trying to write something more complex that can't be tested anyway.
2024-05-16Fix FAIL: macros altmacroAlan Modra1-5/+5
spu-elf and z80-coff fail this test due to "def" being a pseudo-op. tic30-unknown-coff fails it due to '#' not starting comments. * testsuite/gas/macros/altmacro.s: Use /* */ comments. Rename DEF to EDF.
2024-05-15gas: Fix \+ expansion for .irp and .irpcFangrui Song3-1/+10
.irp and .irpc receive a null macro_entry. \+ causes a crash after the recent \+ support. Restore the previous behavior. Signed-off-by: Fangrui Song <maskray@gcc.gnu.org>
2024-05-15aarch64: Add sysreg features to +d128 dependenciesAndrew Carlotti1-2/+5
We should revisit sysreg feature enablement and dependencies in future, but this change should help until then. OK for master?
2024-05-15aarch64: Add simd dependency to +sha2Andrew Carlotti1-1/+1
This matches the existing behaviour in GCC and LLVM, and also the current documentation. OK for master?
2024-05-15aarch64: testsuite: share test utils macros and use themMatthieu Longo23-519/+576
This patch rewrites assembly tests to use utils macros declared in sysreg-test-utils.inc. Some tests were adapted to use the new macro rw_sys_reg.
2024-05-15aarch64: testsuite: reorder write and read to match macro orderMatthieu Longo11-292/+286
This patch aims at grouping write and read for a same system register one after another so that the diff for the macro replacement does not generate too much noise.
2024-05-15aarch64: testsuite: use same regs for read and write testsMatthieu Longo8-377/+377
This patch aims at making easier to replacement of read and write instructions to system registers by a macro that will use the same registers for read and write.
2024-05-15aarch64: testsuite: replace instruction addresses by regexMatthieu Longo1-28/+28
This patch removes the instruction addresses from the objdump's expected output (.d files). The intended benefit from this clean-up is to allow to swap lines around more easilly, and removes the noise of patches that add, remove or reorder instructions.
2024-05-14Fix gas's 'macro count' test for various targetsNick Clifton2-10/+15
2024-05-14arm: update documentation for removal of the Maverick extensionRichard Earnshaw1-7/+4
Finally, update the documentation and add a NEWS item.
2024-05-14arm: opcodes: remove Maverick disassembly.Richard Earnshaw2-8/+8
Remove the patterns to match Maverick co-processor instructions from the disassembly tables. This required fixing a couple of tests in the assembler testsuite where we, probably incorrectly, disassembled generic co-processor instructions as a Maverick instruction (it particularly made no sense to do this for Armv6t2 in Thumb state).
2024-05-14arm: remove Maverick support from the assembler.Richard Earnshaw1-179/+4
Delete all the Maverick instructions and register handling from the assembler. We continue to recognize -mcpu=ep9312, but treat it as an alias for arm920t. We no-longer recognize -mfpu=maverick.
2024-05-14arm: remove tests for Maverick FPU extensionsRichard Earnshaw12-2010/+0
Before removing the code itself, remove the tests that will no-longer apply.
2024-05-13Add new assembler macro pseudo-variable \+ which counts the number of times ↵Nick Clifton10-14/+77
a macro has been invoked.
2024-05-08RISC-V: Support B, Zaamo and Zalrsc extensions.Nelson Chu13-10/+25
* https://github.com/riscv/riscv-b/tags Added standard B extension back, which implies Zba, Zbb and Zbs extensions. * https://github.com/riscv/riscv-zaamo-zalrsc/tags Splited standard A extension into two new extensions, Zaamo and Zalrsc. The A extension implies Zaamo and Zalrsc extensions. Not sure if we need to do the similar check as i and zicsr/zifencei. Passed riscv[32|64]-[elf/linux] binutils testcases. bfd/ * elfxx-riscv.c (riscv_implicit_subsets): Added imply rules for A and B extensions. The A implies Zaamo and Zalrsc, the B implies Zba, Zbb and Zbs. (riscv_supported_std_ext): Supported B extension with v1.0. (riscv_supported_std_z_ext): Supported Zaamo and Zalrsc with v1.0. (riscv_multi_subset_supports, riscv_multi_subset_supports_ext): Updated. include/ * opcode/riscv.h (riscv_insn_class): Removed INSN_CLASS_A, Added INSN_CLASS_ZAAMO and INSN_CLASS_ZALRSC. opcodes/ * riscv-opc.c (riscv_opcodes): Splited standard A extension into two new extensions, Zaamo and Zalrsc. gas/ * testsuite/gas/riscv/march-imply-a.d: New testcase. * testsuite/gas/riscv/march-imply-b.d: New testcase. * testsuite/gas/riscv/attribute-01.d: Updated. * testsuite/gas/riscv/attribute-02.d: Updated. * testsuite/gas/riscv/attribute-03.d: Updated. * testsuite/gas/riscv/attribute-04.d: Updated. * testsuite/gas/riscv/attribute-05.d: Updated. * testsuite/gas/riscv/attribute-10.d: Updated. * testsuite/gas/riscv/mapping-symbols.d: Updated. * testsuite/gas/riscv/march-imply-g.d: Updated. * testsuite/gas/riscv/march-imply-unsupported.d: Updated. * testsuite/gas/riscv/march-ok-reorder.d: Updated. ld/ * testsuite/ld-riscv-elf/attr-merge-arch-01.d: Updated. * testsuite/ld-riscv-elf/attr-merge-arch-02.d: Updated. * testsuite/ld-riscv-elf/attr-merge-arch-03.d: Updated. * testsuite/ld-riscv-elf/attr-merge-user-ext-01.d: Updated.
2024-05-06x86: Drop using extension_opcode to encode vvvv registerCui, Lili3-6/+17
gas/ChangeLog: * config/tc-i386.c (build_modrm_byte): Dropped the use of extension_opcode to encode the vvvv register. * testsuite/gas/i386/x86-64-sse2avx.d: Added new testcases. * testsuite/gas/i386/x86-64-sse2avx.s: Diito. opcodes/ChangeLog: * i386-opc.tbl: Added DstVVVV to some extension_opcode instructions. * i386-tbl.h: Regenerated.
2024-05-06x86: Drop SwapSourcesCui, Lili1-8/+11
gas/ChangeLog: * config/tc-i386.c (build_modrm_byte): Dropped the use of SWAP_SOURCES to encode the vvvv register. opcodes/ChangeLog: * i386-opc.h (SWAP_SOURCES): Dropped. (NO_DEFAULT_MASK): Adjusted the value. (ADDR_PREFIX_OP_REG): Ditto. (DISTINCT_DEST): Ditto. (IMPLICIT_STACK_OP): Ditto. (VexVVVV_SRC2): New. * i386-opc.tbl: Dropped SwapSources and replaced its VexVVVV with Src1VVVV. * i386-tbl.h: Regenerated.
2024-05-06x86: Use vexvvvv as the switch state to encode the vvvv registerCui, Lili1-15/+17
Use vexvvvv as the switch state, and replace VexVVVV with Src1VVVV. Src1VVVV means using VEX.vvvv encodes the first source register operand. The old logic did not check vexvvvv first, which made the logic here very complicated. gas/ChangeLog: * config/tc-i386.c (optimize_encoding): Replaced 1 with Src1VVVV. (build_modrm_byte): Used vexvvvv to encode the vvvv register. (s_insn): Replaced 1 with Src1VVVV. opcodes/ChangeLog: * i386-opc.h (VexVVVV_DST): Adjusted the value. (Src1VVVV): New. * i386-opc.tbl: Replaced part VexVVVV with Src1VVVV. * i386-tbl.h: Regenerated.
2024-05-03x86/APX: further extend SSE2AVX coverageJan Beulich3-0/+974
Since {vex}/{vex3} are respected on legacy mnemonics when -msse2avx is in use, {evex} should be respected, too. So far this is the case only for insns where eGPR-s can come into play. Extend coverage to insns with only %xmm register and possibly immediate operands.
2024-05-03x86/APX: extend SSE2AVX coverageJan Beulich7-3/+613
Legacy encoded SIMD insns are converted to AVX ones in that mode. When eGPR-s are in use, i.e. with APX, convert to AVX10 insns (where available; there are quite a few which can't be converted). Note that LDDQU is represented as VMOVDQU32 (and the prior use of the sse3 template there needs dropping, to get the order right). Note further that in a few cases, due to the use of templates, AVX512VL is used when AVX512F would suffice. Since AVX10 is the main reference, this shouldn't be too much of a problem.
2024-04-25bpf: fix calculation when deciding to relax branchDavid Faust7-43/+95
In certain cases we were calculating the jump displacement incorrectly when deciding whether to relax a branch. This meant for some branches, such as a very long backwards conditional branch, relaxation was not done when it should have been. The result was to error later, because the actual jump displacement was too large to fit in the original instruction. This patch fixes up the displacement calculation so that those branches are correctly relaxed and no longer result in an error. In addition, it changes md_convert_frag to install fixups for the JAL instructions in the resulting relaxations rather than encoding the displacement value directly. gas/ * config/tc-bpf.c (relaxed_branch_length): Correct displacement calculation when relaxing. (md_convert_frag): Likewise. Install fixups for JAL instructions resulting from relaxation. * testsuite/gas/bpf/jump-relax-ja-be.d: Correct and expand test. * testsuite/gas/bpf/jump-relax-ja.d: Likewise. * testsuite/gas/bpf/jump-relax-ja.s: Likewise. * testsuite/gas/bpf/jump-relax-jump-be.d: Likewise. * testsuite/gas/bpf/jump-relax-jump.d: Likewise. * testsuite/gas/bpf/jump-relax-jump.s: Likewise.
2024-04-25LoongArch: gas: Simplify relocations in sections without code flagJinyang He3-3/+19
Gas should not emit ADD/SUB relocation pairs for label differences if they are in the same section without code flag even relax enabled. Because the real value is not be affected by relaxation and it can be compute out in assembly stage. Thus, correct the `TC_FORCE_RELOCATION _SUB_SAME` and the label differences in same section without code flag can be resolved in fixup_segment().
2024-04-23arm: Fix MVE vmla encodingClaudio Bantaloukas3-1355/+2041
2024-04-22x86/APX: Add invalid check for APX EVEX.X4.Cui, Lili3-1/+9
gas/ChangeLog: * config/tc-i386.c (build_apx_evex_prefix): Added invalid check for APX X4. * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Added invalid testcase. * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto. opcodes/ChangeLog: * i386-dis.c (get_valid_dis386): Added invalid check for APX X4.
2024-04-20LoongArch: Add -mignore-start-align optionmengqinggang3-21/+51
Ignore .align at the start of a section may result in misalignment when partial linking. Manually add -mignore-start-align option without partial linking. Gcc -falign-functions add .align 5 to the start of a section, it causes some error message mismatch. Set these testcases to xfail on LoongArch target.
2024-04-16x86: Fix a memory leak in md_assembleH.J. Lu1-5/+8
Fix a memory leak in md_assemble where copy may be cleared and may be the same as copy: if (copy && !mnem_suffix) { line = copy; copy = NULL; no_match: * config/tc-i386.c (md_assemble): Properly free the xstrdup memory.
2024-04-16gas: Free unused memory in scfi_ops_cleanupH.J. Lu1-0/+3
* scfi.c (scfi_ops_cleanup): Free op->op_data and head.
2024-04-16Gas Doc: Update example of how .altmacro affects the interpretation of macro ↵Nick Clifton1-8/+3
arguments. PR 31255
2024-04-12Update description of macro keyword argument assignment in assembler ↵Nick Clifton1-1/+34
documentation. PR 31255
2024-04-12gas: Fix memory leaks in gen-sframe.cH.J. Lu1-0/+4
* gen-sframe.c (sframe_xlate_ctx_cleanup): Call XDELETE on xlate_ctx->cur_fre. (create_sframe_all): Call XDELETE on xlate_ctx after use.
2024-04-11gas: Fix a CFI label name memory leak in scfi.cH.J. Lu2-2/+3
CFI label name can be freed only after use. * scfi.c (handle_scfi_dot_cfi): Free CFI label name after use. * scfidw2gen.c (scfi_process_cfi_label): Add a comment. Remove TODO on freeing CFI label name.
2024-04-11gas: Fix memory leaks in ginsn.cH.J. Lu1-3/+11
Free buffer memory after use in ginsn.c. * ginsn.c (ginsn_dst_print): Free buffer after use. (ginsn_print): Likewise.
2024-04-10x86-64: Use long NOPs for Intel Core processorsH.J. Lu5-9/+397
Use long NOPs for Intel Core processors since they are faster than multiple NOPs. Don't use them for 64-bit processors by default since Intel Atom processors can only decode 4 prefixes in 1 cycle. * config/tc-i386.c (alt64_9): New. (alt64_10): Likewise. (alt64_11): Likewise. (alt64_12): Likewise. (alt64_13): Likewise. (alt64_14): Likewise. (alt64_15): Likewise. (alt64_patt): Likewise. (i386_generate_nops): Use alt64_patt for Intel Core processors in 64-bit mode. * testsuite/gas/i386/x86-64-nops-1-core2.d: Expect long NOPs. * testsuite/gas/i386/x86-64-nops-4-core2.d: Likewise. * testsuite/gas/i386/ilp32/x86-64-nops-1-core2.d: Replace ../x86-64-nops-1.d with ../x86-64-nops-1-core2.d. * testsuite/gas/i386/ilp32/x86-64-nops-4-core2.d: Replace ../x86-64-nops-4.d with ../x86-64-nops-4-core2.d.
2024-04-10gas: scfi: bugfixes for SCFI state propagationIndu Bhagat5-8/+87
There are two state propagation functions in SCFI machinery - forward and backward flow. The patch addresses two issues: - In forward_flow_scfi_state (), the state being compared in forward flow must be that at the exit of a prev bb and that at the entry of the next bb. The variable holding the state to be compared was previously (erroneously) stale. - In cmp_scfi_state (), the assumption that two different control flows, leading to the same basic block, cannot have a mismatched notion of CFA base register, is not true. Remove the assertion and instead return err if mismatch. Fixing these issues helps correctly synthesize CFI, when previously SCFI was erroring out for an otherwise valid input asm. gas/ * scfi.c (cmp_scfi_state): Remove assertion and return mismatch in return value as applicable. (forward_flow_scfi_state): Update state object to be the same as the exit state of the prev bb before comparing. gas/testsuite/ * gas/scfi/x86_64/scfi-x86-64.exp: Add new test. * gas/scfi/x86_64/scfi-cfg-5.d: New test. * gas/scfi/x86_64/scfi-cfg-5.l: New test. * gas/scfi/x86_64/scfi-cfg-5.s: New test.
2024-04-10gas: gcfg: add_bb_at_ginsn must return root_bbIndu Bhagat5-26/+164
A GCFG (ginsn control flow graph) is created for SCFI purposes in GAS. The existing GCFG creation process was ignoring some paths. add_bb_at_ginsn () is a recursive function which should return the root of the added basic blocks. This property was being violated in some traversals, e.g., where a taken path involving a sequence of a few basic blocks eventually culminated in a GINSN_TYPE_RETURN instruction. This patch fixes the issue by keeping an explicit variable root_bb to memorize the bb to be returned. Next, find_or_make_bb () must either create or find the bb with the first ginsn as the provided ginsn. Add a few assertions to ensure health of the cfg creation process. Note that the testcase, in its current shape, is not fit for catching regressions for the issue at hand. Although the testcase does exercise the updated code path, the testcase passes even without the current fix, because the added edge in this specific testcase does not alter the synthesized CFI. (The missing edge is the fallthrough edge of the conditional branch "jne .L13" in the testcase.) Using a manual gcfg_print (), one can see the missing edge without the fix. Lets keep the testcase for now, until there is a better way to test the GCFG for this issue (e.g., either by dumping the GCFG in textual format, or a case when the missing edge does cause wrong synthesized CFI). gas/ * ginsn.c (bb_add_edge): Fix a code comment. (find_bb): Likewise. (find_or_make_bb): Add new assertions to ensure health of cfg creation process. (add_bb_at_ginsn): Keep reference to the root_bb and return it. gas/testsuite/ * gas/scfi/x86_64/scfi-x86-64.exp: Add new test. * gas/scfi/x86_64/scfi-cfg-4.d: New test. * gas/scfi/x86_64/scfi-cfg-4.l: New test. * gas/scfi/x86_64/scfi-cfg-4.s: New test.
2024-04-09arm: Fix disassembly of MVE vq[r]shr[u]nAlex Coplan3-1808/+1875
This patch fixes the disassembly of vq[r]shr[u]n insns so that the shift immediate is properly decoded. See the description of the previous patch for an example of the incorrect disassembly. As part of this patch we also fix the mve-vqrshrn.d test which was testing for the incorrect disassembly of the immediates. The disassembly now matches the assembled instructions in that test. Finally we add an mve-vqshrn test which tests the non-rounding variants of those insns, whose encoding we fixed with the previous patch in this series.
2024-04-09arm: Fix encoding of MVE vqshr[u]nAlex Coplan1-4/+4
As it stands, these insns are incorrectly encoded as vqrshr[u]n. Concretely, the problem can be seen as follows: $ cat t.s vqrshrnb.s16 q0,q0,#8 vqshrnb.s16 q0,q0,#8 $ gas/as-new t.s -march=armv8.1-m.main+mve -o t.o $ binutils/objdump -d t.o -m armv8.1-m.main t.o: file format elf32-littlearm Disassembly of section .text: 00000000 <.text>: 0: ee88 0f41 vqrshrnb.s16 q0, q0, #0 4: ee88 0f41 vqrshrnb.s16 q0, q0, #0 Here we assemble these two instructions to the same opcode. The encoding of the first is the correct, while the encoding of the second is incorrect, and the bottom bit should be clear, see the Armv8-M ARM: https://developer.arm.com/documentation/ddi0553/latest/ There is an additional problem here in that the disassembly of the immediate is incorrect. llvm-objdump shows the correct disassembly here: t.o: file format elf32-littlearm Disassembly of section .text: 00000000 <$t>: 0: ee88 0f41 vqrshrnb.s16 q0, q0, #8 4: ee88 0f41 vqrshrnb.s16 q0, q0, #8 Note that we defer adding a test for the correct encoding of these insns until the next patch which fixes the disassembly issue.
2024-04-09RISC-V: Support Zcmp push/pop instructions.Jiawei8-0/+525
Support zcmp extension push/pop/popret and popret zero instructions. The `reg_list' is a list containing 1 to 13 registers, we can use: "{ra}, {ra, s0}, {ra, s0-s1}, {ra, s0-s2} ... {ra, s0-sN}" to present this feature. Passed gcc/binutils regressions of riscv-gnu-toolchain. Most of work was finished by Sinan Lin. Co-Authored by: Charlie Keaney <charlie.keaney@embecosm.com> Co-Authored by: Mary Bennett <mary.bennett@embecosm.com> Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com> Co-Authored by: Sinan Lin <sinan.lin@linux.alibaba.com> Co-Authored by: Simon Cook <simon.cook@embecosm.com> Co-Authored by: Shihua Liao <shihua@iscas.ac.cn> Co-Authored by: Yulong Shi <yulong@iscas.ac.cn> bfd/ChangeLog: * elfxx-riscv.c (riscv_implicit_subset): Imply zca for zcmp. (riscv_supported_std_z_ext): Added zcmp with version 1.0. (riscv_parse_check_conflicts): Zcmp conflicts with d/zcd. (riscv_multi_subset_supports): Handle zcmp. (riscv_multi_subset_supports_ext): Ditto. gas/ChangeLog: * NEWS: Updated. * config/tc-riscv.c (regno_to_reg_list): New function, used to map register to reg_list number. (reglist_lookup): Called reglist_lookup_internal. Return false if reg_list number is zero, which is an invalid value. (reglist_lookup_internal): Parse register list, and return the last register by regno_to_reg_list. (validate_riscv_insn): New operators. (riscv_ip): Ditto. * testsuite/gas/riscv/march-help.l: Updated. * testsuite/gas/riscv/zcmp-push-pop-fail.d: New test. * testsuite/gas/riscv/zcmp-push-pop-fail.l: New test. * testsuite/gas/riscv/zcmp-push-pop-fail.s: New test. * testsuite/gas/riscv/zcmp-push-pop.d: New test. * testsuite/gas/riscv/zcmp-push-pop.s: New test. include/ChangeLog: * opcode/riscv-opc.h (MATCH/MASK_CM_PUSH): New macros for zcmp. (MATCH/MASK_CM_POP): Ditto. (MATCH/MASK_CM_POPRET): Ditto. (MATCH/MASK_CM_POPRETZ): Ditto. (DECLARE_INSN): New declarations for zcmp. * opcode/riscv.h (EXTRACT/ENCODE/VALID_ZCMP_SPIMM): Handle spimm operand for zcmp. (OP_MASK_REG_LIST): Handle operand for zcmp register list. (OP_SH_REG_LIST): Ditto. (ZCMP_SP_ALIGNMENT): New argument, used in riscv_get_sp_base. (X_S0, X_S1, X_S2, X_S10, X_S11): New register numbers. (enum riscv_insn_class): Added INSN_CLASS_ZCMP. (extern riscv_get_sp_base): Added. opcodes/ChangeLog: * riscv-dis.c (print_reg_list): New function, used to get zcmp reg_list field. (riscv_get_spimm): New function, used to get zcmp sp adjustment immediate. (print_insn_args): Handle new operands for zcmp. * riscv-opc.c (riscv_get_sp_base): New function, used by gas and objdump. Get sp base adjustment. (riscv_opcodes): Added zcmp instructions.