diff options
author | Indu Bhagat <indu.bhagat@oracle.com> | 2024-01-15 01:00:31 -0800 |
---|---|---|
committer | Indu Bhagat <indu.bhagat@oracle.com> | 2024-01-15 03:31:35 -0800 |
commit | c7defc5386cc53a4abbb7c53a924cdac3f16aa33 (patch) | |
tree | 7593f82a48e1f517da03e98b0ffa70fb2d67601b | |
parent | 448cf9e67d3fe9edbef70d4cfcc32d1816603370 (diff) | |
download | gdb-c7defc5386cc53a4abbb7c53a924cdac3f16aa33.zip gdb-c7defc5386cc53a4abbb7c53a924cdac3f16aa33.tar.gz gdb-c7defc5386cc53a4abbb7c53a924cdac3f16aa33.tar.bz2 |
gas: x86: synthesize CFI for hand-written asm
This patch adds support in GAS to create generic GAS instructions
(a.k.a., the ginsn) for the x86 backend (AMD64 ABI only at this time).
Using this ginsn infrastructure, GAS can then synthesize CFI for
hand-written asm for x86_64.
A ginsn is a target-independent representation of the machine
instructions. One machine instruction may need one or more ginsn.
This patch also adds skeleton support for printing ginsn in the listing
output for debugging purposes.
Since the current use-case of ginsn is to synthesize CFI, the x86 target
needs to generate ginsns necessary for the following machine
instructions only:
- All change of flow instructions, including all conditional and
unconditional branches, call and return from functions.
- All register saves and unsaves to the stack.
- All instructions affecting the two registers that could potentially
be used as the base register for CFA tracking. For SCFI, the base
register for CFA tracking is limited to REG_SP and REG_FP only for
now.
The representation of ginsn is kept simple:
- GAS instruction has GINSN_NUM_SRC_OPNDS (defined to be 2 at this time)
number of source operands and one destination operand at this time.
- GAS instruction uses DWARF register numbers in its representation and
does not track register size.
- GAS instructions carry location information (file name and line
number).
- GAS instructions are ID's with a natural number in order of their
addtion to the list. This can be used as a proxy for the static
program order of the corresponding machine instructions.
Note that, GAS instruction (ginsn) format does not support
GINSN_TYPE_PUSH and GINSN_TYPE_POP. Some architectures, like aarch64,
do not have push and pop instructions, but rather STP/LDP/STR/LDR etc.
instructions. Further these instructions have a variety of addressing
modes, like offset, pre-indexing and post-indexing etc. Among other
things, one of differences in these addressing modes is _when_ the addr
register is updated with the result of the address calculation: before
or after the memory operation. To best support such needs, the generic
instructions like GINSN_TYPE_LOAD, GINSN_TYPE_STORE together with
GINSN_TYPE_ADD, and GINSN_TYPE_SUB may be used.
The functionality provided in ginsn.c and scfi.c is compiled in when a
target defines TARGET_USE_SCFI and TARGET_USE_GINSN. This can be
revisited later when there are other use-cases of creating ginsn's in
GAS, apart from the current use-case of synthesizing CFI for
hand-written asm.
Support is added only for System V AMD64 ABI for ELF at this time. If
the user enables SCFI with --32, GAS issues an error:
"Fatal error: SCFI is not supported for this ABI"
For synthesizing (DWARF) CFI, the SCFI machinery requires the programmer
to adhere to some pre-requisites for their asm:
- Hand-written asm block must begin with a .type foo, @function
It is highly recommended to, additionally, also ensure that:
- Hand-written asm block ends with a .size foo, .-foo
The SCFI machinery encodes some rules which align with the standard
calling convention specified by the ABI. Apart from the rules, the SCFI
machinery employs some heuristics. For example:
- The base register for CFA tracking may be either REG_SP or REG_FP.
- If the base register for CFA tracking is REG_SP, the precise amount of
stack usage (and hence, the value of REG_SP) must be known at all times.
- If using dynamic stack allocation, the function must switch to
FP-based CFA. This means using instructions like the following (in
AMD64) in prologue:
pushq %rbp
movq %rsp, %rbp
and analogous instructions in epilogue.
- Save and Restore of callee-saved registers must be symmetrical.
However, the SCFI machinery at this time only warns if any such
asymmetry is seen.
These heuristics/rules are architecture-independent and are meant to
employed for all architectures/ABIs using SCFI in the future.
gas/
* Makefile.am: Add new files.
* Makefile.in: Regenerated.
* as.c (defined): Handle documentation and listing option for
ginsns and SCFI.
* config/obj-elf.c (obj_elf_size): Invoke ginsn_data_end.
(obj_elf_type): Invoke ginsn_data_begin.
* config/tc-i386.c (x86_scfi_callee_saved_p): New function.
(ginsn_prefix_66H_p): Likewise.
(ginsn_dw2_regnum): Likewise.
(x86_ginsn_addsub_reg_mem): Likewise.
(x86_ginsn_addsub_mem_reg): Likewise.
(x86_ginsn_alu_imm): Likewise.
(x86_ginsn_move): Likewise.
(x86_ginsn_lea): Likewise.
(x86_ginsn_jump): Likewise.
(x86_ginsn_jump_cond): Likewise.
(x86_ginsn_enter): Likewise.
(x86_ginsn_safe_to_skip): Likewise.
(x86_ginsn_unhandled): Likewise.
(x86_ginsn_new): New functionality to generate ginsns.
(md_assemble): Invoke x86_ginsn_new.
(s_insn): Likewise.
(i386_target_format): Add hard error for usage of SCFI with non AMD64 ABIs.
* config/tc-i386.h (TARGET_USE_GINSN): New definition.
(TARGET_USE_SCFI): Likewise.
(SCFI_MAX_REG_ID): Likewise.
(REG_FP): Likewise.
(REG_SP): Likewise.
(SCFI_INIT_CFA_OFFSET): Likewise.
(SCFI_CALLEE_SAVED_REG_P): Likewise.
(x86_scfi_callee_saved_p): Likewise.
* gas/listing.h (LISTING_GINSN_SCFI): New define for ginsn and
SCFI.
* gas/read.c (read_a_source_file): Close SCFI processing at end
of file read.
* gas/scfidw2gen.c (scfi_process_cfi_label): Add implementation.
(scfi_process_cfi_signal_frame): Likewise.
* subsegs.h (struct frch_ginsn_data): New forward declaration.
(struct frchain): New member for ginsn data.
* gas/subsegs.c (subseg_set_rest): Initialize the new member.
* symbols.c (colon): Invoke ginsn_frob_label to convey
user-defined labels to ginsn infrastructure.
* ginsn.c: New file.
* ginsn.h: New file.
* scfi.c: New file.
* scfi.h: New file.
-rw-r--r-- | gas/Makefile.am | 4 | ||||
-rw-r--r-- | gas/Makefile.in | 19 | ||||
-rw-r--r-- | gas/as.c | 5 | ||||
-rw-r--r-- | gas/config/obj-elf.c | 18 | ||||
-rw-r--r-- | gas/config/tc-i386.c | 1113 | ||||
-rw-r--r-- | gas/config/tc-i386.h | 21 | ||||
-rw-r--r-- | gas/ginsn.c | 1259 | ||||
-rw-r--r-- | gas/ginsn.h | 384 | ||||
-rw-r--r-- | gas/listing.h | 1 | ||||
-rw-r--r-- | gas/read.c | 10 | ||||
-rw-r--r-- | gas/scfi.c | 1232 | ||||
-rw-r--r-- | gas/scfi.h | 38 | ||||
-rw-r--r-- | gas/scfidw2gen.c | 28 | ||||
-rw-r--r-- | gas/subsegs.c | 1 | ||||
-rw-r--r-- | gas/subsegs.h | 2 | ||||
-rw-r--r-- | gas/symbols.c | 3 |
16 files changed, 4128 insertions, 10 deletions
diff --git a/gas/Makefile.am b/gas/Makefile.am index 2848fac..37ca095 100644 --- a/gas/Makefile.am +++ b/gas/Makefile.am @@ -82,6 +82,7 @@ GAS_CFILES = \ flonum-mult.c \ frags.c \ gen-sframe.c \ + ginsn.c \ hash.c \ input-file.c \ input-scrub.c \ @@ -94,6 +95,7 @@ GAS_CFILES = \ remap.c \ sb.c \ scfidw2gen.c \ + scfi.c \ sframe-opt.c \ stabs.c \ subsegs.c \ @@ -119,6 +121,7 @@ HFILES = \ flonum.h \ frags.h \ gen-sframe.h \ + ginsn.h \ hash.h \ input-file.h \ itbl-lex.h \ @@ -130,6 +133,7 @@ HFILES = \ read.h \ sb.h \ scfidw2gen.h \ + scfi.h \ subsegs.h \ symbols.h \ tc.h \ diff --git a/gas/Makefile.in b/gas/Makefile.in index 7d4fbfc..bc25765 100644 --- a/gas/Makefile.in +++ b/gas/Makefile.in @@ -173,12 +173,13 @@ am__objects_1 = app.$(OBJEXT) as.$(OBJEXT) atof-generic.$(OBJEXT) \ ecoff.$(OBJEXT) ehopt.$(OBJEXT) expr.$(OBJEXT) \ flonum-copy.$(OBJEXT) flonum-konst.$(OBJEXT) \ flonum-mult.$(OBJEXT) frags.$(OBJEXT) gen-sframe.$(OBJEXT) \ - hash.$(OBJEXT) input-file.$(OBJEXT) input-scrub.$(OBJEXT) \ - listing.$(OBJEXT) literal.$(OBJEXT) macro.$(OBJEXT) \ - messages.$(OBJEXT) output-file.$(OBJEXT) read.$(OBJEXT) \ - remap.$(OBJEXT) sb.$(OBJEXT) scfidw2gen.$(OBJEXT) \ - sframe-opt.$(OBJEXT) stabs.$(OBJEXT) subsegs.$(OBJEXT) \ - symbols.$(OBJEXT) write.$(OBJEXT) + ginsn.$(OBJEXT) hash.$(OBJEXT) input-file.$(OBJEXT) \ + input-scrub.$(OBJEXT) listing.$(OBJEXT) literal.$(OBJEXT) \ + macro.$(OBJEXT) messages.$(OBJEXT) output-file.$(OBJEXT) \ + read.$(OBJEXT) remap.$(OBJEXT) sb.$(OBJEXT) \ + scfidw2gen.$(OBJEXT) scfi.$(OBJEXT) sframe-opt.$(OBJEXT) \ + stabs.$(OBJEXT) subsegs.$(OBJEXT) symbols.$(OBJEXT) \ + write.$(OBJEXT) am_as_new_OBJECTS = $(am__objects_1) am__dirstamp = $(am__leading_dot)dirstamp as_new_OBJECTS = $(am_as_new_OBJECTS) @@ -581,6 +582,7 @@ GAS_CFILES = \ flonum-mult.c \ frags.c \ gen-sframe.c \ + ginsn.c \ hash.c \ input-file.c \ input-scrub.c \ @@ -593,6 +595,7 @@ GAS_CFILES = \ remap.c \ sb.c \ scfidw2gen.c \ + scfi.c \ sframe-opt.c \ stabs.c \ subsegs.c \ @@ -617,6 +620,7 @@ HFILES = \ flonum.h \ frags.h \ gen-sframe.h \ + ginsn.h \ hash.h \ input-file.h \ itbl-lex.h \ @@ -628,6 +632,7 @@ HFILES = \ read.h \ sb.h \ scfidw2gen.h \ + scfi.h \ subsegs.h \ symbols.h \ tc.h \ @@ -1336,6 +1341,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/flonum-mult.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/frags.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/gen-sframe.Po@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginsn.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/hash.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/input-file.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/input-scrub.Po@am__quote@ @@ -1350,6 +1356,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/read.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/remap.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sb.Po@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scfi.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scfidw2gen.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sframe-opt.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stabs.Po@am__quote@ @@ -45,6 +45,7 @@ #include "codeview.h" #include "bfdver.h" #include "write.h" +#include "ginsn.h" #ifdef HAVE_ITBL_CPU #include "itbl-ops.h" @@ -245,6 +246,7 @@ Options:\n\ d omit debugging directives\n\ g include general info\n\ h include high-level source\n\ + i include ginsn and synthesized CFI info\n\ l include assembly\n\ m include macro expansions\n\ n omit forms processing\n\ @@ -1088,6 +1090,9 @@ This program has absolutely no warranty.\n")); case 'h': listing |= LISTING_HLL; break; + case 'i': + listing |= LISTING_GINSN_SCFI; + break; case 'l': listing |= LISTING_LISTING; break; diff --git a/gas/config/obj-elf.c b/gas/config/obj-elf.c index 1f34f5b..00c6e38 100644 --- a/gas/config/obj-elf.c +++ b/gas/config/obj-elf.c @@ -24,6 +24,7 @@ #include "subsegs.h" #include "obstack.h" #include "dwarf2dbg.h" +#include "ginsn.h" #ifndef ECOFF_DEBUGGING #define ECOFF_DEBUGGING 0 @@ -2311,6 +2312,13 @@ obj_elf_size (int ignore ATTRIBUTE_UNUSED) symbol_get_obj (sym)->size = XNEW (expressionS); *symbol_get_obj (sym)->size = exp; } + + /* If the symbol in the directive matches the current function being + processed, indicate end of the current stream of ginsns. */ + if (flag_synth_cfi + && S_IS_FUNCTION (sym) && sym == ginsn_data_func_symbol ()) + ginsn_data_end (symbol_temp_new_now ()); + demand_empty_rest_of_line (); } @@ -2499,6 +2507,16 @@ obj_elf_type (int ignore ATTRIBUTE_UNUSED) elfsym->symbol.flags &= ~mask; } + if (S_IS_FUNCTION (sym) && flag_synth_cfi) + { + /* When using SCFI, .type directive indicates start of a new FDE for SCFI + processing. So, we must first demarcate the previous block of ginsns, + if any, to mark the end of a previous FDE. */ + if (frchain_now->frch_ginsn_data) + ginsn_data_end (symbol_temp_new_now ()); + ginsn_data_begin (sym); + } + demand_empty_rest_of_line (); } diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index b25fa37..455e5a3 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -30,6 +30,7 @@ #include "subsegs.h" #include "dwarf2dbg.h" #include "dw2gencfi.h" +#include "scfi.h" #include "gen-sframe.h" #include "sframe.h" #include "elf/x86-64.h" @@ -5327,6 +5328,1095 @@ static INLINE bool may_need_pass2 (const insn_template *t) && t->base_opcode == 0x63); } +#if defined (OBJ_MAYBE_ELF) || defined (OBJ_ELF) + +/* DWARF register number for EFLAGS. Used for pushf/popf insns. */ +#define GINSN_DW2_REGNUM_EFLAGS 49 +/* DWARF register number for RSI. Used as dummy value when RegIP/RegIZ. */ +#define GINSN_DW2_REGNUM_RSI_DUMMY 4 + +/* Identify the callee-saved registers in System V AMD64 ABI. */ + +bool +x86_scfi_callee_saved_p (unsigned int dw2reg_num) +{ + if (dw2reg_num == 3 /* rbx. */ + || dw2reg_num == REG_FP /* rbp. */ + || dw2reg_num == REG_SP /* rsp. */ + || (dw2reg_num >= 12 && dw2reg_num <= 15) /* r12 - r15. */) + return true; + + return false; +} + +/* Check whether an instruction prefix which affects operation size + accompanies. For insns in the legacy space, setting REX.W takes precedence + over the operand-size prefix (66H) when both are used. + + The current users of this API are in the handlers for PUSH, POP or other + instructions which affect the stack pointer implicitly: the operation size + (16, 32, or 64 bits) determines the amount by which the stack pointer is + incremented / decremented (2, 4 or 8). */ + +static bool +ginsn_opsize_prefix_p (void) +{ + return (!(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX]); +} + +/* Get the DWARF register number for the given register entry. + For specific byte/word/dword register accesses like al, cl, ah, ch, r8d, + r20w etc., we need to identify the DWARF register number for the + corresponding 8-byte GPR. + + This function is a hack - it relies on relative ordering of reg entries in + the i386_regtab. FIXME - it will be good to allow a more direct way to get + this information. */ + +static unsigned int +ginsn_dw2_regnum (const reg_entry *ireg) +{ + /* PS: Note the data type here as int32_t, because of Dw2Inval (-1). */ + int32_t dwarf_reg = Dw2Inval; + const reg_entry *temp = ireg; + unsigned int idx = 0; + + /* ginsn creation is available for AMD64 abi only ATM. Other flag_code + are not expected. */ + gas_assert (ireg && flag_code == CODE_64BIT); + + /* Watch out for RegIP, RegIZ. These are expected to appear only with + base/index addressing modes. Although creating inaccurate data + dependencies, using a dummy value (lets say volatile register rsi) will + not hurt SCFI. TBD_GINSN_GEN_NOT_SCFI. */ + if (ireg->reg_num == RegIP || ireg->reg_num == RegIZ) + return GINSN_DW2_REGNUM_RSI_DUMMY; + + dwarf_reg = ireg->dw2_regnum[flag_code >> 1]; + + if (dwarf_reg == Dw2Inval) + { + if (ireg <= &i386_regtab[3]) + /* For al, cl, dl, bl, bump over to axl, cxl, dxl, bxl respectively by + adding 8. */ + temp = ireg + 8; + else if (ireg <= &i386_regtab[7]) + /* For ah, ch, dh, bh, bump over to axl, cxl, dxl, bxl respectively by + adding 4. */ + temp = ireg + 4; + else + { + /* The code relies on the relative ordering of the reg entries in + i386_regtab. There are 32 register entries between axl-r31b, + ax-r31w etc. The assertions here ensures the code does not + recurse indefinitely. */ + gas_assert ((temp - &i386_regtab[0]) >= 0); + idx = temp - &i386_regtab[0]; + gas_assert (idx + 32 < i386_regtab_size - 1); + + temp = temp + 32; + } + + dwarf_reg = ginsn_dw2_regnum (temp); + } + + /* Sanity check - failure may indicate state corruption, bad ginsn or + perhaps the i386-reg table and the current function got out of sync. */ + gas_assert (dwarf_reg >= 0); + + return (unsigned int) dwarf_reg; +} + +static ginsnS * +x86_ginsn_addsub_reg_mem (const symbolS *insn_end_sym) +{ + unsigned int dw2_regnum; + unsigned int src1_dw2_regnum; + ginsnS *ginsn = NULL; + ginsnS * (*ginsn_func) (const symbolS *, bool, + enum ginsn_src_type, unsigned int, offsetT, + enum ginsn_src_type, unsigned int, offsetT, + enum ginsn_dst_type, unsigned int, offsetT); + uint16_t opcode = i.tm.base_opcode; + + gas_assert (i.tm.opcode_space == SPACE_BASE + && (opcode == 0x1 || opcode == 0x29)); + ginsn_func = (opcode == 0x1) ? ginsn_new_add : ginsn_new_sub; + + /* op %reg, symbol or even other cases where destination involves indirect + access are unnecessary for SCFI correctness. TBD_GINSN_GEN_NOT_SCFI. */ + if (i.mem_operands) + return ginsn; + + /* op reg, reg/mem. */ + src1_dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + /* Of interest only when second opnd is not memory. */ + if (i.reg_operands == 2) + { + dw2_regnum = ginsn_dw2_regnum (i.op[1].regs); + ginsn = ginsn_func (insn_end_sym, true, + GINSN_SRC_REG, src1_dw2_regnum, 0, + GINSN_SRC_REG, dw2_regnum, 0, + GINSN_DST_REG, dw2_regnum, 0); + ginsn_set_where (ginsn); + } + + return ginsn; +} + +static ginsnS * +x86_ginsn_addsub_mem_reg (const symbolS *insn_end_sym) +{ + unsigned int dw2_regnum; + unsigned int src1_dw2_regnum; + const reg_entry *mem_reg; + int32_t gdisp = 0; + ginsnS *ginsn = NULL; + ginsnS * (*ginsn_func) (const symbolS *, bool, + enum ginsn_src_type, unsigned int, offsetT, + enum ginsn_src_type, unsigned int, offsetT, + enum ginsn_dst_type, unsigned int, offsetT); + uint16_t opcode = i.tm.base_opcode; + + gas_assert (i.tm.opcode_space == SPACE_BASE + && (opcode == 0x3 || opcode == 0x2b)); + ginsn_func = (opcode == 0x3) ? ginsn_new_add : ginsn_new_sub; + + /* op symbol, %reg. */ + if (i.mem_operands && !i.base_reg && !i.index_reg) + return ginsn; + + /* op reg/mem, %reg. */ + dw2_regnum = ginsn_dw2_regnum (i.op[1].regs); + + if (i.reg_operands == 2) + { + src1_dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn = ginsn_func (insn_end_sym, true, + GINSN_SRC_REG, src1_dw2_regnum, 0, + GINSN_SRC_REG, dw2_regnum, 0, + GINSN_DST_REG, dw2_regnum, 0); + ginsn_set_where (ginsn); + } + else if (i.mem_operands) + { + mem_reg = (i.base_reg) ? i.base_reg : i.index_reg; + src1_dw2_regnum = ginsn_dw2_regnum (mem_reg); + if (i.disp_operands == 1) + gdisp = i.op[0].disps->X_add_number; + ginsn = ginsn_func (insn_end_sym, true, + GINSN_SRC_INDIRECT, src1_dw2_regnum, gdisp, + GINSN_SRC_REG, dw2_regnum, 0, + GINSN_DST_REG, dw2_regnum, 0); + ginsn_set_where (ginsn); + } + + return ginsn; +} + +static ginsnS * +x86_ginsn_alu_imm (const symbolS *insn_end_sym) +{ + offsetT src_imm; + unsigned int dw2_regnum; + ginsnS *ginsn = NULL; + enum ginsn_src_type src_type = GINSN_SRC_REG; + enum ginsn_dst_type dst_type = GINSN_DST_REG; + + ginsnS * (*ginsn_func) (const symbolS *, bool, + enum ginsn_src_type, unsigned int, offsetT, + enum ginsn_src_type, unsigned int, offsetT, + enum ginsn_dst_type, unsigned int, offsetT); + + /* FIXME - create ginsn where dest is REG_SP / REG_FP only ? */ + /* Map for insn.tm.extension_opcode + 000 ADD 100 AND + 001 OR 101 SUB + 010 ADC 110 XOR + 011 SBB 111 CMP */ + + /* add/sub/and imm, %reg only at this time for SCFI. + Although all three ('and', 'or' , 'xor') make the destination reg + untraceable, 'and' op is handled but not 'or' / 'xor' because we will look + into supporting the DRAP pattern at some point. Other opcodes ('adc', + 'sbb' and 'cmp') are not generated here either. The ginsn representation + does not have support for the latter three opcodes; GINSN_TYPE_OTHER may + be added for these after x86_ginsn_unhandled () invocation if the + destination register is REG_SP or REG_FP. */ + if (i.tm.extension_opcode == 5) + ginsn_func = ginsn_new_sub; + else if (i.tm.extension_opcode == 4) + ginsn_func = ginsn_new_and; + else if (i.tm.extension_opcode == 0) + ginsn_func = ginsn_new_add; + else + return ginsn; + + /* TBD_GINSN_REPRESENTATION_LIMIT: There is no representation for when a + symbol is used as an operand, like so: + addq $simd_cmp_op+8, %rdx + Skip generating any ginsn for this. */ + if (i.imm_operands == 1 + && i.op[0].imms->X_op != O_constant) + return ginsn; + + /* addq $1, symbol + addq $1, -16(%rbp) + These are not of interest for SCFI. Also, TBD_GINSN_GEN_NOT_SCFI. */ + if (i.mem_operands == 1) + return ginsn; + + gas_assert (i.imm_operands == 1); + src_imm = i.op[0].imms->X_add_number; + /* The second operand may be a register or indirect access. For SCFI, only + the case when the second opnd is a register is interesting. Revisit this + if generating ginsns for a different gen mode TBD_GINSN_GEN_NOT_SCFI. */ + if (i.reg_operands == 1) + { + dw2_regnum = ginsn_dw2_regnum (i.op[1].regs); + /* For ginsn, keep the imm as second src operand. */ + ginsn = ginsn_func (insn_end_sym, true, + src_type, dw2_regnum, 0, + GINSN_SRC_IMM, 0, src_imm, + dst_type, dw2_regnum, 0); + + ginsn_set_where (ginsn); + } + + return ginsn; +} + +/* Create ginsn(s) for MOV operations. + + The generated ginsns corresponding to mov with indirect access to memory + (src or dest) suffer with loss of information: when both index and base + registers are at play, only base register gets conveyed in ginsn. Note + this TBD_GINSN_GEN_NOT_SCFI. */ + +static ginsnS * +x86_ginsn_move (const symbolS *insn_end_sym) +{ + ginsnS *ginsn = NULL; + unsigned int dst_reg; + unsigned int src_reg; + offsetT src_disp = 0; + offsetT dst_disp = 0; + const reg_entry *dst = NULL; + const reg_entry *src = NULL; + uint16_t opcode = i.tm.base_opcode; + enum ginsn_src_type src_type = GINSN_SRC_REG; + enum ginsn_dst_type dst_type = GINSN_DST_REG; + + /* mov %reg, symbol or mov symbol, %reg. + Not of interest for SCFI. Also, TBD_GINSN_GEN_NOT_SCFI. */ + if (i.mem_operands == 1 && !i.base_reg && !i.index_reg) + return ginsn; + + gas_assert (i.tm.opcode_space == SPACE_BASE); + if (opcode == 0x8b || opcode == 0x8a) + { + /* mov disp(%reg), %reg. */ + if (i.mem_operands) + { + src = (i.base_reg) ? i.base_reg : i.index_reg; + if (i.disp_operands == 1) + src_disp = i.op[0].disps->X_add_number; + src_type = GINSN_SRC_INDIRECT; + } + else + src = i.op[0].regs; + + dst = i.op[1].regs; + } + else if (opcode == 0x89 || opcode == 0x88) + { + /* mov %reg, disp(%reg). */ + src = i.op[0].regs; + if (i.mem_operands) + { + dst = (i.base_reg) ? i.base_reg : i.index_reg; + if (i.disp_operands == 1) + dst_disp = i.op[1].disps->X_add_number; + dst_type = GINSN_DST_INDIRECT; + } + else + dst = i.op[1].regs; + } + + src_reg = ginsn_dw2_regnum (src); + dst_reg = ginsn_dw2_regnum (dst); + + ginsn = ginsn_new_mov (insn_end_sym, true, + src_type, src_reg, src_disp, + dst_type, dst_reg, dst_disp); + ginsn_set_where (ginsn); + + return ginsn; +} + +/* Generate appropriate ginsn for lea. + Sub-cases marked with TBD_GINSN_INFO_LOSS indicate some loss of information + in the ginsn. But these are fine for now for GINSN_GEN_SCFI generation + mode. */ + +static ginsnS * +x86_ginsn_lea (const symbolS *insn_end_sym) +{ + offsetT src_disp = 0; + ginsnS *ginsn = NULL; + unsigned int base_reg; + unsigned int index_reg; + offsetT index_scale; + unsigned int dst_reg; + + if (!i.index_reg && !i.base_reg) + { + /* lea symbol, %rN. */ + dst_reg = ginsn_dw2_regnum (i.op[1].regs); + /* TBD_GINSN_INFO_LOSS - Skip encoding information about the symbol. */ + ginsn = ginsn_new_mov (insn_end_sym, false, + GINSN_SRC_IMM, 0xf /* arbitrary const. */, 0, + GINSN_DST_REG, dst_reg, 0); + } + else if (i.base_reg && !i.index_reg) + { + /* lea -0x2(%base),%dst. */ + base_reg = ginsn_dw2_regnum (i.base_reg); + dst_reg = ginsn_dw2_regnum (i.op[1].regs); + + if (i.disp_operands) + src_disp = i.op[0].disps->X_add_number; + + if (src_disp) + /* Generate an ADD ginsn. */ + ginsn = ginsn_new_add (insn_end_sym, true, + GINSN_SRC_REG, base_reg, 0, + GINSN_SRC_IMM, 0, src_disp, + GINSN_DST_REG, dst_reg, 0); + else + /* Generate a MOV ginsn. */ + ginsn = ginsn_new_mov (insn_end_sym, true, + GINSN_SRC_REG, base_reg, 0, + GINSN_DST_REG, dst_reg, 0); + } + else if (!i.base_reg && i.index_reg) + { + /* lea (,%index,imm), %dst. */ + /* TBD_GINSN_INFO_LOSS - There is no explicit ginsn multiply operation, + instead use GINSN_TYPE_OTHER. Also, note that info about displacement + is not carried forward either. But this is fine because + GINSN_TYPE_OTHER will cause SCFI pass to bail out any which way if + dest reg is interesting. */ + index_scale = i.log2_scale_factor; + index_reg = ginsn_dw2_regnum (i.index_reg); + dst_reg = ginsn_dw2_regnum (i.op[1].regs); + ginsn = ginsn_new_other (insn_end_sym, true, + GINSN_SRC_REG, index_reg, + GINSN_SRC_IMM, index_scale, + GINSN_DST_REG, dst_reg); + /* FIXME - It seems to make sense to represent a scale factor of 1 + correctly here (i.e. not as "other", but rather similar to the + base-without- index case above)? */ + } + else + { + /* lea disp(%base,%index,imm) %dst. */ + /* TBD_GINSN_INFO_LOSS - Skip adding information about the disp and imm + for index reg. */ + base_reg = ginsn_dw2_regnum (i.base_reg); + index_reg = ginsn_dw2_regnum (i.index_reg); + dst_reg = ginsn_dw2_regnum (i.op[1].regs); + /* Generate an GINSN_TYPE_OTHER ginsn. */ + ginsn = ginsn_new_other (insn_end_sym, true, + GINSN_SRC_REG, base_reg, + GINSN_SRC_REG, index_reg, + GINSN_DST_REG, dst_reg); + } + + ginsn_set_where (ginsn); + + return ginsn; +} + +static ginsnS * +x86_ginsn_jump (const symbolS *insn_end_sym, bool cond_p) +{ + ginsnS *ginsn = NULL; + const symbolS *src_symbol; + ginsnS * (*ginsn_func) (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, + const symbolS *src_ginsn_sym); + + gas_assert (i.disp_operands == 1); + + ginsn_func = cond_p ? ginsn_new_jump_cond : ginsn_new_jump; + if (i.op[0].disps->X_op == O_symbol && !i.op[0].disps->X_add_number) + { + src_symbol = i.op[0].disps->X_add_symbol; + ginsn = ginsn_func (insn_end_sym, true, + GINSN_SRC_SYMBOL, 0, src_symbol); + + ginsn_set_where (ginsn); + } + else + { + /* A non-zero addend in jump/JCC target makes control-flow tracking + difficult. Skip SCFI for now. */ + as_bad (_("SCFI: `%s' insn with non-zero addend to sym not supported"), + cond_p ? "JCC" : "jmp"); + return ginsn; + } + + return ginsn; +} + +static ginsnS * +x86_ginsn_enter (const symbolS *insn_end_sym) +{ + ginsnS *ginsn = NULL; + ginsnS *ginsn_next = NULL; + ginsnS *ginsn_last = NULL; + /* In 64-bit mode, the default stack update size is 8 bytes. */ + int stack_opnd_size = 8; + + gas_assert (i.imm_operands == 2); + + /* For non-zero size operands, bail out as untraceable for SCFI. */ + if (i.op[0].imms->X_op != O_constant || i.op[0].imms->X_add_symbol != 0 + || i.op[1].imms->X_op != O_constant || i.op[1].imms->X_add_symbol != 0) + { + as_bad ("SCFI: enter insn with non-zero operand not supported"); + return ginsn; + } + + /* Check if this is a 16-bit op. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + + /* If the nesting level is 0, the processor pushes the frame pointer from + the BP/EBP/RBP register onto the stack, copies the current stack + pointer from the SP/ESP/RSP register into the BP/EBP/RBP register, and + loads the SP/ESP/RSP register with the current stack-pointer value + minus the value in the size operand. */ + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn); + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_REG, REG_FP, + GINSN_DST_INDIRECT, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + ginsn_last = ginsn_new_mov (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_DST_REG, REG_FP, 0); + ginsn_set_where (ginsn_last); + gas_assert (!ginsn_link_next (ginsn_next, ginsn_last)); + + return ginsn; +} + +static ginsnS * +x86_ginsn_leave (const symbolS *insn_end_sym) +{ + ginsnS *ginsn = NULL; + ginsnS *ginsn_next = NULL; + ginsnS *ginsn_last = NULL; + /* In 64-bit mode, the default stack update size is 8 bytes. */ + int stack_opnd_size = 8; + + /* Check if this is a 16-bit op. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + + /* The 'leave' instruction copies the contents of the RBP register + into the RSP register to release all stack space allocated to the + procedure. */ + ginsn = ginsn_new_mov (insn_end_sym, false, + GINSN_SRC_REG, REG_FP, 0, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn); + /* Then it restores the old value of the RBP register from the stack. */ + ginsn_next = ginsn_new_load (insn_end_sym, false, + GINSN_SRC_INDIRECT, REG_SP, 0, + GINSN_DST_REG, REG_FP); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + ginsn_last = ginsn_new_add (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn_next, ginsn_last)); + + return ginsn; +} + +/* Check if an instruction is whitelisted. + + Some instructions may appear with REG_SP or REG_FP as destination, because + which they are deemed 'interesting' for SCFI. Whitelist them here if they + do not affect SCFI correctness. */ + +static bool +x86_ginsn_safe_to_skip_p (void) +{ + bool skip_p = false; + uint16_t opcode = i.tm.base_opcode; + + switch (opcode) + { + case 0x80: + case 0x81: + case 0x83: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* cmp imm, reg/rem. */ + if (i.tm.extension_opcode == 7) + skip_p = true; + break; + + case 0x38: + case 0x39: + case 0x3a: + case 0x3b: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* cmp imm/reg/mem, reg/rem. */ + skip_p = true; + break; + + case 0xf6: + case 0xf7: + case 0x84: + case 0x85: + /* test imm/reg/mem, reg/mem. */ + if (i.tm.opcode_space != SPACE_BASE) + break; + skip_p = true; + break; + + default: + break; + } + + return skip_p; +} + +#define X86_GINSN_UNHANDLED_NONE 0 +#define X86_GINSN_UNHANDLED_DEST_REG 1 +#define X86_GINSN_UNHANDLED_CFG 2 +#define X86_GINSN_UNHANDLED_STACKOP 3 +#define X86_GINSN_UNHANDLED_UNEXPECTED 4 + +/* Check the input insn for its impact on the correctness of the synthesized + CFI. Returns an error code to the caller. */ + +static int +x86_ginsn_unhandled (void) +{ + int err = X86_GINSN_UNHANDLED_NONE; + const reg_entry *reg_op; + unsigned int dw2_regnum; + + /* Keep an eye out for instructions affecting control flow. */ + if (i.tm.opcode_modifier.jump) + err = X86_GINSN_UNHANDLED_CFG; + /* Also, for any instructions involving an implicit update to the stack + pointer. */ + else if (i.tm.opcode_modifier.operandconstraint == IMPLICIT_STACK_OP) + err = X86_GINSN_UNHANDLED_STACKOP; + /* Finally, also check if the missed instructions are affecting REG_SP or + REG_FP. The destination operand is the last at all stages of assembly + (due to following AT&T syntax layout in the internal representation). In + case of Intel syntax input, this still remains true as swap_operands () + is done by now. + PS: These checks do not involve index / base reg, as indirect memory + accesses via REG_SP or REG_FP do not affect SCFI correctness. + (Also note these instructions are candidates for other ginsn generation + modes in future. TBD_GINSN_GEN_NOT_SCFI.) */ + else if (i.operands && i.reg_operands + && !(i.flags[i.operands - 1] & Operand_Mem)) + { + reg_op = i.op[i.operands - 1].regs; + if (reg_op) + { + dw2_regnum = ginsn_dw2_regnum (reg_op); + if (dw2_regnum == REG_SP || dw2_regnum == REG_FP) + err = X86_GINSN_UNHANDLED_DEST_REG; + } + else + /* Something unexpected. Indicate to caller. */ + err = X86_GINSN_UNHANDLED_UNEXPECTED; + } + + return err; +} + +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current + machine instruction. + + Returns the head of linked list of ginsn(s) added, if success; Returns NULL + if failure. + + The input ginsn_gen_mode GMODE determines the set of minimal necessary + ginsns necessary for correctness of any passes applicable for that mode. + For supporting the GINSN_GEN_SCFI generation mode, following is the list of + machine instructions that must be translated into the corresponding ginsns + to ensure correctness of SCFI: + - All instructions affecting the two registers that could potentially + be used as the base register for CFA tracking. For SCFI, the base + register for CFA tracking is limited to REG_SP and REG_FP only for + now. + - All change of flow instructions: conditional and unconditional branches, + call and return from functions. + - All instructions that can potentially be a register save / restore + operation. + - All instructions that perform stack manipulation implicitly: the CALL, + RET, PUSH, POP, ENTER, and LEAVE instructions. + + The function currently supports GINSN_GEN_SCFI ginsn generation mode only. + To support other generation modes will require work on this target-specific + process of creation of ginsns: + - Some of such places are tagged with TBD_GINSN_GEN_NOT_SCFI to serve as + possible starting points. + - Also note that ginsn representation may need enhancements. Specifically, + note some TBD_GINSN_INFO_LOSS and TBD_GINSN_REPRESENTATION_LIMIT markers. + */ + +static ginsnS * +x86_ginsn_new (const symbolS *insn_end_sym, enum ginsn_gen_mode gmode) +{ + int err = 0; + uint16_t opcode; + unsigned int dw2_regnum; + const reg_entry *mem_reg; + ginsnS *ginsn = NULL; + ginsnS *ginsn_next = NULL; + /* In 64-bit mode, the default stack update size is 8 bytes. */ + int stack_opnd_size = 8; + + /* Currently supports generation of selected ginsns, sufficient for + the use-case of SCFI only. */ + if (gmode != GINSN_GEN_SCFI) + return ginsn; + + opcode = i.tm.base_opcode; + + /* Until it is clear how to handle APX NDD and other new opcodes, disallow + them from SCFI. */ + if (is_apx_rex2_encoding () + || (i.tm.opcode_modifier.evex && is_apx_evex_encoding ())) + { + as_bad (_("SCFI: unsupported APX op %#x may cause incorrect CFI"), + opcode); + return ginsn; + } + + switch (opcode) + { + case 0x1: /* add reg, reg/mem. */ + case 0x29: /* sub reg, reg/mem. */ + if (i.tm.opcode_space != SPACE_BASE) + break; + ginsn = x86_ginsn_addsub_reg_mem (insn_end_sym); + break; + + case 0x3: /* add reg/mem, reg. */ + case 0x2b: /* sub reg/mem, reg. */ + if (i.tm.opcode_space != SPACE_BASE) + break; + ginsn = x86_ginsn_addsub_mem_reg (insn_end_sym); + break; + + case 0xa0: /* push fs. */ + case 0xa8: /* push gs. */ + /* push fs / push gs have opcode_space == SPACE_0F. */ + if (i.tm.opcode_space != SPACE_0F) + break; + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + /* Check if operation size is 16-bit. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn); + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_REG, dw2_regnum, + GINSN_DST_INDIRECT, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + + case 0xa1: /* pop fs. */ + case 0xa9: /* pop gs. */ + /* pop fs / pop gs have opcode_space == SPACE_0F. */ + if (i.tm.opcode_space != SPACE_0F) + break; + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + /* Check if operation size is 16-bit. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + ginsn = ginsn_new_load (insn_end_sym, false, + GINSN_SRC_INDIRECT, REG_SP, 0, + GINSN_DST_REG, dw2_regnum); + ginsn_set_where (ginsn); + ginsn_next = ginsn_new_add (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + + case 0x50 ... 0x57: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* push reg. */ + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + /* Check if operation size is 16-bit. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn); + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_REG, dw2_regnum, + GINSN_DST_INDIRECT, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + + case 0x58 ... 0x5f: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* pop reg. */ + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn = ginsn_new_load (insn_end_sym, false, + GINSN_SRC_INDIRECT, REG_SP, 0, + GINSN_DST_REG, dw2_regnum); + ginsn_set_where (ginsn); + /* Check if operation size is 16-bit. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + ginsn_next = ginsn_new_add (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + + case 0x6a: /* push imm8. */ + case 0x68: /* push imm16/imm32. */ + if (i.tm.opcode_space != SPACE_BASE) + break; + /* Check if operation size is 16-bit. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + /* Skip getting the value of imm from machine instruction + because this is not important for SCFI. */ + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn); + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_IMM, 0, + GINSN_DST_INDIRECT, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + + /* PS: Opcodes 0x80 ... 0x8f with opcode_space SPACE_0F are present + only after relaxation. They do not need to be handled for ginsn + creation. */ + case 0x70 ... 0x7f: + if (i.tm.opcode_space != SPACE_BASE) + break; + ginsn = x86_ginsn_jump (insn_end_sym, true); + break; + + case 0x80: + case 0x81: + case 0x83: + if (i.tm.opcode_space != SPACE_BASE) + break; + ginsn = x86_ginsn_alu_imm (insn_end_sym); + break; + + case 0x8a: /* mov r/m8, r8. */ + case 0x8b: /* mov r/m(16/32/64), r(16/32/64). */ + case 0x88: /* mov r8, r/m8. */ + case 0x89: /* mov r(16/32/64), r/m(16/32/64). */ + if (i.tm.opcode_space != SPACE_BASE) + break; + ginsn = x86_ginsn_move (insn_end_sym); + break; + + case 0x8d: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* lea disp(%base,%index,imm) %dst. */ + ginsn = x86_ginsn_lea (insn_end_sym); + break; + + case 0x8f: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* pop to reg/mem. */ + if (i.mem_operands) + { + mem_reg = (i.base_reg) ? i.base_reg : i.index_reg; + /* Use dummy register if no base or index. Unlike other opcodes, + ginsns must be generated as this affect stack pointer. */ + dw2_regnum = (mem_reg + ? ginsn_dw2_regnum (mem_reg) + : GINSN_DW2_REGNUM_RSI_DUMMY); + } + else + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn = ginsn_new_load (insn_end_sym, false, + GINSN_SRC_INDIRECT, REG_SP, 0, + GINSN_DST_INDIRECT, dw2_regnum); + ginsn_set_where (ginsn); + /* Check if operation size is 16-bit. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + ginsn_next = ginsn_new_add (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + + case 0x9c: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* pushf / pushfq. */ + /* Check if operation size is 16-bit. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn); + /* FIXME - hardcode the actual DWARF reg number value. As for SCFI + correctness, although this behaves simply a placeholder value; its + just clearer if the value is correct. */ + dw2_regnum = GINSN_DW2_REGNUM_EFLAGS; + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_REG, dw2_regnum, + GINSN_DST_INDIRECT, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + + case 0x9d: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* popf / popfq. */ + /* Check if operation size is 16-bit. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + /* FIXME - hardcode the actual DWARF reg number value. As for SCFI + correctness, although this behaves simply a placeholder value; its + just clearer if the value is correct. */ + dw2_regnum = GINSN_DW2_REGNUM_EFLAGS; + ginsn = ginsn_new_load (insn_end_sym, false, + GINSN_SRC_INDIRECT, REG_SP, 0, + GINSN_DST_REG, dw2_regnum); + ginsn_set_where (ginsn); + ginsn_next = ginsn_new_add (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + + case 0xff: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* push from reg/mem. */ + if (i.tm.extension_opcode == 6) + { + /* Check if operation size is 16-bit. */ + if (ginsn_opsize_prefix_p ()) + stack_opnd_size = 2; + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, 0, + GINSN_SRC_IMM, 0, stack_opnd_size, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn); + if (i.mem_operands) + { + mem_reg = (i.base_reg) ? i.base_reg : i.index_reg; + /* Use dummy register if no base or index. Unlike other opcodes, + ginsns must be generated as this affect stack pointer. */ + dw2_regnum = (mem_reg + ? ginsn_dw2_regnum (mem_reg) + : GINSN_DW2_REGNUM_RSI_DUMMY); + } + else + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_INDIRECT, dw2_regnum, + GINSN_DST_INDIRECT, REG_SP, 0); + ginsn_set_where (ginsn_next); + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + } + else if (i.tm.extension_opcode == 4) + { + /* jmp r/m. E.g., notrack jmp *%rax. */ + if (i.reg_operands) + { + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn = ginsn_new_jump (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, NULL); + ginsn_set_where (ginsn); + } + else if (i.mem_operands && i.index_reg) + { + /* jmp *0x0(,%rax,8). */ + dw2_regnum = ginsn_dw2_regnum (i.index_reg); + ginsn = ginsn_new_jump (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, NULL); + ginsn_set_where (ginsn); + } + else if (i.mem_operands && i.base_reg) + { + dw2_regnum = ginsn_dw2_regnum (i.base_reg); + ginsn = ginsn_new_jump (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, NULL); + ginsn_set_where (ginsn); + } + } + else if (i.tm.extension_opcode == 2) + { + /* 0xFF /2 (call). */ + if (i.reg_operands) + { + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn = ginsn_new_call (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, NULL); + ginsn_set_where (ginsn); + } + else if (i.mem_operands && i.base_reg) + { + dw2_regnum = ginsn_dw2_regnum (i.base_reg); + ginsn = ginsn_new_call (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, NULL); + ginsn_set_where (ginsn); + } + } + break; + + case 0xc2: /* ret imm16. */ + case 0xc3: /* ret. */ + if (i.tm.opcode_space != SPACE_BASE) + break; + /* Near ret. */ + ginsn = ginsn_new_return (insn_end_sym, true); + ginsn_set_where (ginsn); + break; + + case 0xc8: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* enter. */ + ginsn = x86_ginsn_enter (insn_end_sym); + break; + + case 0xc9: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* leave. */ + ginsn = x86_ginsn_leave (insn_end_sym); + break; + + case 0xe0 ... 0xe2: /* loop / loope / loopne. */ + case 0xe3: /* jecxz / jrcxz. */ + if (i.tm.opcode_space != SPACE_BASE) + break; + ginsn = x86_ginsn_jump (insn_end_sym, true); + ginsn_set_where (ginsn); + break; + + case 0xe8: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* PS: SCFI machinery does not care about which func is being + called. OK to skip that info. */ + ginsn = ginsn_new_call (insn_end_sym, true, + GINSN_SRC_SYMBOL, 0, NULL); + ginsn_set_where (ginsn); + break; + + /* PS: opcode 0xe9 appears only after relaxation. Skip here. */ + case 0xeb: + /* If opcode_space != SPACE_BASE, this is not a jmp insn. Skip it + for GINSN_GEN_SCFI. */ + if (i.tm.opcode_space != SPACE_BASE) + break; + /* Unconditional jmp. */ + ginsn = x86_ginsn_jump (insn_end_sym, false); + ginsn_set_where (ginsn); + break; + + default: + /* TBD_GINSN_GEN_NOT_SCFI: Skip all other opcodes uninteresting for + GINSN_GEN_SCFI mode. */ + break; + } + + if (!ginsn && !x86_ginsn_safe_to_skip_p ()) + { + /* For all unhandled insns that are not whitelisted, check that they do + not impact SCFI correctness. */ + err = x86_ginsn_unhandled (); + switch (err) + { + case X86_GINSN_UNHANDLED_NONE: + break; + case X86_GINSN_UNHANDLED_DEST_REG: + /* Not all writes to REG_FP are harmful in context of SCFI. Simply + generate a GINSN_TYPE_OTHER with destination set to the + appropriate register. The SCFI machinery will bail out if this + ginsn affects SCFI correctness. */ + dw2_regnum = ginsn_dw2_regnum (i.op[i.operands - 1].regs); + ginsn = ginsn_new_other (insn_end_sym, true, + GINSN_SRC_IMM, 0, + GINSN_SRC_IMM, 0, + GINSN_DST_REG, dw2_regnum); + ginsn_set_where (ginsn); + break; + case X86_GINSN_UNHANDLED_CFG: + case X86_GINSN_UNHANDLED_STACKOP: + as_bad (_("SCFI: unhandled op %#x may cause incorrect CFI"), opcode); + break; + case X86_GINSN_UNHANDLED_UNEXPECTED: + as_bad (_("SCFI: unexpected op %#x may cause incorrect CFI"), + opcode); + break; + default: + abort (); + break; + } + } + + return ginsn; +} + +#endif + /* This is the guts of the machine-dependent assembler. LINE points to a machine dependent instruction. This function is supposed to emit the frags/bytes it assembles to. */ @@ -5870,6 +6960,17 @@ md_assemble (char *line) /* We are ready to output the insn. */ output_insn (last_insn); +#if defined (OBJ_MAYBE_ELF) || defined (OBJ_ELF) + /* PS: SCFI is enabled only for System V AMD64 ABI. The ABI check has been + performed in i386_target_format. */ + if (IS_ELF && flag_synth_cfi) + { + ginsnS *ginsn; + ginsn = x86_ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ()); + frch_ginsn_data_append (ginsn); + } +#endif + insert_lfence_after (); if (i.tm.opcode_modifier.isprefix) @@ -12144,6 +13245,13 @@ s_insn (int dummy ATTRIBUTE_UNUSED) last_insn->name = ".insn directive"; last_insn->file = as_where (&last_insn->line); +#if defined (OBJ_MAYBE_ELF) || defined (OBJ_ELF) + /* PS: SCFI is enabled only for System V AMD64 ABI. The ABI check has been + performed in i386_target_format. */ + if (IS_ELF && flag_synth_cfi) + as_bad (_("SCFI: hand-crafting instructions not supported")); +#endif + done: *saved_ilp = saved_char; input_line_pointer = line; @@ -15788,6 +16896,11 @@ i386_target_format (void) else as_fatal (_("unknown architecture")); +#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF) + if (IS_ELF && flag_synth_cfi && x86_elf_abi != X86_64_ABI) + as_fatal (_("SCFI is not supported for this ABI")); +#endif + if (cpu_flags_all_zero (&cpu_arch_isa_flags)) cpu_arch_isa_flags = cpu_arch[flag_code == CODE_64BIT].enable; diff --git a/gas/config/tc-i386.h b/gas/config/tc-i386.h index 2499aa6..b93799a 100644 --- a/gas/config/tc-i386.h +++ b/gas/config/tc-i386.h @@ -415,6 +415,27 @@ extern bfd_vma x86_64_section_letter (int, const char **); extern void x86_cleanup (void); #define md_cleanup() x86_cleanup () +#define TARGET_USE_GINSN 1 +/* Allow GAS to synthesize DWARF CFI for hand-written asm. + PS: TARGET_USE_CFIPOP is a pre-condition. */ +#define TARGET_USE_SCFI 1 +/* Identify the maximum DWARF register number of all the registers being + tracked for SCFI. This is the last DWARF register number of the set + of SP, BP, and all callee-saved registers. For AMD64, this means + R15 (15). Use SCFI_CALLEE_SAVED_REG_P to identify which registers + are callee-saved from this set. */ +#define SCFI_MAX_REG_ID 15 +/* Identify the DWARF register number of the frame-pointer register. */ +#define REG_FP 6 +/* Identify the DWARF register number of the stack-pointer register. */ +#define REG_SP 7 +/* Some ABIs, like AMD64, use stack for call instruction. For such an ABI, + identify the initial (CFA) offset from RSP at the entry of function. */ +#define SCFI_INIT_CFA_OFFSET 8 + +#define SCFI_CALLEE_SAVED_REG_P(dw2reg) x86_scfi_callee_saved_p (dw2reg) +extern bool x86_scfi_callee_saved_p (uint32_t dw2reg_num); + /* Whether SFrame stack trace info is supported. */ extern bool x86_support_sframe_p (void); #define support_sframe_p x86_support_sframe_p diff --git a/gas/ginsn.c b/gas/ginsn.c new file mode 100644 index 0000000..5f6a67c --- /dev/null +++ b/gas/ginsn.c @@ -0,0 +1,1259 @@ +/* ginsn.h - GAS instruction representation. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GAS, the GNU Assembler. + + GAS is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GAS is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GAS; see the file COPYING. If not, write to the Free + Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#include "as.h" +#include "subsegs.h" +#include "ginsn.h" +#include "scfi.h" + +#ifdef TARGET_USE_GINSN + +static const char *const ginsn_type_names[] = +{ +#define _GINSN_TYPE_ITEM(NAME, STR) STR, + _GINSN_TYPES +#undef _GINSN_TYPE_ITEM +}; + +static ginsnS * +ginsn_alloc (void) +{ + ginsnS *ginsn = XCNEW (ginsnS); + return ginsn; +} + +static ginsnS * +ginsn_init (enum ginsn_type type, const symbolS *sym, bool real_p) +{ + ginsnS *ginsn = ginsn_alloc (); + ginsn->type = type; + ginsn->sym = sym; + if (real_p) + ginsn->flags |= GINSN_F_INSN_REAL; + return ginsn; +} + +static void +ginsn_cleanup (ginsnS **ginsnp) +{ + ginsnS *ginsn; + + if (!ginsnp || !*ginsnp) + return; + + ginsn = *ginsnp; + if (ginsn->scfi_ops) + { + scfi_ops_cleanup (ginsn->scfi_ops); + ginsn->scfi_ops = NULL; + } + + free (ginsn); + *ginsnp = NULL; +} + +static void +ginsn_set_src (struct ginsn_src *src, enum ginsn_src_type type, unsigned int reg, + offsetT immdisp) +{ + if (!src) + return; + + src->type = type; + /* Even when the use-case is SCFI, the value of reg may be > SCFI_MAX_REG_ID. + E.g., in AMD64, push fs etc. */ + src->reg = reg; + src->immdisp = immdisp; +} + +static void +ginsn_set_dst (struct ginsn_dst *dst, enum ginsn_dst_type type, unsigned int reg, + offsetT disp) +{ + if (!dst) + return; + + dst->type = type; + dst->reg = reg; + + if (type == GINSN_DST_INDIRECT) + dst->disp = disp; +} + +static void +ginsn_set_file_line (ginsnS *ginsn, const char *file, unsigned int line) +{ + if (!ginsn) + return; + + ginsn->file = file; + ginsn->line = line; +} + +struct ginsn_src * +ginsn_get_src1 (ginsnS *ginsn) +{ + return &ginsn->src[0]; +} + +struct ginsn_src * +ginsn_get_src2 (ginsnS *ginsn) +{ + return &ginsn->src[1]; +} + +struct ginsn_dst * +ginsn_get_dst (ginsnS *ginsn) +{ + return &ginsn->dst; +} + +unsigned int +ginsn_get_src_reg (struct ginsn_src *src) +{ + return src->reg; +} + +enum ginsn_src_type +ginsn_get_src_type (struct ginsn_src *src) +{ + return src->type; +} + +offsetT +ginsn_get_src_disp (struct ginsn_src *src) +{ + return src->immdisp; +} + +offsetT +ginsn_get_src_imm (struct ginsn_src *src) +{ + return src->immdisp; +} + +unsigned int +ginsn_get_dst_reg (struct ginsn_dst *dst) +{ + return dst->reg; +} + +enum ginsn_dst_type +ginsn_get_dst_type (struct ginsn_dst *dst) +{ + return dst->type; +} + +offsetT +ginsn_get_dst_disp (struct ginsn_dst *dst) +{ + return dst->disp; +} + +void +label_ginsn_map_insert (const symbolS *label, ginsnS *ginsn) +{ + const char *name = S_GET_NAME (label); + str_hash_insert (frchain_now->frch_ginsn_data->label_ginsn_map, + name, ginsn, 0 /* noreplace. */); +} + +ginsnS * +label_ginsn_map_find (const symbolS *label) +{ + const char *name = S_GET_NAME (label); + ginsnS *ginsn + = (ginsnS *) str_hash_find (frchain_now->frch_ginsn_data->label_ginsn_map, + name); + return ginsn; +} + +ginsnS * +ginsn_new_phantom (const symbolS *sym) +{ + ginsnS *ginsn = ginsn_alloc (); + ginsn->type = GINSN_TYPE_PHANTOM; + ginsn->sym = sym; + /* By default, GINSN_F_INSN_REAL is not set in ginsn->flags. */ + return ginsn; +} + +ginsnS * +ginsn_new_symbol (const symbolS *sym, bool func_begin_p) +{ + ginsnS *ginsn = ginsn_alloc (); + ginsn->type = GINSN_TYPE_SYMBOL; + ginsn->sym = sym; + if (func_begin_p) + ginsn->flags |= GINSN_F_FUNC_MARKER; + return ginsn; +} + +ginsnS * +ginsn_new_symbol_func_begin (const symbolS *sym) +{ + return ginsn_new_symbol (sym, true); +} + +ginsnS * +ginsn_new_symbol_func_end (const symbolS *sym) +{ + return ginsn_new_symbol (sym, false); +} + +ginsnS * +ginsn_new_symbol_user_label (const symbolS *sym) +{ + ginsnS *ginsn = ginsn_new_symbol (sym, false); + ginsn->flags |= GINSN_F_USER_LABEL; + return ginsn; +} + +ginsnS * +ginsn_new_add (const symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp, + enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_ADD, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src1_type, src1_reg, src1_disp); + ginsn_set_src (&ginsn->src[1], src2_type, src2_reg, src2_disp); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp); + + return ginsn; +} + +ginsnS * +ginsn_new_and (const symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp, + enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_AND, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src1_type, src1_reg, src1_disp); + ginsn_set_src (&ginsn->src[1], src2_type, src2_reg, src2_disp); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp); + + return ginsn; +} + +ginsnS * +ginsn_new_call (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, + const symbolS *src_text_sym) + +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_CALL, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0); + + if (src_type == GINSN_SRC_SYMBOL) + ginsn->src[0].sym = src_text_sym; + + return ginsn; +} + +ginsnS * +ginsn_new_jump (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, + const symbolS *src_ginsn_sym) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_JUMP, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0); + + if (src_type == GINSN_SRC_SYMBOL) + ginsn->src[0].sym = src_ginsn_sym; + + return ginsn; +} + +ginsnS * +ginsn_new_jump_cond (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, + const symbolS *src_ginsn_sym) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_JUMP_COND, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0); + + if (src_type == GINSN_SRC_SYMBOL) + ginsn->src[0].sym = src_ginsn_sym; + + return ginsn; +} + +ginsnS * +ginsn_new_mov (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, offsetT src_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_MOV, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_reg, src_disp); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp); + + return ginsn; +} + +ginsnS * +ginsn_new_store (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_STORE, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0); + /* dst info. */ + gas_assert (dst_type == GINSN_DST_INDIRECT); + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp); + + return ginsn; +} + +ginsnS * +ginsn_new_load (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, offsetT src_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_LOAD, sym, real_p); + /* src info. */ + gas_assert (src_type == GINSN_SRC_INDIRECT); + ginsn_set_src (&ginsn->src[0], src_type, src_reg, src_disp); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0); + + return ginsn; +} + +ginsnS * +ginsn_new_sub (const symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp, + enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_SUB, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src1_type, src1_reg, src1_disp); + ginsn_set_src (&ginsn->src[1], src2_type, src2_reg, src2_disp); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp); + + return ginsn; +} + +/* PS: Note this API does not identify the displacement values of + src1/src2/dst. At this time, it is unnecessary for correctness to support + the additional argument. */ + +ginsnS * +ginsn_new_other (const symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, unsigned int src1_val, + enum ginsn_src_type src2_type, unsigned int src2_val, + enum ginsn_dst_type dst_type, unsigned int dst_reg) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_OTHER, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src1_type, src1_val, src1_val); + /* GINSN_SRC_INDIRECT src2_type is not expected. */ + gas_assert (src2_type != GINSN_SRC_INDIRECT); + ginsn_set_src (&ginsn->src[1], src2_type, src2_val, src2_val); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0); + + return ginsn; +} + +ginsnS * +ginsn_new_return (const symbolS *sym, bool real_p) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_RETURN, sym, real_p); + return ginsn; +} + +void +ginsn_set_where (ginsnS *ginsn) +{ + const char *file; + unsigned int line; + file = as_where (&line); + ginsn_set_file_line (ginsn, file, line); +} + +int +ginsn_link_next (ginsnS *ginsn, ginsnS *next) +{ + int ret = 0; + + /* Avoid data corruption by limiting the scope of the API. */ + if (!ginsn || ginsn->next) + return 1; + + ginsn->next = next; + + return ret; +} + +bool +ginsn_track_reg_p (unsigned int dw2reg, enum ginsn_gen_mode gmode) +{ + bool track_p = false; + + if (gmode == GINSN_GEN_SCFI && dw2reg <= SCFI_MAX_REG_ID) + { + /* FIXME - rename this to tc_ ? */ + track_p |= SCFI_CALLEE_SAVED_REG_P (dw2reg); + track_p |= (dw2reg == REG_FP); + track_p |= (dw2reg == REG_SP); + } + + return track_p; +} + +static bool +ginsn_indirect_jump_p (ginsnS *ginsn) +{ + bool ret_p = false; + if (!ginsn) + return ret_p; + + ret_p = (ginsn->type == GINSN_TYPE_JUMP + && ginsn->src[0].type == GINSN_SRC_REG); + return ret_p; +} + +static bool +ginsn_direct_local_jump_p (ginsnS *ginsn) +{ + bool ret_p = false; + if (!ginsn) + return ret_p; + + ret_p |= (ginsn->type == GINSN_TYPE_JUMP + && ginsn->src[0].type == GINSN_SRC_SYMBOL); + return ret_p; +} + +static char * +ginsn_src_print (struct ginsn_src *src) +{ + size_t len = 40; + char *src_str = XNEWVEC (char, len); + + memset (src_str, 0, len); + + switch (src->type) + { + case GINSN_SRC_REG: + snprintf (src_str, len, "%%r%d, ", ginsn_get_src_reg (src)); + break; + case GINSN_SRC_IMM: + snprintf (src_str, len, "%lld, ", + (long long int) ginsn_get_src_imm (src)); + break; + case GINSN_SRC_INDIRECT: + snprintf (src_str, len, "[%%r%d+%lld], ", ginsn_get_src_reg (src), + (long long int) ginsn_get_src_disp (src)); + break; + default: + break; + } + + return src_str; +} + +static char* +ginsn_dst_print (struct ginsn_dst *dst) +{ + size_t len = GINSN_LISTING_OPND_LEN; + char *dst_str = XNEWVEC (char, len); + + memset (dst_str, 0, len); + + if (dst->type == GINSN_DST_REG) + { + char *buf = XNEWVEC (char, 32); + sprintf (buf, "%%r%d", ginsn_get_dst_reg (dst)); + strcat (dst_str, buf); + } + else if (dst->type == GINSN_DST_INDIRECT) + { + char *buf = XNEWVEC (char, 32); + sprintf (buf, "[%%r%d+%lld]", ginsn_get_dst_reg (dst), + (long long int) ginsn_get_dst_disp (dst)); + strcat (dst_str, buf); + } + + gas_assert (strlen (dst_str) < GINSN_LISTING_OPND_LEN); + + return dst_str; +} + +static const char* +ginsn_type_func_marker_print (ginsnS *ginsn) +{ + int id = 0; + static const char * const ginsn_sym_strs[] = + { "", "FUNC_BEGIN", "FUNC_END" }; + + if (GINSN_F_FUNC_BEGIN_P (ginsn)) + id = 1; + else if (GINSN_F_FUNC_END_P (ginsn)) + id = 2; + + return ginsn_sym_strs[id]; +} + +static char* +ginsn_print (ginsnS *ginsn) +{ + struct ginsn_src *src; + struct ginsn_dst *dst; + int str_size = 0; + size_t len = GINSN_LISTING_LEN; + char *ginsn_str = XNEWVEC (char, len); + + memset (ginsn_str, 0, len); + + str_size = snprintf (ginsn_str, GINSN_LISTING_LEN, "ginsn: %s", + ginsn_type_names[ginsn->type]); + gas_assert (str_size >= 0 && str_size < GINSN_LISTING_LEN); + + /* For some ginsn types, no further information is printed for now. */ + if (ginsn->type == GINSN_TYPE_CALL + || ginsn->type == GINSN_TYPE_RETURN) + goto end; + else if (ginsn->type == GINSN_TYPE_SYMBOL) + { + if (GINSN_F_USER_LABEL_P (ginsn)) + str_size += snprintf (ginsn_str + str_size, + GINSN_LISTING_LEN - str_size, + " %s", S_GET_NAME (ginsn->sym)); + else + str_size += snprintf (ginsn_str + str_size, + GINSN_LISTING_LEN - str_size, + " %s", ginsn_type_func_marker_print (ginsn)); + goto end; + } + + /* src 1. */ + src = ginsn_get_src1 (ginsn); + str_size += snprintf (ginsn_str + str_size, GINSN_LISTING_LEN - str_size, + " %s", ginsn_src_print (src)); + gas_assert (str_size >= 0 && str_size < GINSN_LISTING_LEN); + + /* src 2. */ + src = ginsn_get_src2 (ginsn); + str_size += snprintf (ginsn_str + str_size, GINSN_LISTING_LEN - str_size, + "%s", ginsn_src_print (src)); + gas_assert (str_size >= 0 && str_size < GINSN_LISTING_LEN); + + /* dst. */ + dst = ginsn_get_dst (ginsn); + str_size += snprintf (ginsn_str + str_size, GINSN_LISTING_LEN - str_size, + "%s", ginsn_dst_print (dst)); + +end: + gas_assert (str_size >= 0 && str_size < GINSN_LISTING_LEN); + return ginsn_str; +} + +static void +gbb_cleanup (gbbS **bbp) +{ + gbbS *bb = NULL; + + if (!bbp && !*bbp) + return; + + bb = *bbp; + + if (bb->entry_state) + { + free (bb->entry_state); + bb->entry_state = NULL; + } + if (bb->exit_state) + { + free (bb->exit_state); + bb->exit_state = NULL; + } + free (bb); + *bbp = NULL; +} + +static void +bb_add_edge (gbbS* from_bb, gbbS *to_bb) +{ + gedgeS *tmpedge = NULL; + gedgeS *gedge; + bool exists = false; + + if (!from_bb || !to_bb) + return; + + /* Create a new edge object. */ + gedge = XCNEW (gedgeS); + gedge->dst_bb = to_bb; + gedge->next = NULL; + gedge->visited = false; + + /* Add it in. */ + if (from_bb->out_gedges == NULL) + { + from_bb->out_gedges = gedge; + from_bb->num_out_gedges++; + } + else + { + /* Get the tail of the list. */ + tmpedge = from_bb->out_gedges; + while (tmpedge) + { + /* Do not add duplicate edges. Duplicated edges will cause unwanted + failures in the forward and backward passes for SCFI. */ + if (tmpedge->dst_bb == to_bb) + { + exists = true; + break; + } + if (tmpedge->next) + tmpedge = tmpedge->next; + else + break; + } + + if (!exists) + { + tmpedge->next = gedge; + from_bb->num_out_gedges++; + } + else + free (gedge); + } +} + +static void +cfg_add_bb (gcfgS *gcfg, gbbS *gbb) +{ + gbbS *last_bb = NULL; + + if (!gcfg->root_bb) + gcfg->root_bb = gbb; + else + { + last_bb = gcfg->root_bb; + while (last_bb->next) + last_bb = last_bb->next; + + last_bb->next = gbb; + } + gcfg->num_gbbs++; + + gbb->id = gcfg->num_gbbs; +} + +static gbbS * +add_bb_at_ginsn (const symbolS *func, gcfgS *gcfg, ginsnS *ginsn, gbbS *prev_bb, + int *errp); + +static gbbS * +find_bb (gcfgS *gcfg, ginsnS *ginsn) +{ + gbbS *found_bb = NULL; + gbbS *gbb = NULL; + + if (!ginsn) + return found_bb; + + if (ginsn->visited) + { + cfg_for_each_bb (gcfg, gbb) + { + if (gbb->first_ginsn == ginsn) + { + found_bb = gbb; + break; + } + } + /* Must be found if ginsn is visited. */ + gas_assert (found_bb); + } + + return found_bb; +} + +static gbbS * +find_or_make_bb (const symbolS *func, gcfgS *gcfg, ginsnS *ginsn, gbbS *prev_bb, + int *errp) +{ + gbbS *found_bb = NULL; + + found_bb = find_bb (gcfg, ginsn); + if (found_bb) + return found_bb; + + return add_bb_at_ginsn (func, gcfg, ginsn, prev_bb, errp); +} + +/* Add the basic block starting at GINSN to the given GCFG. + Also adds an edge from the PREV_BB to the newly added basic block. + + This is a recursive function which returns the root of the added + basic blocks. */ + +static gbbS * +add_bb_at_ginsn (const symbolS *func, gcfgS *gcfg, ginsnS *ginsn, gbbS *prev_bb, + int *errp) +{ + gbbS *current_bb = NULL; + ginsnS *target_ginsn = NULL; + const symbolS *taken_label; + + while (ginsn) + { + /* Skip these as they may be right after a GINSN_TYPE_RETURN. + For GINSN_TYPE_RETURN, we have already considered that as + end of bb, and a logical exit from function. */ + if (GINSN_F_FUNC_END_P (ginsn)) + { + ginsn = ginsn->next; + continue; + } + + if (ginsn->visited) + { + /* If the ginsn has been visited earlier, the bb must exist by now + in the cfg. */ + prev_bb = current_bb; + current_bb = find_bb (gcfg, ginsn); + gas_assert (current_bb); + /* Add edge from the prev_bb. */ + if (prev_bb) + bb_add_edge (prev_bb, current_bb); + break; + } + else if (current_bb && GINSN_F_USER_LABEL_P (ginsn)) + { + /* Create new bb starting at this label ginsn. */ + prev_bb = current_bb; + find_or_make_bb (func, gcfg, ginsn, prev_bb, errp); + break; + } + + if (current_bb == NULL) + { + /* Create a new bb. */ + current_bb = XCNEW (gbbS); + cfg_add_bb (gcfg, current_bb); + /* Add edge for the Not Taken, or Fall-through path. */ + if (prev_bb) + bb_add_edge (prev_bb, current_bb); + } + + if (current_bb->first_ginsn == NULL) + current_bb->first_ginsn = ginsn; + + ginsn->visited = true; + current_bb->num_ginsns++; + current_bb->last_ginsn = ginsn; + + /* Note that BB is _not_ split on ginsn of type GINSN_TYPE_CALL. */ + if (ginsn->type == GINSN_TYPE_JUMP + || ginsn->type == GINSN_TYPE_JUMP_COND + || ginsn->type == GINSN_TYPE_RETURN) + { + /* Indirect Jumps or direct jumps to symbols non-local to the + function must not be seen here. The caller must have already + checked for that. */ + gas_assert (!ginsn_indirect_jump_p (ginsn)); + if (ginsn->type == GINSN_TYPE_JUMP) + gas_assert (ginsn_direct_local_jump_p (ginsn)); + + /* Direct Jumps. May include conditional or unconditional change of + flow. What is important for CFG creation is that the target be + local to function. */ + if (ginsn->type == GINSN_TYPE_JUMP_COND + || ginsn_direct_local_jump_p (ginsn)) + { + gas_assert (ginsn->src[0].type == GINSN_SRC_SYMBOL); + taken_label = ginsn->src[0].sym; + gas_assert (taken_label); + + /* Preserve the prev_bb to be the dominator bb as we are + going to follow the taken path of the conditional branch + soon. */ + prev_bb = current_bb; + + /* Follow the target on the taken path. */ + target_ginsn = label_ginsn_map_find (taken_label); + /* Add the bb for the target of the taken branch. */ + if (target_ginsn) + find_or_make_bb (func, gcfg, target_ginsn, prev_bb, errp); + else + { + *errp = GCFG_JLABEL_NOT_PRESENT; + as_warn_where (ginsn->file, ginsn->line, + _("missing label '%s' in func '%s' may result in imprecise cfg"), + S_GET_NAME (taken_label), S_GET_NAME (func)); + } + /* Add the bb for the fall through path. */ + find_or_make_bb (func, gcfg, ginsn->next, prev_bb, errp); + } + else if (ginsn->type == GINSN_TYPE_RETURN) + { + /* We'll come back to the ginsns following GINSN_TYPE_RETURN + from another path if they are indeed reachable code. */ + break; + } + + /* Current BB has been processed. */ + current_bb = NULL; + } + ginsn = ginsn->next; + } + + return current_bb; +} + +static int +gbbs_compare (const void *v1, const void *v2) +{ + const gbbS *bb1 = *(const gbbS **) v1; + const gbbS *bb2 = *(const gbbS **) v2; + + if (bb1->first_ginsn->id < bb2->first_ginsn->id) + return -1; + else if (bb1->first_ginsn->id > bb2->first_ginsn->id) + return 1; + else if (bb1->first_ginsn->id == bb2->first_ginsn->id) + return 0; + + return 0; +} + +/* Synthesize DWARF CFI and emit it. */ + +static int +ginsn_pass_execute_scfi (const symbolS *func, gcfgS *gcfg, gbbS *root_bb) +{ + int err = scfi_synthesize_dw2cfi (func, gcfg, root_bb); + if (!err) + scfi_emit_dw2cfi (func); + + return err; +} + +/* Traverse the list of ginsns for the function and warn if some + ginsns are not visited. + + FIXME - this code assumes the caller has already performed a pass over + ginsns such that the reachable ginsns are already marked. Revisit this - we + should ideally make this pass self-sufficient. */ + +static int +ginsn_pass_warn_unreachable_code (const symbolS *func, + gcfgS *gcfg ATTRIBUTE_UNUSED, + ginsnS *root_ginsn) +{ + ginsnS *ginsn; + bool unreach_p = false; + + if (!gcfg || !func || !root_ginsn) + return 0; + + ginsn = root_ginsn; + + while (ginsn) + { + /* Some ginsns of type GINSN_TYPE_SYMBOL remain unvisited. Some + may even be excluded from the CFG as they are not reachable, given + their function, e.g., user labels after return machine insn. */ + if (!ginsn->visited + && !GINSN_F_FUNC_END_P (ginsn) + && !GINSN_F_USER_LABEL_P (ginsn)) + { + unreach_p = true; + break; + } + ginsn = ginsn->next; + } + + if (unreach_p) + as_warn_where (ginsn->file, ginsn->line, + _("GINSN: found unreachable code in func '%s'"), + S_GET_NAME (func)); + + return unreach_p; +} + +void +gcfg_get_bbs_in_prog_order (gcfgS *gcfg, gbbS **prog_order_bbs) +{ + uint64_t i = 0; + gbbS *gbb; + + if (!prog_order_bbs) + return; + + cfg_for_each_bb (gcfg, gbb) + { + gas_assert (i < gcfg->num_gbbs); + prog_order_bbs[i++] = gbb; + } + + qsort (prog_order_bbs, gcfg->num_gbbs, sizeof (gbbS *), gbbs_compare); +} + +/* Build the control flow graph for the ginsns of the function. + + It is important that the target adds an appropriate ginsn: + - GINSN_TYPE_JUMP, + - GINSN_TYPE_JUMP_COND, + - GINSN_TYPE_CALL, + - GINSN_TYPE_RET + at the associated points in the function. The correctness of the CFG + depends on the accuracy of these 'change of flow instructions'. */ + +gcfgS * +gcfg_build (const symbolS *func, int *errp) +{ + gcfgS *gcfg; + ginsnS *first_ginsn; + + gcfg = XCNEW (gcfgS); + first_ginsn = frchain_now->frch_ginsn_data->gins_rootP; + add_bb_at_ginsn (func, gcfg, first_ginsn, NULL /* prev_bb. */, errp); + + return gcfg; +} + +void +gcfg_cleanup (gcfgS **gcfgp) +{ + gcfgS *cfg; + gbbS *bb, *next_bb; + gedgeS *edge, *next_edge; + + if (!gcfgp || !*gcfgp) + return; + + cfg = *gcfgp; + bb = gcfg_get_rootbb (cfg); + + while (bb) + { + next_bb = bb->next; + + /* Cleanup all the edges. */ + edge = bb->out_gedges; + while (edge) + { + next_edge = edge->next; + free (edge); + edge = next_edge; + } + + gbb_cleanup (&bb); + bb = next_bb; + } + + free (cfg); + *gcfgp = NULL; +} + +gbbS * +gcfg_get_rootbb (gcfgS *gcfg) +{ + gbbS *rootbb = NULL; + + if (!gcfg || !gcfg->num_gbbs) + return NULL; + + rootbb = gcfg->root_bb; + + return rootbb; +} + +void +gcfg_print (const gcfgS *gcfg, FILE *outfile) +{ + gbbS *gbb = NULL; + gedgeS *gedge = NULL; + uint64_t total_ginsns = 0; + + cfg_for_each_bb(gcfg, gbb) + { + fprintf (outfile, "BB [%" PRIu64 "] with num insns: %" PRIu64, + gbb->id, gbb->num_ginsns); + fprintf (outfile, " [insns: %u to %u]\n", + gbb->first_ginsn->line, gbb->last_ginsn->line); + total_ginsns += gbb->num_ginsns; + bb_for_each_edge(gbb, gedge) + fprintf (outfile, " outgoing edge to %" PRIu64 "\n", + gedge->dst_bb->id); + } + fprintf (outfile, "\nTotal ginsns in all GBBs = %" PRIu64 "\n", + total_ginsns); +} + +void +frch_ginsn_data_init (const symbolS *func, symbolS *start_addr, + enum ginsn_gen_mode gmode) +{ + /* FIXME - error out if prev object is not free'd ? */ + frchain_now->frch_ginsn_data = XCNEW (struct frch_ginsn_data); + + frchain_now->frch_ginsn_data->mode = gmode; + /* Annotate with the current function symbol. */ + frchain_now->frch_ginsn_data->func = func; + /* Create a new start address symbol now. */ + frchain_now->frch_ginsn_data->start_addr = start_addr; + /* Assume the set of ginsn are apt for CFG creation, by default. */ + frchain_now->frch_ginsn_data->gcfg_apt_p = true; + + frchain_now->frch_ginsn_data->label_ginsn_map = str_htab_create (); +} + +void +frch_ginsn_data_cleanup (void) +{ + ginsnS *ginsn = NULL; + ginsnS *next_ginsn = NULL; + + ginsn = frchain_now->frch_ginsn_data->gins_rootP; + while (ginsn) + { + next_ginsn = ginsn->next; + ginsn_cleanup (&ginsn); + ginsn = next_ginsn; + } + + if (frchain_now->frch_ginsn_data->label_ginsn_map) + htab_delete (frchain_now->frch_ginsn_data->label_ginsn_map); + + free (frchain_now->frch_ginsn_data); + frchain_now->frch_ginsn_data = NULL; +} + +/* Append GINSN to the list of ginsns for the current function being + assembled. */ + +int +frch_ginsn_data_append (ginsnS *ginsn) +{ + ginsnS *last = NULL; + ginsnS *temp = NULL; + uint64_t id = 0; + + if (!ginsn) + return 1; + + if (frchain_now->frch_ginsn_data->gins_lastP) + id = frchain_now->frch_ginsn_data->gins_lastP->id; + + /* Do the necessary preprocessing on the set of input GINSNs: + - Update each ginsn with its ID. + While you iterate, also keep gcfg_apt_p updated by checking whether any + ginsn is inappropriate for GCFG creation. */ + temp = ginsn; + while (temp) + { + temp->id = ++id; + + if (ginsn_indirect_jump_p (temp) + || (ginsn->type == GINSN_TYPE_JUMP + && !ginsn_direct_local_jump_p (temp))) + frchain_now->frch_ginsn_data->gcfg_apt_p = false; + + if (listing & LISTING_GINSN_SCFI) + listing_newline (ginsn_print (temp)); + + /* The input GINSN may be a linked list of multiple ginsns chained + together. Find the last ginsn in the input chain of ginsns. */ + last = temp; + + temp = temp->next; + } + + /* Link in the ginsn to the tail. */ + if (!frchain_now->frch_ginsn_data->gins_rootP) + frchain_now->frch_ginsn_data->gins_rootP = ginsn; + else + ginsn_link_next (frchain_now->frch_ginsn_data->gins_lastP, ginsn); + + frchain_now->frch_ginsn_data->gins_lastP = last; + + return 0; +} + +enum ginsn_gen_mode +frch_ginsn_gen_mode (void) +{ + enum ginsn_gen_mode gmode = GINSN_GEN_NONE; + + if (frchain_now->frch_ginsn_data) + gmode = frchain_now->frch_ginsn_data->mode; + + return gmode; +} + +int +ginsn_data_begin (const symbolS *func) +{ + ginsnS *ginsn; + + /* The previous block of asm must have been processed by now. */ + if (frchain_now->frch_ginsn_data) + as_bad (_("GINSN process for prev func not done")); + + /* FIXME - hard code the mode to GINSN_GEN_SCFI. + This can be changed later when other passes on ginsns are formalised. */ + frch_ginsn_data_init (func, symbol_temp_new_now (), GINSN_GEN_SCFI); + + /* Create and insert ginsn with function begin marker. */ + ginsn = ginsn_new_symbol_func_begin (func); + frch_ginsn_data_append (ginsn); + + return 0; +} + +int +ginsn_data_end (const symbolS *label) +{ + ginsnS *ginsn; + gbbS *root_bb; + gcfgS *gcfg = NULL; + const symbolS *func; + int err = 0; + + if (!frchain_now->frch_ginsn_data) + return err; + + /* Insert Function end marker. */ + ginsn = ginsn_new_symbol_func_end (label); + frch_ginsn_data_append (ginsn); + + func = frchain_now->frch_ginsn_data->func; + + /* Build the cfg of ginsn(s) of the function. */ + if (!frchain_now->frch_ginsn_data->gcfg_apt_p) + { + as_warn (_("Untraceable control flow for func '%s'; Skipping SCFI"), + S_GET_NAME (func)); + goto end; + } + + gcfg = gcfg_build (func, &err); + + root_bb = gcfg_get_rootbb (gcfg); + if (!root_bb) + { + as_bad (_("Bad cfg of ginsn of func '%s'"), S_GET_NAME (func)); + goto end; + } + + /* Execute the desired passes on ginsns. */ + err = ginsn_pass_execute_scfi (func, gcfg, root_bb); + if (err) + goto end; + + /* Other passes, e.g., warn for unreachable code can be enabled too. */ + ginsn = frchain_now->frch_ginsn_data->gins_rootP; + err = ginsn_pass_warn_unreachable_code (func, gcfg, ginsn); + +end: + if (gcfg) + gcfg_cleanup (&gcfg); + frch_ginsn_data_cleanup (); + + return err; +} + +/* Add GINSN_TYPE_SYMBOL type ginsn for user-defined labels. These may be + branch targets, and hence are necessary for control flow graph. */ + +void +ginsn_frob_label (const symbolS *label) +{ + ginsnS *label_ginsn; + const char *file; + unsigned int line; + + if (frchain_now->frch_ginsn_data) + { + /* PS: Note how we keep the actual LABEL symbol as ginsn->sym. + Take care to avoid inadvertent updates or cleanups of symbols. */ + label_ginsn = ginsn_new_symbol_user_label (label); + /* Keep the location updated. */ + file = as_where (&line); + ginsn_set_file_line (label_ginsn, file, line); + + frch_ginsn_data_append (label_ginsn); + + label_ginsn_map_insert (label, label_ginsn); + } +} + +const symbolS * +ginsn_data_func_symbol (void) +{ + const symbolS *func = NULL; + + if (frchain_now->frch_ginsn_data) + func = frchain_now->frch_ginsn_data->func; + + return func; +} + +#else + +int +ginsn_data_begin (const symbolS *func ATTRIBUTE_UNUSED) +{ + as_bad (_("ginsn unsupported for target")); + return 1; +} + +int +ginsn_data_end (const symbolS *label ATTRIBUTE_UNUSED) +{ + as_bad (_("ginsn unsupported for target")); + return 1; +} + +void +ginsn_frob_label (const symbolS *sym ATTRIBUTE_UNUSED) +{ + return; +} + +const symbolS * +ginsn_data_func_symbol (void) +{ + return NULL; +} + +#endif /* TARGET_USE_GINSN. */ diff --git a/gas/ginsn.h b/gas/ginsn.h new file mode 100644 index 0000000..2513e3c --- /dev/null +++ b/gas/ginsn.h @@ -0,0 +1,384 @@ +/* ginsn.h - GAS instruction representation. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GAS, the GNU Assembler. + + GAS is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GAS is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GAS; see the file COPYING. If not, write to the Free + Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#ifndef GINSN_H +#define GINSN_H + +#include "as.h" + +/* Maximum number of source operands of a ginsn. */ +#define GINSN_NUM_SRC_OPNDS 2 + +/* A ginsn in printed in the following format: + "ginsn: OPCD SRC1, SRC2, DST" + "<-5-> <--------125------->" + where each of SRC1, SRC2, and DST are in the form: + "%rNN," (up to 5 chars) + "imm," (up to int32_t+1 chars) + "[%rNN+-imm]," (up to int32_t+9 chars) + Hence a max of 19 chars. */ + +#define GINSN_LISTING_OPND_LEN 40 +#define GINSN_LISTING_LEN 156 + +enum ginsn_gen_mode +{ + GINSN_GEN_NONE, + /* Generate ginsns for program validation passes. */ + GINSN_GEN_FVAL, + /* Generate ginsns for synthesizing DWARF CFI. */ + GINSN_GEN_SCFI, +}; + +/* ginsn types. + + GINSN_TYPE_PHANTOM are phantom ginsns. They are used where there is no real + machine instruction counterpart, but a ginsn is needed only to carry + information to GAS. For example, to carry an SCFI Op. + + Note that, ginsns do not have a push / pop instructions. + Instead, following are used: + type=GINSN_TYPE_LOAD, src=GINSN_SRC_INDIRECT, REG_SP: Load from stack. + type=GINSN_TYPE_STORE, dst=GINSN_DST_INDIRECT, REG_SP: Store to stack. +*/ + +#define _GINSN_TYPES \ + _GINSN_TYPE_ITEM (GINSN_TYPE_SYMBOL, "SYM") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_PHANTOM, "PHANTOM") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_ADD, "ADD") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_AND, "AND") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_CALL, "CALL") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_JUMP, "JMP") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_JUMP_COND, "JCC") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_MOV, "MOV") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_LOAD, "LOAD") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_STORE, "STORE") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_RETURN, "RET") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_SUB, "SUB") \ + _GINSN_TYPE_ITEM (GINSN_TYPE_OTHER, "OTH") + +enum ginsn_type +{ +#define _GINSN_TYPE_ITEM(NAME, STR) NAME, + _GINSN_TYPES +#undef _GINSN_TYPE_ITEM +}; + +enum ginsn_src_type +{ + GINSN_SRC_UNKNOWN, + GINSN_SRC_REG, + GINSN_SRC_IMM, + GINSN_SRC_INDIRECT, + GINSN_SRC_SYMBOL, +}; + +/* GAS instruction source operand representation. */ + +struct ginsn_src +{ + enum ginsn_src_type type; + /* DWARF register number. */ + unsigned int reg; + /* Immediate or disp for indirect memory access. */ + offsetT immdisp; + /* Src symbol. May be needed for some control flow instructions. */ + const symbolS *sym; +}; + +enum ginsn_dst_type +{ + GINSN_DST_UNKNOWN, + GINSN_DST_REG, + GINSN_DST_INDIRECT, +}; + +/* GAS instruction destination operand representation. */ + +struct ginsn_dst +{ + enum ginsn_dst_type type; + /* DWARF register number. */ + unsigned int reg; + /* Disp for indirect memory access. */ + offsetT disp; +}; + +/* Various flags for additional information per GAS instruction. */ + +/* Function begin or end symbol. */ +#define GINSN_F_FUNC_MARKER 0x1 +/* Identify real or implicit GAS insn. + Some targets employ CISC-like instructions. Multiple ginsn's may be used + for a single machine instruction in some ISAs. For some optimizations, + there is need to identify whether a ginsn, e.g., GINSN_TYPE_ADD or + GINSN_TYPE_SUB is a result of an user-specified instruction or not. */ +#define GINSN_F_INSN_REAL 0x2 +/* Identify if the GAS insn of type GINSN_TYPE_SYMBOL is due to a user-defined + label. Each user-defined labels in a function will cause addition of a new + ginsn. This simplifies control flow graph creation. + See htab_t label_ginsn_map usage. */ +#define GINSN_F_USER_LABEL 0x4 +/* Max bit position for flags (uint32_t). */ +#define GINSN_F_MAX 0x20 + +#define GINSN_F_FUNC_BEGIN_P(ginsn) \ + ((ginsn != NULL) \ + && (ginsn->type == GINSN_TYPE_SYMBOL) \ + && (ginsn->flags & GINSN_F_FUNC_MARKER)) + +/* PS: For ginsn associated with a user-defined symbol location, + GINSN_F_FUNC_MARKER is unset, but GINSN_F_USER_LABEL is set. */ +#define GINSN_F_FUNC_END_P(ginsn) \ + ((ginsn != NULL) \ + && (ginsn->type == GINSN_TYPE_SYMBOL) \ + && !(ginsn->flags & GINSN_F_FUNC_MARKER) \ + && !(ginsn->flags & GINSN_F_USER_LABEL)) + +#define GINSN_F_INSN_REAL_P(ginsn) \ + ((ginsn != NULL) \ + && (ginsn->flags & GINSN_F_INSN_REAL)) + +#define GINSN_F_USER_LABEL_P(ginsn) \ + ((ginsn != NULL) \ + && (ginsn->type == GINSN_TYPE_SYMBOL) \ + && !(ginsn->flags & GINSN_F_FUNC_MARKER) \ + && (ginsn->flags & GINSN_F_USER_LABEL)) + +typedef struct ginsn ginsnS; +typedef struct scfi_op scfi_opS; +typedef struct scfi_state scfi_stateS; + +/* GAS generic instruction. + + Generic instructions are used by GAS to abstract out the binary machine + instructions. In other words, ginsn is a target/ABI independent internal + representation for GAS. Note that, depending on the target, there may be + more than one ginsn per binary machine instruction. + + ginsns can be used by GAS to perform validations, or even generate + additional information like, sythesizing DWARF CFI for hand-written asm. */ + +struct ginsn +{ + enum ginsn_type type; + /* GAS instructions are simple instructions with GINSN_NUM_SRC_OPNDS number + of source operands and one destination operand at this time. */ + struct ginsn_src src[GINSN_NUM_SRC_OPNDS]; + struct ginsn_dst dst; + /* Additional information per instruction. */ + uint32_t flags; + /* Symbol. For ginsn of type other than GINSN_TYPE_SYMBOL, this identifies + the end of the corresponding machine instruction in the .text segment. + These symbols are created anew by the targets and are not used elsewhere + in GAS. The only exception is some ginsns of type GINSN_TYPE_SYMBOL, when + generated for the user-defined labels. See ginsn_frob_label. */ + const symbolS *sym; + /* Identifier (linearly increasing natural number) for each ginsn. Used as + a proxy for program order of ginsns. */ + uint64_t id; + /* Location information for user-interfacing messaging. Only ginsns with + GINSN_F_FUNC_BEGIN_P and GINSN_F_FUNC_END_P may present themselves with no + file or line information. */ + const char *file; + unsigned int line; + + /* Information needed for synthesizing CFI. */ + scfi_opS **scfi_ops; + uint32_t num_scfi_ops; + + /* Flag to keep track of visited instructions for CFG creation. */ + bool visited; + + ginsnS *next; /* A linked list. */ +}; + +struct ginsn_src *ginsn_get_src1 (ginsnS *ginsn); +struct ginsn_src *ginsn_get_src2 (ginsnS *ginsn); +struct ginsn_dst *ginsn_get_dst (ginsnS *ginsn); + +unsigned int ginsn_get_src_reg (struct ginsn_src *src); +enum ginsn_src_type ginsn_get_src_type (struct ginsn_src *src); +offsetT ginsn_get_src_disp (struct ginsn_src *src); +offsetT ginsn_get_src_imm (struct ginsn_src *src); + +unsigned int ginsn_get_dst_reg (struct ginsn_dst *dst); +enum ginsn_dst_type ginsn_get_dst_type (struct ginsn_dst *dst); +offsetT ginsn_get_dst_disp (struct ginsn_dst *dst); + +/* Data object for book-keeping information related to GAS generic + instructions. */ +struct frch_ginsn_data +{ + /* Mode for GINSN creation. */ + enum ginsn_gen_mode mode; + /* Head of the list of ginsns. */ + ginsnS *gins_rootP; + /* Tail of the list of ginsns. */ + ginsnS *gins_lastP; + /* Function symbol. */ + const symbolS *func; + /* Start address of the function. */ + symbolS *start_addr; + /* User-defined label to ginsn mapping. */ + htab_t label_ginsn_map; + /* Is the list of ginsn apt for creating CFG. */ + bool gcfg_apt_p; +}; + +int ginsn_data_begin (const symbolS *func); +int ginsn_data_end (const symbolS *label); +const symbolS *ginsn_data_func_symbol (void); +void ginsn_frob_label (const symbolS *sym); + +void frch_ginsn_data_init (const symbolS *func, symbolS *start_addr, + enum ginsn_gen_mode gmode); +void frch_ginsn_data_cleanup (void); +int frch_ginsn_data_append (ginsnS *ginsn); +enum ginsn_gen_mode frch_ginsn_gen_mode (void); + +void label_ginsn_map_insert (const symbolS *label, ginsnS *ginsn); +ginsnS *label_ginsn_map_find (const symbolS *label); + +ginsnS *ginsn_new_symbol_func_begin (const symbolS *sym); +ginsnS *ginsn_new_symbol_func_end (const symbolS *sym); +ginsnS *ginsn_new_symbol_user_label (const symbolS *sym); + +ginsnS *ginsn_new_phantom (const symbolS *sym); +ginsnS *ginsn_new_symbol (const symbolS *sym, bool real_p); +ginsnS *ginsn_new_add (const symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp, + enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp); +ginsnS *ginsn_new_and (const symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp, + enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp); +ginsnS *ginsn_new_call (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, + const symbolS *src_text_sym); +ginsnS *ginsn_new_jump (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, + const symbolS *src_ginsn_sym); +ginsnS *ginsn_new_jump_cond (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, + const symbolS *src_ginsn_sym); +ginsnS *ginsn_new_mov (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, offsetT src_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp); +ginsnS *ginsn_new_store (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp); +ginsnS *ginsn_new_load (const symbolS *sym, bool real_p, + enum ginsn_src_type src_type, unsigned int src_reg, offsetT src_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg); +ginsnS *ginsn_new_sub (const symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp, + enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp, + enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp); +ginsnS *ginsn_new_other (const symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, unsigned int src1_val, + enum ginsn_src_type src2_type, unsigned int src2_val, + enum ginsn_dst_type dst_type, unsigned int dst_reg); +ginsnS *ginsn_new_return (const symbolS *sym, bool real_p); + +void ginsn_set_where (ginsnS *ginsn); + +bool ginsn_track_reg_p (unsigned int dw2reg, enum ginsn_gen_mode); + +int ginsn_link_next (ginsnS *ginsn, ginsnS *next); + +enum gcfg_err_code +{ + GCFG_OK = 0, + GCFG_JLABEL_NOT_PRESENT = 1, /* Warning-level code. */ +}; + +typedef struct gbb gbbS; +typedef struct gedge gedgeS; + +/* GBB - Basic block of generic GAS instructions. */ + +struct gbb +{ + ginsnS *first_ginsn; + ginsnS *last_ginsn; + uint64_t num_ginsns; + + /* Identifier (linearly increasing natural number) for each gbb. Added for + debugging purpose only. */ + uint64_t id; + + bool visited; + + uint32_t num_out_gedges; + gedgeS *out_gedges; + + /* Members for SCFI purposes. */ + /* SCFI state at the entry of basic block. */ + scfi_stateS *entry_state; + /* SCFI state at the exit of basic block. */ + scfi_stateS *exit_state; + + /* A linked list. In order of addition. */ + gbbS *next; +}; + +struct gedge +{ + gbbS *dst_bb; + /* A linked list. In order of addition. */ + gedgeS *next; + bool visited; +}; + +/* Control flow graph of generic GAS instructions. */ + +struct gcfg +{ + uint64_t num_gbbs; + gbbS *root_bb; +}; + +typedef struct gcfg gcfgS; + +#define bb_for_each_insn(bb, ginsn) \ + for (ginsn = bb->first_ginsn; ginsn; \ + ginsn = (ginsn != bb->last_ginsn) ? ginsn->next : NULL) + +#define bb_for_each_edge(bb, edge) \ + for (edge = (edge == NULL) ? bb->out_gedges : edge; edge; edge = edge->next) + +#define cfg_for_each_bb(cfg, bb) \ + for (bb = cfg->root_bb; bb; bb = bb->next) + +#define bb_get_first_ginsn(bb) \ + (bb->first_ginsn) + +#define bb_get_last_ginsn(bb) \ + (bb->last_ginsn) + +gcfgS *gcfg_build (const symbolS *func, int *errp); +void gcfg_cleanup (gcfgS **gcfg); +void gcfg_print (const gcfgS *gcfg, FILE *outfile); +gbbS *gcfg_get_rootbb (gcfgS *gcfg); +void gcfg_get_bbs_in_prog_order (gcfgS *gcfg, gbbS **prog_order_bbs); + +#endif /* GINSN_H. */ diff --git a/gas/listing.h b/gas/listing.h index b0391bd..8a46351 100644 --- a/gas/listing.h +++ b/gas/listing.h @@ -29,6 +29,7 @@ #define LISTING_NOCOND 32 #define LISTING_MACEXP 64 #define LISTING_GENERAL 128 +#define LISTING_GINSN_SCFI 256 #define LISTING_DEFAULT (LISTING_LISTING | LISTING_HLL | LISTING_SYMBOLS) @@ -42,6 +42,7 @@ #include "codeview.h" #include "wchar.h" #include "filenames.h" +#include "ginsn.h" #include <limits.h> @@ -1384,6 +1385,9 @@ read_a_source_file (const char *name) } #endif + if (flag_synth_cfi) + ginsn_data_end (symbol_temp_new_now ()); + #ifdef md_cleanup md_cleanup (); #endif @@ -4198,6 +4202,12 @@ cons_worker (int nbytes, /* 1=.byte, 2=.word, 4=.long. */ if (flag_mri) mri_comment_end (stop, stopc); + + /* Disallow hand-crafting instructions using .byte. FIXME - what about + .word, .long etc ? */ + if (flag_synth_cfi && frchain_now && frchain_now->frch_ginsn_data + && nbytes == 1) + as_bad (_("SCFI: hand-crafting instructions not supported")); } void diff --git a/gas/scfi.c b/gas/scfi.c new file mode 100644 index 0000000..37cc85c --- /dev/null +++ b/gas/scfi.c @@ -0,0 +1,1232 @@ +/* scfi.c - Support for synthesizing DWARF CFI for hand-written asm. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GAS, the GNU Assembler. + + GAS is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GAS is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GAS; see the file COPYING. If not, write to the Free + Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#include "as.h" +#include "scfi.h" +#include "subsegs.h" +#include "scfidw2gen.h" +#include "dw2gencfi.h" + +#if defined (TARGET_USE_SCFI) && defined (TARGET_USE_GINSN) + +/* Beyond the target defined number of registers to be tracked + (SCFI_MAX_REG_ID), keep the next register ID, in sequence, for REG_CFA. */ +#define REG_CFA (SCFI_MAX_REG_ID+1) +/* Define the total number of registers being tracked. + Used as index into an array of cfi_reglocS. Note that a ginsn may carry a + register number greater than MAX_NUM_SCFI_REGS, e.g., for the ginsns + corresponding to push fs/gs in AMD64. */ +#define MAX_NUM_SCFI_REGS (REG_CFA+1) + +#define REG_INVALID ((unsigned int)-1) + +enum cfi_reglocstate +{ + CFI_UNDEFINED, + CFI_IN_REG, + CFI_ON_STACK +}; + +/* Location at which CFI register is saved. + + A CFI register (callee-saved registers, RA/LR) are always an offset from + the CFA. REG_CFA itself, however, may have REG_SP or REG_FP as base + register. Hence, keep the base reg ID and offset per tracked register. */ + +struct cfi_regloc +{ + /* Base reg ID (DWARF register number). */ + unsigned int base; + /* Location as offset from the CFA. */ + offsetT offset; + /* Current state of the CFI register. */ + enum cfi_reglocstate state; +}; + +typedef struct cfi_regloc cfi_reglocS; + +struct scfi_op_data +{ + const char *name; +}; + +typedef struct scfi_op_data scfi_op_dataS; + +/* SCFI operation. + + An SCFI operation represents a single atomic change to the SCFI state. + This can also be understood as an abstraction for what eventually gets + emitted as a DWARF CFI operation. */ + +struct scfi_op +{ + /* An SCFI op updates the state of either the CFA or other tracked + (callee-saved, REG_SP etc) registers. 'reg' is in the DWARF register + number space and must be strictly less than MAX_NUM_SCFI_REGS. */ + unsigned int reg; + /* Location of the reg. */ + cfi_reglocS loc; + /* DWARF CFI opcode. */ + uint32_t dw2cfi_op; + /* Some SCFI ops, e.g., for CFI_label, may need to carry additional data. */ + scfi_op_dataS *op_data; + /* A linked list. */ + struct scfi_op *next; +}; + +/* SCFI State - accumulated unwind information at a PC. + + SCFI state is the accumulated unwind information encompassing: + - REG_SP, REG_FP, + - RA, and + - all callee-saved registers. + + Note that SCFI_MAX_REG_ID is target/ABI dependent and is provided by the + backends. The backend must also identify the DWARF register numbers for + the REG_SP, and REG_FP registers. */ + +struct scfi_state +{ + cfi_reglocS regs[MAX_NUM_SCFI_REGS]; + cfi_reglocS scratch[MAX_NUM_SCFI_REGS]; + /* Current stack size. */ + offsetT stack_size; + /* Whether the stack size is known. + Stack size may become untraceable depending on the specific stack + manipulation machine instruction, e.g., rsp = rsp op reg instruction + makes the stack size untraceable. */ + bool traceable_p; +}; + +/* Initialize a new SCFI op. */ + +static scfi_opS * +init_scfi_op (void) +{ + scfi_opS *op = XCNEW (scfi_opS); + + return op; +} + +/* Free the SCFI ops, given the HEAD of the list. */ + +void +scfi_ops_cleanup (scfi_opS **head) +{ + scfi_opS *op; + scfi_opS *next; + + if (!head || !*head) + return; + + op = *head; + next = op->next; + + while (op) + { + free (op); + op = next; + next = op ? op->next : NULL; + } +} + +/* Compare two SCFI states. */ + +static int +cmp_scfi_state (scfi_stateS *state1, scfi_stateS *state2) +{ + int ret; + + if (!state1 || !state2) + ret = 1; + + /* Skip comparing the scratch[] value of registers. The user visible + unwind information is derived from the regs[] from the SCFI state. */ + ret = memcmp (state1->regs, state2->regs, + sizeof (cfi_reglocS) * MAX_NUM_SCFI_REGS); + + /* For user functions which perform dynamic stack allocation, after switching + t REG_FP based CFA tracking, it is perfectly possible to have stack usage + in some control flows. However, double-checking that all control flows + have the same idea of CFA tracking before this wont hurt. */ + gas_assert (state1->regs[REG_CFA].base == state2->regs[REG_CFA].base); + if (state1->regs[REG_CFA].base == REG_SP) + ret |= state1->stack_size != state2->stack_size; + + ret |= state1->traceable_p != state2->traceable_p; + + return ret; +} + +#if 0 +static void +scfi_state_update_reg (scfi_stateS *state, uint32_t dst, uint32_t base, + int32_t offset) +{ + if (dst >= MAX_NUM_SCFI_REGS) + return; + + state->regs[dst].base = base; + state->regs[dst].offset = offset; +} +#endif + +/* Update the SCFI state of REG as available on execution stack at OFFSET + from REG_CFA (BASE). + + Note that BASE must be REG_CFA, because any other base (REG_SP, REG_FP) + is by definition transitory in the function. */ + +static void +scfi_state_save_reg (scfi_stateS *state, unsigned int reg, unsigned int base, + offsetT offset) +{ + if (reg >= MAX_NUM_SCFI_REGS) + return; + + gas_assert (base == REG_CFA); + + state->regs[reg].base = base; + state->regs[reg].offset = offset; + state->regs[reg].state = CFI_ON_STACK; +} + +static void +scfi_state_restore_reg (scfi_stateS *state, unsigned int reg) +{ + if (reg >= MAX_NUM_SCFI_REGS) + return; + + /* Sanity check. See Rule 4. */ + gas_assert (state->regs[reg].state == CFI_ON_STACK); + gas_assert (state->regs[reg].base == REG_CFA); + + state->regs[reg].base = reg; + state->regs[reg].offset = 0; + /* PS: the register may still be on stack much after the restore, but the + SCFI state keeps the state as 'in register'. */ + state->regs[reg].state = CFI_IN_REG; +} + +/* Identify if the given GAS instruction GINSN saves a register + (of interest) on stack. */ + +static bool +ginsn_scfi_save_reg_p (ginsnS *ginsn, scfi_stateS *state) +{ + bool save_reg_p = false; + struct ginsn_src *src; + struct ginsn_dst *dst; + + src = ginsn_get_src1 (ginsn); + dst = ginsn_get_dst (ginsn); + + /* The first save to stack of callee-saved register is deemed as + register save. */ + if (!ginsn_track_reg_p (ginsn_get_src_reg (src), GINSN_GEN_SCFI) + || state->regs[ginsn_get_src_reg (src)].state == CFI_ON_STACK) + return save_reg_p; + + /* A register save insn may be an indirect mov. */ + if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT + && (ginsn_get_dst_reg (dst) == REG_SP + || (ginsn_get_dst_reg (dst) == REG_FP + && state->regs[REG_CFA].base == REG_FP))) + save_reg_p = true; + /* or an explicit store to stack. */ + else if (ginsn->type == GINSN_TYPE_STORE + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT + && ginsn_get_dst_reg (dst) == REG_SP) + save_reg_p = true; + + return save_reg_p; +} + +/* Identify if the given GAS instruction GINSN restores a register + (of interest) on stack. */ + +static bool +ginsn_scfi_restore_reg_p (ginsnS *ginsn, scfi_stateS *state) +{ + bool restore_reg_p = false; + struct ginsn_dst *dst; + struct ginsn_src *src1; + + dst = ginsn_get_dst (ginsn); + src1 = ginsn_get_src1 (ginsn); + + if (!ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI)) + return restore_reg_p; + + /* A register restore insn may be an indirect mov... */ + if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT + && (ginsn_get_src_reg (src1) == REG_SP + || (ginsn_get_src_reg (src1) == REG_FP + && state->regs[REG_CFA].base == REG_FP))) + restore_reg_p = true; + /* ...or an explicit load from stack. */ + else if (ginsn->type == GINSN_TYPE_LOAD + && ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT + && ginsn_get_src_reg (src1) == REG_SP) + restore_reg_p = true; + + return restore_reg_p; +} + +/* Append the SCFI operation OP to the list of SCFI operations in the + given GINSN. */ + +static int +ginsn_append_scfi_op (ginsnS *ginsn, scfi_opS *op) +{ + scfi_opS *sop; + + if (!ginsn || !op) + return 1; + + if (!ginsn->scfi_ops) + { + ginsn->scfi_ops = XCNEW (scfi_opS *); + *ginsn->scfi_ops = op; + } + else + { + /* Add to tail. Most ginsns have a single SCFI operation, + so this traversal for every insertion is acceptable for now. */ + sop = *ginsn->scfi_ops; + while (sop->next) + sop = sop->next; + + sop->next = op; + } + ginsn->num_scfi_ops++; + + return 0; +} + +static void +scfi_op_add_def_cfa_reg (scfi_stateS *state, ginsnS *ginsn, unsigned int reg) +{ + scfi_opS *op = NULL; + + state->regs[REG_CFA].base = reg; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_def_cfa_register; + op->reg = REG_CFA; + op->loc = state->regs[REG_CFA]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfa_offset_inc (scfi_stateS *state, ginsnS *ginsn, offsetT num) +{ + scfi_opS *op = NULL; + + state->regs[REG_CFA].offset -= num; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_def_cfa_offset; + op->reg = REG_CFA; + op->loc = state->regs[REG_CFA]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfa_offset_dec (scfi_stateS *state, ginsnS *ginsn, offsetT num) +{ + scfi_opS *op = NULL; + + state->regs[REG_CFA].offset += num; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_def_cfa_offset; + op->reg = REG_CFA; + op->loc = state->regs[REG_CFA]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_def_cfa (scfi_stateS *state, ginsnS *ginsn, unsigned int reg, + offsetT num) +{ + scfi_opS *op = NULL; + + state->regs[REG_CFA].base = reg; + state->regs[REG_CFA].offset = num; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_def_cfa; + op->reg = REG_CFA; + op->loc = state->regs[REG_CFA]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfi_offset (scfi_stateS *state, ginsnS *ginsn, unsigned int reg) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_offset; + op->reg = reg; + op->loc = state->regs[reg]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfa_restore (ginsnS *ginsn, unsigned int reg) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_restore; + op->reg = reg; + op->loc.base = REG_INVALID; + op->loc.offset = 0; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfi_remember_state (ginsnS *ginsn) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_remember_state; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfi_restore_state (ginsnS *ginsn) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_restore_state; + + /* FIXME - add to the beginning of the scfi_ops. */ + ginsn_append_scfi_op (ginsn, op); +} + +void +scfi_op_add_cfi_label (ginsnS *ginsn, const char *name) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + op->dw2cfi_op = CFI_label; + op->op_data = XCNEW (scfi_op_dataS); + op->op_data->name = name; + + ginsn_append_scfi_op (ginsn, op); +} + +void +scfi_op_add_signal_frame (ginsnS *ginsn) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + op->dw2cfi_op = CFI_signal_frame; + + ginsn_append_scfi_op (ginsn, op); +} + +static int +verify_heuristic_traceable_reg_fp (ginsnS *ginsn, scfi_stateS *state) +{ + /* The function uses this variable to issue error to user right away. */ + int fp_traceable_p = 0; + struct ginsn_dst *dst; + struct ginsn_src *src1; + struct ginsn_src *src2; + + src1 = ginsn_get_src1 (ginsn); + src2 = ginsn_get_src2 (ginsn); + dst = ginsn_get_dst (ginsn); + + /* Stack manipulation can be done in a variety of ways. A program may + allocate stack statically or may perform dynamic stack allocation in + the prologue. + + The SCFI machinery in GAS is based on some heuristics: + + - Rule 3 If the base register for CFA tracking is REG_FP, the program + must not clobber REG_FP, unless it is for switch to REG_SP based CFA + tracking (via say, a pop %rbp in X86). */ + + /* Check all applicable instructions with dest REG_FP, when the CFA base + register is REG_FP. */ + if (state->regs[REG_CFA].base == REG_FP && ginsn_get_dst_reg (dst) == REG_FP) + { + /* Excuse the add/sub with imm usage: They are OK. */ + if ((ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB) + && ginsn_get_src_reg (src1) == REG_FP + && ginsn_get_src_type (src2) == GINSN_SRC_IMM) + fp_traceable_p = 0; + /* REG_FP restore is OK too. */ + else if (ginsn->type == GINSN_TYPE_LOAD) + fp_traceable_p = 0; + /* mov's to memory with REG_FP base do not make REG_FP untraceable. */ + else if (ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT + && (ginsn->type == GINSN_TYPE_MOV + || ginsn->type == GINSN_TYPE_STORE)) + fp_traceable_p = 0; + /* Manipulations of the values possibly on stack are OK too. */ + else if ((ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB + || ginsn->type == GINSN_TYPE_AND) + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT) + fp_traceable_p = 0; + /* All other ginsns with REG_FP as destination make REG_FP not + traceable. */ + else + fp_traceable_p = 1; + } + + if (fp_traceable_p) + as_bad_where (ginsn->file, ginsn->line, + _("SCFI: usage of REG_FP as scratch not supported")); + + return fp_traceable_p; +} + +static int +verify_heuristic_traceable_stack_manipulation (ginsnS *ginsn, + scfi_stateS *state) +{ + /* The function uses this variable to issue error to user right away. */ + int sp_untraceable_p = 0; + bool possibly_untraceable = false; + struct ginsn_dst *dst; + struct ginsn_src *src1; + struct ginsn_src *src2; + + src1 = ginsn_get_src1 (ginsn); + src2 = ginsn_get_src2 (ginsn); + dst = ginsn_get_dst (ginsn); + + /* Stack manipulation can be done in a variety of ways. A program may + allocate stack statically in prologue or may need to do dynamic stack + allocation. + + The SCFI machinery in GAS is based on some heuristics: + + - Rule 1 The base register for CFA tracking may be either REG_SP or + REG_FP. + + - Rule 2 If the base register for CFA tracking is REG_SP, the precise + amount of stack usage (and hence, the value of rsp) must be known at + all times. */ + + if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_dst_type (dst) == GINSN_DST_REG + && ginsn_get_dst_reg (dst) == REG_SP + && ginsn_get_src_type (src1) == GINSN_SRC_REG + /* Exclude mov %rbp, %rsp from this check. */ + && ginsn_get_src_reg (src1) != REG_FP) + { + /* mov %reg, %rsp. */ + /* A previous mov %rsp, %reg must have been seen earlier for this to be + an OK for stack manipulation. */ + if (state->scratch[ginsn_get_src_reg (src1)].base != REG_CFA + || state->scratch[ginsn_get_src_reg (src1)].state != CFI_IN_REG) + { + possibly_untraceable = true; + } + } + /* Check add/sub/and insn usage when CFA base register is REG_SP. + Any stack size manipulation, including stack realignment is not allowed + if CFA base register is REG_SP. */ + else if (ginsn_get_dst_type (dst) == GINSN_DST_REG + && ginsn_get_dst_reg (dst) == REG_SP + && (((ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB) + && ginsn_get_src_type (src2) != GINSN_SRC_IMM) + || ginsn->type == GINSN_TYPE_AND + || ginsn->type == GINSN_TYPE_OTHER)) + possibly_untraceable = true; + /* If a register save operation is seen when REG_SP is untraceable, + CFI cannot be synthesized for register saves, hence bail out. */ + else if (ginsn_scfi_save_reg_p (ginsn, state) && !state->traceable_p) + { + sp_untraceable_p = 1; + /* If, however, the register save is an REG_FP-based, indirect mov + like: mov reg, disp(%rbp) and CFA base register is REG_BP, + untraceable REG_SP is not a problem. */ + if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT + && (ginsn_get_dst_reg (dst) == REG_FP + && state->regs[REG_CFA].base == REG_FP)) + sp_untraceable_p = 0; + } + else if (ginsn_scfi_restore_reg_p (ginsn, state) && !state->traceable_p) + { + if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT + && (ginsn_get_src_reg (src1) == REG_SP + || (ginsn_get_src_reg (src1) == REG_FP + && state->regs[REG_CFA].base != REG_FP))) + sp_untraceable_p = 1; + } + + if (possibly_untraceable) + { + /* See Rule 2. For SP-based CFA, this makes CFA tracking not possible. + Propagate now to caller. */ + if (state->regs[REG_CFA].base == REG_SP) + sp_untraceable_p = 1; + else if (state->traceable_p) + { + /* An extension of Rule 2. + For FP-based CFA, this may be a problem *if* certain specific + changes to the SCFI state are seen beyond this point, e.g., + register save / restore from stack. */ + gas_assert (state->regs[REG_CFA].base == REG_FP); + /* Simply make a note in the SCFI state object for now and + continue. Indicate an error when register save / restore + for callee-saved registers is seen. */ + sp_untraceable_p = 0; + state->traceable_p = false; + } + } + + if (sp_untraceable_p) + as_bad_where (ginsn->file, ginsn->line, + _("SCFI: unsupported stack manipulation pattern")); + + return sp_untraceable_p; +} + +static int +verify_heuristic_symmetrical_restore_reg (scfi_stateS *state, ginsnS* ginsn) +{ + int sym_restore = true; + offsetT expected_offset = 0; + struct ginsn_src *src1; + struct ginsn_dst *dst; + unsigned int reg; + + /* Rule 4: Save and Restore of callee-saved registers must be symmetrical. + It is expected that value of the saved register is restored correctly. + E.g., + push reg1 + push reg2 + ... + body of func which uses reg1 , reg2 as scratch, + and may be even spills them to stack. + ... + pop reg2 + pop reg1 + It is difficult to verify the Rule 4 in all cases. For the SCFI machinery, + it is difficult to separate prologue-epilogue from the body of the function + + Hence, the SCFI machinery at this time, should only warn on an asymetrical + restore. */ + src1 = ginsn_get_src1 (ginsn); + dst = ginsn_get_dst (ginsn); + reg = ginsn_get_dst_reg (dst); + + /* For non callee-saved registers, calling the API is meaningless. */ + if (!ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI)) + return sym_restore; + + /* The register must have been saved on stack, for sure. */ + gas_assert (state->regs[reg].state == CFI_ON_STACK); + gas_assert (state->regs[reg].base == REG_CFA); + + if ((ginsn->type == GINSN_TYPE_MOV + || ginsn->type == GINSN_TYPE_LOAD) + && ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT + && (ginsn_get_src_reg (src1) == REG_SP + || (ginsn_get_src_reg (src1) == REG_FP + && state->regs[REG_CFA].base == REG_FP))) + { + /* mov disp(%rsp), reg. */ + /* mov disp(%rbp), reg. */ + expected_offset = (((ginsn_get_src_reg (src1) == REG_SP) + ? -state->stack_size + : state->regs[REG_FP].offset) + + ginsn_get_src_disp (src1)); + } + + sym_restore = (expected_offset == state->regs[reg].offset); + + return sym_restore; +} + +/* Perform symbolic execution of the GINSN and update its list of scfi_ops. + scfi_ops are later used to directly generate the DWARF CFI directives. + Also update the SCFI state object STATE for the caller. */ + +static int +gen_scfi_ops (ginsnS *ginsn, scfi_stateS *state) +{ + int ret = 0; + offsetT offset; + struct ginsn_src *src1; + struct ginsn_src *src2; + struct ginsn_dst *dst; + + if (!ginsn || !state) + ret = 1; + + /* For the first ginsn (of type GINSN_TYPE_SYMBOL) in the gbb, generate + the SCFI op with DW_CFA_def_cfa. Note that the register and offset are + target-specific. */ + if (GINSN_F_FUNC_BEGIN_P (ginsn)) + { + scfi_op_add_def_cfa (state, ginsn, REG_SP, SCFI_INIT_CFA_OFFSET); + state->stack_size += SCFI_INIT_CFA_OFFSET; + return ret; + } + + src1 = ginsn_get_src1 (ginsn); + src2 = ginsn_get_src2 (ginsn); + dst = ginsn_get_dst (ginsn); + + ret = verify_heuristic_traceable_stack_manipulation (ginsn, state); + if (ret) + return ret; + + ret = verify_heuristic_traceable_reg_fp (ginsn, state); + if (ret) + return ret; + + switch (ginsn->dst.type) + { + case GINSN_DST_REG: + switch (ginsn->type) + { + case GINSN_TYPE_MOV: + if (ginsn_get_src_type (src1) == GINSN_SRC_REG + && ginsn_get_src_reg (src1) == REG_SP + && ginsn_get_dst_reg (dst) == REG_FP + && state->regs[REG_CFA].base == REG_SP) + { + /* mov %rsp, %rbp. */ + scfi_op_add_def_cfa_reg (state, ginsn, ginsn_get_dst_reg (dst)); + } + else if (ginsn_get_src_type (src1) == GINSN_SRC_REG + && ginsn_get_src_reg (src1) == REG_FP + && ginsn_get_dst_reg (dst) == REG_SP + && state->regs[REG_CFA].base == REG_FP) + { + /* mov %rbp, %rsp. */ + state->stack_size = -state->regs[REG_FP].offset; + scfi_op_add_def_cfa_reg (state, ginsn, ginsn_get_dst_reg (dst)); + state->traceable_p = true; + } + else if (ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT + && (ginsn_get_src_reg (src1) == REG_SP + || ginsn_get_src_reg (src1) == REG_FP) + && ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI)) + { + /* mov disp(%rsp), reg. */ + /* mov disp(%rbp), reg. */ + if (verify_heuristic_symmetrical_restore_reg (state, ginsn)) + { + scfi_state_restore_reg (state, ginsn_get_dst_reg (dst)); + scfi_op_add_cfa_restore (ginsn, ginsn_get_dst_reg (dst)); + } + else + as_warn_where (ginsn->file, ginsn->line, + _("SCFI: asymetrical register restore")); + } + else if (ginsn_get_src_type (src1) == GINSN_SRC_REG + && ginsn_get_dst_type (dst) == GINSN_DST_REG + && ginsn_get_src_reg (src1) == REG_SP) + { + /* mov %rsp, %reg. */ + /* The value of rsp is taken directly from state->stack_size. + IMP: The workflow in gen_scfi_ops must keep it updated. + PS: Not taking the value from state->scratch[REG_SP] is + intentional. */ + state->scratch[ginsn_get_dst_reg (dst)].base = REG_CFA; + state->scratch[ginsn_get_dst_reg (dst)].offset = -state->stack_size; + state->scratch[ginsn_get_dst_reg (dst)].state = CFI_IN_REG; + } + else if (ginsn_get_src_type (src1) == GINSN_SRC_REG + && ginsn_get_dst_type (dst) == GINSN_DST_REG + && ginsn_get_dst_reg (dst) == REG_SP) + { + /* mov %reg, %rsp. */ + /* Keep the value of REG_SP updated. */ + if (state->scratch[ginsn_get_src_reg (src1)].state == CFI_IN_REG) + { + state->stack_size = -state->scratch[ginsn_get_src_reg (src1)].offset; + state->traceable_p = true; + } +# if 0 + scfi_state_update_reg (state, ginsn_get_dst_reg (dst), + state->scratch[ginsn_get_src_reg (src1)].base, + state->scratch[ginsn_get_src_reg (src1)].offset); +#endif + + } + break; + case GINSN_TYPE_SUB: + if (ginsn_get_src_reg (src1) == REG_SP + && ginsn_get_dst_reg (dst) == REG_SP) + { + /* Stack inc/dec offset, when generated due to stack push and pop is + target-specific. Use the value encoded in the ginsn. */ + state->stack_size += ginsn_get_src_imm (src2); + if (state->regs[REG_CFA].base == REG_SP) + { + /* push reg. */ + scfi_op_add_cfa_offset_dec (state, ginsn, ginsn_get_src_imm (src2)); + } + } + break; + case GINSN_TYPE_ADD: + if (ginsn_get_src_reg (src1) == REG_SP + && ginsn_get_dst_reg (dst) == REG_SP) + { + /* Stack inc/dec offset is target-specific. Use the value + encoded in the ginsn. */ + state->stack_size -= ginsn_get_src_imm (src2); + /* pop %reg affects CFA offset only if CFA is currently + stack-pointer based. */ + if (state->regs[REG_CFA].base == REG_SP) + { + scfi_op_add_cfa_offset_inc (state, ginsn, ginsn_get_src_imm (src2)); + } + } + else if (ginsn_get_src_reg (src1) == REG_FP + && ginsn_get_dst_reg (dst) == REG_SP + && state->regs[REG_CFA].base == REG_FP) + { + /* FIXME - what is this for ? */ + state->stack_size = 0 - (state->regs[REG_FP].offset + ginsn_get_src_imm (src2)); + } + break; + case GINSN_TYPE_LOAD: + /* If this is a load from stack. */ + if (ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT + && (ginsn_get_src_reg (src1) == REG_SP + || (ginsn_get_src_reg (src1) == REG_FP + && state->regs[REG_CFA].base == REG_FP))) + { + /* pop %rbp when CFA tracking is REG_FP based. */ + if (ginsn_get_dst_reg (dst) == REG_FP + && state->regs[REG_CFA].base == REG_FP) + { + scfi_op_add_def_cfa_reg (state, ginsn, REG_SP); + if (state->regs[REG_CFA].offset != state->stack_size) + scfi_op_add_cfa_offset_inc (state, ginsn, + (state->regs[REG_CFA].offset - state->stack_size)); + } + if (ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI)) + { + if (verify_heuristic_symmetrical_restore_reg (state, ginsn)) + { + scfi_state_restore_reg (state, ginsn_get_dst_reg (dst)); + scfi_op_add_cfa_restore (ginsn, ginsn_get_dst_reg (dst)); + } + else + as_warn_where (ginsn->file, ginsn->line, + _("SCFI: asymetrical register restore")); + } + } + break; + default: + break; + } + break; + + case GINSN_DST_INDIRECT: + /* Some operations with an indirect access to memory (or even to stack) + may still be uninteresting for SCFI purpose (e.g, addl %edx, -32(%rsp) + in x86). In case of x86_64, these can neither be a register + save / unsave, nor can alter the stack size. + PS: This condition may need to be revisited for other arches. */ + if (ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB + || ginsn->type == GINSN_TYPE_AND) + break; + gas_assert (ginsn->type == GINSN_TYPE_MOV + || ginsn->type == GINSN_TYPE_STORE + || ginsn->type == GINSN_TYPE_LOAD); + /* mov reg, disp(%rbp) */ + /* mov reg, disp(%rsp) */ + if (ginsn_scfi_save_reg_p (ginsn, state)) + { + if (ginsn_get_dst_reg (dst) == REG_SP) + { + /* mov reg, disp(%rsp) */ + offset = 0 - state->stack_size + ginsn_get_dst_disp (dst); + scfi_state_save_reg (state, ginsn_get_src_reg (src1), REG_CFA, offset); + scfi_op_add_cfi_offset (state, ginsn, ginsn_get_src_reg (src1)); + } + else if (ginsn_get_dst_reg (dst) == REG_FP) + { + gas_assert (state->regs[REG_CFA].base == REG_FP); + /* mov reg, disp(%rbp) */ + offset = 0 - state->regs[REG_CFA].offset + ginsn_get_dst_disp (dst); + scfi_state_save_reg (state, ginsn_get_src_reg (src1), REG_CFA, offset); + scfi_op_add_cfi_offset (state, ginsn, ginsn_get_src_reg (src1)); + } + } + break; + + default: + /* Skip GINSN_DST_UNKNOWN and GINSN_DST_MEM as they are uninteresting + currently for SCFI. */ + break; + } + + return ret; +} + +/* Recursively perform forward flow of the (unwind information) SCFI state + starting at basic block GBB. + + The forward flow process propagates the SCFI state at exit of a basic block + to the successor basic block. + + Returns error code, if any. */ + +static int +forward_flow_scfi_state (gcfgS *gcfg, gbbS *gbb, scfi_stateS *state) +{ + ginsnS *ginsn; + gbbS *prev_bb; + gedgeS *gedge = NULL; + int ret = 0; + + if (gbb->visited) + { + /* Check that the SCFI state is the same as previous. */ + ret = cmp_scfi_state (state, gbb->entry_state); + if (ret) + as_bad (_("SCFI: Bad CFI propagation perhaps")); + return ret; + } + + gbb->visited = true; + + gbb->entry_state = XCNEW (scfi_stateS); + memcpy (gbb->entry_state, state, sizeof (scfi_stateS)); + + /* Perform symbolic execution of each ginsn in the gbb and update the + scfi_ops list of each ginsn (and also update the STATE object). */ + bb_for_each_insn(gbb, ginsn) + { + ret = gen_scfi_ops (ginsn, state); + if (ret) + goto fail; + } + + gbb->exit_state = XCNEW (scfi_stateS); + memcpy (gbb->exit_state, state, sizeof (scfi_stateS)); + + /* Forward flow the SCFI state. Currently, we process the next basic block + in DFS order. But any forward traversal order should be fine. */ + prev_bb = gbb; + if (gbb->num_out_gedges) + { + bb_for_each_edge(gbb, gedge) + { + gbb = gedge->dst_bb; + if (gbb->visited) + { + ret = cmp_scfi_state (gbb->entry_state, state); + if (ret) + goto fail; + } + + if (!gedge->visited) + { + gedge->visited = true; + + /* Entry SCFI state for the destination bb of the edge is the + same as the exit SCFI state of the source bb of the edge. */ + memcpy (state, prev_bb->exit_state, sizeof (scfi_stateS)); + ret = forward_flow_scfi_state (gcfg, gbb, state); + if (ret) + goto fail; + } + } + } + + return 0; + +fail: + + if (gedge) + gedge->visited = true; + return 1; +} + +static int +backward_flow_scfi_state (const symbolS *func ATTRIBUTE_UNUSED, gcfgS *gcfg) +{ + gbbS **prog_order_bbs; + gbbS **restore_bbs; + gbbS *current_bb; + gbbS *prev_bb; + gbbS *dst_bb; + ginsnS *ginsn; + gedgeS *gedge = NULL; + + int ret = 0; + uint64_t i, j; + + /* Basic blocks in reverse program order. */ + prog_order_bbs = XCNEWVEC (gbbS *, gcfg->num_gbbs); + /* Basic blocks for which CFI remember op needs to be generated. */ + restore_bbs = XCNEWVEC (gbbS *, gcfg->num_gbbs); + + gcfg_get_bbs_in_prog_order (gcfg, prog_order_bbs); + + i = gcfg->num_gbbs - 1; + /* Traverse in reverse program order. */ + while (i > 0) + { + current_bb = prog_order_bbs[i]; + prev_bb = prog_order_bbs[i-1]; + if (cmp_scfi_state (prev_bb->exit_state, current_bb->entry_state)) + { + /* Candidate for .cfi_restore_state found. */ + ginsn = bb_get_first_ginsn (current_bb); + scfi_op_add_cfi_restore_state (ginsn); + /* Memorize current_bb now to find location for its remember state + later. */ + restore_bbs[i] = current_bb; + } + else + { + bb_for_each_edge (current_bb, gedge) + { + dst_bb = gedge->dst_bb; + for (j = 0; j < gcfg->num_gbbs; j++) + if (restore_bbs[j] == dst_bb) + { + ginsn = bb_get_last_ginsn (current_bb); + scfi_op_add_cfi_remember_state (ginsn); + /* Remove the memorised restore_bb from the list. */ + restore_bbs[j] = NULL; + break; + } + } + } + i--; + } + + /* All .cfi_restore_state pseudo-ops must have a corresponding + .cfi_remember_state by now. */ + for (j = 0; j < gcfg->num_gbbs; j++) + if (restore_bbs[j] != NULL) + { + ret = 1; + break; + } + + free (restore_bbs); + free (prog_order_bbs); + + return ret; +} + +/* Synthesize DWARF CFI for a function. */ + +int +scfi_synthesize_dw2cfi (const symbolS *func, gcfgS *gcfg, gbbS *root_bb) +{ + int ret; + scfi_stateS *init_state; + + init_state = XCNEW (scfi_stateS); + init_state->traceable_p = true; + + /* Traverse the input GCFG and perform forward flow of information. + Update the scfi_op(s) per ginsn. */ + ret = forward_flow_scfi_state (gcfg, root_bb, init_state); + if (ret) + { + as_bad (_("SCFI: forward pass failed for func '%s'"), S_GET_NAME (func)); + goto end; + } + + ret = backward_flow_scfi_state (func, gcfg); + if (ret) + { + as_bad (_("SCFI: backward pass failed for func '%s'"), S_GET_NAME (func)); + goto end; + } + +end: + free (init_state); + return ret; +} + +static int +handle_scfi_dot_cfi (ginsnS *ginsn) +{ + scfi_opS *op; + + /* Nothing to do. */ + if (!ginsn->scfi_ops) + return 0; + + op = *ginsn->scfi_ops; + if (!op) + goto bad; + + while (op) + { + switch (op->dw2cfi_op) + { + case DW_CFA_def_cfa_register: + scfi_dot_cfi (DW_CFA_def_cfa_register, op->loc.base, 0, 0, NULL, + ginsn->sym); + break; + case DW_CFA_def_cfa_offset: + scfi_dot_cfi (DW_CFA_def_cfa_offset, op->loc.base, 0, + op->loc.offset, NULL, ginsn->sym); + break; + case DW_CFA_def_cfa: + scfi_dot_cfi (DW_CFA_def_cfa, op->loc.base, 0, op->loc.offset, + NULL, ginsn->sym); + break; + case DW_CFA_offset: + scfi_dot_cfi (DW_CFA_offset, op->reg, 0, op->loc.offset, NULL, + ginsn->sym); + break; + case DW_CFA_restore: + scfi_dot_cfi (DW_CFA_restore, op->reg, 0, 0, NULL, ginsn->sym); + break; + case DW_CFA_remember_state: + scfi_dot_cfi (DW_CFA_remember_state, 0, 0, 0, NULL, ginsn->sym); + break; + case DW_CFA_restore_state: + scfi_dot_cfi (DW_CFA_restore_state, 0, 0, 0, NULL, ginsn->sym); + break; + case CFI_label: + scfi_dot_cfi (CFI_label, 0, 0, 0, op->op_data->name, ginsn->sym); + break; + case CFI_signal_frame: + scfi_dot_cfi (CFI_signal_frame, 0, 0, 0, NULL, ginsn->sym); + break; + default: + goto bad; + break; + } + op = op->next; + } + + return 0; +bad: + as_bad (_("SCFI: Invalid DWARF CFI opcode data")); + return 1; +} + +/* Emit Synthesized DWARF CFI. */ + +int +scfi_emit_dw2cfi (const symbolS *func) +{ + struct frch_ginsn_data *frch_gdata; + ginsnS* ginsn = NULL; + + frch_gdata = frchain_now->frch_ginsn_data; + ginsn = frch_gdata->gins_rootP; + + while (ginsn) + { + switch (ginsn->type) + { + case GINSN_TYPE_SYMBOL: + /* .cfi_startproc and .cfi_endproc pseudo-ops. */ + if (GINSN_F_FUNC_BEGIN_P (ginsn)) + { + scfi_dot_cfi_startproc (frch_gdata->start_addr); + break; + } + else if (GINSN_F_FUNC_END_P (ginsn)) + { + scfi_dot_cfi_endproc (ginsn->sym); + break; + } + /* Fall through. */ + case GINSN_TYPE_ADD: + case GINSN_TYPE_AND: + case GINSN_TYPE_CALL: + case GINSN_TYPE_JUMP: + case GINSN_TYPE_JUMP_COND: + case GINSN_TYPE_MOV: + case GINSN_TYPE_LOAD: + case GINSN_TYPE_PHANTOM: + case GINSN_TYPE_STORE: + case GINSN_TYPE_SUB: + case GINSN_TYPE_OTHER: + case GINSN_TYPE_RETURN: + + /* For all other SCFI ops, invoke the handler. */ + if (ginsn->scfi_ops) + handle_scfi_dot_cfi (ginsn); + break; + + default: + /* No other GINSN_TYPE_* expected. */ + as_bad (_("SCFI: bad ginsn for func '%s'"), + S_GET_NAME (func)); + break; + } + ginsn = ginsn->next; + } + return 0; +} + +#else + +int +scfi_emit_dw2cfi (const symbolS *func ATTRIBUTE_UNUSED) +{ + as_bad (_("SCFI: unsupported for target")); + return 1; +} + +int +scfi_synthesize_dw2cfi (const symbolS *func ATTRIBUTE_UNUSED, + gcfgS *gcfg ATTRIBUTE_UNUSED, + gbbS *root_bb ATTRIBUTE_UNUSED) +{ + as_bad (_("SCFI: unsupported for target")); + return 1; +} + +#endif /* defined (TARGET_USE_SCFI) && defined (TARGET_USE_GINSN). */ diff --git a/gas/scfi.h b/gas/scfi.h new file mode 100644 index 0000000..07abe99 --- /dev/null +++ b/gas/scfi.h @@ -0,0 +1,38 @@ +/* scfi.h - Support for synthesizing CFI for asm. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GAS, the GNU Assembler. + + GAS is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GAS is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GAS; see the file COPYING. If not, write to the Free + Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#ifndef SCFI_H +#define SCFI_H + +#include "as.h" +#include "ginsn.h" + +void scfi_ops_cleanup (scfi_opS **head); + +/* Some SCFI ops are not synthesized and are only added externally when parsing + the assembler input. Two examples are CFI_label, and CFI_signal_frame. */ +void scfi_op_add_cfi_label (ginsnS *ginsn, const char *name); +void scfi_op_add_signal_frame (ginsnS *ginsn); + +int scfi_emit_dw2cfi (const symbolS *func); + +int scfi_synthesize_dw2cfi (const symbolS *func, gcfgS *gcfg, gbbS *root_bb); + +#endif /* SCFI_H. */ diff --git a/gas/scfidw2gen.c b/gas/scfidw2gen.c index 1b3fb15..ebf2d24 100644 --- a/gas/scfidw2gen.c +++ b/gas/scfidw2gen.c @@ -19,6 +19,8 @@ 02110-1301, USA. */ #include "as.h" +#include "ginsn.h" +#include "scfi.h" #include "dw2gencfi.h" #include "subsegs.h" #include "scfidw2gen.h" @@ -43,15 +45,33 @@ dot_scfi_ignore (int ignored ATTRIBUTE_UNUSED) static void scfi_process_cfi_label (void) { - /* To be implemented. */ - return; + char *name; + ginsnS *ginsn; + + name = read_symbol_name (); + if (name == NULL) + return; + + /* Add a new ginsn. */ + ginsn = ginsn_new_phantom (symbol_temp_new_now ()); + frch_ginsn_data_append (ginsn); + + scfi_op_add_cfi_label (ginsn, name); + /* TODO. */ + // free (name); + + demand_empty_rest_of_line (); } static void scfi_process_cfi_signal_frame (void) { - /* To be implemented. */ - return; + ginsnS *ginsn; + + ginsn = ginsn_new_phantom (symbol_temp_new_now ()); + frch_ginsn_data_append (ginsn); + + scfi_op_add_signal_frame (ginsn); } static void diff --git a/gas/subsegs.c b/gas/subsegs.c index 42f42c5..6ecfc37 100644 --- a/gas/subsegs.c +++ b/gas/subsegs.c @@ -130,6 +130,7 @@ subseg_set_rest (segT seg, subsegT subseg) newP->frch_frag_now = frag_alloc (&newP->frch_obstack); newP->frch_frag_now->fr_type = rs_fill; newP->frch_cfi_data = NULL; + newP->frch_ginsn_data = NULL; newP->frch_root = newP->frch_last = newP->frch_frag_now; diff --git a/gas/subsegs.h b/gas/subsegs.h index 8de3950..d1e73fe 100644 --- a/gas/subsegs.h +++ b/gas/subsegs.h @@ -40,6 +40,7 @@ #include "obstack.h" struct frch_cfi_data; +struct frch_ginsn_data; struct frchain /* control building of a frag chain */ { /* FRCH = FRagment CHain control */ @@ -52,6 +53,7 @@ struct frchain /* control building of a frag chain */ struct obstack frch_obstack; /* for objects in this frag chain */ fragS *frch_frag_now; /* frag_now for this subsegment */ struct frch_cfi_data *frch_cfi_data; + struct frch_ginsn_data *frch_ginsn_data; }; typedef struct frchain frchainS; diff --git a/gas/symbols.c b/gas/symbols.c index 10a3f50..41f273c 100644 --- a/gas/symbols.c +++ b/gas/symbols.c @@ -25,6 +25,7 @@ #include "obstack.h" /* For "symbols.h" */ #include "subsegs.h" #include "write.h" +#include "scfi.h" #include <limits.h> #ifndef CHAR_BIT @@ -709,6 +710,8 @@ colon (/* Just seen "x:" - rattle symbols & frags. */ #ifdef obj_frob_label obj_frob_label (symbolP); #endif + if (flag_synth_cfi) + ginsn_frob_label (symbolP); return symbolP; } |