aboutsummaryrefslogtreecommitdiff
path: root/gas/doc
diff options
context:
space:
mode:
authorKen Raeburn <raeburn@cygnus>1995-04-10 20:06:48 +0000
committerKen Raeburn <raeburn@cygnus>1995-04-10 20:06:48 +0000
commitae6cd60f9e1dedfae16b6b52cac635ab816fcc6b (patch)
tree0332f3f0e37c4a40f0020b880fb7e585e2b8ca0f /gas/doc
parent7f390875ca707668c9825010cb3252da1fede97b (diff)
downloadgdb-ae6cd60f9e1dedfae16b6b52cac635ab816fcc6b.zip
gdb-ae6cd60f9e1dedfae16b6b52cac635ab816fcc6b.tar.gz
gdb-ae6cd60f9e1dedfae16b6b52cac635ab816fcc6b.tar.bz2
* doc/internals.texi: Add loud disclaimer. Refill to 79 columns, specify
fill-column in local-variables section. Change subheadings to subsections so they can be cross-referenced. Describe broken words, frags, frag chains, generic relaxation, relax table, m68k relaxation, m68k addressing modes, test suite code. Add a few words about various file formats.
Diffstat (limited to 'gas/doc')
-rw-r--r--gas/doc/internals.texi688
1 files changed, 592 insertions, 96 deletions
diff --git a/gas/doc/internals.texi b/gas/doc/internals.texi
index cc5b566..1ba9cfe 100644
--- a/gas/doc/internals.texi
+++ b/gas/doc/internals.texi
@@ -1,7 +1,19 @@
+\input texinfo
+@setfilename internals.info
@node Assembler Internals
@chapter Assembler Internals
@cindex internals
+This documentation is not ready for prime time yet. Not even close. It's not
+so much documentation as random blathering of mine intended to be notes to
+myself that may eventually be turned into real documentation.
+
+I take no responsibility for any negative effect it may have on your
+professional, personal, or spiritual life. Read it at your own risk. Caveat
+emptor. Delete before reading. Abandon all hope, ye who enter here.
+
+However, enhancements will be gratefully accepted.
+
@menu
* Data types:: Data types
@end menu
@@ -17,162 +29,602 @@ BFD, MANY_SECTIONS, BFD_HEADERS
@section Data types
@cindex internals, data types
-@subheading Symbols
+@subsection Symbols
@cindex internals, symbols
@cindex symbols, internal
... `local' symbols ... flags ...
-The definition for @code{struct symbol}, also known as @code{symbolS},
-is located in @file{struc-symbol.h}. Symbol structures can contain the
-following fields:
+The definition for @code{struct symbol}, also known as @code{symbolS}, is
+located in @file{struc-symbol.h}. Symbol structures can contain the following
+fields:
@table @code
@item sy_value
-This is an @code{expressionS} that describes the value of the symbol.
-It might refer to another symbol; if so, its true value may not be known
-until @code{foo} is run.
+This is an @code{expressionS} that describes the value of the symbol. It might
+refer to another symbol; if so, its true value may not be known until
+@code{foo} is called.
-More generally, however, ... undefined? ... or an offset from the start
-of a frag pointed to by the @code{sy_frag} field.
+More generally, however, ... undefined? ... or an offset from the start of a
+frag pointed to by the @code{sy_frag} field.
@item sy_resolved
-This field is non-zero if the symbol's value has been completely
-resolved. It is used during the final pass over the symbol table.
+This field is non-zero if the symbol's value has been completely resolved. It
+is used during the final pass over the symbol table.
@item sy_resolving
This field is used to detect loops while resolving the symbol's value.
@item sy_used_in_reloc
-This field is non-zero if the symbol is used by a relocation entry. If
-a local symbol is used in a relocation entry, it must be possible to
-redirect those relocations to other symbols, or this symbol cannot be
-removed from the final symbol list.
+This field is non-zero if the symbol is used by a relocation entry. If a local
+symbol is used in a relocation entry, it must be possible to redirect those
+relocations to other symbols, or this symbol cannot be removed from the final
+symbol list.
@item sy_next
@itemx sy_previous
-These pointers to other @code{symbolS} structures describe a singly or
-doubly linked list. (If @code{SYMBOLS_NEED_BACKPOINTERS} is not
-defined, the @code{sy_previous} field will be omitted.) These fields
-should be accessed with @code{symbol_next} and @code{symbol_previous}.
+These pointers to other @code{symbolS} structures describe a singly or doubly
+linked list. (If @code{SYMBOLS_NEED_BACKPOINTERS} is not defined, the
+@code{sy_previous} field will be omitted.) These fields should be accessed
+with @code{symbol_next} and @code{symbol_previous}.
@item sy_frag
This points to the @code{fragS} that this symbol is attached to.
@item sy_used
-Whether the symbol is used as an operand or in an expression. Note: Not
-all the backends keep this information accurate; backends which use this
-bit are responsible for setting it when a symbol is used in backend
-routines.
+Whether the symbol is used as an operand or in an expression. Note: Not all of
+the backends keep this information accurate; backends which use this bit are
+responsible for setting it when a symbol is used in backend routines.
@item bsym
-If @code{BFD_ASSEMBLER} is defined, this points to the @code{asymbol}
-that will be used in writing the object file.
+If @code{BFD_ASSEMBLER} is defined, this points to the @code{asymbol} that will
+be used in writing the object file.
@item sy_name_offset
-(Only used if @code{BFD_ASSEMBLER} is not defined.)
-This is the position of the symbol's name in the symbol table of the
-object file. On some formats, this will start at position 4, with
-position 0 reserved for unnamed symbols. This field is not used until
-@code{write_object_file} is called.
+(Only used if @code{BFD_ASSEMBLER} is not defined.) This is the position of
+the symbol's name in the symbol table of the object file. On some formats,
+this will start at position 4, with position 0 reserved for unnamed symbols.
+This field is not used until @code{write_object_file} is called.
@item sy_symbol
-(Only used if @code{BFD_ASSEMBLER} is not defined.)
-This is the format-specific symbol structure, as it would be written into
-the object file.
+(Only used if @code{BFD_ASSEMBLER} is not defined.) This is the
+format-specific symbol structure, as it would be written into the object file.
@item sy_number
-(Only used if @code{BFD_ASSEMBLER} is not defined.)
-This is a 24-bit symbol number, for use in constructing relocation table
-entries.
+(Only used if @code{BFD_ASSEMBLER} is not defined.) This is a 24-bit symbol
+number, for use in constructing relocation table entries.
@item sy_obj
-This format-specific data is of type @code{OBJ_SYMFIELD_TYPE}. If no
-macro by that name is defined in @file{obj-format.h}, this field is not
-defined.
+This format-specific data is of type @code{OBJ_SYMFIELD_TYPE}. If no macro by
+that name is defined in @file{obj-format.h}, this field is not defined.
@item sy_tc
-This processor-specific data is of type @code{TC_SYMFIELD_TYPE}. If no
-macro by that name is defined in @file{targ-cpu.h}, this field is not
-defined.
+This processor-specific data is of type @code{TC_SYMFIELD_TYPE}. If no macro
+by that name is defined in @file{targ-cpu.h}, this field is not defined.
@item TARGET_SYMBOL_FIELDS
-If this macro is defined, it defines additional fields in the symbol
-structure. This macro is obsolete, and should be replaced when possible
-by uses of @code{OBJ_SYMFIELD_TYPE} and @code{TC_SYMFIELD_TYPE}.
+If this macro is defined, it defines additional fields in the symbol structure.
+This macro is obsolete, and should be replaced when possible by uses of
+@code{OBJ_SYMFIELD_TYPE} and @code{TC_SYMFIELD_TYPE}.
@end table
-Access with S_SET_SEGMENT, S_SET_VALUE, S_GET_VALUE, S_GET_SEGMENT,
-etc., etc.
+Access with S_SET_SEGMENT, S_SET_VALUE, S_GET_VALUE, S_GET_SEGMENT, etc., etc.
-@foo Expressions
+@subsection Expressions
@cindex internals, expressions
@cindex expressions, internal
Expressions are stored as a combination of operator, symbols, blah.
-@subheading Fixups
+@subsection Fixups
@cindex internals, fixups
@cindex fixups
-@subheading Frags
+@subsection Frags
@cindex internals, frags
@cindex frags
-@subheading Broken Words
+The frag is the basic unit for storing section contents.
+
+@table @code
+
+@item fr_address
+The address of the frag. This is not set until the assembler rescans the list
+of all frags after the entire input file is parsed. The function
+@code{relax_segment} fills in this field.
+
+@item fr_next
+Pointer to the next frag in this (sub)section.
+
+@item fr_fix
+Fixed number of characters we know we're going to emit to the output file. May
+be zero.
+
+@item fr_var
+Variable number of characters we may output, after the initial @code{fr_fix}
+characters. May be zero.
+
+@item fr_symbol
+@itemx fr_offset
+Foo.
+
+@item fr_opcode
+Points to the lowest-addressed byte of the opcode, for use in relaxation.
+
+@item line
+Holds line-number info.
+
+@item fr_type
+Relaxation state. This field indicates the interpretation of @code{fr_offset},
+@code{fr_symbol} and the variable-length tail of the frag, as well as the
+treatment it gets in various phases of processing. It does not affect the
+initial @code{fr_fix} characters; they are always supposed to be output
+verbatim (fixups aside). See below for specific values this field can have.
+
+@item fr_subtype
+Relaxation substate. If the macro @code{md_relax_frag} isn't defined, this is
+assumed to be an index into @code{md_relax_table} for the generic relaxation
+code to process. (@xref{Relaxation}.) If @code{md_relax_frag} is defined,
+this field is available for any use by the CPU-specific code.
+
+@item align_mask
+@itemx align_offset
+These fields are not used yet. They are intended to keep track of the
+alignment of the current frag within its section, even if the exact offset
+isn't known. In many cases, we should be able to avoid creating extra frags
+when @code{.align} directives are given; instead, the number of bytes needed
+may be computable when the @code{.align} directive is processed. Hmm. Is this
+the right place for these, or should they be in the @code{frchainS} structure?
+
+@item fr_pcrel_adjust
+@itemx fr_bsr
+These fields are only used in the NS32k configuration. But since @code{struct
+frag} is defined before the CPU-specific header files are included, they must
+unconditionally be defined.
+
+@item fr_literal
+Declared as a one-character array, this last field grows arbitrarily large to
+hold the actual contents of the frag.
+
+@end table
+
+These are the possible relaxation states, provided in the enumeration type
+@code{relax_stateT}, and the interpretations they represent for the other
+fields:
+
+@table @code
+
+@item rs_align
+The start of the following frag should be aligned on some boundary. In this
+frag, @code{fr_offset} is the logarithm (base 2) of the alignment in bytes.
+(For example, if alignment on an 8-byte boundary were desired, @code{fr_offset}
+would have a value of 3.) The variable characters indicate the fill pattern to
+be used. (More than one?)
+
+@item rs_broken_word
+This indicates that ``broken word'' processing should be done. @xref{Broken
+Words,,Broken Words}. If broken word processing is not necessary on the target
+machine, this enumerator value will not be defined.
+
+@item rs_fill
+The variable characters are to be repeated @code{fr_offset} times. If
+@code{fr_offset} is 0, this frag has a length of @code{fr_fix}.
+
+@item rs_machine_dependent
+Displacement relaxation is to be done on this frag. The target is indicated by
+@code{fr_symbol} and @code{fr_offset}, and @code{fr_subtype} indicates the
+particular machine-specific addressing mode desired. @xref{Relaxation}.
+
+@item rs_org
+The start of the following frag should be pushed back to some specific offset
+within the section. (Some assemblers use the value as an absolute address; the
+@sc{gnu} assembler does not handle final absolute addresses, it requires that
+the linker set them.) The offset is given by @code{fr_symbol} and
+@code{fr_offset}; one character from the variable-length tail is used as the
+fill character.
+
+@end table
+
+A chain of frags is built up for each subsection. The data structure
+describing a chain is called a @code{frchainS}, and contains the following
+fields:
+
+@table @code
+@item frch_root
+Points to the first frag in the chain. May be null if there are no frags in
+this chain.
+@item frch_last
+Points to the last frag in the chain, or null if there are none.
+@item frch_next
+Next in the list of @code{frchainS} structures.
+@item frch_seg
+Indicates the section this frag chain belongs to.
+@item frch_subseg
+Subsection (subsegment) number of this frag chain.
+@item fix_root, fix_tail
+(Defined only if @code{BFD_ASSEMBLER} is defined.) Point to first and last
+@code{fixS} structures associated with this subsection.
+@item frch_obstack
+Not currently used. Intended to be used for frag allocation for this
+subsection. This should reduce frag generation caused by switching sections.
+@end table
+
+A @code{frchainS} corresponds to a subsection; each section has a list of
+@code{frchainS} records associated with it. In most cases, only one subsection
+of each section is used, so the list will only be one element long, but any
+processing of frag chains should be prepared to deal with multiple chains per
+section.
+
+After the input files have been completely processed, and no more frags are to
+be generated, the frag chains are joined into one per section for further
+processing. After this point, it is safe to operate on one chain per section.
+
+@node Broken Words
+@subsection Broken Words
@cindex internals, broken words
@cindex broken words
@cindex promises, promises
+The ``broken word'' idea derives from the fact that some compilers, including
+@code{gcc}, will sometimes emit switch tables specifying 16-bit @code{.word}
+displacements to branch targets, and branch instructions that load entries from
+that table to compute the target address. If this is done on a 32-bit machine,
+there is a chance (at least with really large functions) that the displacement
+will not fit in 16 bits. Thus the ``broken word'' idea is well named, since
+there is an implied promise that the 16-bit field will in fact hold the
+specified displacement.
+
+If the ``broken word'' processing is enabled, and a situation like this is
+encountered, the assembler will insert a jump instruction into the instruction
+stream, close enough to be reached with the 16-bit displacement. This jump
+instruction will transfer to the real desired target address. Thus, as long as
+the @code{.word} value really is used as a displacement to compute an address
+to jump to, the net effect will be correct (minus a very small efficiency
+cost). If @code{.word} directives with label differences for values are used
+for other purposes, however, things may not work properly. I think there is a
+command-line option to turn on warnings when a broken word is discovered.
+
+This code is turned off by the @code{WORKING_DOT_WORD} macro. It isn't needed
+if @code{.word} emits a value large enough to contain an address (or, more
+correctly, any possible difference between two addresses).
+
@node What Happens?
@section What Happens?
-Blah blah blah, initialization, argument parsing, file reading,
-whitespace munging, opcode parsing and lookup, operand parsing. Now
-it's time to write the output file.
+Blah blah blah, initialization, argument parsing, file reading, whitespace
+munging, opcode parsing and lookup, operand parsing. Now it's time to write
+the output file.
In @code{BFD_ASSEMBLER} mode, processing of relocations and symbols and
-creation of the output file is initiated by calling
-@code{write_object_file}.
+creation of the output file is initiated by calling @code{write_object_file}.
@node Target Dependent Definitions
@section Target Dependent Definitions
-@subheader Format-specific definitions
+@subsection Format-specific definitions
+
+@defmac obj_sec_sym_ok_for_reloc (section)
+(@code{BFD_ASSEMBLER} only.)
+Is it okay to use this section's section-symbol in a relocation entry? If not,
+a new internal-linkage symbol is generated and emitted if such a relocation
+entry is needed. (Default: Always use a new symbol.)
+
+@end defmac
-@defmac obj_sec_sym_ok_for_reloc section
+@defmac obj_adjust_symtab
(@code{BFD_ASSEMBLER} only.)
-Is it okay to use this section's section-symbol in a relocation entry?
-If not, a new internal-linkage symbol is generated and emitted if such a
-relocation entry is needed. (Default: Always use a new symbol.)
+If this macro is defined, it is invoked just before setting the symbol table of
+the output BFD. Any finalizing changes needed in the symbol table should be
+done here. For example, in the COFF support, if there is no @code{.file}
+symbol defined already, one is generated at this point. If no such adjustments
+are needed, this macro need not be defined.
+
+@end defmac
@defmac EMIT_SECTION_SYMBOLS
(@code{BFD_ASSEMBLER} only.)
Should section symbols be included in the symbol list if they're used in
-relocations? Some formats can generate section-relative relocations,
-and thus don't need
-(Default: 1.)
+relocations? Some formats can generate section-relative relocations, and thus
+don't need symbols emitted for them. (Default: 1.)
+@end defmac
+
+@defmac obj_frob_file
+Any final cleanup needed before writing out the BFD may be done here. For
+example, ECOFF formats (and MIPS ELF format) may do some work on the MIPS-style
+symbol table with its integrated debug information. The symbol table should
+not be modified at this time.
+@end defmac
+
+@subsection CPU-specific definitions
+
+@node Relaxation
+@subsubsection Relaxation
+@cindex Relaxation
+
+If @code{md_relax_frag} isn't defined, the assembler will perform some
+relaxation on @code{rs_machine_dependent} frags based on the frag subtype and
+the displacement to some specified target address. The basic idea is that many
+machines have different addressing modes for instructions that can specify
+different ranges of values, with successive modes able to access wider ranges,
+including the entirety of the previous range. Smaller ranges are assumed to be
+more desirable (perhaps the instruction requires one word instead of two or
+three); if this is not the case, don't describe the smaller-range, inferior
+mode.
+
+The @code{fr_subtype} and the field of a frag is an index into a CPU-specific
+relaxation table. That table entry indicates the range of values that can be
+stored, the number of bytes that will have to be added to the frag to
+accomodate the addressing mode, and the index of the next entry to examine if
+the value to be stored is outside the range accessible by the current
+addressing mode. The @code{fr_symbol} field of the frag indicates what symbol
+is to be accessed; the @code{fr_offset} field is added in.
+
+If the @code{fr_pcrel_adjust} field is set, which currently should only happen
+for the NS32k family, the @code{TC_PCREL_ADJUST} macro is called on the frag to
+compute an adjustment to be made to the displacement.
+
+The value fitted by the relaxation code is always assumed to be a displacement
+from the current frag. (More specifically, from @code{fr_fix} bytes into the
+frag.) This seems kinda silly. What about fitting small absolute values? I
+suppose @code{md_assemble} is supposed to take care of that, but if the operand
+is a difference between symbols, it might not be able to, if the difference was
+not computable yet.
+
+The end of the relaxation sequence is indicated by a ``next'' value of 0. This
+is kinda silly too, since it means that the first entry in the table can't be
+used. I think -1 would make a more logical sentinel value.
+
+The table @code{md_relax_table} from @file{targ-cpu.c} describes the relaxation
+modes available. Currently this must always be provided, even on machines for
+which this type of relaxation isn't possible or practical. Probably fewer than
+half the machines gas supports used it; it ought to be made conditional on some
+CPU-specific macro. Currently, also that table must be declared ``const;'' on
+some machines, though, it might make sense to keep it writeable, so it can be
+modified depending on which CPU of a family is specified. For example, in the
+m68k family, the 68020 has some addressing modes that are not available on the
+68000.
+
+The relaxation table type contains these fields:
+
+@table @code
+@item long rlx_forward
+Forward reach, must be non-negative.
+@item long rlx_backward
+Backward reach, must be zero or negative.
+@item rlx_length
+Length in bytes of this addressing mode.
+@item rlx_more
+Index of the next-longer relax state, or zero if there is no ``next''
+relax state.
+@end table
+
+The relaxation is done in @code{relax_segment} in @file{write.c}. The
+difference in the length fields between the original mode and the one finally
+chosen by the relaxing code is taken as the size by which the current frag will
+be increased in size. For example, if the initial relaxing mode has a length
+of 2 bytes, and because of the size of the displacement, it gets upgraded to a
+mode with a size of 6 bytes, it is assumed that the frag will grow by 4 bytes.
+(The initial two bytes should have been part of the fixed portion of the frag,
+since it is already known that they will be output.) This growth must be
+effected by @code{md_convert_frag}; it should increase the @code{fr_fix} field
+by the appropriate size, and fill in the appropriate bytes of the frag.
+(Enough space for the maximum growth should have been allocated in the call to
+frag_var as the second argument.)
+
+If relocation records are needed, they should be emitted by
+@code{md_estimate_size_before_relax}.
+
+These are the machine-specific definitions associated with the relaxation
+mechanism:
+
+@deftypefun int md_estimate_size_before_relax (fragS *@var{frag}, segT @var{sec})
+This function should examine the target symbol of the supplied frag and correct
+the @code{fr_subtype} of the frag if needed. When this function is called, if
+the symbol has not yet been defined, it will not become defined later; however,
+its value may still change if the section it is in gets relaxed.
+
+Usually, if the symbol is in the same section as the frag (given by the
+@var{sec} argument), the narrowest likely relaxation mode is stored in
+@code{fr_subtype}, and that's that.
+
+If the symbol is undefined, or in a different section (and therefore moveable
+to an arbitrarily large distance), the largest available relaxation mode is
+specified, @code{fix_new} is called to produce the relocation record,
+@code{fr_fix} is increased to include the relocated field (remember, this
+storage was allocated when @code{frag_var} was called), and @code{frag_wane} is
+called to convert the frag to an @code{rs_fill} frag with no variant part.
+Sometimes changing addressing modes may also require rewriting the instruction.
+It can be accessed via @code{fr_opcode} or @code{fr_fix}.
+
+Sometimes @code{fr_var} is increased instead, and @code{frag_wane} is not
+called. I'm not sure, but I think this is to keep @code{fr_fix} referring to
+an earlier byte, and @code{fr_subtype} set to @code{rs_machine_dependent} so
+that @code{md_convert_frag} will get called.
+@end deftypefun
+
+@deftypevar relax_typeS md_relax_table []
+This is the table.
+@end deftypevar
+
+@defmac md_relax_frag (@var{frag})
+
+This macro, if defined, overrides all of the processing described above. It's
+only defined for the MIPS target CPU, and there it doesn't do anything; it's
+used solely to disable the relaxing code and free up the @code{fr_subtype}
+field for use by the CPU-specific code.
+
+@end defmac
+
+@defmac tc_frob_file
+Like @code{obj_frob_file}, this macro handles miscellaneous last-minute
+cleanup. Currently only used on PowerPC/POWER support, for setting up a
+@code{.debug} section. This macro should not cause the symbol table to be
+modified.
+
+@end defmac
@node Source File Summary
@section Source File Summary
-The code in the @file{obj-coff} back end assumes @code{BFD_ASSEMBLER} is
-defined; the code in @file{obj-coffbfd} uses @code{BFD},
-@code{BFD_HEADERS}, and @code{MANY_SEGMENTS}, but does a lot of the file
-positioning itself. This confusing situation arose from the history of
-the code.
+@subsection File Format Descriptions
+
+@subheading a.out
+
+The @code{a.out} format is described by @file{obj-aout.*}.
+
+@subheading b.out
+
+The @code{b.out} format, described by @file{obj-bout.*}, is similar to
+@code{a.out} format, except for a few additional fields in the file header
+describing section alignment and address.
+
+@subheading COFF
Originally, @file{obj-coff} was a purely non-BFD version, and
-@file{obj-coffbfd} was created to use BFD for low-level byte-swapping.
-When the @code{BFD_ASSEMBLER} conversion started, the first COFF target
-to be converted was using @file{obj-coff}, and the two files had
-diverged somewhat, and I didn't feel like first converting the support
-of that target over to use the low-level BFD interface.
+@file{obj-coffbfd} was created to use BFD for low-level byte-swapping. When
+the @code{BFD_ASSEMBLER} conversion started, the first COFF target to be
+converted was using @file{obj-coff}, and the two files had diverged somewhat,
+and I didn't feel like first converting the support of that target over to use
+the low-level BFD interface.
+
+So @file{obj-coff} got converted, and to simplify certain things,
+@file{obj-coffbfd} got ``merged'' in with a brute-force approach.
+Specifically, preprocessor conditionals testing for @code{BFD_ASSEMBLER}
+effectively split the @file{obj-coff} files into the two separate versions. It
+isn't pretty. They will be merged more thoroughly, and eventually only the
+higher-level interface will be used.
+
+@subheading ECOFF
+
+All ECOFF configurations use BFD for writing object files.
+
+@subheading ELF
+
+ELF is a fairly reasonable format, without many of the deficiencies the other
+object file formats have. (It's got some of its own, but not as bad as the
+others.) All ELF configurations use BFD for writing object files.
+
+@subheading EVAX
+
+This is the format used on VMS. Yes, someone has actually written BFD support
+for it. The code hasn't been integrated yet though.
+
+@subheading HP300?
+
+@subheading IEEE?
+
+@subheading SOM
+
+@subheading XCOFF
+
+The XCOFF configuration is based on the COFF cofiguration (using the
+higher-level BFD interface). In fact, it uses the same files in the assembler.
+
+@subheading VMS
+
+This is the old Vax VMS support. It doesn't use BFD.
+
+@subsection Processor Descriptions
-Currently, all COFF targets use one of the two BFD interfaces, so the
-non-BFD code can be removed. Eventually, all should be converted to
-using one COFF back end, which uses the high-level BFD interface.
+Foo: a29k, alpha, h8300, h8500, hppa, i386, i860, i960, m68k, m88k, mips,
+ns32k, ppc, sh, sparc, tahoe, vax, z8k.
+
+@node M68k
+@subsubsection M68k
+
+The operand syntax handling is atrocious. There is no clear specification of
+the operand syntax. I'm looking into using a Bison grammar to replace much of
+it.
+
+Operands on the 68k series processors can have two displacement values
+specified, plus a base register and a (possibly scaled) index register of which
+only some bits might be used. Thus a single 68k operand requires up to two
+expressions, two register numbers, and size and scale factors. The
+@code{struct m68k_op} type also includes a field indicating the mode of the
+operand, and an @code{error} field indicating a problem encountered while
+parsing the operand.
+
+An instruction on the 68k may have up to 6 operands, although most of them have
+to be simple register operands. Up to 11 (16-bit) words may be required to
+express the instruction.
+
+A @code{struct m68k_exp} expression contains an @code{expressionS}, pointers to
+the first and last characters of the input that produced the expression, an
+indication of the section to which the expression belongs, and a size field.
+I'm not sure what the size field describes.
+
+@subsubheading M68k addressing modes
+
+Many instructions used the low six bits of the first instruction word to
+describe the location of the operand, or how to compute the location. The six
+bits are typically split into three for a ``mode'' and three for a ``register''
+value. The interpretation of these values is as follows:
+
+@example
+Mode Register Operand addressing mode
+0 Dn data register
+1 An address register
+2 An indirect
+3 An indirect, post-increment
+4 An indirect, pre-decrement
+5 An indirect with displacement
+6 An indirect with optional displacement and index;
+ may involve multiple indirections and two
+ displacements
+7 0 16-bit address follows
+7 1 32-bit address follows
+7 2 PC indirect with displacement
+7 3 PC indirect with optional displacements and index
+7 4 immediate 16- or 32-bit
+7 5,6,7 Reserved
+@end example
+
+On the 68000 and 68010, support for modes 6 and 7.3 are incomplete; the
+displacement must fit in 8 bits, and no scaling or index suppression is
+permitted.
+
+@subsubheading M68k relaxation modes
+
+The relaxation modes used on the 68k are:
+
+@table @code
+@item ABRANCH
+Case @samp{g} except when @code{BCC68000} is applicable.
+@item FBRANCH
+Coprocessor branches.
+@item PCREL
+Mode 7.2 -- program counter indirect with 16-bit displacement. This is
+available on all processors. Widens to 32-bit absolute. Used only if the
+original code used @code{ABSL} mode, and the CPU is not a 68000 or 68010.
+(Why? Those processors support mode 7.2.)
+@item BCC68000
+A conditional branch instruction, on the 68000 or 68010. These instructions
+support only 16-bit displacements on these processors. If a larger
+displacement is needed, the condition is negated and turned into a short branch
+around a jump instruction to the specified target. This jump will have an
+long absolute addressing mode.
+@item DBCC
+Like @code{BCC68000}, but for @code{dbCC} (decrement and branch on condition)
+instructions.
+@item PCLEA
+Not currently used?? Short form is mode 7.2 (program counter indirect, 16-bit
+displacement); long form is 7.3/0x0170 (program counter indirect, suppressed
+index register, 32-bit displacement). Used in progressive-930331 for mode
+@code{AOFF} with a PC-relative addressing mode and a displacement that won't
+fit in 16 bits, or which is variable and is not specified to have a size other
+than long.
+@item PCINDEX
+Newly added. PC indirect with index. An 8-bit displacement is supported on
+the 68000 and 68010, wider displacements on later processors.
+
+Well, actually, I haven't added it yet. I need to soon, though. It fixes a
+bug reported by a customer.
+@end table
+
+@subsection ``Emulation'' Descriptions
+
+These are the @file{te-*.h} files.
@node Foo
@section Foo
@@ -182,12 +634,12 @@ using one COFF back end, which uses the high-level BFD interface.
@deftypefun int had_warnings (void)
@deftypefunx int had_errors (void)
-Returns non-zero if any warnings or errors, respectively, have been
-printed during this invocation.
+Returns non-zero if any warnings or errors, respectively, have been printed
+during this invocation.
@end deftypefun
-@deftypefun void as_perror (const char *@var{gripe}, const char *@var{filename})
+@deftypefun void as_perror (const char *@var{gripe}, const char *@var{filename})
Displays a BFD or system error, then clears the error status.
@@ -198,33 +650,77 @@ Displays a BFD or system error, then clears the error status.
@deftypefunx void as_bad (const char *@var{format}, ...)
@deftypefunx void as_fatal (const char *@var{format}, ...)
-These functions display messages about something amiss with the input
-file, or internal problems in the assembler itself. The current file
-name and line number are printed, followed by the supplied message,
-formatted using @code{vfprintf}, and a final newline.
+These functions display messages about something amiss with the input file, or
+internal problems in the assembler itself. The current file name and line
+number are printed, followed by the supplied message, formatted using
+@code{vfprintf}, and a final newline.
+
+An error indicated by @code{as_bad} will result in a non-zero exit status when
+the assembler has finished. Calling @code{as_fatal} will result in immediate
+termination of the assembler process.
@end deftypefun
@deftypefun void as_warn_where (char *@var{file}, unsigned int @var{line}, const char *@var{format}, ...)
@deftypefunx void as_bad_where (char *@var{file}, unsigned int @var{line}, const char *@var{format}, ...)
-These variants permit specification of the file name and line number,
-and are used when problems are detected when reprocessing information
-saved away when processing some earlier part of the file. For example,
-fixups are processed after all input has been read, but messages about
-fixups should refer to the original filename and line number that they
-are applicable to.
+These variants permit specification of the file name and line number, and are
+used when problems are detected when reprocessing information saved away when
+processing some earlier part of the file. For example, fixups are processed
+after all input has been read, but messages about fixups should refer to the
+original filename and line number that they are applicable to.
@end deftypefun
@deftypefun void fprint_value (FILE *@var{file}, valueT @var{val})
@deftypefunx void sprint_value (char *@var{buf}, valueT @var{val})
-These functions are helpful for converting a @code{valueT} value into
-printable format, in case it's wider than modes that @code{*printf} can
-handle. If the type is narrow enough, a decimal number will be
-produced; otherwise, it will be in hexadecimal (FIXME: currently without
-`0x' prefix). The value itself is not examined to make this
-determination.
+These functions are helpful for converting a @code{valueT} value into printable
+format, in case it's wider than modes that @code{*printf} can handle. If the
+type is narrow enough, a decimal number will be produced; otherwise, it will be
+in hexadecimal (FIXME: currently without `0x' prefix). The value itself is not
+examined to make this determination.
@end deftypefun
+
+@node Writing a new target
+@section Writing a new target
+
+@node Test suite
+@section Test suite
+@cindex test suite
+
+The test suite is kind of lame for most processors. Often it only checks to
+see if a couple of files can be assembled without the assembler reporting any
+errors. For more complete testing, write a test which either examines the
+assembler listing, or runs @code{objdump} and examines its output. For the
+latter, the TCL procedure @code{run_dump_test} may come in handy. It takes the
+base name of a file, and looks for @file{@var{file}.d}. This file should
+contain as its initial lines a set of variable settings in @samp{#} comments,
+in the form:
+
+@example
+ #@var{varname}: @var{value}
+@end example
+
+The @var{varname} may be @code{objdump}, @code{nm}, or @code{as}, in which case
+it specifies the options to be passed to the specified programs. Exactly one
+of @code{objdump} or @code{nm} must be specified, as that also specifies which
+program to run after the assembler has finished. If @var{varname} is
+@code{source}, it specifies the name of the source file; otherwise,
+@file{@var{file}.s} is used. If @var{varname} is @code{name}, it specifies the
+name of the test to be used in the @code{pass} or @code{fail} messages.
+
+The non-commented parts of the file are interpreted as regular expressions, one
+per line. Blank lines in the @code{objdump} or @code{nm} output are skipped,
+as are blank lines in the @code{.d} file; the other lines are tested to see if
+the regular expression matches the program output. If it does not, the test
+fails.
+
+Note that this means the tests must be modified if the @code{objdump} output
+style is changed.
+
+@bye
+@c Local Variables:
+@c fill-column: 79
+@c End: