\input texinfo @parindent=0pt @setfilename gld @c @@setchapternewpage odd @settitle GLD, The GNU linker @titlepage @title{gld} @subtitle{The gnu loader} @sp 1 @subtitle Second Edition---gld version 2.0 @subtitle January 1991 @vskip 0pt plus 1filll Copyright @copyright{} 1991 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions. @author {Steve Chamberlain} @author {Cygnus Support} @author {steve@@cygnus.com} @end titlepage @node Top,,, @comment node-name, next, previous, up @ifinfo This file documents the GNU linker gld. @end ifinfo @c chapter What does a linker do ? @c chapter Command Language @noindent @chapter Overview The @code{gld} command combines a number of object and archive files, relocates their data and ties up symbol references. Often the last step in building a new compiled program to run is a call to @code{gld}. The @code{gld} command accepts Linker Command Language files in a superset of AT+T's Link Editor Command Language syntax, to provide explict and total control over the linking process. This version of @code{gld} uses the general purpose @code{bfd} libraries to operate on object files. This allows @code{gld} to read and write any of the formats supported by @code{bfd}, different formats may be linked together producing any available object file. Supported formats: @itemize @bullet @item Sun3 68k a.out @item IEEE-695 68k Object Module Format @item Oasys 68k Binary Relocatable Object File Format @item Sun4 sparc a.out @item 88k bcs coff @item i960 coff little endian @item i960 coff big endian @item i960 b.out little endian @item i960 b.out big endian @item s-records @end itemize When linking similar formats, @code{gld} maintains all debugging information. @chapter Command line options @example gld [ -Bstatic ] [ -D @var{datasize} ] [ -c @var{filename} ] [ -d ] | [ -dc ] | [ -dp ] [ -i ] [ -e @var{entry} ] [ -l @var{arch} ] [ -L @var{searchdir} ] [ -M ] [ -N | -n | -z ] [ -noinhibit-exec ] [ -r ] [ -S ] [ -s ] [ -f @var{fill} ] [ -T @var{textorg} ] [ -Tdata @var{dataorg} ] [ -t ] [ -u @var{sym}] [ -X ] [ -x ] [-o @var{output} ] @var{objfiles}@dots{} @end example Command-line options to GNU @code{gld} may be specified in any order, and may be repeated at will. For the most part, repeating an option with a different argument will either have no further effect, or override prior occurrences (those further to the left on the command line) of an option. The exceptions which may meaningfully be present several times are @code{-L}, @code{-l}, and @code{-u}. @var{objfiles} may follow, precede, or be mixed in with command-line options; save that an @var{objfiles} argument may not be placed between an option flag and its argument. Option arguments must follow the option letter without intervening whitespace, or be given as separate arguments immediately following the option that requires them. @table @code @item @var{objfiles}@dots{} The object files @var{objfiles} to be linked; at least one must be specified. @item -Bstatic This flag is accepted for command-line compatibility with the SunOS linker, but has no effect on @code{gld}. @item -c @var{commandfile} Directs @code{gld} to read linkage commands from the file @var{commandfile}. @item -D @var{datasize} Use this option to specify a target size for the @code{data} segment of your linked program. The option is only obeyed if @var{datasize} is larger than the natural size of the program's @code{data} segment. @var{datasize} must be an integer specified in hexadecimal. @code{ld} will simply increase the size of the @code{data} segment, padding the created gap with zeros, and reduce the size of the @code{bss} segment to match. @item -d Force @code{ld} to assign space to common symbols even if a relocatable output file is specified (@code{-r}). @item -dc | -dp This flags is accepted for command-line compatibility with the SunOS linker, but has no effect on @code{gld}. @item -e @var{entry} Use @var{entry} as the explicit symbol for beginning execution of your program, rather than the default entry point. If this symbol is not specified, the symbol @code{start} is used as the entry address. If there is no symbol called @code{start}, then the entry address is set to the first address in the first output section (usually the @samp{text} section). @item -f @var{fill} Sets the default fill pattern for ``holes'' in the output file to the lowest two bytes of the expression specified. @item -i Produce an incremental link (same as option @code{-r}). @item -l @var{arch} Add an archive file @var{arch} to the list of files to link. This option may be used any number of times. @code{ld} will search its path-list for occurrences of @code{lib@var{arch}.a} for every @var{arch} specified. @c This also has a side effect of using the "c++ demangler" if we happen @c to specify -llibg++. Document? pesch@@cygnus.com, 24jan91 @item -L @var{searchdir} This command adds path @var{searchdir} to the list of paths that @code{gld} will search for archive libraries. You may use this option any number of times. @c Should we make any attempt to list the standard paths searched @c without listing? When hacking on a new system I often want to know @c this, but this may not be the place... it's not constant across @c systems, of course, which is what makes it interesting. @c pesch@@cygnus.com, 24jan91. @item -M @itemx -m Print (to the standard output file) a link map---diagnostic information about where symbols are mapped by @code{ld}, and information on global common storage allocation. @item -N specifies read and writable @code{text} and @code{data} sections. If the output format supports Unix style magic numbers, then OMAGIC is set. @item -n sets the text segment to be read only, and @code{NMAGIC} is written if possible. @item -o @var{output} @var{output} is a name for the program produced by @code{ld}; if this option is not specified, the name @samp{a.out} is used by default. @item -r Generates relocatable output---i.e., generate an output file that can in turn serve as input to @code{gld}. As a side effect, this option also sets the output file's magic number to @code{OMAGIC}; see @samp{-N}. If this option is not specified, an absolute file is produced. @item -S Omits debugger symbol information (but not all symbols) from the output file. @item -s Omits all symbol information from the output file. @item -T @var{textorg} @itemx -Ttext @var{textorg} Use @var{textorg} as the starting address for the @code{text} segment of the output file. Both forms of this option are equivalent. The option argument must be a hexadecimal integer. @item -Tdata @var{dataorg} Use @var{dataorg} as the starting address for the @code{data} segment of the output file. The option argument must be a hexadecimal integer. @item -t Prints names of input files as @code{ld} processes them. @item -u @var{sym} Forces @var{sym} to be entered in the output file as an undefined symbol. This may, for example, trigger linking of additional modules from standard libraries. @code{-u} may be repeated with different option arguments to enter additional undefined symbols. This option is equivalent to the @code{EXTERN} linker command. @item -X If @code{-s} or @code{-S} is also specified, delete only local symbols beginning with @samp{L}. @item -z @code{-z} sets @code{ZMAGIC}, the default: the @code{text} segment is read-only, demand pageable, and shared. Specifying a relocatable output file (@code{-r}) will also set the magic number to @code{OMAGIC}. See description of @samp{-N}. @end table @chapter Command Language The command language allows explicit control over the linkage process, allowing specification of: @table @bullet @item input files @item file formats @item output file format @item addresses of sections @item placement of common blocks @item and more @end table A command file may be supplied to the linker, either explicitly through the @code{-c} option, or implicitly as an ordinary file. If the linker opens a file which does not have a reasonable object or archive format, it tries to read the file as if it were a command file. @section Structure To be added @section Expressions The syntax for expressions in the command language is identical to that of C expressions, with the following features: @table @bullet @item All expressions evaluated as integers and are of ``long'' or ``unsigned long'' type. @item All constants are integers. @item All of the C arithmetic operators are provided. @item Global variables may be referenced, defined and created. @item Build in functions may be called. @end table @section Expressions The linker has a practice of ``lazy evaluation'' for expressions; it only calculates an expression when absolutely necessary. For instance, when the linker reads in the command file it has to know the values of the start address and the length of the memory regions for linkage to continue, so these values are worked out, but other values (such as symbol values) are not known or needed until after storage allocation. They are evaluated later, when the other information, such as the sizes of output sections are available for use in the symbol assignment expression. When a linker expression is evaluated and assigned to a variable it is given either an absolute or a relocatable type. An absolute expression type is one in which the symbol contains the value that it will have in the output file, a relocateable expression type is one in which the value is expressed as a fixed offset from the base of a section. The type of the expression is controlled by its position in the script file. A symbol assigned within a @code{SECTION} specification is created relative to the base of the section, a symbol assigned in any other place is created as an absolute symbol. Since a symbol created within a @code{SECTION} specification is relative to the base of the section it will remain relocatable if relocatable output is requested. A symbol may be created with an absolute value even when assigned to within a @code{SECTION} specification by using the absolute assignment function @code{ABSOLUTE} For example, to create an absolute symbol whose address is the last byte of the output section @code{.data}: @example .data : @{ *(.data) _edata = ABSOLUTE(.) ; @} @end example Unless quoted, symbol names start with a letter, underscore, point or minus sign and may include any letters, underscores, digits, points, and minus signs. Unquoted symbol names must not conflict with any keywords. To specify a symbol which contains odd characters or has the same name as a keyword surround it in double quotes: @example ``SECTION'' = 9; ``with a space'' = ``also with a space'' + 10; @end example @subsection Integers An octal integer is @samp{0} followed by zero or more of the octal digits (@samp{01234567}). A decimal integer starts with a non-zero digit followed by zero or more digits (@samp{0123456789}). A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}. Integers have the usual values. To denote a negative integer, use the unary operator @samp{-} discussed under expressions. Additionally the suffixes @code{K} and @code{M} may be used to multiply the previous constant by 1024 or @tex $1024^2$ @end tex respectively. @example _as_decimal = 57005; _as_hex = 0xdead; _as_octal = 0157255; _4k_1 = 4K; _4k_2 = 4096; _4k_3 = 0x1000; @end example @subsection Operators The linker provides the standard C set of arithmetic operators, with the standard bindings and precedence levels: @example @end example @tex \vbox{\offinterlineskip \hrule \halign {\vrule#&\hfil#\hfil&\vrule#&\hfil#\hfil&\vrule#&\hfil#\hfil&\vrule#\cr height2pt&&&&&\cr &Level&& associativity &&Operators&\cr height2pt&&&&&\cr \noalign{\hrule} height2pt&&&&&\cr &highest&&&&&&\cr &1&&left&&$ ! - ~$&\cr height2pt&&&&&\cr &2&&left&&* / \%&\cr height2pt&&&&&\cr &3&&left&&+ -&\cr height2pt&&&&&\cr &4&&left&&$>> <<$&\cr height2pt&&&&&\cr &5&&left&&$== != > < <= >=$&\cr height2pt&&&&&\cr &6&&left&&\&&\cr height2pt&&&&&\cr &7&&left&&|&\cr height2pt&&&&&\cr &8&&left&&{\&\&}&\cr height2pt&&&&&\cr &9&&left&&||&\cr height2pt&&&&&\cr &10&&right&&? :&\cr height2pt&&&&&\cr &11&&right&&$${\&= += -= *= /=}&\cr &lowest&&&&&&\cr height2pt&&&&&\cr} \hrule} @end tex @section Built in Functions The command language provides built in functions for use in expressions in linkage scripts. @table @bullet @item @code{ALIGN(@var{exp})} returns the result of the current location counter (@code{dot}) aligned to the next @var{exp} boundary, where @var{exp} is a power of two. This is equivalent to @code{(. + @var{exp} -1) & ~(@var{exp}-1)}. As an example, to align the output @code{.data} section to the next 0x2000 byte boundary after the preceding section and to set a variable within the section to the next 0x8000 boundary after the input sections: @example .data ALIGN(0x2000) :@{ *(.data) variable = ALIGN(0x8000); @} @end example @item @code{ADDR(@var{section name})} returns the absolute address of the named section if the section has already been bound. In the following examples the @code{symbol_1} and @code{symbol_2} are assigned identical values: @example .output1: @{ start_of_output_1 $= .; ... @} .output: @{ symbol_1 = ADDR(.output1); symbol_2 = start_of_output_1; @} @end example @item @code{SIZEOF(@var{section name})} returns the size in bytes of the named section, if the section has been allocated. In the following example the @code{symbol_1} and @code{symbol_2} are assigned identical values: @example .output @{ .start = . ; ... .end = .; @} symbol_1 = .end - .start; symbol_2 = SIZEOF(.output); @end example @item @code{DEFINED(@var{symbol name})} Returns 1 if the symbol is in the linker global symbol table and is defined, otherwise it returns 0. This example shows the setting of a global symbol @code{begin} to the first location in the @code{.text} section, only if there is no other symbol called @code{begin} already: @example .text: @{ begin = DEFINED(begin) ? begin : . ; ... @} @end example @end table @page @section MEMORY Directive The linker's default configuration is for all memory to be allocatable. This state may be overridden by using the @code{MEMORY} directive. The @code{MEMORY} directive describes the location and size of blocks of memory in the target. Careful use can describe memory regions which may or may not be used by the linker. The linker does not shuffle sections to fit into the available regions, but does move the requested sections into the correct regions and issue errors when the regions become too full. The syntax is: @example MEMORY @{ @tex $\bigl\lbrace {\it name_1} ({\it attr_1}):$ ORIGIN = ${\it origin_1},$ LENGTH $= {\it len_1} \bigr\rbrace $ @end tex @} @end example @table @code @item @var{name} is a name used internally by the linker to refer to the region. Any symbol name may be used. The region names are stored in a separate name space, and will not conflict with symbols, filenames or section names. @item @var{attr} is an optional list of attributes, parsed for compatibility with the AT+T linker but ignored by the both the AT+T and the gnu linker. @item @var{origin} is the start address of the region in physical memory expressed as standard linker expression which must evaluate to a constant before memory allocation is performed. The keyword @code{ORIGIN} may be abbreviated to @code{org} or @code{o}. @item @var{len} is the size in bytes of the region as a standard linker expression. The keyword @code{LENGTH} may be abbreviated to @code{len} or @code{l} @end table For example, to specify that memory has two regions available for allocation; one starting at 0 for 256k, and the other starting at 0x40000000 for four megabytes: @example MEMORY @{ rom : ORIGIN= 0, LENGTH = 256K ram : ORIGIN= 0x40000000, LENGTH = 4M @} @end example If the combined output sections directed to a region are too big for the region the linker will emit an error message. @page @section SECTIONS Directive The @code{SECTIONS} directive controls exactly where input sections are placed into output sections, their order and to which output sections they are allocated. When no @code{SECTIONS} directives are specified, the default action of the linker is to place each input section into an identically named output section in the order that the sections appear in the first file, and then the order of the files. The syntax of the @code{SECTIONS} directive is: @example SECTIONS @{ @tex $\bigl\lbrace {\it name_n}\bigl[options\bigr]\colon$ $\bigl\lbrace {\it statements_n} \bigr\rbrace \bigl[ = {\it fill expression } \bigr] \bigl[ > mem spec \bigr] \bigr\rbrace $ @end tex @} @end example @table @code @item @var{name} controls the name of the output section. In formats which only support a limited number of sections, such as @code{a.out}, the name must be one of the names supported by the format (in the case of a.out, @code{.text}, @code{.data} or @code{.bss}). If the output format supports any number of sections, but with numbers and not names (in the case of IEEE), the name should be supplied as a quoted numeric string. A section name may consist of any sequence characters, but any name which does not conform to the standard @code{gld} symbol name syntax must be quoted. To copy sections 1 through 4 from a Oasys file into the @code{.text} section of an @code{a.out} file, and sections 13 and 14 into the @code{data} section: @example SECTION @{ .text :@{ *(``1'' ``2'' ``3'' ``4'') @} .data :@{ *(``13'' ``14'') @} @} @end example @item @var{fill expression} If present this expression sets the fill value. Any unallocated holes in the current output section when written to the output file will be filled with the two least significant bytes of the value, repeated as necessary. @page @item @var{options} the @var{options} parameter is a list of optional arguments specifying attributes of the output section, they may be taken from the following list: @table @bullet{} @item @var{addr expression} forces the output section to be loaded at a specified address. The address is specified as a standard linker expression. The following example generates section @var{output} at location @code{0x40000000}: @example SECTIONS @{ output 0x40000000: @{ ... @} @} @end example Since the built in function @code{ALIGN} references the location counter implicitly, a section may be located on a certain boundary by using the @code{ALIGN} function in the expression. For example, to locate the @code{.data} section on the next 8k boundary after the end of the @code{.text} section: @example SECTIONS @{ .text @{ ... @} .data ALIGN(4K) @{ ... @} @} @end example @end table @item @var{statements} is a list of file names, input sections and assignments. These statements control what is placed into the output section. The syntax of a single @var{statement} is one of: @table @bullet @item @var{symbol} [ $= | += | -= | *= | /= ] @var{ expression} @code{;} Global symbols may be created and have their values (addresses) altered using the assignment statement. The linker tries to put off the evaluation of an assignment until all the terms in the source expression are known; for instance the sizes of sections cannot be known until after allocation, so assignments dependent upon these are not performed until after allocation. Some expressions, such as those depending upon the location counter @code{dot}, @samp{.} must be evaluated during allocation. If the result of an expression is required, but the value is not available, then an error results: eg @example SECTIONS @{ text 9+this_isnt_constant: @{ @} @} testscript:21: Non constant expression for initial address @end example @item @code{CREATE_OBJECT_SYMBOLS} causes the linker to create a symbol for each input file and place it into the specified section set with the value of the first byte of data written from the input file. For instance, with @code{a.out} files it is conventional to have a symbol for each input file. @example SECTIONS @{ .text 0x2020 : @{ CREATE_OBJECT_SYMBOLS *(.text) _etext = ALIGN(0x2000); @} @} @end example Supplied with four object files, @code{a.o}, @code{b.o}, @code{c.o}, and @code{d.o} a run of @code{gld} could create a map: @example From functions like : a.c: afunction() { } int adata=1; int abss; 00000000 A __DYNAMIC 00004020 B _abss 00004000 D _adata 00002020 T _afunction 00004024 B _bbss 00004008 D _bdata 00002038 T _bfunction 00004028 B _cbss 00004010 D _cdata 00002050 T _cfunction 0000402c B _dbss 00004018 D _ddata 00002068 T _dfunction 00004020 D _edata 00004030 B _end 00004000 T _etext 00002020 t a.o 00002038 t b.o 00002050 t c.o 00002068 t d.o @end example @item @var{filename} @code{(} @var{section name list} @code{)} This command allocates all the named sections from the input object file supplied into the output section at the current point. Sections are written in the order they appear in the list so: @example SECTIONS @{ .text 0x2020 : @{ a.o(.data) b.o(.data) *(.text) @} .data : @{ *(.data) @} .bss : @{ *(.bss) COMMON @} @} @end example will produce a map: @example insert here @end example @item @code{* (} @var{section name list} @code{)} This command causes all sections from all input files which have not yet been assigned output sections to be assigned the current output section. @item @var{filename} @code{[COMMON]} This allocates all the common symbols from the specified file and places them into the current output section. @item @code{* [COMMON]} This allocates all the common symbols from the files which have not yet had their common symbols allocated and places them into the current output section. @item @var{filename} A filename alone within a @code{SECTIONS} statement will cause all the input sections from the file to be placed into the current output section at the current location. If the file name has been mentioned before with a section name list then only those sections which have not yet been allocated are noted. The following example reads all of the sections from file all.o and places them at the start of output section @code{outputa} which starts at location @code{0x10000}. All of the data from section @code{.input1} from file foo.o is placed next into the same output section. All of section @code{.input2} is read from foo.o and placed into output section @code{outputb}. Next all of section @code{.input1} is read from foo1.o. All of the remaining @code{.input1} and @code{.input2} sections from any files are written to output section @code{output3}. @example SECTIONS @{ outputa 0x10000 : @{ all.o foo.o (.input1) @} outputb : @{ foo.o (.input2) foo1.o (.input1) @} outputc : @{ *(.input1) *(.input2) @} @} @end example @end table @end table @section Using the Location Counter The special linker variable @code{dot}, @samp{.} always contains the current output location counter. Since the @code{dot} always refers to a location in an output section, it must always appear in an expression within a @code{SECTIONS} directive. The @code{dot} symbol may appear anywhere that an ordinary symbol may appear in an expression, but its assignments have a side effect. Assigning a value to the @code{dot} symbol will cause the location counter to be moved. This may be used to create holes in the output section. The location counter may never be moved backwards. @example SECTIONS @{ output : @{ file1(.text) . = . + 1000; file2(.text) . += 1000; file3(.text) . -= 32; file4(.text) @} = 0x1234; @} @end example In the previous example, @code{file1} is located at the beginning of the output section, then there is a 1000 byte gap, filled with 0x1234. Then @code{file2} appears, also with a 1000 byte gap following before @code{file3} is loaded. Then the first 32 bytes of @code{file4} are placed over the last 32 bytes of @code{file3}. @section Command Language Syntax @section The Entry Point The linker chooses the first executable instruction in an output file from a list of possibilities, in order: @itemize @bullet @item The value of the symbol provided to the command line with the @code{-e} option, when present. @item The value of the symbol provided in the @code{ENTRY} directive, if present. @item The value of the symbol @code{start}, if present. @item The value of the symbol @code{_main}, if present. @item The address of the first byte of the @code{.text} section, if present. @item The value 0. @end itemize If the symbol @code{start} is not defined within the set of input files to a link, it may be generated by a simple assignment expression. eg. @example start = 0x2020; @end example @section Section Attributes @section Allocation of Sections into Memory @section Defining Symbols @chapter Examples of operation The simplest case is linking standard Unix object files on a standard Unix system supported by the linker. To link a file hello.o: @example $ gld -o output /lib/crt0.o hello.o -lc @end example This tells gld to produce a file called @code{output} after linking the file @code{/lib/crt0.o} with @code{hello.o} and the library @code{libc.a} which will come from the standard search directories. @chapter Partial Linking Specifying the @code{-r} on the command line causes @code{gld} to perform a partial link. @chapter BFD The linker accesses object and archive files using the @code{bfd} libraries. These libraries allow the linker to use the same routines to operate on object files whatever the object file format. A different object file format can be supported simply by creating a new @code{bfd} back end and adding it to the library. Formats currently supported: @itemize @bullet @item Sun3 68k a.out @item IEEE-695 68k Object Module Format @item Oasys 68k Binary Relocatable Object File Format @item Sun4 sparc a.out @item 88k bcs coff @item i960 coff little endian @item i960 coff big endian @item i960 b.out little endian @item i960 b.out big endian @end itemize As with most implementations, @code{bfd} is a compromise between several conflicting requirements. The major factor influencing @code{bfd} design was efficiency, any time used converting between formats is time which would not have been spent had @code{bfd} not been involved. This is partly offset by abstraction payback; since @code{bfd} simplifies applications and back ends, more time and care may be spent optimizing algorithms for a greater speed. One minor artifact of the @code{bfd} solution which the user should be aware of is information lossage. There are two places where useful information can be lost using the @code{bfd} mechanism; during conversion and during output. @section How it works When an object file is opened, @code{bfd} tries to automatically determine the format of the input object file, a descriptor is built in memory with pointers to routines to access elements of the object file's data structures. As different information from the the object files is required @code{bfd} reads from different sections of the file and processes them. For example a very common operation for the linker is processing symbol tables. Each @code{bfd} back end provides a routine for converting between the object file's representation of symbols and an internal canonical format. When the linker asks for the symbol table of an object file, it calls through the memory pointer to the relevant @code{bfd} back end routine which reads and converts the table into the canonical form. Linker then operates upon the common form. When the link is finished and the linker writes the symbol table of the output file, another @code{bfd} back end routine is called which takes the newly created symbol table and converts it into the output format. @section Information Leaks @table @bullet{} @item Information lost during output. The output formats supported by @code{bfd} do not provide identical facilities, and information which may be described in one form has no where to go in another format. One example of this would be alignment information in @code{b.out}. There is no where in an @code{a.out} format file to store alignment information on the contained data, so when a file is linked from @code{b.out} and an @code{a.out} image is produced, alignment information is lost. (Note that in this case the linker has the alignment information internally, so the link is performed correctly). Another example is COFF section names. COFF files may contain an unlimited number of sections, each one with a textual section name. If the target of the link is a format which does not have many sections (eg @code{a.out}) or has sections without names (eg the Oasys format) the link cannot be done simply. It is possible to circumvent this problem by describing the desired input section to output section mapping with the command language. @item Information lost during canonicalization. The @code{bfd} internal canonical form of the external formats is not exhaustive, there are structures in input formats for which there is no direct representation internally. This means that the @code{bfd} back ends cannot maintain all the data richness through the transformation between external to internal and back to external formats. This limitation is only a problem when using the linker to read one format and write another. Each @code{bfd} back end is responsible for maintaining as much data as possible, and the internal @code{bfd} canonical form has structures which are opaque to the @code{bfd} core, and exported only to the back ends. When a file is read in one format, the canonical form is generated for @code{bfd} and the linker. At the same time, the back end saves away any information which may otherwise be lost. If the data is then written back to the same back end, the back end routine will be able to use the canonical form provided by the @code{bfd} core as well as the information it prepared earlier. Since there is a great deal of commonality between back ends, this mechanism is very useful. There is no information lost when linking big endian COFF to little endian COFF, or from a.out to b.out. When a mixture of formats are linked, the information is only lost from the files with a different format to the destination. @end table @section Mechanism The smallest amount of information is preserved when there is a small union between the information provided by the source format, that stored by the canonical format and the information needed by the destination format. A brief description of the canonical form will help the user appreciate what is possible to be maintained between conversions. @table @bullet @item file level Information on target machine architecture, particular implementation and format type are stored on a per file basis. Other information includes a demand pageable bit and a write protected bit. Note that information like Unix magic numbers is not stored here, only the magic numbers meaning, so a ZMAGIC file would have both the demand pageable bit and the write protected text bit set. The byte order of the target is stored on a per file basis, so that both big and little endian object files may be linked together at the same time. @item section level Each section in the input file contains the name of the section, the original address in the object file, various flags, size and alignment information and pointers into other @code{bfd} data structures. @item symbol level Each symbol contains a pointer to the object file which originally defined it, its name, value and various flags bits. When a symbol table is read in all symbols are relocated to make them relative to the base of the section they were defined in, so each symbol points to the containing section. Each symbol also has a varying amount of hidden data to contain private data for the back end. Since the symbol points to the original file, the symbol private data format is accessible. Operations may be done to a list of symbols of wildly different formats without problems. Normal global and simple local symbols are maintained on output, so an output file, no matter the format will retain symbols pointing to functions, globals, statics and commons. Some symbol information is not worth retaining; in @code{a.out} type information is stored in the symbol table as long symbol names. This information would be useless to most coff debuggers and may be thrown away with appropriate command line switches. (Note that gdb does support stabs in coff). There is one word of type information within the symbol, so if the format supports symbol type information within symbols - (eg COFF, IEEE, Oasys) and the type is simple enough to fit within one word (nearly everything but aggregates) the information will be preserved. @item relocation level Each canonical relocation record contains a pointer to the symbol to relocate to, the offset of the data to relocate, the section the data is in and a pointer to a relocation type descriptor. Relocation is performed effectively by message passing through the relocation type descriptor and symbol pointer. It allows relocations to be performed on output data using a relocation method only available in one of the input formats. For instance, Oasys provides a byte relocation format. A relocation record requesting this relocation type would point indirectly to a routine to perform this, so the relocation may be performed on a byte being written to a COFF file, even though 68k COFF has no such relocation type. @item line numbers Line numbers have to be relocated along with the symbol information. Each symbol with an associated list of line number records points to the first record of the list. The head of a line number list consists of a pointer to the symbol, which allows divination of the address of the function who's line number is being described. The rest of the list is tuples offsets into the section and line indexes. Any format which can simply derive this information can pass it without lossage between formats (COFF, IEEE and Oasys). @end table @bye