diff options
Diffstat (limited to 'ld/ldint.texinfo')
-rw-r--r-- | ld/ldint.texinfo | 700 |
1 files changed, 0 insertions, 700 deletions
diff --git a/ld/ldint.texinfo b/ld/ldint.texinfo deleted file mode 100644 index 6df7c88..0000000 --- a/ld/ldint.texinfo +++ /dev/null @@ -1,700 +0,0 @@ -\input texinfo -@setfilename ldint.info -@c Copyright (C) 1992-2018 Free Software Foundation, Inc. - -@ifnottex -@dircategory Software development -@direntry -* Ld-Internals: (ldint). The GNU linker internals. -@end direntry -@end ifnottex - -@copying -This file documents the internals of the GNU linker ld. - -Copyright @copyright{} 1992-2018 Free Software Foundation, Inc. -Contributed by Cygnus Support. - -Permission is granted to copy, distribute and/or modify this document -under the terms of the GNU Free Documentation License, Version 1.3 or -any later version published by the Free Software Foundation; with the -Invariant Sections being ``GNU General Public License'' and ``Funding -Free Software'', the Front-Cover texts being (a) (see below), and with -the Back-Cover Texts being (b) (see below). A copy of the license is -included in the section entitled ``GNU Free Documentation License''. - -(a) The FSF's Front-Cover Text is: - - A GNU Manual - -(b) The FSF's Back-Cover Text is: - - You have freedom to copy and modify this GNU Manual, like GNU - software. Copies published by the Free Software Foundation raise - funds for GNU development. -@end copying - -@iftex -@finalout -@setchapternewpage off -@settitle GNU Linker Internals -@titlepage -@title{A guide to the internals of the GNU linker} -@author Per Bothner, Steve Chamberlain, Ian Lance Taylor, DJ Delorie -@author Cygnus Support -@page - -@tex -\def\$#1${{#1}} % Kluge: collect RCS revision info without $...$ -\xdef\manvers{2.10.91} % For use in headers, footers too -{\parskip=0pt -\hfill Cygnus Support\par -\hfill \manvers\par -\hfill \TeX{}info \texinfoversion\par -} -@end tex - -@vskip 0pt plus 1filll -Copyright @copyright{} 1992-2018 Free Software Foundation, Inc. - - Permission is granted to copy, distribute and/or modify this document - under the terms of the GNU Free Documentation License, Version 1.3 - or any later version published by the Free Software Foundation; - with no Invariant Sections, with no Front-Cover Texts, and with no - Back-Cover Texts. A copy of the license is included in the - section entitled "GNU Free Documentation License". - -@end titlepage -@end iftex - -@node Top -@top - -This file documents the internals of the GNU linker @code{ld}. It is a -collection of miscellaneous information with little form at this point. -Mostly, it is a repository into which you can put information about -GNU @code{ld} as you discover it (or as you design changes to @code{ld}). - -This document is distributed under the terms of the GNU Free -Documentation License. A copy of the license is included in the -section entitled "GNU Free Documentation License". - -@menu -* README:: The README File -* Emulations:: How linker emulations are generated -* Emulation Walkthrough:: A Walkthrough of a Typical Emulation -* Architecture Specific:: Some Architecture Specific Notes -* GNU Free Documentation License:: GNU Free Documentation License -@end menu - -@node README -@chapter The @file{README} File - -Check the @file{README} file; it often has useful information that does not -appear anywhere else in the directory. - -@node Emulations -@chapter How linker emulations are generated - -Each linker target has an @dfn{emulation}. The emulation includes the -default linker script, and certain emulations also modify certain types -of linker behaviour. - -Emulations are created during the build process by the shell script -@file{genscripts.sh}. - -The @file{genscripts.sh} script starts by reading a file in the -@file{emulparams} directory. This is a shell script which sets various -shell variables used by @file{genscripts.sh} and the other shell scripts -it invokes. - -The @file{genscripts.sh} script will invoke a shell script in the -@file{scripttempl} directory in order to create default linker scripts -written in the linker command language. The @file{scripttempl} script -will be invoked 5 (or, in some cases, 6) times, with different -assignments to shell variables, to create different default scripts. -The choice of script is made based on the command line options. - -After creating the scripts, @file{genscripts.sh} will invoke yet another -shell script, this time in the @file{emultempl} directory. That shell -script will create the emulation source file, which contains C code. -This C code permits the linker emulation to override various linker -behaviours. Most targets use the generic emulation code, which is in -@file{emultempl/generic.em}. - -To summarize, @file{genscripts.sh} reads three shell scripts: an -emulation parameters script in the @file{emulparams} directory, a linker -script generation script in the @file{scripttempl} directory, and an -emulation source file generation script in the @file{emultempl} -directory. - -For example, the Sun 4 linker sets up variables in -@file{emulparams/sun4.sh}, creates linker scripts using -@file{scripttempl/aout.sc}, and creates the emulation code using -@file{emultempl/sunos.em}. - -Note that the linker can support several emulations simultaneously, -depending upon how it is configured. An emulation can be selected with -the @code{-m} option. The @code{-V} option will list all supported -emulations. - -@menu -* emulation parameters:: @file{emulparams} scripts -* linker scripts:: @file{scripttempl} scripts -* linker emulations:: @file{emultempl} scripts -@end menu - -@node emulation parameters -@section @file{emulparams} scripts - -Each target selects a particular file in the @file{emulparams} directory -by setting the shell variable @code{targ_emul} in @file{configure.tgt}. -This shell variable is used by the @file{configure} script to control -building an emulation source file. - -Certain conventions are enforced. Suppose the @code{targ_emul} variable -is set to @var{emul} in @file{configure.tgt}. The name of the emulation -shell script will be @file{emulparams/@var{emul}.sh}. The -@file{Makefile} must have a target named @file{e@var{emul}.c}; this -target must depend upon @file{emulparams/@var{emul}.sh}, as well as the -appropriate scripts in the @file{scripttempl} and @file{emultempl} -directories. The @file{Makefile} target must invoke @code{GENSCRIPTS} -with two arguments: @var{emul}, and the value of the make variable -@code{tdir_@var{emul}}. The value of the latter variable will be set by -the @file{configure} script, and is used to set the default target -directory to search. - -By convention, the @file{emulparams/@var{emul}.sh} shell script should -only set shell variables. It may set shell variables which are to be -interpreted by the @file{scripttempl} and the @file{emultempl} scripts. -Certain shell variables are interpreted directly by the -@file{genscripts.sh} script. - -Here is a list of shell variables interpreted by @file{genscripts.sh}, -as well as some conventional shell variables interpreted by the -@file{scripttempl} and @file{emultempl} scripts. - -@table @code -@item SCRIPT_NAME -This is the name of the @file{scripttempl} script to use. If -@code{SCRIPT_NAME} is set to @var{script}, @file{genscripts.sh} will use -the script @file{scripttempl/@var{script}.sc}. - -@item TEMPLATE_NAME -This is the name of the @file{emultempl} script to use. If -@code{TEMPLATE_NAME} is set to @var{template}, @file{genscripts.sh} will -use the script @file{emultempl/@var{template}.em}. If this variable is -not set, the default value is @samp{generic}. - -@item GENERATE_SHLIB_SCRIPT -If this is set to a nonempty string, @file{genscripts.sh} will invoke -the @file{scripttempl} script an extra time to create a shared library -script. @ref{linker scripts}. - -@item OUTPUT_FORMAT -This is normally set to indicate the BFD output format use (e.g., -@samp{"a.out-sunos-big"}. The @file{scripttempl} script will normally -use it in an @code{OUTPUT_FORMAT} expression in the linker script. - -@item ARCH -This is normally set to indicate the architecture to use (e.g., -@samp{sparc}). The @file{scripttempl} script will normally use it in an -@code{OUTPUT_ARCH} expression in the linker script. - -@item ENTRY -Some @file{scripttempl} scripts use this to set the entry address, in an -@code{ENTRY} expression in the linker script. - -@item TEXT_START_ADDR -Some @file{scripttempl} scripts use this to set the start address of the -@samp{.text} section. - -@item SEGMENT_SIZE -The @file{genscripts.sh} script uses this to set the default value of -@code{DATA_ALIGNMENT} when running the @file{scripttempl} script. - -@item TARGET_PAGE_SIZE -If @code{SEGMENT_SIZE} is not defined, the @file{genscripts.sh} script -uses this to define it. - -@item ALIGNMENT -Some @file{scripttempl} scripts set this to a number to pass to -@code{ALIGN} to set the required alignment for the @code{end} symbol. -@end table - -@node linker scripts -@section @file{scripttempl} scripts - -Each linker target uses a @file{scripttempl} script to generate the -default linker scripts. The name of the @file{scripttempl} script is -set by the @code{SCRIPT_NAME} variable in the @file{emulparams} script. -If @code{SCRIPT_NAME} is set to @var{script}, @code{genscripts.sh} will -invoke @file{scripttempl/@var{script}.sc}. - -The @file{genscripts.sh} script will invoke the @file{scripttempl} -script 5 to 9 times. Each time it will set the shell variable -@code{LD_FLAG} to a different value. When the linker is run, the -options used will direct it to select a particular script. (Script -selection is controlled by the @code{get_script} emulation entry point; -this describes the conventional behaviour). - -The @file{scripttempl} script should just write a linker script, written -in the linker command language, to standard output. If the emulation -name--the name of the @file{emulparams} file without the @file{.sc} -extension--is @var{emul}, then the output will be directed to -@file{ldscripts/@var{emul}.@var{extension}} in the build directory, -where @var{extension} changes each time the @file{scripttempl} script is -invoked. - -Here is the list of values assigned to @code{LD_FLAG}. - -@table @code -@item (empty) -The script generated is used by default (when none of the following -cases apply). The output has an extension of @file{.x}. -@item n -The script generated is used when the linker is invoked with the -@code{-n} option. The output has an extension of @file{.xn}. -@item N -The script generated is used when the linker is invoked with the -@code{-N} option. The output has an extension of @file{.xbn}. -@item r -The script generated is used when the linker is invoked with the -@code{-r} option. The output has an extension of @file{.xr}. -@item u -The script generated is used when the linker is invoked with the -@code{-Ur} option. The output has an extension of @file{.xu}. -@item shared -The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to -this value if @code{GENERATE_SHLIB_SCRIPT} is defined in the -@file{emulparams} file. The @file{emultempl} script must arrange to use -this script at the appropriate time, normally when the linker is invoked -with the @code{-shared} option. The output has an extension of -@file{.xs}. -@item c -The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to -this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the -@file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf}. The -@file{emultempl} script must arrange to use this script at the appropriate -time, normally when the linker is invoked with the @code{-z combreloc} -option. The output has an extension of -@file{.xc}. -@item cshared -The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to -this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the -@file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf} and -@code{GENERATE_SHLIB_SCRIPT} is defined in the @file{emulparams} file. -The @file{emultempl} script must arrange to use this script at the -appropriate time, normally when the linker is invoked with the @code{-shared --z combreloc} option. The output has an extension of @file{.xsc}. -@item auto_import -The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to -this value if @code{GENERATE_AUTO_IMPORT_SCRIPT} is defined in the -@file{emulparams} file. The @file{emultempl} script must arrange to -use this script at the appropriate time, normally when the linker is -invoked with the @code{--enable-auto-import} option. The output has -an extension of @file{.xa}. -@end table - -Besides the shell variables set by the @file{emulparams} script, and the -@code{LD_FLAG} variable, the @file{genscripts.sh} script will set -certain variables for each run of the @file{scripttempl} script. - -@table @code -@item RELOCATING -This will be set to a non-empty string when the linker is doing a final -relocation (e.g., all scripts other than @code{-r} and @code{-Ur}). - -@item CONSTRUCTING -This will be set to a non-empty string when the linker is building -global constructor and destructor tables (e.g., all scripts other than -@code{-r}). - -@item DATA_ALIGNMENT -This will be set to an @code{ALIGN} expression when the output should be -page aligned, or to @samp{.} when generating the @code{-N} script. - -@item CREATE_SHLIB -This will be set to a non-empty string when generating a @code{-shared} -script. - -@item COMBRELOC -This will be set to a non-empty string when generating @code{-z combreloc} -scripts to a temporary file name which can be used during script generation. -@end table - -The conventional way to write a @file{scripttempl} script is to first -set a few shell variables, and then write out a linker script using -@code{cat} with a here document. The linker script will use variable -substitutions, based on the above variables and those set in the -@file{emulparams} script, to control its behaviour. - -When there are parts of the @file{scripttempl} script which should only -be run when doing a final relocation, they should be enclosed within a -variable substitution based on @code{RELOCATING}. For example, on many -targets special symbols such as @code{_end} should be defined when doing -a final link. Naturally, those symbols should not be defined when doing -a relocatable link using @code{-r}. The @file{scripttempl} script -could use a construct like this to define those symbols: -@smallexample - $@{RELOCATING+ _end = .;@} -@end smallexample -This will do the symbol assignment only if the @code{RELOCATING} -variable is defined. - -The basic job of the linker script is to put the sections in the correct -order, and at the correct memory addresses. For some targets, the -linker script may have to do some other operations. - -For example, on most MIPS platforms, the linker is responsible for -defining the special symbol @code{_gp}, used to initialize the -@code{$gp} register. It must be set to the start of the small data -section plus @code{0x8000}. Naturally, it should only be defined when -doing a final relocation. This will typically be done like this: -@smallexample - $@{RELOCATING+ _gp = ALIGN(16) + 0x8000;@} -@end smallexample -This line would appear just before the sections which compose the small -data section (@samp{.sdata}, @samp{.sbss}). All those sections would be -contiguous in memory. - -Many COFF systems build constructor tables in the linker script. The -compiler will arrange to output the address of each global constructor -in a @samp{.ctor} section, and the address of each global destructor in -a @samp{.dtor} section (this is done by defining -@code{ASM_OUTPUT_CONSTRUCTOR} and @code{ASM_OUTPUT_DESTRUCTOR} in the -@code{gcc} configuration files). The @code{gcc} runtime support -routines expect the constructor table to be named @code{__CTOR_LIST__}. -They expect it to be a list of words, with the first word being the -count of the number of entries. There should be a trailing zero word. -(Actually, the count may be -1 if the trailing word is present, and the -trailing word may be omitted if the count is correct, but, as the -@code{gcc} behaviour has changed slightly over the years, it is safest -to provide both). Here is a typical way that might be handled in a -@file{scripttempl} file. -@smallexample - $@{CONSTRUCTING+ __CTOR_LIST__ = .;@} - $@{CONSTRUCTING+ LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)@} - $@{CONSTRUCTING+ *(.ctors)@} - $@{CONSTRUCTING+ LONG(0)@} - $@{CONSTRUCTING+ __CTOR_END__ = .;@} - $@{CONSTRUCTING+ __DTOR_LIST__ = .;@} - $@{CONSTRUCTING+ LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)@} - $@{CONSTRUCTING+ *(.dtors)@} - $@{CONSTRUCTING+ LONG(0)@} - $@{CONSTRUCTING+ __DTOR_END__ = .;@} -@end smallexample -The use of @code{CONSTRUCTING} ensures that these linker script commands -will only appear when the linker is supposed to be building the -constructor and destructor tables. This example is written for a target -which uses 4 byte pointers. - -Embedded systems often need to set a stack address. This is normally -best done by using the @code{PROVIDE} construct with a default stack -address. This permits the user to easily override the stack address -using the @code{--defsym} option. Here is an example: -@smallexample - $@{RELOCATING+ PROVIDE (__stack = 0x80000000);@} -@end smallexample -The value of the symbol @code{__stack} would then be used in the startup -code to initialize the stack pointer. - -@node linker emulations -@section @file{emultempl} scripts - -Each linker target uses an @file{emultempl} script to generate the -emulation code. The name of the @file{emultempl} script is set by the -@code{TEMPLATE_NAME} variable in the @file{emulparams} script. If the -@code{TEMPLATE_NAME} variable is not set, the default is -@samp{generic}. If the value of @code{TEMPLATE_NAME} is @var{template}, -@file{genscripts.sh} will use @file{emultempl/@var{template}.em}. - -Most targets use the generic @file{emultempl} script, -@file{emultempl/generic.em}. A different @file{emultempl} script is -only needed if the linker must support unusual actions, such as linking -against shared libraries. - -The @file{emultempl} script is normally written as a simple invocation -of @code{cat} with a here document. The document will use a few -variable substitutions. Typically each function names uses a -substitution involving @code{EMULATION_NAME}, for ease of debugging when -the linker supports multiple emulations. - -Every function and variable in the emitted file should be static. The -only globally visible object must be named -@code{ld_@var{EMULATION_NAME}_emulation}, where @var{EMULATION_NAME} is -the name of the emulation set in @file{configure.tgt} (this is also the -name of the @file{emulparams} file without the @file{.sh} extension). -The @file{genscripts.sh} script will set the shell variable -@code{EMULATION_NAME} before invoking the @file{emultempl} script. - -The @code{ld_@var{EMULATION_NAME}_emulation} variable must be a -@code{struct ld_emulation_xfer_struct}, as defined in @file{ldemul.h}. -It defines a set of function pointers which are invoked by the linker, -as well as strings for the emulation name (normally set from the shell -variable @code{EMULATION_NAME} and the default BFD target name (normally -set from the shell variable @code{OUTPUT_FORMAT} which is normally set -by the @file{emulparams} file). - -The @file{genscripts.sh} script will set the shell variable -@code{COMPILE_IN} when it invokes the @file{emultempl} script for the -default emulation. In this case, the @file{emultempl} script should -include the linker scripts directly, and return them from the -@code{get_scripts} entry point. When the emulation is not the default, -the @code{get_scripts} entry point should just return a file name. See -@file{emultempl/generic.em} for an example of how this is done. - -At some point, the linker emulation entry points should be documented. - -@node Emulation Walkthrough -@chapter A Walkthrough of a Typical Emulation - -This chapter is to help people who are new to the way emulations -interact with the linker, or who are suddenly thrust into the position -of having to work with existing emulations. It will discuss the files -you need to be aware of. It will tell you when the given "hooks" in -the emulation will be called. It will, hopefully, give you enough -information about when and how things happen that you'll be able to -get by. As always, the source is the definitive reference to this. - -The starting point for the linker is in @file{ldmain.c} where -@code{main} is defined. The bulk of the code that's emulation -specific will initially be in @code{emultempl/@var{emulation}.em} but -will end up in @code{e@var{emulation}.c} when the build is done. -Most of the work to select and interface with emulations is in -@code{ldemul.h} and @code{ldemul.c}. Specifically, @code{ldemul.h} -defines the @code{ld_emulation_xfer_struct} structure your emulation -exports. - -Your emulation file exports a symbol -@code{ld_@var{EMULATION_NAME}_emulation}. If your emulation is -selected (it usually is, since usually there's only one), -@code{ldemul.c} sets the variable @var{ld_emulation} to point to it. -@code{ldemul.c} also defines a number of API functions that interface -to your emulation, like @code{ldemul_after_parse} which simply calls -your @code{ld_@var{EMULATION}_emulation.after_parse} function. For -the rest of this section, the functions will be mentioned, but you -should assume the indirect reference to your emulation also. - -We will also skip or gloss over parts of the link process that don't -relate to emulations, like setting up internationalization. - -After initialization, @code{main} selects an emulation by pre-scanning -the command line arguments. It calls @code{ldemul_choose_target} to -choose a target. If you set @code{choose_target} to -@code{ldemul_default_target}, it picks your @code{target_name} by -default. - -@code{main} calls @code{ldemul_before_parse}, then @code{parse_args}. -@code{parse_args} calls @code{ldemul_parse_args} for each arg, which -must update the @code{getopt} globals if it recognizes the argument. -If the emulation doesn't recognize it, then parse_args checks to see -if it recognizes it. - -Now that the emulation has had access to all its command-line options, -@code{main} calls @code{ldemul_set_symbols}. This can be used for any -initialization that may be affected by options. It is also supposed -to set up any variables needed by the emulation script. - -@code{main} now calls @code{ldemul_get_script} to get the emulation -script to use (based on arguments, no doubt, @pxref{Emulations}) and -runs it. While parsing, @code{ldgram.y} may call @code{ldemul_hll} or -@code{ldemul_syslib} to handle the @code{HLL} or @code{SYSLIB} -commands. It may call @code{ldemul_unrecognized_file} if you asked -the linker to link a file it doesn't recognize. It will call -@code{ldemul_recognized_file} for each file it does recognize, in case -the emulation wants to handle some files specially. All the while, -it's loading the files (possibly calling -@code{ldemul_open_dynamic_archive}) and symbols and stuff. After it's -done reading the script, @code{main} calls @code{ldemul_after_parse}. -Use the after-parse hook to set up anything that depends on stuff the -script might have set up, like the entry point. - -@code{main} next calls @code{lang_process} in @code{ldlang.c}. This -appears to be the main core of the linking itself, as far as emulation -hooks are concerned(*). It first opens the output file's BFD, calling -@code{ldemul_set_output_arch}, and calls -@code{ldemul_create_output_section_statements} in case you need to use -other means to find or create object files (i.e. shared libraries -found on a path, or fake stub objects). Despite the name, nobody -creates output sections here. - -(*) In most cases, the BFD library does the bulk of the actual -linking, handling symbol tables, symbol resolution, relocations, and -building the final output file. See the BFD reference for all the -details. Your emulation is usually concerned more with managing -things at the file and section level, like "put this here, add this -section", etc. - -Next, the objects to be linked are opened and BFDs created for them, -and @code{ldemul_after_open} is called. At this point, you have all -the objects and symbols loaded, but none of the data has been placed -yet. - -Next comes the Big Linking Thingy (except for the parts BFD does). -All input sections are mapped to output sections according to the -script. If a section doesn't get mapped by default, -@code{ldemul_place_orphan} will get called to figure out where it goes. -Next it figures out the offsets for each section, calling -@code{ldemul_before_allocation} before and -@code{ldemul_after_allocation} after deciding where each input section -ends up in the output sections. - -The last part of @code{lang_process} is to figure out all the symbols' -values. After assigning final values to the symbols, -@code{ldemul_finish} is called, and after that, any undefined symbols -are turned into fatal errors. - -OK, back to @code{main}, which calls @code{ldwrite} in -@file{ldwrite.c}. @code{ldwrite} calls BFD's final_link, which does -all the relocation fixups and writes the output bfd to disk, and we're -done. - -In summary, - -@itemize @bullet - -@item @code{main()} in @file{ldmain.c} -@item @file{emultempl/@var{EMULATION}.em} has your code -@item @code{ldemul_choose_target} (defaults to your @code{target_name}) -@item @code{ldemul_before_parse} -@item Parse argv, calls @code{ldemul_parse_args} for each -@item @code{ldemul_set_symbols} -@item @code{ldemul_get_script} -@item parse script - -@itemize @bullet -@item may call @code{ldemul_hll} or @code{ldemul_syslib} -@item may call @code{ldemul_open_dynamic_archive} -@end itemize - -@item @code{ldemul_after_parse} -@item @code{lang_process()} in @file{ldlang.c} - -@itemize @bullet -@item create @code{output_bfd} -@item @code{ldemul_set_output_arch} -@item @code{ldemul_create_output_section_statements} -@item read objects, create input bfds - all symbols exist, but have no values -@item may call @code{ldemul_unrecognized_file} -@item will call @code{ldemul_recognized_file} -@item @code{ldemul_after_open} -@item map input sections to output sections -@item may call @code{ldemul_place_orphan} for remaining sections -@item @code{ldemul_before_allocation} -@item gives input sections offsets into output sections, places output sections -@item @code{ldemul_after_allocation} - section addresses valid -@item assigns values to symbols -@item @code{ldemul_finish} - symbol values valid -@end itemize - -@item output bfd is written to disk - -@end itemize - -@node Architecture Specific -@chapter Some Architecture Specific Notes - -This is the place for notes on the behavior of @code{ld} on -specific platforms. Currently, only Intel x86 is documented (and -of that, only the auto-import behavior for DLLs). - -@menu -* ix86:: Intel x86 -@end menu - -@node ix86 -@section Intel x86 - -@table @emph -@code{ld} can create DLLs that operate with various runtimes available -on a common x86 operating system. These runtimes include native (using -the mingw "platform"), cygwin, and pw. - -@item auto-import from DLLs -@enumerate -@item -With this feature on, DLL clients can import variables from DLL -without any concern from their side (for example, without any source -code modifications). Auto-import can be enabled using the -@code{--enable-auto-import} flag, or disabled via the -@code{--disable-auto-import} flag. Auto-import is disabled by default. - -@item -This is done completely in bounds of the PE specification (to be fair, -there's a minor violation of the spec at one point, but in practice -auto-import works on all known variants of that common x86 operating -system) So, the resulting DLL can be used with any other PE -compiler/linker. - -@item -Auto-import is fully compatible with standard import method, in which -variables are decorated using attribute modifiers. Libraries of either -type may be mixed together. - -@item -Overhead (space): 8 bytes per imported symbol, plus 20 for each -reference to it; Overhead (load time): negligible; Overhead -(virtual/physical memory): should be less than effect of DLL -relocation. -@end enumerate - -Motivation - -The obvious and only way to get rid of dllimport insanity is -to make client access variable directly in the DLL, bypassing -the extra dereference imposed by ordinary DLL runtime linking. -I.e., whenever client contains something like - -@code{mov dll_var,%eax,} - -address of dll_var in the command should be relocated to point -into loaded DLL. The aim is to make OS loader do so, and than -make ld help with that. Import section of PE made following -way: there's a vector of structures each describing imports -from particular DLL. Each such structure points to two other -parallel vectors: one holding imported names, and one which -will hold address of corresponding imported name. So, the -solution is de-vectorize these structures, making import -locations be sparse and pointing directly into code. - -Implementation - -For each reference of data symbol to be imported from DLL (to -set of which belong symbols with name <sym>, if __imp_<sym> is -found in implib), the import fixup entry is generated. That -entry is of type IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3 -subsection. Each fixup entry contains pointer to symbol's address -within .text section (marked with __fuN_<sym> symbol, where N is -integer), pointer to DLL name (so, DLL name is referenced by -multiple entries), and pointer to symbol name thunk. Symbol name -thunk is singleton vector (__nm_th_<symbol>) pointing to -IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing -imported name. Here comes that "om the edge" problem mentioned above: -PE specification rambles that name vector (OriginalFirstThunk) should -run in parallel with addresses vector (FirstThunk), i.e. that they -should have same number of elements and terminated with zero. We violate -this, since FirstThunk points directly into machine code. But in -practice, OS loader implemented the sane way: it goes thru -OriginalFirstThunk and puts addresses to FirstThunk, not something -else. It once again should be noted that dll and symbol name -structures are reused across fixup entries and should be there -anyway to support standard import stuff, so sustained overhead is -20 bytes per reference. Other question is whether having several -IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes, -it is done even by native compiler/linker (libth32's functions are in -fact resident in windows9x kernel32.dll, so if you use it, you have -two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is -whether referencing the same PE structures several times is valid. -The answer is why not, prohibiting that (detecting violation) would -require more work on behalf of loader than not doing it. - -@end table - -@node GNU Free Documentation License -@chapter GNU Free Documentation License - -@include fdl.texi - -@contents -@bye |