diff options
Diffstat (limited to 'winsup/doc/overview2.sgml')
-rw-r--r-- | winsup/doc/overview2.sgml | 323 |
1 files changed, 0 insertions, 323 deletions
diff --git a/winsup/doc/overview2.sgml b/winsup/doc/overview2.sgml deleted file mode 100644 index 4c8595d..0000000 --- a/winsup/doc/overview2.sgml +++ /dev/null @@ -1,323 +0,0 @@ -<sect1 id="ov-ex-unix"><title>Expectations for UNIX Programmers</title> - -<para>Developers coming from a UNIX background will find a set of utilities -they are already comfortable using, including a working UNIX shell. The -compiler tools are the standard GNU compilers most people will have previously -used under UNIX, only ported to the Windows host. Programmers wishing to port -UNIX software to Windows NT or 9x will find that the Cygwin library provides -an easy way to port many UNIX packages, with only minimal source code -changes.</para> -</sect1> - -<sect1 id="ov-ex-win"><title>Expectations for Windows Programmers</title> -<para>Developers coming from a Windows background will find a set of tools capable -of writing console or GUI executables that rely on the Microsoft Win32 API. -The linker and dlltool utility may be used to write Windows Dynamically Linked -Libraries (DLLs). The resource compiler "windres" is also provided with the -native Windows GNUPro tools. All tools may be used from the Microsoft command -line prompt, with full support for normal Windows pathnames.</para> -</sect1> - -<sect2 id="ov-hi-intro"><title>Introduction</title> <para>When a binary linked -against the library is executed, the Cygwin DLL is loaded into the -application's text segment. Because we are trying to emulate a UNIX kernel -which needs access to all processes running under it, the first Cygwin DLL to -run creates shared memory areas that other processes using separate instances -of the DLL can access. This is used to keep track of open file descriptors and -assist fork and exec, among other purposes. In addition to the shared memory -regions, every process also has a per_process structure that contains -information such as process id, user id, signal masks, and other similar -process-specific information.</para> - -<para>The DLL is implemented using the Win32 API, which allows it to run on all -Win32 hosts. Because processes run under the standard Win32 subsystem, they -can access both the UNIX compatibility calls provided by Cygwin as well as -any of the Win32 API calls. This gives the programmer complete flexibility in -designing the structure of their program in terms of the APIs used. For -example, they could write a Win32-specific GUI using Win32 API calls on top of -a UNIX back-end that uses Cygwin.</para> - -<para>Early on in the development process, we made the important design -decision that it would not be necessary to strictly adhere to existing UNIX -standards like POSIX.1 if it was not possible or if it would significantly -diminish the usability of the tools on the Win32 platform. In many cases, an -environment variable can be set to override the default behavior and force -standards compliance.</para> -</sect2> - -<sect2 id="ov-hi-win9xnt"><title>Supporting both Windows NT and 9x</title> -<para>While Windows 95 and Windows 98 are similar enough to each other that we -can safely ignore the distinction when implementing Cygwin, Windows NT is an -extremely different operating system. For this reason, whenever the DLL is -loaded, the library checks which operating system is active so that it can act -accordingly.</para> - -<para>In some cases, the Win32 API is only different for -historical reasons. In this situation, the same basic functionality is -available under Windows 9x and NT but the method used to gain this -functionality differs. A trivial example: in our implementation of -uname, the library examines the sysinfo.dwProcessorType structure -member to figure out the processor type under Windows 9x. This field -is not supported in NT, which has its own operating system-specific -structure member called sysinfo.wProcessorLevel.</para> - -<para>Other differences between NT and 9x are much more fundamental in -nature. The best example is that only NT provides a security model.</para> -</sect2> - -<sect2 id="ov-hi-perm"><title>Permissions and Security</title> -<para>Windows NT includes a sophisticated security model based on Access -Control Lists (ACLs). Cygwin maps Win32 file ownership and permissions to the -more standard, older UNIX model by default. Cygwin version 1.1 introduces -support for ACLs according to the system calls used on newer versions of -Solaris. This ability is used when the `ntsec' feature is switched on which -is described in another chapter. -The chmod call maps UNIX-style permissions -back to the Win32 equivalents. Because many programs expect to be able to find -the /etc/passwd and /etc/group files, we provide utilities that can be used to -construct them from the user and group information provided by the operating -system.</para> - -<para>Under Windows NT, the administrator is permitted to chown files. There -is no mechanism to support the setuid concept or API call since Cygwin version -1.1.2. With version 1.1.3 Cygwin introduces a mechanism for setting real -and effective UIDs under Windows NT/W2K. This is described in the ntsec -section.</para> - -<para>Under Windows 9x, the situation is considerably different. Since a -security model is not provided, Cygwin fakes file ownership by making all -files look like they are owned by a default user and group id. As under NT, -file permissions can still be determined by examining their read/write/execute -status. Rather than return an unimplemented error, under Windows 9x, the -chown call succeeds immediately without actually performing any action -whatsoever. This is appropriate since essentially all users jointly own the -files when no concept of file ownership exists.</para> - -<para>It is important that we discuss the implications of our "kernel" using -shared memory areas to store information about Cygwin processes. Because -these areas are not yet protected in any way, in principle a malicious user -could modify them to cause unexpected behavior in Cygwin processes. While -this is not a new problem under Windows 9x (because of the lack of operating -system security), it does constitute a security hole under Windows NT. -This is because one user could affect the Cygwin programs run by -another user by changing the shared memory information in ways that -they could not in a more typical WinNT program. For this reason, it -is not appropriate to use Cygwin in high-security applications. In -practice, this will not be a major problem for most uses of the -library.</para> -</sect2> - -<sect2 id="ov-hi-files"><title>File Access</title> <para>Cygwin supports -both Win32- and POSIX-style paths, using either forward or back slashes as the -directory delimiter. Paths coming into the DLL are translated from Win32 to -POSIX as needed. As a result, the library believes that the file system is a -POSIX-compliant one, translating paths back to Win32 paths whenever it calls a -Win32 API function. UNC pathnames (starting with two slashes) are -supported.</para> - -<para>The layout of this POSIX view of the Windows file system space is stored -in the Windows registry. While the slash ('/') directory points to the system -partition by default, this is easy to change with the Cygwin mount utility. -In addition to selecting the slash partition, it allows mounting arbitrary -Win32 paths into the POSIX file system space. Many people use the utility to -mount each drive letter under the slash partition (e.g. C:\ to /c, D:\ to /d, -etc...).</para> - -<para>The library exports several Cygwin-specific functions that can be used -by external programs to convert a path or path list from Win32 to POSIX or vice -versa. Shell scripts and Makefiles cannot call these functions directly. -Instead, they can do the same path translations by executing the cygpath -utility program that we provide with Cygwin.</para> - -<para>Win32 file systems are case preserving but case insensitive. Cygwin -does not currently support case distinction because, in practice, few UNIX -programs actually rely on it. While we could mangle file names to support case -distinction, this would add unnecessary overhead to the library and make it -more difficult for non-Cygwin applications to access those files.</para> - -<para>Symbolic links are emulated by files containing a magic cookie followed -by the path to which the link points. They are marked with the System -attribute so that only files with that attribute have to be read to determine -whether or not the file is a symbolic link. Hard links are fully supported -under Windows NT on NTFS file systems. On a FAT file system, the call falls -back to simply copying the file, a strategy that works in many cases.</para> - -<para>The inode number for a file is calculated by hashing its full Win32 path. -The inode number generated by the stat call always matches the one returned in -d_ino of the dirent structure. It is worth noting that the number produced by -this method is not guaranteed to be unique. However, we have not found this to -be a significant problem because of the low probability of generating a -duplicate inode number.</para> - -<para>Chroot is supported since release 1.1.3. Note that chroot isn't -supported native by Windows. This implies some restrictions. First of all, -the chroot call isn't a privileged call. Each user may call it. Second, the -chroot environment isn't safe against native windows processes. If you -want to support a chroot environment as, for example, by allowing an -anonymous ftp with restricted access, you'll have to care that only -native Cygwin applications are accessible inside of the chroot environment. -Since that applications are only using the Cygwin POSIX API to access the -file system their access can be restricted as it is intended. This includes -not only POSIX paths but Win32 paths (containing drive letter and/or -backslashes) and CIFS paths (//server/share or \\server\share) as well.</para> -</sect2> - -<sect2 id="ov-hi-textvsbinary"><title>Text Mode vs. Binary Mode</title> -<para>Interoperability with other Win32 programs such as text editors was -critical to the success of the port of the development tools. Most Cygnus -customers upgrading from the older DOS-hosted toolchains expected the new -Win32-hosted ones to continue to work with their old development -sources.</para> - -<para>Unfortunately, UNIX and Win32 use different end-of-line terminators in -text files. Consequently, carriage-return newlines have to be translated on -the fly by Cygwin into a single newline when reading in text mode. The -control-z character is interpreted as a valid end-of-file character for a -similar reason.</para> - -<para>This solution addresses the compatibility requirement at the expense of -violating the POSIX standard that states that text and binary mode will be -identical. Consequently, processes that attempt to lseek through text files can -no longer rely on the number of bytes read as an accurate indicator of position -in the file. For this reason, the CYGWIN environment variable can be -set to override this behavior.</para> -</sect2> - -<sect2 id="ov-hi-ansiclib"><title>ANSI C Library</title> -<para>We chose to include -Cygnus' own existing ANSI C library -"newlib" as part of the library, rather than write all of the lib C -and math calls from scratch. Newlib is a BSD-derived ANSI C library, -previously only used by cross-compilers for embedded systems -development.</para> - -<para>The reuse of existing free implementations of such things -as the glob, regexp, and getopt libraries saved us considerable -effort. In addition, Cygwin uses Doug Lea's free malloc -implementation that successfully balances speed and compactness. The -library accesses the malloc calls via an exported function pointer. -This makes it possible for a Cygwin process to provide its own -malloc if it so desires.</para> -</sect2> - -<sect2 id="ov-hi-process"><title>Process Creation</title> -<para>The fork call in Cygwin is particularly interesting because it -does not map well on top of the Win32 API. This makes it very -difficult to implement correctly. Currently, the Cygwin fork is a -non-copy-on-write implementation similar to what was present in early -flavors of UNIX.</para> - -<para>The first thing that happens when a parent process -forks a child process is that the parent initializes a space in the -Cygwin process table for the child. It then creates a suspended -child process using the Win32 CreateProcess call. Next, the parent -process calls setjmp to save its own context and sets a pointer to -this in a Cygwin shared memory area (shared among all Cygwin -tasks). It then fills in the child's .data and .bss sections by -copying from its own address space into the suspended child's address -space. After the child's address space is initialized, the child is -run while the parent waits on a mutex. The child discovers it has -been forked and longjumps using the saved jump buffer. The child then -sets the mutex the parent is waiting on and blocks on another mutex. -This is the signal for the parent to copy its stack and heap into the -child, after which it releases the mutex the child is waiting on and -returns from the fork call. Finally, the child wakes from blocking on -the last mutex, recreates any memory-mapped areas passed to it via the -shared area, and returns from fork itself.</para> - -<para>While we have some -ideas as to how to speed up our fork implementation by reducing the -number of context switches between the parent and child process, fork -will almost certainly always be inefficient under Win32. Fortunately, -in most circumstances the spawn family of calls provided by Cygwin -can be substituted for a fork/exec pair with only a little effort. -These calls map cleanly on top of the Win32 API. As a result, they -are much more efficient. Changing the compiler's driver program to -call spawn instead of fork was a trivial change and increased -compilation speeds by twenty to thirty percent in our -tests.</para> - -<para>However, spawn and exec present their own set of -difficulties. Because there is no way to do an actual exec under -Win32, Cygwin has to invent its own Process IDs (PIDs). As a -result, when a process performs multiple exec calls, there will be -multiple Windows PIDs associated with a single Cygwin PID. In some -cases, stubs of each of these Win32 processes may linger, waiting for -their exec'd Cygwin process to exit.</para> -</sect2> - -<sect2 id="ov-hi-signals"><title>Signals</title> -<para>When -a Cygwin process starts, the library starts a secondary thread for -use in signal handling. This thread waits for Windows events used to -pass signals to the process. When a process notices it has a signal, -it scans its signal bitmask and handles the signal in the appropriate -fashion.</para> - -<para>Several complications in the implementation arise from the -fact that the signal handler operates in the same address space as the -executing program. The immediate consequence is that Cygwin system -functions are interruptible unless special care is taken to avoid -this. We go to some lengths to prevent the sig_send function that -sends signals from being interrupted. In the case of a process -sending a signal to another process, we place a mutex around sig_send -such that sig_send will not be interrupted until it has completely -finished sending the signal.</para> - -<para>In the case of a process sending -itself a signal, we use a separate semaphore/event pair instead of the -mutex. sig_send starts by resetting the event and incrementing the -semaphore that flags the signal handler to process the signal. After -the signal is processed, the signal handler signals the event that it -is done. This process keeps intraprocess signals synchronous, as -required by POSIX.</para> - -<para>Most standard UNIX signals are provided. Job -control works as expected in shells that support -it.</para> -</sect2> - -<sect2 id="ov-hi-sockets"><title>Sockets</title> -<para>Socket-related calls in Cygwin simply -call the functions by the same name in Winsock, Microsoft's -implementation of Berkeley sockets. Only a few changes were needed to -match the expected UNIX semantics - one of the most troublesome -differences was that Winsock must be initialized before the first -socket function is called. As a result, Cygwin has to perform this -initialization when appropriate. In order to support sockets across -fork calls, child processes initialize Winsock if any inherited file -descriptor is a socket.</para> - -<para>Unfortunately, implicitly loading DLLs -at process startup is usually a slow affair. Because many processes -do not use sockets, Cygwin explicitly loads the Winsock DLL the -first time it calls the Winsock initialization routine. This single -change sped up GNU configure times by thirty -percent.</para> -</sect2> - -<sect2 id="ov-hi-select"><title>Select</title> -<para>The UNIX select function is another -call that does not map cleanly on top of the Win32 API. Much to our -dismay, we discovered that the Win32 select in Winsock only worked on -socket handles. Our implementation allows select to function normally -when given different types of file descriptors (sockets, pipes, -handles, and a custom /dev/windows Windows messages -pseudo-device).</para> - -<para>Upon entry into the select function, the first -operation is to sort the file descriptors into the different types. -There are then two cases to consider. The simple case is when at -least one file descriptor is a type that is always known to be ready -(such as a disk file). In that case, select returns immediately as -soon as it has polled each of the other types to see if they are -ready. The more complex case involves waiting for socket or pipe file -descriptors to be ready. This is accomplished by the main thread -suspending itself, after starting one thread for each type of file -descriptor present. Each thread polls the file descriptors of its -respective type with the appropriate Win32 API call. As soon as a -thread identifies a ready descriptor, that thread signals the main -thread to wake up. This case is now the same as the first one since -we know at least one descriptor is ready. So select returns, after -polling all of the file descriptors one last time.</para> -</sect2> |