diff options
Diffstat (limited to 'winsup/doc/textbinary.sgml')
-rw-r--r-- | winsup/doc/textbinary.sgml | 181 |
1 files changed, 0 insertions, 181 deletions
diff --git a/winsup/doc/textbinary.sgml b/winsup/doc/textbinary.sgml deleted file mode 100644 index cf6fc1b..0000000 --- a/winsup/doc/textbinary.sgml +++ /dev/null @@ -1,181 +0,0 @@ -<sect1 id="using-textbinary"><title>Text and Binary modes</title> - -<sect2> <title>The Issue</title> - -<para>On a UNIX system, when an application reads from a file it gets -exactly what's in the file on disk and the converse is true for writing. -The situation is different in the DOS/Windows world where a file can -be opened in one of two modes, binary or text. In the binary mode the -system behaves exactly as in UNIX. However in text mode there are -major differences:</para> -<OrderedList Numeration="Loweralpha" Spacing="Compact"> -<listitem> -<para> -On writing in text mode, a NL (\n, ^J) is transformed into the -sequence CR (\r, ^M) NL.</para> -</listitem> -<listitem> -<para> -On reading in text mode, a CR followed by an NL is deleted and a ^Z -character signals the end of file.</para> -</listitem> -</OrderedList> - -<para>This can wreak havoc with the seek/fseek calls since the number -of bytes actually in the file may differ from that seen by the -application.</para> - -<para>The mode can be specified explicitly as explained in the Programming -section below. In an ideal DOS/Windows world, all programs using lines as -records (such as <command>bash</command>, <command>make</command>, -<command>sed</command> ...) would open files (and change the mode of their -standard input and output) as text. All other programs (such as -<command>cat</command>, <command>cmp</command>, <command>tr</command> ...) -would use binary mode. In practice with Cygwin, programs that deal -explicitly with object files specify binary mode (this is the case of -<command>od</command>, which is helpful to diagnose CR problems). Most -other programs (such as <command>cat</command>, <command>cmp</command>, -<command>tr</command>) use the default mode.</para> - -</sect2> - -<sect2><title>The default Cygwin behavior</title> - -<para>The Cygwin system gives us some flexibility in deciding how files -are to be opened when the mode is not specified explicitly. -The rules are evolving, this section gives the design goals.</para> -<OrderedList Numeration="Loweralpha"> -<listitem> -<para>If the file appears to reside on a file system that is mounted -(i.e. if its pathname starts with a directory displayed by -<command>mount</command>), then the default is specified by the mount -flag. If the file is a symbolic link, the mode of the target file system -applies.</para> -</listitem> -<listitem> -<para>If the file appears to reside on a file system that is not mounted -(as can happen when the path contains a drive letter), the default is text. -</para> -</listitem> -<listitem> -<para>Pipes and non-file devices are opened in binary mode, -except if the <EnVar>CYGWIN</EnVar> environment variable contains -<literal>nobinmode</literal>.</para> -<warning><Title>Warning!</Title><para>In b20.1 of 12/98, a file will be opened -in binary mode if any of the following conditions hold:</para> -<OrderedList Numeration="arabic" Spacing="Compact"> -<listitem><para>binary mode is specified in the open call</para> -</listitem> -<listitem><para><envar>CYGWIN</envar> contains <literal>binmode</literal></para> -</listitem> -<listitem><para>the file resides in a binary mounted partition</para> -</listitem> -<listitem><para>the file is not a disk file</para> -</listitem> -</OrderedList> -</warning> -</listitem> - -<listitem> -<para>When a Cygwin program is launched by a shell, its standard input, -output and error are in binary mode if the <envar>CYGWIN</envar> variable -contains <literal>tty</literal>, else in text mode, except if they are piped -or redirected.</para> -<para> When redirecting, the Cygwin shells uses rules (a-c). For -these shells the relevant value of <envar>CYGWIN</envar> is that at the time -the shell was launched and not that at the time the program is executed. -Non-Cygwin shells always pipe and redirect with binary mode. With -non-Cygwin shells the commands <command> cat filename | program </command> -and <command> program < filename </command> are not equivalent when -<filename>filename</filename> is on a text-mounted partition. </para> -</listitem> -</OrderedList> -</sect2> - -<sect2><title>Example</title> -<para>To illustrate the various rules, we provide scripts to delete CRs -from files by using the <command>tr</command> program, which can only write -to standard output. -The script</para> -<screen> -#!/bin/sh -# Remove \r from the file given as argument -tr -d '\r' < "$1" > "$1".nocr -</screen> -<para> will not work on a text mounted systems because the \r will be -reintroduced on writing. However scripts such as </para> -<screen> -#!/bin/sh -# Remove \r from the file given as argument -tr -d '\r' | gzip | gunzip > "$1".nocr -</screen> -<para>and the .bat file</para> -<screen> -REM Remove \r from the file given as argument -@echo off -tr -d \r < %1 > %1.nocr -</screen> -<para> work fine. In the first case (assuming the pipes are binary) -we rely on <command>gunzip</command> to set its output to binary mode, -possibly overriding the mode used by the shell. -In the second case we rely on the DOS shell to redirect in binary mode. -</para> -</sect2> - -<sect2><title>Binary or text?</title> - -<para>UNIX programs that have been written for maximum portability -will know the difference between text and binary files and act -appropriately under Cygwin. For those programs, the text mode default -is a good choice. Programs included in official Cygnus distributions -should work well in the default mode. </para> - -<para>Text mode makes it much easier to mix files between Cygwin and -Windows programs, since Windows programs will usually use the CRLF -format. Unfortunately you may still have some problems with text -mode. First, some of the utilities included with Cygwin do not yet -specify binary mode when they should, e.g. <command>cat</command> will -not work with binary files (input will stop at ^Z, CRs will be -introduced in the output). Second, you will introduce CRs in text -files you write, which can cause problems when moving them back to a -UNIX system. </para> - -<para>If you are mounting a remote file system from a UNIX machine, -or moving files back and forth to a UNIX machine, you may want to -access the files in binary mode. The text files found there will normally -be in UNIX NL format, and you would want any files put there by Cygwin -programs to be stored in a format understood by UNIX. -Be sure to remove CRs from all Makefiles and -shell scripts and make sure that you only edit the files with -DOS/Windows editors that can cope with and preserve NL terminated lines. -</para> - -<para>Note that you can decide this on a disk by disk basis (for -example, mounting local disks in text mode and network disks in binary -mode). You can also partition a disk, for example by mounting -<filename>c:</filename> in text mode, and <filename>c:\home</filename> -in binary mode.</para> - -</sect2> - -<sect2><title>Programming</title> - -<para>In the <function>open()</function> function call, binary mode can be -specified with the flag <literal>O_BINARY</literal> and text mode with -<literal>O_TEXT</literal>. These symbols are defined in -<filename>fcntl.h</filename>.</para> - -<para>In the <function>fopen()</function> function call, binary mode can be -specified by adding a <literal>b</literal> to the mode string. There is no -direct way to specify text mode.</para> - -<para>The mode of a file can be changed by the call -<function>setmode(fd,mode)</function> where <literal>fd</literal> is a file -descriptor (an integer) and <literal>mode</literal> is -<literal>O_BINARY</literal> or <literal>O_TEXT</literal>. The function -returns <literal>O_BINARY</literal> or <literal>O_TEXT</literal> depending -on the mode before the call, and <literal>EOF</literal> on error.</para> - -</sect2> - -</sect1> |