diff options
Diffstat (limited to 'manual/llio.texi')
-rw-r--r-- | manual/llio.texi | 1979 |
1 files changed, 1979 insertions, 0 deletions
diff --git a/manual/llio.texi b/manual/llio.texi new file mode 100644 index 0000000..6a5a5d2 --- /dev/null +++ b/manual/llio.texi @@ -0,0 +1,1979 @@ +@node Low-Level I/O, File System Interface, I/O on Streams, Top +@chapter Low-Level Input/Output + +This chapter describes functions for performing low-level input/output +operations on file descriptors. These functions include the primitives +for the higher-level I/O functions described in @ref{I/O on Streams}, as +well as functions for performing low-level control operations for which +there are no equivalents on streams. + +Stream-level I/O is more flexible and usually more convenient; +therefore, programmers generally use the descriptor-level functions only +when necessary. These are some of the usual reasons: + +@itemize @bullet +@item +For reading binary files in large chunks. + +@item +For reading an entire file into core before parsing it. + +@item +To perform operations other than data transfer, which can only be done +with a descriptor. (You can use @code{fileno} to get the descriptor +corresponding to a stream.) + +@item +To pass descriptors to a child process. (The child can create its own +stream to use a descriptor that it inherits, but cannot inherit a stream +directly.) +@end itemize + +@menu +* Opening and Closing Files:: How to open and close file + descriptors. +* I/O Primitives:: Reading and writing data. +* File Position Primitive:: Setting a descriptor's file + position. +* Descriptors and Streams:: Converting descriptor to stream + or vice-versa. +* Stream/Descriptor Precautions:: Precautions needed if you use both + descriptors and streams. +* Waiting for I/O:: How to check for input or output + on multiple file descriptors. +* Control Operations:: Various other operations on file + descriptors. +* Duplicating Descriptors:: Fcntl commands for duplicating + file descriptors. +* Descriptor Flags:: Fcntl commands for manipulating + flags associated with file + descriptors. +* File Status Flags:: Fcntl commands for manipulating + flags associated with open files. +* File Locks:: Fcntl commands for implementing + file locking. +* Interrupt Input:: Getting an asynchronous signal when + input arrives. +@end menu + + +@node Opening and Closing Files +@section Opening and Closing Files + +@cindex opening a file descriptor +@cindex closing a file descriptor +This section describes the primitives for opening and closing files +using file descriptors. The @code{open} and @code{creat} functions are +declared in the header file @file{fcntl.h}, while @code{close} is +declared in @file{unistd.h}. +@pindex unistd.h +@pindex fcntl.h + +@comment fcntl.h +@comment POSIX.1 +@deftypefun int open (const char *@var{filename}, int @var{flags}[, mode_t @var{mode}]) +The @code{open} function creates and returns a new file descriptor +for the file named by @var{filename}. Initially, the file position +indicator for the file is at the beginning of the file. The argument +@var{mode} is used only when a file is created, but it doesn't hurt +to supply the argument in any case. + +The @var{flags} argument controls how the file is to be opened. This is +a bit mask; you create the value by the bitwise OR of the appropriate +parameters (using the @samp{|} operator in C). +@xref{File Status Flags}, for the parameters available. + +The normal return value from @code{open} is a non-negative integer file +descriptor. In the case of an error, a value of @code{-1} is returned +instead. In addition to the usual file name errors (@pxref{File +Name Errors}), the following @code{errno} error conditions are defined +for this function: + +@table @code +@item EACCES +The file exists but is not readable/writable as requested by the @var{flags} +argument, the file does not exist and the directory is unwritable so +it cannot be created. + +@item EEXIST +Both @code{O_CREAT} and @code{O_EXCL} are set, and the named file already +exists. + +@item EINTR +The @code{open} operation was interrupted by a signal. +@xref{Interrupted Primitives}. + +@item EISDIR +The @var{flags} argument specified write access, and the file is a directory. + +@item EMFILE +The process has too many files open. +The maximum number of file descriptors is controlled by the +@code{RLIMIT_NOFILE} resource limit; @pxref{Limits on Resources}. + +@item ENFILE +The entire system, or perhaps the file system which contains the +directory, cannot support any additional open files at the moment. +(This problem cannot happen on the GNU system.) + +@item ENOENT +The named file does not exist, and @code{O_CREAT} is not specified. + +@item ENOSPC +The directory or file system that would contain the new file cannot be +extended, because there is no disk space left. + +@item ENXIO +@code{O_NONBLOCK} and @code{O_WRONLY} are both set in the @var{flags} +argument, the file named by @var{filename} is a FIFO (@pxref{Pipes and +FIFOs}), and no process has the file open for reading. + +@item EROFS +The file resides on a read-only file system and any of @w{@code{O_WRONLY}}, +@code{O_RDWR}, and @code{O_TRUNC} are set in the @var{flags} argument, +or @code{O_CREAT} is set and the file does not already exist. +@end table + +@c !!! umask + +The @code{open} function is the underlying primitive for the @code{fopen} +and @code{freopen} functions, that create streams. +@end deftypefun + +@comment fcntl.h +@comment POSIX.1 +@deftypefn {Obsolete function} int creat (const char *@var{filename}, mode_t @var{mode}) +This function is obsolete. The call: + +@smallexample +creat (@var{filename}, @var{mode}) +@end smallexample + +@noindent +is equivalent to: + +@smallexample +open (@var{filename}, O_WRONLY | O_CREAT | O_TRUNC, @var{mode}) +@end smallexample +@end deftypefn + +@comment unistd.h +@comment POSIX.1 +@deftypefun int close (int @var{filedes}) +The function @code{close} closes the file descriptor @var{filedes}. +Closing a file has the following consequences: + +@itemize @bullet +@item +The file descriptor is deallocated. + +@item +Any record locks owned by the process on the file are unlocked. + +@item +When all file descriptors associated with a pipe or FIFO have been closed, +any unread data is discarded. +@end itemize + +The normal return value from @code{close} is @code{0}; a value of @code{-1} +is returned in case of failure. The following @code{errno} error +conditions are defined for this function: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. + +@item EINTR +The @code{close} call was interrupted by a signal. +@xref{Interrupted Primitives}. +Here is an example of how to handle @code{EINTR} properly: + +@smallexample +TEMP_FAILURE_RETRY (close (desc)); +@end smallexample + +@item ENOSPC +@itemx EIO +@itemx EDQUOT +When the file is accessed by NFS, these errors from @code{write} can sometimes +not be detected until @code{close}. @xref{I/O Primitives}, for details +on their meaning. +@end table +@end deftypefun + +To close a stream, call @code{fclose} (@pxref{Closing Streams}) instead +of trying to close its underlying file descriptor with @code{close}. +This flushes any buffered output and updates the stream object to +indicate that it is closed. + +@node I/O Primitives +@section Input and Output Primitives + +This section describes the functions for performing primitive input and +output operations on file descriptors: @code{read}, @code{write}, and +@code{lseek}. These functions are declared in the header file +@file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftp {Data Type} ssize_t +This data type is used to represent the sizes of blocks that can be +read or written in a single operation. It is similar to @code{size_t}, +but must be a signed type. +@end deftp + +@cindex reading from a file descriptor +@comment unistd.h +@comment POSIX.1 +@deftypefun ssize_t read (int @var{filedes}, void *@var{buffer}, size_t @var{size}) +The @code{read} function reads up to @var{size} bytes from the file +with descriptor @var{filedes}, storing the results in the @var{buffer}. +(This is not necessarily a character string and there is no terminating +null character added.) + +@cindex end-of-file, on a file descriptor +The return value is the number of bytes actually read. This might be +less than @var{size}; for example, if there aren't that many bytes left +in the file or if there aren't that many bytes immediately available. +The exact behavior depends on what kind of file it is. Note that +reading less than @var{size} bytes is not an error. + +A value of zero indicates end-of-file (except if the value of the +@var{size} argument is also zero). This is not considered an error. +If you keep calling @code{read} while at end-of-file, it will keep +returning zero and doing nothing else. + +If @code{read} returns at least one character, there is no way you can +tell whether end-of-file was reached. But if you did reach the end, the +next read will return zero. + +In case of an error, @code{read} returns @code{-1}. The following +@code{errno} error conditions are defined for this function: + +@table @code +@item EAGAIN +Normally, when no input is immediately available, @code{read} waits for +some input. But if the @code{O_NONBLOCK} flag is set for the file +(@pxref{File Status Flags}), @code{read} returns immediately without +reading any data, and reports this error. + +@strong{Compatibility Note:} Most versions of BSD Unix use a different +error code for this: @code{EWOULDBLOCK}. In the GNU library, +@code{EWOULDBLOCK} is an alias for @code{EAGAIN}, so it doesn't matter +which name you use. + +On some systems, reading a large amount of data from a character special +file can also fail with @code{EAGAIN} if the kernel cannot find enough +physical memory to lock down the user's pages. This is limited to +devices that transfer with direct memory access into the user's memory, +which means it does not include terminals, since they always use +separate buffers inside the kernel. This problem never happens in the +GNU system. + +Any condition that could result in @code{EAGAIN} can instead result in a +successful @code{read} which returns fewer bytes than requested. +Calling @code{read} again immediately would result in @code{EAGAIN}. + +@item EBADF +The @var{filedes} argument is not a valid file descriptor, +or is not open for reading. + +@item EINTR +@code{read} was interrupted by a signal while it was waiting for input. +@xref{Interrupted Primitives}. A signal will not necessary cause +@code{read} to return @code{EINTR}; it may instead result in a +successful @code{read} which returns fewer bytes than requested. + +@item EIO +For many devices, and for disk files, this error code indicates +a hardware error. + +@code{EIO} also occurs when a background process tries to read from the +controlling terminal, and the normal action of stopping the process by +sending it a @code{SIGTTIN} signal isn't working. This might happen if +signal is being blocked or ignored, or because the process group is +orphaned. @xref{Job Control}, for more information about job control, +and @ref{Signal Handling}, for information about signals. +@end table + +The @code{read} function is the underlying primitive for all of the +functions that read from streams, such as @code{fgetc}. +@end deftypefun + +@cindex writing to a file descriptor +@comment unistd.h +@comment POSIX.1 +@deftypefun ssize_t write (int @var{filedes}, const void *@var{buffer}, size_t @var{size}) +The @code{write} function writes up to @var{size} bytes from +@var{buffer} to the file with descriptor @var{filedes}. The data in +@var{buffer} is not necessarily a character string and a null character is +output like any other character. + +The return value is the number of bytes actually written. This may be +@var{size}, but can always be smaller. Your program should always call +@code{write} in a loop, iterating until all the data is written. + +Once @code{write} returns, the data is enqueued to be written and can be +read back right away, but it is not necessarily written out to permanent +storage immediately. You can use @code{fsync} when you need to be sure +your data has been permanently stored before continuing. (It is more +efficient for the system to batch up consecutive writes and do them all +at once when convenient. Normally they will always be written to disk +within a minute or less.) +@c !!! xref fsync +You can use the @code{O_FSYNC} open mode to make @code{write} always +store the data to disk before returning; @pxref{Operating Modes}. + +In the case of an error, @code{write} returns @code{-1}. The following +@code{errno} error conditions are defined for this function: + +@table @code +@item EAGAIN +Normally, @code{write} blocks until the write operation is complete. +But if the @code{O_NONBLOCK} flag is set for the file (@pxref{Control +Operations}), it returns immediately without writing any data, and +reports this error. An example of a situation that might cause the +process to block on output is writing to a terminal device that supports +flow control, where output has been suspended by receipt of a STOP +character. + +@strong{Compatibility Note:} Most versions of BSD Unix use a different +error code for this: @code{EWOULDBLOCK}. In the GNU library, +@code{EWOULDBLOCK} is an alias for @code{EAGAIN}, so it doesn't matter +which name you use. + +On some systems, writing a large amount of data from a character special +file can also fail with @code{EAGAIN} if the kernel cannot find enough +physical memory to lock down the user's pages. This is limited to +devices that transfer with direct memory access into the user's memory, +which means it does not include terminals, since they always use +separate buffers inside the kernel. This problem does not arise in the +GNU system. + +@item EBADF +The @var{filedes} argument is not a valid file descriptor, +or is not open for writing. + +@item EFBIG +The size of the file would become larger than the implementation can support. + +@item EINTR +The @code{write} operation was interrupted by a signal while it was +blocked waiting for completion. A signal will not necessary cause +@code{write} to return @code{EINTR}; it may instead result in a +successful @code{write} which writes fewer bytes than requested. +@xref{Interrupted Primitives}. + +@item EIO +For many devices, and for disk files, this error code indicates +a hardware error. + +@item ENOSPC +The device containing the file is full. + +@item EPIPE +This error is returned when you try to write to a pipe or FIFO that +isn't open for reading by any process. When this happens, a @code{SIGPIPE} +signal is also sent to the process; see @ref{Signal Handling}. +@end table + +Unless you have arranged to prevent @code{EINTR} failures, you should +check @code{errno} after each failing call to @code{write}, and if the +error was @code{EINTR}, you should simply repeat the call. +@xref{Interrupted Primitives}. The easy way to do this is with the +macro @code{TEMP_FAILURE_RETRY}, as follows: + +@smallexample +nbytes = TEMP_FAILURE_RETRY (write (desc, buffer, count)); +@end smallexample + +The @code{write} function is the underlying primitive for all of the +functions that write to streams, such as @code{fputc}. +@end deftypefun + +@node File Position Primitive +@section Setting the File Position of a Descriptor + +Just as you can set the file position of a stream with @code{fseek}, you +can set the file position of a descriptor with @code{lseek}. This +specifies the position in the file for the next @code{read} or +@code{write} operation. @xref{File Positioning}, for more information +on the file position and what it means. + +To read the current file position value from a descriptor, use +@code{lseek (@var{desc}, 0, SEEK_CUR)}. + +@cindex file positioning on a file descriptor +@cindex positioning a file descriptor +@cindex seeking on a file descriptor +@comment unistd.h +@comment POSIX.1 +@deftypefun off_t lseek (int @var{filedes}, off_t @var{offset}, int @var{whence}) +The @code{lseek} function is used to change the file position of the +file with descriptor @var{filedes}. + +The @var{whence} argument specifies how the @var{offset} should be +interpreted in the same way as for the @code{fseek} function, and must be +one of the symbolic constants @code{SEEK_SET}, @code{SEEK_CUR}, or +@code{SEEK_END}. + +@table @code +@item SEEK_SET +Specifies that @var{whence} is a count of characters from the beginning +of the file. + +@item SEEK_CUR +Specifies that @var{whence} is a count of characters from the current +file position. This count may be positive or negative. + +@item SEEK_END +Specifies that @var{whence} is a count of characters from the end of +the file. A negative count specifies a position within the current +extent of the file; a positive count specifies a position past the +current end. If you set the position past the current end, and +actually write data, you will extend the file with zeros up to that +position.@end table + +The return value from @code{lseek} is normally the resulting file +position, measured in bytes from the beginning of the file. +You can use this feature together with @code{SEEK_CUR} to read the +current file position. + +If you want to append to the file, setting the file position to the +current end of file with @code{SEEK_END} is not sufficient. Another +process may write more data after you seek but before you write, +extending the file so the position you write onto clobbers their data. +Instead, use the @code{O_APPEND} operating mode; @pxref{Operating Modes}. + +You can set the file position past the current end of the file. This +does not by itself make the file longer; @code{lseek} never changes the +file. But subsequent output at that position will extend the file. +Characters between the previous end of file and the new position are +filled with zeros. Extending the file in this way can create a +``hole'': the blocks of zeros are not actually allocated on disk, so the +file takes up less space than it appears so; it is then called a +``sparse file''. +@cindex sparse files +@cindex holes in files + +If the file position cannot be changed, or the operation is in some way +invalid, @code{lseek} returns a value of @code{-1}. The following +@code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The @var{filedes} is not a valid file descriptor. + +@item EINVAL +The @var{whence} argument value is not valid, or the resulting +file offset is not valid. A file offset is invalid. + +@item ESPIPE +The @var{filedes} corresponds to an object that cannot be positioned, +such as a pipe, FIFO or terminal device. (POSIX.1 specifies this error +only for pipes and FIFOs, but in the GNU system, you always get +@code{ESPIPE} if the object is not seekable.) +@end table + +The @code{lseek} function is the underlying primitive for the +@code{fseek}, @code{ftell} and @code{rewind} functions, which operate on +streams instead of file descriptors. +@end deftypefun + +You can have multiple descriptors for the same file if you open the file +more than once, or if you duplicate a descriptor with @code{dup}. +Descriptors that come from separate calls to @code{open} have independent +file positions; using @code{lseek} on one descriptor has no effect on the +other. For example, + +@smallexample +@group +@{ + int d1, d2; + char buf[4]; + d1 = open ("foo", O_RDONLY); + d2 = open ("foo", O_RDONLY); + lseek (d1, 1024, SEEK_SET); + read (d2, buf, 4); +@} +@end group +@end smallexample + +@noindent +will read the first four characters of the file @file{foo}. (The +error-checking code necessary for a real program has been omitted here +for brevity.) + +By contrast, descriptors made by duplication share a common file +position with the original descriptor that was duplicated. Anything +which alters the file position of one of the duplicates, including +reading or writing data, affects all of them alike. Thus, for example, + +@smallexample +@{ + int d1, d2, d3; + char buf1[4], buf2[4]; + d1 = open ("foo", O_RDONLY); + d2 = dup (d1); + d3 = dup (d2); + lseek (d3, 1024, SEEK_SET); + read (d1, buf1, 4); + read (d2, buf2, 4); +@} +@end smallexample + +@noindent +will read four characters starting with the 1024'th character of +@file{foo}, and then four more characters starting with the 1028'th +character. + +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} off_t +This is an arithmetic data type used to represent file sizes. +In the GNU system, this is equivalent to @code{fpos_t} or @code{long int}. +@end deftp + +These aliases for the @samp{SEEK_@dots{}} constants exist for the sake +of compatibility with older BSD systems. They are defined in two +different header files: @file{fcntl.h} and @file{sys/file.h}. + +@table @code +@item L_SET +An alias for @code{SEEK_SET}. + +@item L_INCR +An alias for @code{SEEK_CUR}. + +@item L_XTND +An alias for @code{SEEK_END}. +@end table + +@node Descriptors and Streams +@section Descriptors and Streams +@cindex streams, and file descriptors +@cindex converting file descriptor to stream +@cindex extracting file descriptor from stream + +Given an open file descriptor, you can create a stream for it with the +@code{fdopen} function. You can get the underlying file descriptor for +an existing stream with the @code{fileno} function. These functions are +declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment POSIX.1 +@deftypefun {FILE *} fdopen (int @var{filedes}, const char *@var{opentype}) +The @code{fdopen} function returns a new stream for the file descriptor +@var{filedes}. + +The @var{opentype} argument is interpreted in the same way as for the +@code{fopen} function (@pxref{Opening Streams}), except that +the @samp{b} option is not permitted; this is because GNU makes no +distinction between text and binary files. Also, @code{"w"} and +@code{"w+"} do not cause truncation of the file; these have affect only +when opening a file, and in this case the file has already been opened. +You must make sure that the @var{opentype} argument matches the actual +mode of the open file descriptor. + +The return value is the new stream. If the stream cannot be created +(for example, if the modes for the file indicated by the file descriptor +do not permit the access specified by the @var{opentype} argument), a +null pointer is returned instead. + +In some other systems, @code{fdopen} may fail to detect that the modes +for file descriptor do not permit the access specified by +@code{opentype}. The GNU C library always checks for this. +@end deftypefun + +For an example showing the use of the @code{fdopen} function, +see @ref{Creating a Pipe}. + +@comment stdio.h +@comment POSIX.1 +@deftypefun int fileno (FILE *@var{stream}) +This function returns the file descriptor associated with the stream +@var{stream}. If an error is detected (for example, if the @var{stream} +is not valid) or if @var{stream} does not do I/O to a file, +@code{fileno} returns @code{-1}. +@end deftypefun + +@cindex standard file descriptors +@cindex file descriptors, standard +There are also symbolic constants defined in @file{unistd.h} for the +file descriptors belonging to the standard streams @code{stdin}, +@code{stdout}, and @code{stderr}; see @ref{Standard Streams}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@table @code +@item STDIN_FILENO +@vindex STDIN_FILENO +This macro has value @code{0}, which is the file descriptor for +standard input. +@cindex standard input file descriptor + +@comment unistd.h +@comment POSIX.1 +@item STDOUT_FILENO +@vindex STDOUT_FILENO +This macro has value @code{1}, which is the file descriptor for +standard output. +@cindex standard output file descriptor + +@comment unistd.h +@comment POSIX.1 +@item STDERR_FILENO +@vindex STDERR_FILENO +This macro has value @code{2}, which is the file descriptor for +standard error output. +@end table +@cindex standard error file descriptor + +@node Stream/Descriptor Precautions +@section Dangers of Mixing Streams and Descriptors +@cindex channels +@cindex streams and descriptors +@cindex descriptors and streams +@cindex mixing descriptors and streams + +You can have multiple file descriptors and streams (let's call both +streams and descriptors ``channels'' for short) connected to the same +file, but you must take care to avoid confusion between channels. There +are two cases to consider: @dfn{linked} channels that share a single +file position value, and @dfn{independent} channels that have their own +file positions. + +It's best to use just one channel in your program for actual data +transfer to any given file, except when all the access is for input. +For example, if you open a pipe (something you can only do at the file +descriptor level), either do all I/O with the descriptor, or construct a +stream from the descriptor with @code{fdopen} and then do all I/O with +the stream. + +@menu +* Linked Channels:: Dealing with channels sharing a file position. +* Independent Channels:: Dealing with separately opened, unlinked channels. +* Cleaning Streams:: Cleaning a stream makes it safe to use + another channel. +@end menu + +@node Linked Channels +@subsection Linked Channels +@cindex linked channels + +Channels that come from a single opening share the same file position; +we call them @dfn{linked} channels. Linked channels result when you +make a stream from a descriptor using @code{fdopen}, when you get a +descriptor from a stream with @code{fileno}, when you copy a descriptor +with @code{dup} or @code{dup2}, and when descriptors are inherited +during @code{fork}. For files that don't support random access, such as +terminals and pipes, @emph{all} channels are effectively linked. On +random-access files, all append-type output streams are effectively +linked to each other. + +@cindex cleaning up a stream +If you have been using a stream for I/O, and you want to do I/O using +another channel (either a stream or a descriptor) that is linked to it, +you must first @dfn{clean up} the stream that you have been using. +@xref{Cleaning Streams}. + +Terminating a process, or executing a new program in the process, +destroys all the streams in the process. If descriptors linked to these +streams persist in other processes, their file positions become +undefined as a result. To prevent this, you must clean up the streams +before destroying them. + +@node Independent Channels +@subsection Independent Channels +@cindex independent channels + +When you open channels (streams or descriptors) separately on a seekable +file, each channel has its own file position. These are called +@dfn{independent channels}. + +The system handles each channel independently. Most of the time, this +is quite predictable and natural (especially for input): each channel +can read or write sequentially at its own place in the file. However, +if some of the channels are streams, you must take these precautions: + +@itemize @bullet +@item +You should clean an output stream after use, before doing anything else +that might read or write from the same part of the file. + +@item +You should clean an input stream before reading data that may have been +modified using an independent channel. Otherwise, you might read +obsolete data that had been in the stream's buffer. +@end itemize + +If you do output to one channel at the end of the file, this will +certainly leave the other independent channels positioned somewhere +before the new end. You cannot reliably set their file positions to the +new end of file before writing, because the file can always be extended +by another process between when you set the file position and when you +write the data. Instead, use an append-type descriptor or stream; they +always output at the current end of the file. In order to make the +end-of-file position accurate, you must clean the output channel you +were using, if it is a stream. + +It's impossible for two channels to have separate file pointers for a +file that doesn't support random access. Thus, channels for reading or +writing such files are always linked, never independent. Append-type +channels are also always linked. For these channels, follow the rules +for linked channels; see @ref{Linked Channels}. + +@node Cleaning Streams +@subsection Cleaning Streams + +On the GNU system, you can clean up any stream with @code{fclean}: + +@comment stdio.h +@comment GNU +@deftypefun int fclean (FILE *@var{stream}) +Clean up the stream @var{stream} so that its buffer is empty. If +@var{stream} is doing output, force it out. If @var{stream} is doing +input, give the data in the buffer back to the system, arranging to +reread it. +@end deftypefun + +On other systems, you can use @code{fflush} to clean a stream in most +cases. + +You can skip the @code{fclean} or @code{fflush} if you know the stream +is already clean. A stream is clean whenever its buffer is empty. For +example, an unbuffered stream is always clean. An input stream that is +at end-of-file is clean. A line-buffered stream is clean when the last +character output was a newline. + +There is one case in which cleaning a stream is impossible on most +systems. This is when the stream is doing input from a file that is not +random-access. Such streams typically read ahead, and when the file is +not random access, there is no way to give back the excess data already +read. When an input stream reads from a random-access file, +@code{fflush} does clean the stream, but leaves the file pointer at an +unpredictable place; you must set the file pointer before doing any +further I/O. On the GNU system, using @code{fclean} avoids both of +these problems. + +Closing an output-only stream also does @code{fflush}, so this is a +valid way of cleaning an output stream. On the GNU system, closing an +input stream does @code{fclean}. + +You need not clean a stream before using its descriptor for control +operations such as setting terminal modes; these operations don't affect +the file position and are not affected by it. You can use any +descriptor for these operations, and all channels are affected +simultaneously. However, text already ``output'' to a stream but still +buffered by the stream will be subject to the new terminal modes when +subsequently flushed. To make sure ``past'' output is covered by the +terminal settings that were in effect at the time, flush the output +streams for that terminal before setting the modes. @xref{Terminal +Modes}. + +@node Waiting for I/O +@section Waiting for Input or Output +@cindex waiting for input or output +@cindex multiplexing input +@cindex input from multiple files + +Sometimes a program needs to accept input on multiple input channels +whenever input arrives. For example, some workstations may have devices +such as a digitizing tablet, function button box, or dial box that are +connected via normal asynchronous serial interfaces; good user interface +style requires responding immediately to input on any device. Another +example is a program that acts as a server to several other processes +via pipes or sockets. + +You cannot normally use @code{read} for this purpose, because this +blocks the program until input is available on one particular file +descriptor; input on other channels won't wake it up. You could set +nonblocking mode and poll each file descriptor in turn, but this is very +inefficient. + +A better solution is to use the @code{select} function. This blocks the +program until input or output is ready on a specified set of file +descriptors, or until a timer expires, whichever comes first. This +facility is declared in the header file @file{sys/types.h}. +@pindex sys/types.h + +In the case of a server socket (@pxref{Listening}), we say that +``input'' is available when there are pending connections that could be +accepted (@pxref{Accepting Connections}). @code{accept} for server +sockets blocks and interacts with @code{select} just as @code{read} does +for normal input. + +@cindex file descriptor sets, for @code{select} +The file descriptor sets for the @code{select} function are specified +as @code{fd_set} objects. Here is the description of the data type +and some macros for manipulating these objects. + +@comment sys/types.h +@comment BSD +@deftp {Data Type} fd_set +The @code{fd_set} data type represents file descriptor sets for the +@code{select} function. It is actually a bit array. +@end deftp + +@comment sys/types.h +@comment BSD +@deftypevr Macro int FD_SETSIZE +The value of this macro is the maximum number of file descriptors that a +@code{fd_set} object can hold information about. On systems with a +fixed maximum number, @code{FD_SETSIZE} is at least that number. On +some systems, including GNU, there is no absolute limit on the number of +descriptors open, but this macro still has a constant value which +controls the number of bits in an @code{fd_set}; if you get a file +descriptor with a value as high as @code{FD_SETSIZE}, you cannot put +that descriptor into an @code{fd_set}. +@end deftypevr + +@comment sys/types.h +@comment BSD +@deftypefn Macro void FD_ZERO (fd_set *@var{set}) +This macro initializes the file descriptor set @var{set} to be the +empty set. +@end deftypefn + +@comment sys/types.h +@comment BSD +@deftypefn Macro void FD_SET (int @var{filedes}, fd_set *@var{set}) +This macro adds @var{filedes} to the file descriptor set @var{set}. +@end deftypefn + +@comment sys/types.h +@comment BSD +@deftypefn Macro void FD_CLR (int @var{filedes}, fd_set *@var{set}) +This macro removes @var{filedes} from the file descriptor set @var{set}. +@end deftypefn + +@comment sys/types.h +@comment BSD +@deftypefn Macro int FD_ISSET (int @var{filedes}, fd_set *@var{set}) +This macro returns a nonzero value (true) if @var{filedes} is a member +of the the file descriptor set @var{set}, and zero (false) otherwise. +@end deftypefn + +Next, here is the description of the @code{select} function itself. + +@comment sys/types.h +@comment BSD +@deftypefun int select (int @var{nfds}, fd_set *@var{read-fds}, fd_set *@var{write-fds}, fd_set *@var{except-fds}, struct timeval *@var{timeout}) +The @code{select} function blocks the calling process until there is +activity on any of the specified sets of file descriptors, or until the +timeout period has expired. + +The file descriptors specified by the @var{read-fds} argument are +checked to see if they are ready for reading; the @var{write-fds} file +descriptors are checked to see if they are ready for writing; and the +@var{except-fds} file descriptors are checked for exceptional +conditions. You can pass a null pointer for any of these arguments if +you are not interested in checking for that kind of condition. + +A file descriptor is considered ready for reading if it is at end of +file. A server socket is considered ready for reading if there is a +pending connection which can be accepted with @code{accept}; +@pxref{Accepting Connections}. A client socket is ready for writing when +its connection is fully established; @pxref{Connecting}. + +``Exceptional conditions'' does not mean errors---errors are reported +immediately when an erroneous system call is executed, and do not +constitute a state of the descriptor. Rather, they include conditions +such as the presence of an urgent message on a socket. (@xref{Sockets}, +for information on urgent messages.) + +The @code{select} function checks only the first @var{nfds} file +descriptors. The usual thing is to pass @code{FD_SETSIZE} as the value +of this argument. + +The @var{timeout} specifies the maximum time to wait. If you pass a +null pointer for this argument, it means to block indefinitely until one +of the file descriptors is ready. Otherwise, you should provide the +time in @code{struct timeval} format; see @ref{High-Resolution +Calendar}. Specify zero as the time (a @code{struct timeval} containing +all zeros) if you want to find out which descriptors are ready without +waiting if none are ready. + +The normal return value from @code{select} is the total number of ready file +descriptors in all of the sets. Each of the argument sets is overwritten +with information about the descriptors that are ready for the corresponding +operation. Thus, to see if a particular descriptor @var{desc} has input, +use @code{FD_ISSET (@var{desc}, @var{read-fds})} after @code{select} returns. + +If @code{select} returns because the timeout period expires, it returns +a value of zero. + +Any signal will cause @code{select} to return immediately. So if your +program uses signals, you can't rely on @code{select} to keep waiting +for the full time specified. If you want to be sure of waiting for a +particular amount of time, you must check for @code{EINTR} and repeat +the @code{select} with a newly calculated timeout based on the current +time. See the example below. See also @ref{Interrupted Primitives}. + +If an error occurs, @code{select} returns @code{-1} and does not modify +the argument file descriptor sets. The following @code{errno} error +conditions are defined for this function: + +@table @code +@item EBADF +One of the file descriptor sets specified an invalid file descriptor. + +@item EINTR +The operation was interrupted by a signal. @xref{Interrupted Primitives}. + +@item EINVAL +The @var{timeout} argument is invalid; one of the components is negative +or too large. +@end table +@end deftypefun + +@strong{Portability Note:} The @code{select} function is a BSD Unix +feature. + +Here is an example showing how you can use @code{select} to establish a +timeout period for reading from a file descriptor. The @code{input_timeout} +function blocks the calling process until input is available on the +file descriptor, or until the timeout period expires. + +@smallexample +@include select.c.texi +@end smallexample + +There is another example showing the use of @code{select} to multiplex +input from multiple sockets in @ref{Server Example}. + + +@node Control Operations +@section Control Operations on Files + +@cindex control operations on files +@cindex @code{fcntl} function +This section describes how you can perform various other operations on +file descriptors, such as inquiring about or setting flags describing +the status of the file descriptor, manipulating record locks, and the +like. All of these operations are performed by the function @code{fcntl}. + +The second argument to the @code{fcntl} function is a command that +specifies which operation to perform. The function and macros that name +various flags that are used with it are declared in the header file +@file{fcntl.h}. Many of these flags are also used by the @code{open} +function; see @ref{Opening and Closing Files}. +@pindex fcntl.h + +@comment fcntl.h +@comment POSIX.1 +@deftypefun int fcntl (int @var{filedes}, int @var{command}, @dots{}) +The @code{fcntl} function performs the operation specified by +@var{command} on the file descriptor @var{filedes}. Some commands +require additional arguments to be supplied. These additional arguments +and the return value and error conditions are given in the detailed +descriptions of the individual commands. + +Briefly, here is a list of what the various commands are. + +@table @code +@item F_DUPFD +Duplicate the file descriptor (return another file descriptor pointing +to the same open file). @xref{Duplicating Descriptors}. + +@item F_GETFD +Get flags associated with the file descriptor. @xref{Descriptor Flags}. + +@item F_SETFD +Set flags associated with the file descriptor. @xref{Descriptor Flags}. + +@item F_GETFL +Get flags associated with the open file. @xref{File Status Flags}. + +@item F_SETFL +Set flags associated with the open file. @xref{File Status Flags}. + +@item F_GETLK +Get a file lock. @xref{File Locks}. + +@item F_SETLK +Set or clear a file lock. @xref{File Locks}. + +@item F_SETLKW +Like @code{F_SETLK}, but wait for completion. @xref{File Locks}. + +@item F_GETOWN +Get process or process group ID to receive @code{SIGIO} signals. +@xref{Interrupt Input}. + +@item F_SETOWN +Set process or process group ID to receive @code{SIGIO} signals. +@xref{Interrupt Input}. +@end table +@end deftypefun + + +@node Duplicating Descriptors +@section Duplicating Descriptors + +@cindex duplicating file descriptors +@cindex redirecting input and output + +You can @dfn{duplicate} a file descriptor, or allocate another file +descriptor that refers to the same open file as the original. Duplicate +descriptors share one file position and one set of file status flags +(@pxref{File Status Flags}), but each has its own set of file descriptor +flags (@pxref{Descriptor Flags}). + +The major use of duplicating a file descriptor is to implement +@dfn{redirection} of input or output: that is, to change the +file or pipe that a particular file descriptor corresponds to. + +You can perform this operation using the @code{fcntl} function with the +@code{F_DUPFD} command, but there are also convenient functions +@code{dup} and @code{dup2} for duplicating descriptors. + +@pindex unistd.h +@pindex fcntl.h +The @code{fcntl} function and flags are declared in @file{fcntl.h}, +while prototypes for @code{dup} and @code{dup2} are in the header file +@file{unistd.h}. + +@comment unistd.h +@comment POSIX.1 +@deftypefun int dup (int @var{old}) +This function copies descriptor @var{old} to the first available +descriptor number (the first number not currently open). It is +equivalent to @code{fcntl (@var{old}, F_DUPFD, 0)}. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int dup2 (int @var{old}, int @var{new}) +This function copies the descriptor @var{old} to descriptor number +@var{new}. + +If @var{old} is an invalid descriptor, then @code{dup2} does nothing; it +does not close @var{new}. Otherwise, the new duplicate of @var{old} +replaces any previous meaning of descriptor @var{new}, as if @var{new} +were closed first. + +If @var{old} and @var{new} are different numbers, and @var{old} is a +valid descriptor number, then @code{dup2} is equivalent to: + +@smallexample +close (@var{new}); +fcntl (@var{old}, F_DUPFD, @var{new}) +@end smallexample + +However, @code{dup2} does this atomically; there is no instant in the +middle of calling @code{dup2} at which @var{new} is closed and not yet a +duplicate of @var{old}. +@end deftypefun + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_DUPFD +This macro is used as the @var{command} argument to @code{fcntl}, to +copy the file descriptor given as the first argument. + +The form of the call in this case is: + +@smallexample +fcntl (@var{old}, F_DUPFD, @var{next-filedes}) +@end smallexample + +The @var{next-filedes} argument is of type @code{int} and specifies that +the file descriptor returned should be the next available one greater +than or equal to this value. + +The return value from @code{fcntl} with this command is normally the value +of the new file descriptor. A return value of @code{-1} indicates an +error. The following @code{errno} error conditions are defined for +this command: + +@table @code +@item EBADF +The @var{old} argument is invalid. + +@item EINVAL +The @var{next-filedes} argument is invalid. + +@item EMFILE +There are no more file descriptors available---your program is already +using the maximum. In BSD and GNU, the maximum is controlled by a +resource limit that can be changed; @pxref{Limits on Resources}, for +more information about the @code{RLIMIT_NOFILE} limit. +@end table + +@code{ENFILE} is not a possible error code for @code{dup2} because +@code{dup2} does not create a new opening of a file; duplicate +descriptors do not count toward the limit which @code{ENFILE} +indicates. @code{EMFILE} is possible because it refers to the limit on +distinct descriptor numbers in use in one process. +@end deftypevr + +Here is an example showing how to use @code{dup2} to do redirection. +Typically, redirection of the standard streams (like @code{stdin}) is +done by a shell or shell-like program before calling one of the +@code{exec} functions (@pxref{Executing a File}) to execute a new +program in a child process. When the new program is executed, it +creates and initializes the standard streams to point to the +corresponding file descriptors, before its @code{main} function is +invoked. + +So, to redirect standard input to a file, the shell could do something +like: + +@smallexample +pid = fork (); +if (pid == 0) + @{ + char *filename; + char *program; + int file; + @dots{} + file = TEMP_FAILURE_RETRY (open (filename, O_RDONLY)); + dup2 (file, STDIN_FILENO); + TEMP_FAILURE_RETRY (close (file)); + execv (program, NULL); + @} +@end smallexample + +There is also a more detailed example showing how to implement redirection +in the context of a pipeline of processes in @ref{Launching Jobs}. + + +@node Descriptor Flags +@section File Descriptor Flags +@cindex file descriptor flags + +@dfn{File descriptor flags} are miscellaneous attributes of a file +descriptor. These flags are associated with particular file +descriptors, so that if you have created duplicate file descriptors +from a single opening of a file, each descriptor has its own set of flags. + +Currently there is just one file descriptor flag: @code{FD_CLOEXEC}, +which causes the descriptor to be closed if you use any of the +@code{exec@dots{}} functions (@pxref{Executing a File}). + +The symbols in this section are defined in the header file +@file{fcntl.h}. +@pindex fcntl.h + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_GETFD +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should return the file descriptor flags associated +with the @var{filedes} argument. + +The normal return value from @code{fcntl} with this command is a +nonnegative number which can be interpreted as the bitwise OR of the +individual flags (except that currently there is only one flag to use). + +In case of an error, @code{fcntl} returns @code{-1}. The following +@code{errno} error conditions are defined for this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. +@end table +@end deftypevr + + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_SETFD +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should set the file descriptor flags associated with the +@var{filedes} argument. This requires a third @code{int} argument to +specify the new flags, so the form of the call is: + +@smallexample +fcntl (@var{filedes}, F_SETFD, @var{new-flags}) +@end smallexample + +The normal return value from @code{fcntl} with this command is an +unspecified value other than @code{-1}, which indicates an error. +The flags and error conditions are the same as for the @code{F_GETFD} +command. +@end deftypevr + +The following macro is defined for use as a file descriptor flag with +the @code{fcntl} function. The value is an integer constant usable +as a bit mask value. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int FD_CLOEXEC +@cindex close-on-exec (file descriptor flag) +This flag specifies that the file descriptor should be closed when +an @code{exec} function is invoked; see @ref{Executing a File}. When +a file descriptor is allocated (as with @code{open} or @code{dup}), +this bit is initially cleared on the new file descriptor, meaning that +descriptor will survive into the new program after @code{exec}. +@end deftypevr + +If you want to modify the file descriptor flags, you should get the +current flags with @code{F_GETFD} and modify the value. Don't assume +that the flags listed here are the only ones that are implemented; your +program may be run years from now and more flags may exist then. For +example, here is a function to set or clear the flag @code{FD_CLOEXEC} +without altering any other flags: + +@smallexample +/* @r{Set the @code{FD_CLOEXEC} flag of @var{desc} if @var{value} is nonzero,} + @r{or clear the flag if @var{value} is 0.} + @r{Return 0 on success, or -1 on error with @code{errno} set.} */ + +int +set_cloexec_flag (int desc, int value) +@{ + int oldflags = fcntl (desc, F_GETFD, 0); + /* @r{If reading the flags failed, return error indication now.} + if (oldflags < 0) + return oldflags; + /* @r{Set just the flag we want to set.} */ + if (value != 0) + oldflags |= FD_CLOEXEC; + else + oldflags &= ~FD_CLOEXEC; + /* @r{Store modified flag word in the descriptor.} */ + return fcntl (desc, F_SETFD, oldflags); +@} +@end smallexample + +@node File Status Flags +@section File Status Flags +@cindex file status flags + +@dfn{File status flags} are used to specify attributes of the opening of a +file. Unlike the file descriptor flags discussed in @ref{Descriptor +Flags}, the file status flags are shared by duplicated file descriptors +resulting from a single opening of the file. The file status flags are +specified with the @var{flags} argument to @code{open}; +@pxref{Opening and Closing Files}. + +File status flags fall into three categories, which are described in the +following sections. + +@itemize @bullet +@item +@ref{Access Modes}, specify what type of access is allowed to the +file: reading, writing, or both. They are set by @code{open} and are +returned by @code{fcntl}, but cannot be changed. + +@item +@ref{Open-time Flags}, control details of what @code{open} will do. +These flags are not preserved after the @code{open} call. + +@item +@ref{Operating Modes}, affect how operations such as @code{read} and +@code{write} are done. They are set by @code{open}, and can be fetched or +changed with @code{fcntl}. +@end itemize + +The symbols in this section are defined in the header file +@file{fcntl.h}. +@pindex fcntl.h + +@menu +* Access Modes:: Whether the descriptor can read or write. +* Open-time Flags:: Details of @code{open}. +* Operating Modes:: Special modes to control I/O operations. +* Getting File Status Flags:: Fetching and changing these flags. +@end menu + +@node Access Modes +@subsection File Access Modes + +The file access modes allow a file descriptor to be used for reading, +writing, or both. (In the GNU system, they can also allow none of these, +and allow execution of the file as a program.) The access modes are chosen +when the file is opened, and never change. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_RDONLY +Open the file for read access. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_WRONLY +Open the file for write access. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_RDWR +Open the file for both reading and writing. +@end deftypevr + +In the GNU system (and not in other systems), @code{O_RDONLY} and +@code{O_WRONLY} are independent bits that can be bitwise-ORed together, +and it is valid for either bit to be set or clear. This means that +@code{O_RDWR} is the same as @code{O_RDONLY|O_WRONLY}. A file access +mode of zero is permissible; it allows no operations that do input or +output to the file, but does allow other operations such as +@code{fchmod}. On the GNU system, since ``read-only'' or ``write-only'' +is a misnomer, @file{fcntl.h} defines additional names for the file +access modes. These names are preferred when writing GNU-specific code. +But most programs will want to be portable to other POSIX.1 systems and +should use the POSIX.1 names above instead. + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_READ +Open the file for reading. Same as @code{O_RDWR}; only defined on GNU. +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_WRITE +Open the file for reading. Same as @code{O_WRONLY}; only defined on GNU. +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_EXEC +Open the file for executing. Only defined on GNU. +@end deftypevr + +To determine the file access mode with @code{fcntl}, you must extract +the access mode bits from the retrieved file status flags. In the GNU +system, you can just test the @code{O_READ} and @code{O_WRITE} bits in +the flags word. But in other POSIX.1 systems, reading and writing +access modes are not stored as distinct bit flags. The portable way to +extract the file access mode bits is with @code{O_ACCMODE}. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_ACCMODE +This macro stands for a mask that can be bitwise-ANDed with the file +status flag value to produce a value representing the file access mode. +The mode will be @code{O_RDONLY}, @code{O_WRONLY}, or @code{O_RDWR}. +(In the GNU system it could also be zero, and it never includes the +@code{O_EXEC} bit.) +@end deftypevr + +@node Open-time Flags +@subsection Open-time Flags + +The open-time flags specify options affecting how @code{open} will behave. +These options are not preserved once the file is open. The exception to +this is @code{O_NONBLOCK}, which is also an I/O operating mode and so it +@emph{is} saved. @xref{Opening and Closing Files}, for how to call +@code{open}. + +There are two sorts of options specified by open-time flags. + +@itemize @bullet +@item +@dfn{File name translation flags} affect how @code{open} looks up the +file name to locate the file, and whether the file can be created. +@cindex file name translation flags +@cindex flags, file name translation + +@item +@dfn{Open-time action flags} specify extra operations that @code{open} will +perform on the file once it is open. +@cindex open-time action flags +@cindex flags, open-time action +@end itemize + +Here are the file name translation flags. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_CREAT +If set, the file will be created if it doesn't already exist. +@c !!! mode arg, umask +@cindex create on open (file status flag) +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_EXCL +If both @code{O_CREAT} and @code{O_EXCL} are set, then @code{open} fails +if the specified file already exists. This is guaranteed to never +clobber an existing file. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_NONBLOCK +@cindex non-blocking open +This prevents @code{open} from blocking for a ``long time'' to open the +file. This is only meaningful for some kinds of files, usually devices +such as serial ports; when it is not meaningful, it is harmless and +ignored. Often opening a port to a modem blocks until the modem reports +carrier detection; if @code{O_NONBLOCK} is specified, @code{open} will +return immediately without a carrier. + +Note that the @code{O_NONBLOCK} flag is overloaded as both an I/O operating +mode and a file name translation flag. This means that specifying +@code{O_NONBLOCK} in @code{open} also sets nonblocking I/O mode; +@pxref{Operating Modes}. To open the file without blocking but do normal +I/O that blocks, you must call @code{open} with @code{O_NONBLOCK} set and +then call @code{fcntl} to turn the bit off. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_NOCTTY +If the named file is a terminal device, don't make it the controlling +terminal for the process. @xref{Job Control}, for information about +what it means to be the controlling terminal. + +In the GNU system and 4.4 BSD, opening a file never makes it the +controlling terminal and @code{O_NOCTTY} is zero. However, other +systems may use a nonzero value for @code{O_NOCTTY} and set the +controlling terminal when you open a file that is a terminal device; so +to be portable, use @code{O_NOCTTY} when it is important to avoid this. +@cindex controlling terminal, setting +@end deftypevr + +The following three file name translation flags exist only in the GNU system. + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_IGNORE_CTTY +Do not recognize the named file as the controlling terminal, even if it +refers to the process's existing controlling terminal device. Operations +on the new file descriptor will never induce job control signals. +@xref{Job Control}. +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_NOLINK +If the named file is a symbolic link, open the link itself instead of +the file it refers to. (@code{fstat} on the new file descriptor will +return the information returned by @code{lstat} on the link's name.) +@cindex symbolic link, opening +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_NOTRANS +If the named file is specially translated, do not invoke the translator. +Open the bare file the translator itself sees. +@end deftypevr + + +The open-time action flags tell @code{open} to do additional operations +which are not really related to opening the file. The reason to do them +as part of @code{open} instead of in separate calls is that @code{open} +can do them @i{atomically}. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_TRUNC +Truncate the file to zero length. This option is only useful for +regular files, not special files such as directories or FIFOs. POSIX.1 +requires that you open the file for writing to use @code{O_TRUNC}. In +BSD and GNU you must have permission to write the file to truncate it, +but you need not open for write access. + +This is the only open-time action flag specified by POSIX.1. There is +no good reason for truncation to be done by @code{open}, instead of by +calling @code{ftruncate} afterwards. The @code{O_TRUNC} flag existed in +Unix before @code{ftruncate} was invented, and is retained for backward +compatibility. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_SHLOCK +Acquire a shared lock on the file, as with @code{flock}. +@xref{File Locks}. + +If @code{O_CREAT} is specified, the locking is done atomically when +creating the file. You are guaranteed that no other process will get +the lock on the new file first. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_EXLOCK +Acquire an exclusive lock on the file, as with @code{flock}. +@xref{File Locks}. This is atomic like @code{O_SHLOCK}. +@end deftypevr + +@node Operating Modes +@subsection I/O Operating Modes + +The operating modes affect how input and output operations using a file +descriptor work. These flags are set by @code{open} and can be fetched +and changed with @code{fcntl}. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_APPEND +The bit that enables append mode for the file. If set, then all +@code{write} operations write the data at the end of the file, extending +it, regardless of the current file position. This is the only reliable +way to append to a file. In append mode, you are guaranteed that the +data you write will always go to the current end of the file, regardless +of other processes writing to the file. Conversely, if you simply set +the file position to the end of file and write, then another process can +extend the file after you set the file position but before you write, +resulting in your data appearing someplace before the real end of file. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr O_NONBLOCK +The bit that enables nonblocking mode for the file. If this bit is set, +@code{read} requests on the file can return immediately with a failure +status if there is no input immediately available, instead of blocking. +Likewise, @code{write} requests can also return immediately with a +failure status if the output can't be written immediately. + +Note that the @code{O_NONBLOCK} flag is overloaded as both an I/O +operating mode and a file name translation flag; @pxref{Open-time Flags}. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_NDELAY +This is an obsolete name for @code{O_NONBLOCK}, provided for +compatibility with BSD. It is not defined by the POSIX.1 standard. +@end deftypevr + +The remaining operating modes are BSD and GNU extensions. They exist only +on some systems. On other systems, these macros are not defined. + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_ASYNC +The bit that enables asynchronous input mode. If set, then @code{SIGIO} +signals will be generated when input is available. @xref{Interrupt Input}. + +Asynchronous input mode is a BSD feature. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_FSYNC +The bit that enables synchronous writing for the file. If set, each +@code{write} call will make sure the data is reliably stored on disk before +returning. @c !!! xref fsync + +Synchronous writing is a BSD feature. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_SYNC +This is another name for @code{O_FSYNC}. They have the same value. +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_NOATIME +If this bit is set, @code{read} will not update the access time of the +file. @xref{File Times}. This is used by programs that do backups, so +that backing a file up does not count as reading it. +Only the owner of the file or the superuser may use this bit. + +This is a GNU extension. +@end deftypevr + +@node Getting File Status Flags +@subsection Getting and Setting File Status Flags + +The @code{fcntl} function can fetch or change file status flags. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_GETFL +This macro is used as the @var{command} argument to @code{fcntl}, to +read the file status flags for the open file with descriptor +@var{filedes}. + +The normal return value from @code{fcntl} with this command is a +nonnegative number which can be interpreted as the bitwise OR of the +individual flags. Since the file access modes are not single-bit values, +you can mask off other bits in the returned flags with @code{O_ACCMODE} +to compare them. + +In case of an error, @code{fcntl} returns @code{-1}. The following +@code{errno} error conditions are defined for this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. +@end table +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_SETFL +This macro is used as the @var{command} argument to @code{fcntl}, to set +the file status flags for the open file corresponding to the +@var{filedes} argument. This command requires a third @code{int} +argument to specify the new flags, so the call looks like this: + +@smallexample +fcntl (@var{filedes}, F_SETFL, @var{new-flags}) +@end smallexample + +You can't change the access mode for the file in this way; that is, +whether the file descriptor was opened for reading or writing. + +The normal return value from @code{fcntl} with this command is an +unspecified value other than @code{-1}, which indicates an error. The +error conditions are the same as for the @code{F_GETFL} command. +@end deftypevr + +If you want to modify the file status flags, you should get the current +flags with @code{F_GETFL} and modify the value. Don't assume that the +flags listed here are the only ones that are implemented; your program +may be run years from now and more flags may exist then. For example, +here is a function to set or clear the flag @code{O_NONBLOCK} without +altering any other flags: + +@smallexample +@group +/* @r{Set the @code{O_NONBLOCK} flag of @var{desc} if @var{value} is nonzero,} + @r{or clear the flag if @var{value} is 0.} + @r{Return 0 on success, or -1 on error with @code{errno} set.} */ + +int +set_nonblock_flag (int desc, int value) +@{ + int oldflags = fcntl (desc, F_GETFL, 0); + /* @r{If reading the flags failed, return error indication now.} */ + if (oldflags == -1) + return -1; + /* @r{Set just the flag we want to set.} */ + if (value != 0) + oldflags |= O_NONBLOCK; + else + oldflags &= ~O_NONBLOCK; + /* @r{Store modified flag word in the descriptor.} */ + return fcntl (desc, F_SETFL, oldflags); +@} +@end group +@end smallexample + +@node File Locks +@section File Locks + +@cindex file locks +@cindex record locking +The remaining @code{fcntl} commands are used to support @dfn{record +locking}, which permits multiple cooperating programs to prevent each +other from simultaneously accessing parts of a file in error-prone +ways. + +@cindex exclusive lock +@cindex write lock +An @dfn{exclusive} or @dfn{write} lock gives a process exclusive access +for writing to the specified part of the file. While a write lock is in +place, no other process can lock that part of the file. + +@cindex shared lock +@cindex read lock +A @dfn{shared} or @dfn{read} lock prohibits any other process from +requesting a write lock on the specified part of the file. However, +other processes can request read locks. + +The @code{read} and @code{write} functions do not actually check to see +whether there are any locks in place. If you want to implement a +locking protocol for a file shared by multiple processes, your application +must do explicit @code{fcntl} calls to request and clear locks at the +appropriate points. + +Locks are associated with processes. A process can only have one kind +of lock set for each byte of a given file. When any file descriptor for +that file is closed by the process, all of the locks that process holds +on that file are released, even if the locks were made using other +descriptors that remain open. Likewise, locks are released when a +process exits, and are not inherited by child processes created using +@code{fork} (@pxref{Creating a Process}). + +When making a lock, use a @code{struct flock} to specify what kind of +lock and where. This data type and the associated macros for the +@code{fcntl} function are declared in the header file @file{fcntl.h}. +@pindex fcntl.h + +@comment fcntl.h +@comment POSIX.1 +@deftp {Data Type} {struct flock} +This structure is used with the @code{fcntl} function to describe a file +lock. It has these members: + +@table @code +@item short int l_type +Specifies the type of the lock; one of @code{F_RDLCK}, @code{F_WRLCK}, or +@code{F_UNLCK}. + +@item short int l_whence +This corresponds to the @var{whence} argument to @code{fseek} or +@code{lseek}, and specifies what the offset is relative to. Its value +can be one of @code{SEEK_SET}, @code{SEEK_CUR}, or @code{SEEK_END}. + +@item off_t l_start +This specifies the offset of the start of the region to which the lock +applies, and is given in bytes relative to the point specified by +@code{l_whence} member. + +@item off_t l_len +This specifies the length of the region to be locked. A value of +@code{0} is treated specially; it means the region extends to the end of +the file. + +@item pid_t l_pid +This field is the process ID (@pxref{Process Creation Concepts}) of the +process holding the lock. It is filled in by calling @code{fcntl} with +the @code{F_GETLK} command, but is ignored when making a lock. +@end table +@end deftp + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_GETLK +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should get information about a lock. This command +requires a third argument of type @w{@code{struct flock *}} to be passed +to @code{fcntl}, so that the form of the call is: + +@smallexample +fcntl (@var{filedes}, F_GETLK, @var{lockp}) +@end smallexample + +If there is a lock already in place that would block the lock described +by the @var{lockp} argument, information about that lock overwrites +@code{*@var{lockp}}. Existing locks are not reported if they are +compatible with making a new lock as specified. Thus, you should +specify a lock type of @code{F_WRLCK} if you want to find out about both +read and write locks, or @code{F_RDLCK} if you want to find out about +write locks only. + +There might be more than one lock affecting the region specified by the +@var{lockp} argument, but @code{fcntl} only returns information about +one of them. The @code{l_whence} member of the @var{lockp} structure is +set to @code{SEEK_SET} and the @code{l_start} and @code{l_len} fields +set to identify the locked region. + +If no lock applies, the only change to the @var{lockp} structure is to +update the @code{l_type} to a value of @code{F_UNLCK}. + +The normal return value from @code{fcntl} with this command is an +unspecified value other than @code{-1}, which is reserved to indicate an +error. The following @code{errno} error conditions are defined for +this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. + +@item EINVAL +Either the @var{lockp} argument doesn't specify valid lock information, +or the file associated with @var{filedes} doesn't support locks. +@end table +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_SETLK +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should set or clear a lock. This command requires a +third argument of type @w{@code{struct flock *}} to be passed to +@code{fcntl}, so that the form of the call is: + +@smallexample +fcntl (@var{filedes}, F_SETLK, @var{lockp}) +@end smallexample + +If the process already has a lock on any part of the region, the old lock +on that part is replaced with the new lock. You can remove a lock +by specifying a lock type of @code{F_UNLCK}. + +If the lock cannot be set, @code{fcntl} returns immediately with a value +of @code{-1}. This function does not block waiting for other processes +to release locks. If @code{fcntl} succeeds, it return a value other +than @code{-1}. + +The following @code{errno} error conditions are defined for this +function: + +@table @code +@item EAGAIN +@itemx EACCES +The lock cannot be set because it is blocked by an existing lock on the +file. Some systems use @code{EAGAIN} in this case, and other systems +use @code{EACCES}; your program should treat them alike, after +@code{F_SETLK}. (The GNU system always uses @code{EAGAIN}.) + +@item EBADF +Either: the @var{filedes} argument is invalid; you requested a read lock +but the @var{filedes} is not open for read access; or, you requested a +write lock but the @var{filedes} is not open for write access. + +@item EINVAL +Either the @var{lockp} argument doesn't specify valid lock information, +or the file associated with @var{filedes} doesn't support locks. + +@item ENOLCK +The system has run out of file lock resources; there are already too +many file locks in place. + +Well-designed file systems never report this error, because they have no +limitation on the number of locks. However, you must still take account +of the possibility of this error, as it could result from network access +to a file system on another machine. +@end table +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_SETLKW +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should set or clear a lock. It is just like the +@code{F_SETLK} command, but causes the process to block (or wait) +until the request can be specified. + +This command requires a third argument of type @code{struct flock *}, as +for the @code{F_SETLK} command. + +The @code{fcntl} return values and errors are the same as for the +@code{F_SETLK} command, but these additional @code{errno} error conditions +are defined for this command: + +@table @code +@item EINTR +The function was interrupted by a signal while it was waiting. +@xref{Interrupted Primitives}. + +@item EDEADLK +The specified region is being locked by another process. But that +process is waiting to lock a region which the current process has +locked, so waiting for the lock would result in deadlock. The system +does not guarantee that it will detect all such conditions, but it lets +you know if it notices one. +@end table +@end deftypevr + + +The following macros are defined for use as values for the @code{l_type} +member of the @code{flock} structure. The values are integer constants. + +@table @code +@comment fcntl.h +@comment POSIX.1 +@vindex F_RDLCK +@item F_RDLCK +This macro is used to specify a read (or shared) lock. + +@comment fcntl.h +@comment POSIX.1 +@vindex F_WRLCK +@item F_WRLCK +This macro is used to specify a write (or exclusive) lock. + +@comment fcntl.h +@comment POSIX.1 +@vindex F_UNLCK +@item F_UNLCK +This macro is used to specify that the region is unlocked. +@end table + +As an example of a situation where file locking is useful, consider a +program that can be run simultaneously by several different users, that +logs status information to a common file. One example of such a program +might be a game that uses a file to keep track of high scores. Another +example might be a program that records usage or accounting information +for billing purposes. + +Having multiple copies of the program simultaneously writing to the +file could cause the contents of the file to become mixed up. But +you can prevent this kind of problem by setting a write lock on the +file before actually writing to the file. + +If the program also needs to read the file and wants to make sure that +the contents of the file are in a consistent state, then it can also use +a read lock. While the read lock is set, no other process can lock +that part of the file for writing. + +@c ??? This section could use an example program. + +Remember that file locks are only a @emph{voluntary} protocol for +controlling access to a file. There is still potential for access to +the file by programs that don't use the lock protocol. + +@node Interrupt Input +@section Interrupt-Driven Input + +@cindex interrupt-driven input +If you set the @code{O_ASYNC} status flag on a file descriptor +(@pxref{File Status Flags}), a @code{SIGIO} signal is sent whenever +input or output becomes possible on that file descriptor. The process +or process group to receive the signal can be selected by using the +@code{F_SETOWN} command to the @code{fcntl} function. If the file +descriptor is a socket, this also selects the recipient of @code{SIGURG} +signals that are delivered when out-of-band data arrives on that socket; +see @ref{Out-of-Band Data}. (@code{SIGURG} is sent in any situation +where @code{select} would report the socket as having an ``exceptional +condition''. @xref{Waiting for I/O}.) + +If the file descriptor corresponds to a terminal device, then @code{SIGIO} +signals are sent to the foreground process group of the terminal. +@xref{Job Control}. + +@pindex fcntl.h +The symbols in this section are defined in the header file +@file{fcntl.h}. + +@comment fcntl.h +@comment BSD +@deftypevr Macro int F_GETOWN +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should get information about the process or process +group to which @code{SIGIO} signals are sent. (For a terminal, this is +actually the foreground process group ID, which you can get using +@code{tcgetpgrp}; see @ref{Terminal Access Functions}.) + +The return value is interpreted as a process ID; if negative, its +absolute value is the process group ID. + +The following @code{errno} error condition is defined for this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. +@end table +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int F_SETOWN +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should set the process or process group to which +@code{SIGIO} signals are sent. This command requires a third argument +of type @code{pid_t} to be passed to @code{fcntl}, so that the form of +the call is: + +@smallexample +fcntl (@var{filedes}, F_SETOWN, @var{pid}) +@end smallexample + +The @var{pid} argument should be a process ID. You can also pass a +negative number whose absolute value is a process group ID. + +The return value from @code{fcntl} with this command is @code{-1} +in case of error and some other value if successful. The following +@code{errno} error conditions are defined for this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. + +@item ESRCH +There is no process or process group corresponding to @var{pid}. +@end table +@end deftypevr + +@c ??? This section could use an example program. |