From 28f540f45bbacd939bfd07f213bcad2bf730b1bf Mon Sep 17 00:00:00 2001 From: Roland McGrath <roland@gnu.org> Date: Sat, 18 Feb 1995 01:27:10 +0000 Subject: initial import --- manual/stdio.texi | 3635 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 3635 insertions(+) create mode 100644 manual/stdio.texi (limited to 'manual/stdio.texi') diff --git a/manual/stdio.texi b/manual/stdio.texi new file mode 100644 index 0000000..411d94a --- /dev/null +++ b/manual/stdio.texi @@ -0,0 +1,3635 @@ +@node I/O on Streams, Low-Level I/O, I/O Overview, Top +@chapter Input/Output on Streams + +This chapter describes the functions for creating streams and performing +input and output operations on them. As discussed in @ref{I/O +Overview}, a stream is a fairly abstract, high-level concept +representing a communications channel to a file, device, or process. + +@menu +* Streams:: About the data type representing a stream. +* Standard Streams:: Streams to the standard input and output + devices are created for you. +* Opening Streams:: How to create a stream to talk to a file. +* Closing Streams:: Close a stream when you are finished with it. +* Simple Output:: Unformatted output by characters and lines. +* Character Input:: Unformatted input by characters and words. +* Line Input:: Reading a line or a record from a stream. +* Unreading:: Peeking ahead/pushing back input just read. +* Block Input/Output:: Input and output operations on blocks of data. +* Formatted Output:: @code{printf} and related functions. +* Customizing Printf:: You can define new conversion specifiers for + @code{printf} and friends. +* Formatted Input:: @code{scanf} and related functions. +* EOF and Errors:: How you can tell if an I/O error happens. +* Binary Streams:: Some systems distinguish between text files + and binary files. +* File Positioning:: About random-access streams. +* Portable Positioning:: Random access on peculiar ANSI C systems. +* Stream Buffering:: How to control buffering of streams. +* Other Kinds of Streams:: Streams that do not necessarily correspond + to an open file. +@end menu + +@node Streams +@section Streams + +For historical reasons, the type of the C data structure that represents +a stream is called @code{FILE} rather than ``stream''. Since most of +the library functions deal with objects of type @code{FILE *}, sometimes +the term @dfn{file pointer} is also used to mean ``stream''. This leads +to unfortunate confusion over terminology in many books on C. This +manual, however, is careful to use the terms ``file'' and ``stream'' +only in the technical sense. +@cindex file pointer + +@pindex stdio.h +The @code{FILE} type is declared in the header file @file{stdio.h}. + +@comment stdio.h +@comment ANSI +@deftp {Data Type} FILE +This is the data type used to represent stream objects. A @code{FILE} +object holds all of the internal state information about the connection +to the associated file, including such things as the file position +indicator and buffering information. Each stream also has error and +end-of-file status indicators that can be tested with the @code{ferror} +and @code{feof} functions; see @ref{EOF and Errors}. +@end deftp + +@code{FILE} objects are allocated and managed internally by the +input/output library functions. Don't try to create your own objects of +type @code{FILE}; let the library do it. Your programs should +deal only with pointers to these objects (that is, @code{FILE *} values) +rather than the objects themselves. +@c !!! should say that FILE's have "No user-servicable parts inside." + +@node Standard Streams +@section Standard Streams +@cindex standard streams +@cindex streams, standard + +When the @code{main} function of your program is invoked, it already has +three predefined streams open and available for use. These represent +the ``standard'' input and output channels that have been established +for the process. + +These streams are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypevar {FILE *} stdin +The @dfn{standard input} stream, which is the normal source of input for the +program. +@end deftypevar +@cindex standard input stream + +@comment stdio.h +@comment ANSI +@deftypevar {FILE *} stdout +The @dfn{standard output} stream, which is used for normal output from +the program. +@end deftypevar +@cindex standard output stream + +@comment stdio.h +@comment ANSI +@deftypevar {FILE *} stderr +The @dfn{standard error} stream, which is used for error messages and +diagnostics issued by the program. +@end deftypevar +@cindex standard error stream + +In the GNU system, you can specify what files or processes correspond to +these streams using the pipe and redirection facilities provided by the +shell. (The primitives shells use to implement these facilities are +described in @ref{File System Interface}.) Most other operating systems +provide similar mechanisms, but the details of how to use them can vary. + +In the GNU C library, @code{stdin}, @code{stdout}, and @code{stderr} are +normal variables which you can set just like any others. For example, to redirect +the standard output to a file, you could do: + +@smallexample +fclose (stdout); +stdout = fopen ("standard-output-file", "w"); +@end smallexample + +Note however, that in other systems @code{stdin}, @code{stdout}, and +@code{stderr} are macros that you cannot assign to in the normal way. +But you can use @code{freopen} to get the effect of closing one and +reopening it. @xref{Opening Streams}. + +@node Opening Streams +@section Opening Streams + +@cindex opening a stream +Opening a file with the @code{fopen} function creates a new stream and +establishes a connection between the stream and a file. This may +involve creating a new file. + +@pindex stdio.h +Everything described in this section is declared in the header file +@file{stdio.h}. + +@comment stdio.h +@comment ANSI +@deftypefun {FILE *} fopen (const char *@var{filename}, const char *@var{opentype}) +The @code{fopen} function opens a stream for I/O to the file +@var{filename}, and returns a pointer to the stream. + +The @var{opentype} argument is a string that controls how the file is +opened and specifies attributes of the resulting stream. It must begin +with one of the following sequences of characters: + +@table @samp +@item r +Open an existing file for reading only. + +@item w +Open the file for writing only. If the file already exists, it is +truncated to zero length. Otherwise a new file is created. + +@item a +Open a file for append access; that is, writing at the end of file only. +If the file already exists, its initial contents are unchanged and +output to the stream is appended to the end of the file. +Otherwise, a new, empty file is created. + +@item r+ +Open an existing file for both reading and writing. The initial contents +of the file are unchanged and the initial file position is at the +beginning of the file. + +@item w+ +Open a file for both reading and writing. If the file already exists, it +is truncated to zero length. Otherwise, a new file is created. + +@item a+ +Open or create file for both reading and appending. If the file exists, +its initial contents are unchanged. Otherwise, a new file is created. +The initial file position for reading is at the beginning of the file, +but output is always appended to the end of the file. +@end table + +As you can see, @samp{+} requests a stream that can do both input and +output. The ANSI standard says that when using such a stream, you must +call @code{fflush} (@pxref{Stream Buffering}) or a file positioning +function such as @code{fseek} (@pxref{File Positioning}) when switching +from reading to writing or vice versa. Otherwise, internal buffers +might not be emptied properly. The GNU C library does not have this +limitation; you can do arbitrary reading and writing operations on a +stream in whatever order. + +Additional characters may appear after these to specify flags for the +call. Always put the mode (@samp{r}, @samp{w+}, etc.) first; that is +the only part you are guaranteed will be understood by all systems. + +The GNU C library defines one additional character for use in +@var{opentype}: the character @samp{x} insists on creating a new +file---if a file @var{filename} already exists, @code{fopen} fails +rather than opening it. If you use @samp{x} you can are guaranteed that +you will not clobber an existing file. This is equivalent to the +@code{O_EXCL} option to the @code{open} function (@pxref{Opening and +Closing Files}). + +The character @samp{b} in @var{opentype} has a standard meaning; it +requests a binary stream rather than a text stream. But this makes no +difference in POSIX systems (including the GNU system). If both +@samp{+} and @samp{b} are specified, they can appear in either order. +@xref{Binary Streams}. + +Any other characters in @var{opentype} are simply ignored. They may be +meaningful in other systems. + +If the open fails, @code{fopen} returns a null pointer. +@end deftypefun + +You can have multiple streams (or file descriptors) pointing to the same +file open at the same time. If you do only input, this works +straightforwardly, but you must be careful if any output streams are +included. @xref{Stream/Descriptor Precautions}. This is equally true +whether the streams are in one program (not usual) or in several +programs (which can easily happen). It may be advantageous to use the +file locking facilities to avoid simultaneous access. @xref{File +Locks}. + +@comment stdio.h +@comment ANSI +@deftypevr Macro int FOPEN_MAX +The value of this macro is an integer constant expression that +represents the minimum number of streams that the implementation +guarantees can be open simultaneously. You might be able to open more +than this many streams, but that is not guaranteed. The value of this +constant is at least eight, which includes the three standard streams +@code{stdin}, @code{stdout}, and @code{stderr}. In POSIX.1 systems this +value is determined by the @code{OPEN_MAX} parameter; @pxref{General +Limits}. In BSD and GNU, it is controlled by the @code{RLIMIT_NOFILE} +resource limit; @pxref{Limits on Resources}. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypefun {FILE *} freopen (const char *@var{filename}, const char *@var{opentype}, FILE *@var{stream}) +This function is like a combination of @code{fclose} and @code{fopen}. +It first closes the stream referred to by @var{stream}, ignoring any +errors that are detected in the process. (Because errors are ignored, +you should not use @code{freopen} on an output stream if you have +actually done any output using the stream.) Then the file named by +@var{filename} is opened with mode @var{opentype} as for @code{fopen}, +and associated with the same stream object @var{stream}. + +If the operation fails, a null pointer is returned; otherwise, +@code{freopen} returns @var{stream}. + +@code{freopen} has traditionally been used to connect a standard stream +such as @code{stdin} with a file of your own choice. This is useful in +programs in which use of a standard stream for certain purposes is +hard-coded. In the GNU C library, you can simply close the standard +streams and open new ones with @code{fopen}. But other systems lack +this ability, so using @code{freopen} is more portable. +@end deftypefun + + +@node Closing Streams +@section Closing Streams + +@cindex closing a stream +When a stream is closed with @code{fclose}, the connection between the +stream and the file is cancelled. After you have closed a stream, you +cannot perform any additional operations on it. + +@comment stdio.h +@comment ANSI +@deftypefun int fclose (FILE *@var{stream}) +This function causes @var{stream} to be closed and the connection to +the corresponding file to be broken. Any buffered output is written +and any buffered input is discarded. The @code{fclose} function returns +a value of @code{0} if the file was closed successfully, and @code{EOF} +if an error was detected. + +It is important to check for errors when you call @code{fclose} to close +an output stream, because real, everyday errors can be detected at this +time. For example, when @code{fclose} writes the remaining buffered +output, it might get an error because the disk is full. Even if you +know the buffer is empty, errors can still occur when closing a file if +you are using NFS. + +The function @code{fclose} is declared in @file{stdio.h}. +@end deftypefun + +If the @code{main} function to your program returns, or if you call the +@code{exit} function (@pxref{Normal Termination}), all open streams are +automatically closed properly. If your program terminates in any other +manner, such as by calling the @code{abort} function (@pxref{Aborting a +Program}) or from a fatal signal (@pxref{Signal Handling}), open streams +might not be closed properly. Buffered output might not be flushed and +files may be incomplete. For more information on buffering of streams, +see @ref{Stream Buffering}. + +@node Simple Output +@section Simple Output by Characters or Lines + +@cindex writing to a stream, by characters +This section describes functions for performing character- and +line-oriented output. + +These functions are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int fputc (int @var{c}, FILE *@var{stream}) +The @code{fputc} function converts the character @var{c} to type +@code{unsigned char}, and writes it to the stream @var{stream}. +@code{EOF} is returned if a write error occurs; otherwise the +character @var{c} is returned. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int putc (int @var{c}, FILE *@var{stream}) +This is just like @code{fputc}, except that most systems implement it as +a macro, making it faster. One consequence is that it may evaluate the +@var{stream} argument more than once, which is an exception to the +general rule for macros. @code{putc} is usually the best function to +use for writing a single character. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int putchar (int @var{c}) +The @code{putchar} function is equivalent to @code{putc} with +@code{stdout} as the value of the @var{stream} argument. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fputs (const char *@var{s}, FILE *@var{stream}) +The function @code{fputs} writes the string @var{s} to the stream +@var{stream}. The terminating null character is not written. +This function does @emph{not} add a newline character, either. +It outputs only the characters in the string. + +This function returns @code{EOF} if a write error occurs, and otherwise +a non-negative value. + +For example: + +@smallexample +fputs ("Are ", stdout); +fputs ("you ", stdout); +fputs ("hungry?\n", stdout); +@end smallexample + +@noindent +outputs the text @samp{Are you hungry?} followed by a newline. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int puts (const char *@var{s}) +The @code{puts} function writes the string @var{s} to the stream +@code{stdout} followed by a newline. The terminating null character of +the string is not written. (Note that @code{fputs} does @emph{not} +write a newline as this function does.) + +@code{puts} is the most convenient function for printing simple +messages. For example: + +@smallexample +puts ("This is a message."); +@end smallexample +@end deftypefun + +@comment stdio.h +@comment SVID +@deftypefun int putw (int @var{w}, FILE *@var{stream}) +This function writes the word @var{w} (that is, an @code{int}) to +@var{stream}. It is provided for compatibility with SVID, but we +recommend you use @code{fwrite} instead (@pxref{Block Input/Output}). +@end deftypefun + +@node Character Input +@section Character Input + +@cindex reading from a stream, by characters +This section describes functions for performing character-oriented input. +These functions are declared in the header file @file{stdio.h}. +@pindex stdio.h + +These functions return an @code{int} value that is either a character of +input, or the special value @code{EOF} (usually -1). It is important to +store the result of these functions in a variable of type @code{int} +instead of @code{char}, even when you plan to use it only as a +character. Storing @code{EOF} in a @code{char} variable truncates its +value to the size of a character, so that it is no longer +distinguishable from the valid character @samp{(char) -1}. So always +use an @code{int} for the result of @code{getc} and friends, and check +for @code{EOF} after the call; once you've verified that the result is +not @code{EOF}, you can be sure that it will fit in a @samp{char} +variable without loss of information. + +@comment stdio.h +@comment ANSI +@deftypefun int fgetc (FILE *@var{stream}) +This function reads the next character as an @code{unsigned char} from +the stream @var{stream} and returns its value, converted to an +@code{int}. If an end-of-file condition or read error occurs, +@code{EOF} is returned instead. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int getc (FILE *@var{stream}) +This is just like @code{fgetc}, except that it is permissible (and +typical) for it to be implemented as a macro that evaluates the +@var{stream} argument more than once. @code{getc} is often highly +optimized, so it is usually the best function to use to read a single +character. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int getchar (void) +The @code{getchar} function is equivalent to @code{getc} with @code{stdin} +as the value of the @var{stream} argument. +@end deftypefun + +Here is an example of a function that does input using @code{fgetc}. It +would work just as well using @code{getc} instead, or using +@code{getchar ()} instead of @w{@code{fgetc (stdin)}}. + +@smallexample +int +y_or_n_p (const char *question) +@{ + fputs (question, stdout); + while (1) + @{ + int c, answer; + /* @r{Write a space to separate answer from question.} */ + fputc (' ', stdout); + /* @r{Read the first character of the line.} + @r{This should be the answer character, but might not be.} */ + c = tolower (fgetc (stdin)); + answer = c; + /* @r{Discard rest of input line.} */ + while (c != '\n' && c != EOF) + c = fgetc (stdin); + /* @r{Obey the answer if it was valid.} */ + if (answer == 'y') + return 1; + if (answer == 'n') + return 0; + /* @r{Answer was invalid: ask for valid answer.} */ + fputs ("Please answer y or n:", stdout); + @} +@} +@end smallexample + +@comment stdio.h +@comment SVID +@deftypefun int getw (FILE *@var{stream}) +This function reads a word (that is, an @code{int}) from @var{stream}. +It's provided for compatibility with SVID. We recommend you use +@code{fread} instead (@pxref{Block Input/Output}). Unlike @code{getc}, +any @code{int} value could be a valid result. @code{getw} returns +@code{EOF} when it encounters end-of-file or an error, but there is no +way to distinguish this from an input word with value -1. +@end deftypefun + +@node Line Input +@section Line-Oriented Input + +Since many programs interpret input on the basis of lines, it's +convenient to have functions to read a line of text from a stream. + +Standard C has functions to do this, but they aren't very safe: null +characters and even (for @code{gets}) long lines can confuse them. So +the GNU library provides the nonstandard @code{getline} function that +makes it easy to read lines reliably. + +Another GNU extension, @code{getdelim}, generalizes @code{getline}. It +reads a delimited record, defined as everything through the next +occurrence of a specified delimiter character. + +All these functions are declared in @file{stdio.h}. + +@comment stdio.h +@comment GNU +@deftypefun ssize_t getline (char **@var{lineptr}, size_t *@var{n}, FILE *@var{stream}) +This function reads an entire line from @var{stream}, storing the text +(including the newline and a terminating null character) in a buffer +and storing the buffer address in @code{*@var{lineptr}}. + +Before calling @code{getline}, you should place in @code{*@var{lineptr}} +the address of a buffer @code{*@var{n}} bytes long, allocated with +@code{malloc}. If this buffer is long enough to hold the line, +@code{getline} stores the line in this buffer. Otherwise, +@code{getline} makes the buffer bigger using @code{realloc}, storing the +new buffer address back in @code{*@var{lineptr}} and the increased size +back in @code{*@var{n}}. +@xref{Unconstrained Allocation}. + +If you set @code{*@var{lineptr}} to a null pointer, and @code{*@var{n}} +to zero, before the call, then @code{getline} allocates the initial +buffer for you by calling @code{malloc}. + +In either case, when @code{getline} returns, @code{*@var{lineptr}} is +a @code{char *} which points to the text of the line. + +When @code{getline} is successful, it returns the number of characters +read (including the newline, but not including the terminating null). +This value enables you to distinguish null characters that are part of +the line from the null character inserted as a terminator. + +This function is a GNU extension, but it is the recommended way to read +lines from a stream. The alternative standard functions are unreliable. + +If an error occurs or end of file is reached, @code{getline} returns +@code{-1}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun ssize_t getdelim (char **@var{lineptr}, size_t *@var{n}, int @var{delimiter}, FILE *@var{stream}) +This function is like @code{getline} except that the character which +tells it to stop reading is not necessarily newline. The argument +@var{delimiter} specifies the delimiter character; @code{getdelim} keeps +reading until it sees that character (or end of file). + +The text is stored in @var{lineptr}, including the delimiter character +and a terminating null. Like @code{getline}, @code{getdelim} makes +@var{lineptr} bigger if it isn't big enough. + +@code{getline} is in fact implemented in terms of @code{getdelim}, just +like this: + +@smallexample +ssize_t +getline (char **lineptr, size_t *n, FILE *stream) +@{ + return getdelim (lineptr, n, '\n', stream); +@} +@end smallexample +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun {char *} fgets (char *@var{s}, int @var{count}, FILE *@var{stream}) +The @code{fgets} function reads characters from the stream @var{stream} +up to and including a newline character and stores them in the string +@var{s}, adding a null character to mark the end of the string. You +must supply @var{count} characters worth of space in @var{s}, but the +number of characters read is at most @var{count} @minus{} 1. The extra +character space is used to hold the null character at the end of the +string. + +If the system is already at end of file when you call @code{fgets}, then +the contents of the array @var{s} are unchanged and a null pointer is +returned. A null pointer is also returned if a read error occurs. +Otherwise, the return value is the pointer @var{s}. + +@strong{Warning:} If the input data has a null character, you can't tell. +So don't use @code{fgets} unless you know the data cannot contain a null. +Don't use it to read files edited by the user because, if the user inserts +a null character, you should either handle it properly or print a clear +error message. We recommend using @code{getline} instead of @code{fgets}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefn {Deprecated function} {char *} gets (char *@var{s}) +The function @code{gets} reads characters from the stream @code{stdin} +up to the next newline character, and stores them in the string @var{s}. +The newline character is discarded (note that this differs from the +behavior of @code{fgets}, which copies the newline character into the +string). If @code{gets} encounters a read error or end-of-file, it +returns a null pointer; otherwise it returns @var{s}. + +@strong{Warning:} The @code{gets} function is @strong{very dangerous} +because it provides no protection against overflowing the string +@var{s}. The GNU library includes it for compatibility only. You +should @strong{always} use @code{fgets} or @code{getline} instead. To +remind you of this, the linker (if using GNU @code{ld}) will issue a +warning whenever you use @code{gets}. +@end deftypefn + +@node Unreading +@section Unreading +@cindex peeking at input +@cindex unreading characters +@cindex pushing input back + +In parser programs it is often useful to examine the next character in +the input stream without removing it from the stream. This is called +``peeking ahead'' at the input because your program gets a glimpse of +the input it will read next. + +Using stream I/O, you can peek ahead at input by first reading it and +then @dfn{unreading} it (also called @dfn{pushing it back} on the stream). +Unreading a character makes it available to be input again from the stream, +by the next call to @code{fgetc} or other input function on that stream. + +@menu +* Unreading Idea:: An explanation of unreading with pictures. +* How Unread:: How to call @code{ungetc} to do unreading. +@end menu + +@node Unreading Idea +@subsection What Unreading Means + +Here is a pictorial explanation of unreading. Suppose you have a +stream reading a file that contains just six characters, the letters +@samp{foobar}. Suppose you have read three characters so far. The +situation looks like this: + +@smallexample +f o o b a r + ^ +@end smallexample + +@noindent +so the next input character will be @samp{b}. + +@c @group Invalid outside @example +If instead of reading @samp{b} you unread the letter @samp{o}, you get a +situation like this: + +@smallexample +f o o b a r + | + o-- + ^ +@end smallexample + +@noindent +so that the next input characters will be @samp{o} and @samp{b}. +@c @end group + +@c @group +If you unread @samp{9} instead of @samp{o}, you get this situation: + +@smallexample +f o o b a r + | + 9-- + ^ +@end smallexample + +@noindent +so that the next input characters will be @samp{9} and @samp{b}. +@c @end group + +@node How Unread +@subsection Using @code{ungetc} To Do Unreading + +The function to unread a character is called @code{ungetc}, because it +reverses the action of @code{getc}. + +@comment stdio.h +@comment ANSI +@deftypefun int ungetc (int @var{c}, FILE *@var{stream}) +The @code{ungetc} function pushes back the character @var{c} onto the +input stream @var{stream}. So the next input from @var{stream} will +read @var{c} before anything else. + +If @var{c} is @code{EOF}, @code{ungetc} does nothing and just returns +@code{EOF}. This lets you call @code{ungetc} with the return value of +@code{getc} without needing to check for an error from @code{getc}. + +The character that you push back doesn't have to be the same as the last +character that was actually read from the stream. In fact, it isn't +necessary to actually read any characters from the stream before +unreading them with @code{ungetc}! But that is a strange way to write +a program; usually @code{ungetc} is used only to unread a character +that was just read from the same stream. + +The GNU C library only supports one character of pushback---in other +words, it does not work to call @code{ungetc} twice without doing input +in between. Other systems might let you push back multiple characters; +then reading from the stream retrieves the characters in the reverse +order that they were pushed. + +Pushing back characters doesn't alter the file; only the internal +buffering for the stream is affected. If a file positioning function +(such as @code{fseek} or @code{rewind}; @pxref{File Positioning}) is +called, any pending pushed-back characters are discarded. + +Unreading a character on a stream that is at end of file clears the +end-of-file indicator for the stream, because it makes the character of +input available. After you read that character, trying to read again +will encounter end of file. +@end deftypefun + +Here is an example showing the use of @code{getc} and @code{ungetc} to +skip over whitespace characters. When this function reaches a +non-whitespace character, it unreads that character to be seen again on +the next read operation on the stream. + +@smallexample +#include <stdio.h> +#include <ctype.h> + +void +skip_whitespace (FILE *stream) +@{ + int c; + do + /* @r{No need to check for @code{EOF} because it is not} + @r{@code{isspace}, and @code{ungetc} ignores @code{EOF}.} */ + c = getc (stream); + while (isspace (c)); + ungetc (c, stream); +@} +@end smallexample + +@node Block Input/Output +@section Block Input/Output + +This section describes how to do input and output operations on blocks +of data. You can use these functions to read and write binary data, as +well as to read and write text in fixed-size blocks instead of by +characters or lines. +@cindex binary I/O to a stream +@cindex block I/O to a stream +@cindex reading from a stream, by blocks +@cindex writing to a stream, by blocks + +Binary files are typically used to read and write blocks of data in the +same format as is used to represent the data in a running program. In +other words, arbitrary blocks of memory---not just character or string +objects---can be written to a binary file, and meaningfully read in +again by the same program. + +Storing data in binary form is often considerably more efficient than +using the formatted I/O functions. Also, for floating-point numbers, +the binary form avoids possible loss of precision in the conversion +process. On the other hand, binary files can't be examined or modified +easily using many standard file utilities (such as text editors), and +are not portable between different implementations of the language, or +different kinds of computers. + +These functions are declared in @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun size_t fread (void *@var{data}, size_t @var{size}, size_t @var{count}, FILE *@var{stream}) +This function reads up to @var{count} objects of size @var{size} into +the array @var{data}, from the stream @var{stream}. It returns the +number of objects actually read, which might be less than @var{count} if +a read error occurs or the end of the file is reached. This function +returns a value of zero (and doesn't read anything) if either @var{size} +or @var{count} is zero. + +If @code{fread} encounters end of file in the middle of an object, it +returns the number of complete objects read, and discards the partial +object. Therefore, the stream remains at the actual end of the file. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun size_t fwrite (const void *@var{data}, size_t @var{size}, size_t @var{count}, FILE *@var{stream}) +This function writes up to @var{count} objects of size @var{size} from +the array @var{data}, to the stream @var{stream}. The return value is +normally @var{count}, if the call succeeds. Any other value indicates +some sort of error, such as running out of space. +@end deftypefun + +@node Formatted Output +@section Formatted Output + +@cindex format string, for @code{printf} +@cindex template, for @code{printf} +@cindex formatted output to a stream +@cindex writing to a stream, formatted +The functions described in this section (@code{printf} and related +functions) provide a convenient way to perform formatted output. You +call @code{printf} with a @dfn{format string} or @dfn{template string} +that specifies how to format the values of the remaining arguments. + +Unless your program is a filter that specifically performs line- or +character-oriented processing, using @code{printf} or one of the other +related functions described in this section is usually the easiest and +most concise way to perform output. These functions are especially +useful for printing error messages, tables of data, and the like. + +@menu +* Formatted Output Basics:: Some examples to get you started. +* Output Conversion Syntax:: General syntax of conversion + specifications. +* Table of Output Conversions:: Summary of output conversions and + what they do. +* Integer Conversions:: Details about formatting of integers. +* Floating-Point Conversions:: Details about formatting of + floating-point numbers. +* Other Output Conversions:: Details about formatting of strings, + characters, pointers, and the like. +* Formatted Output Functions:: Descriptions of the actual functions. +* Dynamic Output:: Functions that allocate memory for the output. +* Variable Arguments Output:: @code{vprintf} and friends. +* Parsing a Template String:: What kinds of args does a given template + call for? +* Example of Parsing:: Sample program using @code{parse_printf_format}. +@end menu + +@node Formatted Output Basics +@subsection Formatted Output Basics + +The @code{printf} function can be used to print any number of arguments. +The template string argument you supply in a call provides +information not only about the number of additional arguments, but also +about their types and what style should be used for printing them. + +Ordinary characters in the template string are simply written to the +output stream as-is, while @dfn{conversion specifications} introduced by +a @samp{%} character in the template cause subsequent arguments to be +formatted and written to the output stream. For example, +@cindex conversion specifications (@code{printf}) + +@smallexample +int pct = 37; +char filename[] = "foo.txt"; +printf ("Processing of `%s' is %d%% finished.\nPlease be patient.\n", + filename, pct); +@end smallexample + +@noindent +produces output like + +@smallexample +Processing of `foo.txt' is 37% finished. +Please be patient. +@end smallexample + +This example shows the use of the @samp{%d} conversion to specify that +an @code{int} argument should be printed in decimal notation, the +@samp{%s} conversion to specify printing of a string argument, and +the @samp{%%} conversion to print a literal @samp{%} character. + +There are also conversions for printing an integer argument as an +unsigned value in octal, decimal, or hexadecimal radix (@samp{%o}, +@samp{%u}, or @samp{%x}, respectively); or as a character value +(@samp{%c}). + +Floating-point numbers can be printed in normal, fixed-point notation +using the @samp{%f} conversion or in exponential notation using the +@samp{%e} conversion. The @samp{%g} conversion uses either @samp{%e} +or @samp{%f} format, depending on what is more appropriate for the +magnitude of the particular number. + +You can control formatting more precisely by writing @dfn{modifiers} +between the @samp{%} and the character that indicates which conversion +to apply. These slightly alter the ordinary behavior of the conversion. +For example, most conversion specifications permit you to specify a +minimum field width and a flag indicating whether you want the result +left- or right-justified within the field. + +The specific flags and modifiers that are permitted and their +interpretation vary depending on the particular conversion. They're all +described in more detail in the following sections. Don't worry if this +all seems excessively complicated at first; you can almost always get +reasonable free-format output without using any of the modifiers at all. +The modifiers are mostly used to make the output look ``prettier'' in +tables. + +@node Output Conversion Syntax +@subsection Output Conversion Syntax + +This section provides details about the precise syntax of conversion +specifications that can appear in a @code{printf} template +string. + +Characters in the template string that are not part of a +conversion specification are printed as-is to the output stream. +Multibyte character sequences (@pxref{Extended Characters}) are permitted in +a template string. + +The conversion specifications in a @code{printf} template string have +the general form: + +@example +% @var{flags} @var{width} @r{[} . @var{precision} @r{]} @var{type} @var{conversion} +@end example + +For example, in the conversion specifier @samp{%-10.8ld}, the @samp{-} +is a flag, @samp{10} specifies the field width, the precision is +@samp{8}, the letter @samp{l} is a type modifier, and @samp{d} specifies +the conversion style. (This particular type specifier says to +print a @code{long int} argument in decimal notation, with a minimum of +8 digits left-justified in a field at least 10 characters wide.) + +In more detail, output conversion specifications consist of an +initial @samp{%} character followed in sequence by: + +@itemize @bullet +@item +Zero or more @dfn{flag characters} that modify the normal behavior of +the conversion specification. +@cindex flag character (@code{printf}) + +@item +An optional decimal integer specifying the @dfn{minimum field width}. +If the normal conversion produces fewer characters than this, the field +is padded with spaces to the specified width. This is a @emph{minimum} +value; if the normal conversion produces more characters than this, the +field is @emph{not} truncated. Normally, the output is right-justified +within the field. +@cindex minimum field width (@code{printf}) + +You can also specify a field width of @samp{*}. This means that the +next argument in the argument list (before the actual value to be +printed) is used as the field width. The value must be an @code{int}. +If the value is negative, this means to set the @samp{-} flag (see +below) and to use the absolute value as the field width. + +@item +An optional @dfn{precision} to specify the number of digits to be +written for the numeric conversions. If the precision is specified, it +consists of a period (@samp{.}) followed optionally by a decimal integer +(which defaults to zero if omitted). +@cindex precision (@code{printf}) + +You can also specify a precision of @samp{*}. This means that the next +argument in the argument list (before the actual value to be printed) is +used as the precision. The value must be an @code{int}, and is ignored +if it is negative. If you specify @samp{*} for both the field width and +precision, the field width argument precedes the precision argument. +Other C library versions may not recognize this syntax. + +@item +An optional @dfn{type modifier character}, which is used to specify the +data type of the corresponding argument if it differs from the default +type. (For example, the integer conversions assume a type of @code{int}, +but you can specify @samp{h}, @samp{l}, or @samp{L} for other integer +types.) +@cindex type modifier character (@code{printf}) + +@item +A character that specifies the conversion to be applied. +@end itemize + +The exact options that are permitted and how they are interpreted vary +between the different conversion specifiers. See the descriptions of the +individual conversions for information about the particular options that +they use. + +With the @samp{-Wformat} option, the GNU C compiler checks calls to +@code{printf} and related functions. It examines the format string and +verifies that the correct number and types of arguments are supplied. +There is also a GNU C syntax to tell the compiler that a function you +write uses a @code{printf}-style format string. +@xref{Function Attributes, , Declaring Attributes of Functions, +gcc.info, Using GNU CC}, for more information. + +@node Table of Output Conversions +@subsection Table of Output Conversions +@cindex output conversions, for @code{printf} + +Here is a table summarizing what all the different conversions do: + +@table @asis +@item @samp{%d}, @samp{%i} +Print an integer as a signed decimal number. @xref{Integer +Conversions}, for details. @samp{%d} and @samp{%i} are synonymous for +output, but are different when used with @code{scanf} for input +(@pxref{Table of Input Conversions}). + +@item @samp{%o} +Print an integer as an unsigned octal number. @xref{Integer +Conversions}, for details. + +@item @samp{%u} +Print an integer as an unsigned decimal number. @xref{Integer +Conversions}, for details. + +@item @samp{%x}, @samp{%X} +Print an integer as an unsigned hexadecimal number. @samp{%x} uses +lower-case letters and @samp{%X} uses upper-case. @xref{Integer +Conversions}, for details. + +@item @samp{%f} +Print a floating-point number in normal (fixed-point) notation. +@xref{Floating-Point Conversions}, for details. + +@item @samp{%e}, @samp{%E} +Print a floating-point number in exponential notation. @samp{%e} uses +lower-case letters and @samp{%E} uses upper-case. @xref{Floating-Point +Conversions}, for details. + +@item @samp{%g}, @samp{%G} +Print a floating-point number in either normal or exponential notation, +whichever is more appropriate for its magnitude. @samp{%g} uses +lower-case letters and @samp{%G} uses upper-case. @xref{Floating-Point +Conversions}, for details. + +@item @samp{%c} +Print a single character. @xref{Other Output Conversions}. + +@item @samp{%s} +Print a string. @xref{Other Output Conversions}. + +@item @samp{%p} +Print the value of a pointer. @xref{Other Output Conversions}. + +@item @samp{%n} +Get the number of characters printed so far. @xref{Other Output Conversions}. +Note that this conversion specification never produces any output. + +@item @samp{%m} +Print the string corresponding to the value of @code{errno}. +(This is a GNU extension.) +@xref{Other Output Conversions}. + +@item @samp{%%} +Print a literal @samp{%} character. @xref{Other Output Conversions}. +@end table + +If the syntax of a conversion specification is invalid, unpredictable +things will happen, so don't do this. If there aren't enough function +arguments provided to supply values for all the conversion +specifications in the template string, or if the arguments are not of +the correct types, the results are unpredictable. If you supply more +arguments than conversion specifications, the extra argument values are +simply ignored; this is sometimes useful. + +@node Integer Conversions +@subsection Integer Conversions + +This section describes the options for the @samp{%d}, @samp{%i}, +@samp{%o}, @samp{%u}, @samp{%x}, and @samp{%X} conversion +specifications. These conversions print integers in various formats. + +The @samp{%d} and @samp{%i} conversion specifications both print an +@code{int} argument as a signed decimal number; while @samp{%o}, +@samp{%u}, and @samp{%x} print the argument as an unsigned octal, +decimal, or hexadecimal number (respectively). The @samp{%X} conversion +specification is just like @samp{%x} except that it uses the characters +@samp{ABCDEF} as digits instead of @samp{abcdef}. + +The following flags are meaningful: + +@table @asis +@item @samp{-} +Left-justify the result in the field (instead of the normal +right-justification). + +@item @samp{+} +For the signed @samp{%d} and @samp{%i} conversions, print a +plus sign if the value is positive. + +@item @samp{ } +For the signed @samp{%d} and @samp{%i} conversions, if the result +doesn't start with a plus or minus sign, prefix it with a space +character instead. Since the @samp{+} flag ensures that the result +includes a sign, this flag is ignored if you supply both of them. + +@item @samp{#} +For the @samp{%o} conversion, this forces the leading digit to be +@samp{0}, as if by increasing the precision. For @samp{%x} or +@samp{%X}, this prefixes a leading @samp{0x} or @samp{0X} (respectively) +to the result. This doesn't do anything useful for the @samp{%d}, +@samp{%i}, or @samp{%u} conversions. Using this flag produces output +which can be parsed by the @code{strtoul} function (@pxref{Parsing of +Integers}) and @code{scanf} with the @samp{%i} conversion +(@pxref{Numeric Input Conversions}). + +@item @samp{'} +Separate the digits into groups as specified by the locale specified for +the @code{LC_NUMERIC} category; @pxref{General Numeric}. This flag is a +GNU extension. + +@item @samp{0} +Pad the field with zeros instead of spaces. The zeros are placed after +any indication of sign or base. This flag is ignored if the @samp{-} +flag is also specified, or if a precision is specified. +@end table + +If a precision is supplied, it specifies the minimum number of digits to +appear; leading zeros are produced if necessary. If you don't specify a +precision, the number is printed with as many digits as it needs. If +you convert a value of zero with an explicit precision of zero, then no +characters at all are produced. + +Without a type modifier, the corresponding argument is treated as an +@code{int} (for the signed conversions @samp{%i} and @samp{%d}) or +@code{unsigned int} (for the unsigned conversions @samp{%o}, @samp{%u}, +@samp{%x}, and @samp{%X}). Recall that since @code{printf} and friends +are variadic, any @code{char} and @code{short} arguments are +automatically converted to @code{int} by the default argument +promotions. For arguments of other integer types, you can use these +modifiers: + +@table @samp +@item h +Specifies that the argument is a @code{short int} or @code{unsigned +short int}, as appropriate. A @code{short} argument is converted to an +@code{int} or @code{unsigned int} by the default argument promotions +anyway, but the @samp{h} modifier says to convert it back to a +@code{short} again. + +@item l +Specifies that the argument is a @code{long int} or @code{unsigned long +int}, as appropriate. Two @samp{l} characters is like the @samp{L} +modifier, below. + +@item L +@itemx ll +@itemx q +Specifies that the argument is a @code{long long int}. (This type is +an extension supported by the GNU C compiler. On systems that don't +support extra-long integers, this is the same as @code{long int}.) + +The @samp{q} modifier is another name for the same thing, which comes +from 4.4 BSD; a @w{@code{long long int}} is sometimes called a ``quad'' +@code{int}. + +@item Z +Specifies that the argument is a @code{size_t}. This is a GNU extension. +@end table + +Here is an example. Using the template string: + +@smallexample +"|%5d|%-5d|%+5d|%+-5d|% 5d|%05d|%5.0d|%5.2d|%d|\n" +@end smallexample + +@noindent +to print numbers using the different options for the @samp{%d} +conversion gives results like: + +@smallexample +| 0|0 | +0|+0 | 0|00000| | 00|0| +| 1|1 | +1|+1 | 1|00001| 1| 01|1| +| -1|-1 | -1|-1 | -1|-0001| -1| -01|-1| +|100000|100000|+100000| 100000|100000|100000|100000|100000| +@end smallexample + +In particular, notice what happens in the last case where the number +is too large to fit in the minimum field width specified. + +Here are some more examples showing how unsigned integers print under +various format options, using the template string: + +@smallexample +"|%5u|%5o|%5x|%5X|%#5o|%#5x|%#5X|%#10.8x|\n" +@end smallexample + +@smallexample +| 0| 0| 0| 0| 0| 0x0| 0X0|0x00000000| +| 1| 1| 1| 1| 01| 0x1| 0X1|0x00000001| +|100000|303240|186a0|186A0|0303240|0x186a0|0X186A0|0x000186a0| +@end smallexample + + +@node Floating-Point Conversions +@subsection Floating-Point Conversions + +This section discusses the conversion specifications for floating-point +numbers: the @samp{%f}, @samp{%e}, @samp{%E}, @samp{%g}, and @samp{%G} +conversions. + +The @samp{%f} conversion prints its argument in fixed-point notation, +producing output of the form +@w{[@code{-}]@var{ddd}@code{.}@var{ddd}}, +where the number of digits following the decimal point is controlled +by the precision you specify. + +The @samp{%e} conversion prints its argument in exponential notation, +producing output of the form +@w{[@code{-}]@var{d}@code{.}@var{ddd}@code{e}[@code{+}|@code{-}]@var{dd}}. +Again, the number of digits following the decimal point is controlled by +the precision. The exponent always contains at least two digits. The +@samp{%E} conversion is similar but the exponent is marked with the letter +@samp{E} instead of @samp{e}. + +The @samp{%g} and @samp{%G} conversions print the argument in the style +of @samp{%e} or @samp{%E} (respectively) if the exponent would be less +than -4 or greater than or equal to the precision; otherwise they use the +@samp{%f} style. Trailing zeros are removed from the fractional portion +of the result and a decimal-point character appears only if it is +followed by a digit. + +The following flags can be used to modify the behavior: + +@comment We use @asis instead of @samp so we can have ` ' as an item. +@table @asis +@item @samp{-} +Left-justify the result in the field. Normally the result is +right-justified. + +@item @samp{+} +Always include a plus or minus sign in the result. + +@item @samp{ } +If the result doesn't start with a plus or minus sign, prefix it with a +space instead. Since the @samp{+} flag ensures that the result includes +a sign, this flag is ignored if you supply both of them. + +@item @samp{#} +Specifies that the result should always include a decimal point, even +if no digits follow it. For the @samp{%g} and @samp{%G} conversions, +this also forces trailing zeros after the decimal point to be left +in place where they would otherwise be removed. + +@item @samp{'} +Separate the digits of the integer part of the result into groups as +specified by the locale specified for the @code{LC_NUMERIC} category; +@pxref{General Numeric}. This flag is a GNU extension. + +@item @samp{0} +Pad the field with zeros instead of spaces; the zeros are placed +after any sign. This flag is ignored if the @samp{-} flag is also +specified. +@end table + +The precision specifies how many digits follow the decimal-point +character for the @samp{%f}, @samp{%e}, and @samp{%E} conversions. For +these conversions, the default precision is @code{6}. If the precision +is explicitly @code{0}, this suppresses the decimal point character +entirely. For the @samp{%g} and @samp{%G} conversions, the precision +specifies how many significant digits to print. Significant digits are +the first digit before the decimal point, and all the digits after it. +If the precision @code{0} or not specified for @samp{%g} or @samp{%G}, +it is treated like a value of @code{1}. If the value being printed +cannot be expressed accurately in the specified number of digits, the +value is rounded to the nearest number that fits. + +Without a type modifier, the floating-point conversions use an argument +of type @code{double}. (By the default argument promotions, any +@code{float} arguments are automatically converted to @code{double}.) +The following type modifier is supported: + +@table @samp +@item L +An uppercase @samp{L} specifies that the argument is a @code{long +double}. +@end table + +Here are some examples showing how numbers print using the various +floating-point conversions. All of the numbers were printed using +this template string: + +@smallexample +"|%12.4f|%12.4e|%12.4g|\n" +@end smallexample + +Here is the output: + +@smallexample +| 0.0000| 0.0000e+00| 0| +| 1.0000| 1.0000e+00| 1| +| -1.0000| -1.0000e+00| -1| +| 100.0000| 1.0000e+02| 100| +| 1000.0000| 1.0000e+03| 1000| +| 10000.0000| 1.0000e+04| 1e+04| +| 12345.0000| 1.2345e+04| 1.234e+04| +| 100000.0000| 1.0000e+05| 1e+05| +| 123456.0000| 1.2346e+05| 1.234e+05| +@end smallexample + +Notice how the @samp{%g} conversion drops trailing zeros. + +@node Other Output Conversions +@subsection Other Output Conversions + +This section describes miscellaneous conversions for @code{printf}. + +The @samp{%c} conversion prints a single character. The @code{int} +argument is first converted to an @code{unsigned char}. The @samp{-} +flag can be used to specify left-justification in the field, but no +other flags are defined, and no precision or type modifier can be given. +For example: + +@smallexample +printf ("%c%c%c%c%c", 'h', 'e', 'l', 'l', 'o'); +@end smallexample + +@noindent +prints @samp{hello}. + +The @samp{%s} conversion prints a string. The corresponding argument +must be of type @code{char *} (or @code{const char *}). A precision can +be specified to indicate the maximum number of characters to write; +otherwise characters in the string up to but not including the +terminating null character are written to the output stream. The +@samp{-} flag can be used to specify left-justification in the field, +but no other flags or type modifiers are defined for this conversion. +For example: + +@smallexample +printf ("%3s%-6s", "no", "where"); +@end smallexample + +@noindent +prints @samp{ nowhere }. + +If you accidentally pass a null pointer as the argument for a @samp{%s} +conversion, the GNU library prints it as @samp{(null)}. We think this +is more useful than crashing. But it's not good practice to pass a null +argument intentionally. + +The @samp{%m} conversion prints the string corresponding to the error +code in @code{errno}. @xref{Error Messages}. Thus: + +@smallexample +fprintf (stderr, "can't open `%s': %m\n", filename); +@end smallexample + +@noindent +is equivalent to: + +@smallexample +fprintf (stderr, "can't open `%s': %s\n", filename, strerror (errno)); +@end smallexample + +@noindent +The @samp{%m} conversion is a GNU C library extension. + +The @samp{%p} conversion prints a pointer value. The corresponding +argument must be of type @code{void *}. In practice, you can use any +type of pointer. + +In the GNU system, non-null pointers are printed as unsigned integers, +as if a @samp{%#x} conversion were used. Null pointers print as +@samp{(nil)}. (Pointers might print differently in other systems.) + +For example: + +@smallexample +printf ("%p", "testing"); +@end smallexample + +@noindent +prints @samp{0x} followed by a hexadecimal number---the address of the +string constant @code{"testing"}. It does not print the word +@samp{testing}. + +You can supply the @samp{-} flag with the @samp{%p} conversion to +specify left-justification, but no other flags, precision, or type +modifiers are defined. + +The @samp{%n} conversion is unlike any of the other output conversions. +It uses an argument which must be a pointer to an @code{int}, but +instead of printing anything it stores the number of characters printed +so far by this call at that location. The @samp{h} and @samp{l} type +modifiers are permitted to specify that the argument is of type +@code{short int *} or @code{long int *} instead of @code{int *}, but no +flags, field width, or precision are permitted. + +For example, + +@smallexample +int nchar; +printf ("%d %s%n\n", 3, "bears", &nchar); +@end smallexample + +@noindent +prints: + +@smallexample +3 bears +@end smallexample + +@noindent +and sets @code{nchar} to @code{7}, because @samp{3 bears} is seven +characters. + + +The @samp{%%} conversion prints a literal @samp{%} character. This +conversion doesn't use an argument, and no flags, field width, +precision, or type modifiers are permitted. + + +@node Formatted Output Functions +@subsection Formatted Output Functions + +This section describes how to call @code{printf} and related functions. +Prototypes for these functions are in the header file @file{stdio.h}. +Because these functions take a variable number of arguments, you +@emph{must} declare prototypes for them before using them. Of course, +the easiest way to make sure you have all the right prototypes is to +just include @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int printf (const char *@var{template}, @dots{}) +The @code{printf} function prints the optional arguments under the +control of the template string @var{template} to the stream +@code{stdout}. It returns the number of characters printed, or a +negative value if there was an output error. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fprintf (FILE *@var{stream}, const char *@var{template}, @dots{}) +This function is just like @code{printf}, except that the output is +written to the stream @var{stream} instead of @code{stdout}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int sprintf (char *@var{s}, const char *@var{template}, @dots{}) +This is like @code{printf}, except that the output is stored in the character +array @var{s} instead of written to a stream. A null character is written +to mark the end of the string. + +The @code{sprintf} function returns the number of characters stored in +the array @var{s}, not including the terminating null character. + +The behavior of this function is undefined if copying takes place +between objects that overlap---for example, if @var{s} is also given +as an argument to be printed under control of the @samp{%s} conversion. +@xref{Copying and Concatenation}. + +@strong{Warning:} The @code{sprintf} function can be @strong{dangerous} +because it can potentially output more characters than can fit in the +allocation size of the string @var{s}. Remember that the field width +given in a conversion specification is only a @emph{minimum} value. + +To avoid this problem, you can use @code{snprintf} or @code{asprintf}, +described below. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int snprintf (char *@var{s}, size_t @var{size}, const char *@var{template}, @dots{}) +The @code{snprintf} function is similar to @code{sprintf}, except that +the @var{size} argument specifies the maximum number of characters to +produce. The trailing null character is counted towards this limit, so +you should allocate at least @var{size} characters for the string @var{s}. + +The return value is the number of characters stored, not including the +terminating null. If this value equals @code{@var{size} - 1}, then +there was not enough space in @var{s} for all the output. You should +try again with a bigger output string. Here is an example of doing +this: + +@smallexample +@group +/* @r{Construct a message describing the value of a variable} + @r{whose name is @var{name} and whose value is @var{value}.} */ +char * +make_message (char *name, char *value) +@{ + /* @r{Guess we need no more than 100 chars of space.} */ + int size = 100; + char *buffer = (char *) xmalloc (size); +@end group +@group + while (1) + @{ + /* @r{Try to print in the allocated space.} */ + int nchars = snprintf (buffer, size, + "value of %s is %s", + name, value); + /* @r{If that worked, return the string.} */ + if (nchars < size) + return buffer; + /* @r{Else try again with twice as much space.} */ + size *= 2; + buffer = (char *) xrealloc (size, buffer); + @} +@} +@end group +@end smallexample + +In practice, it is often easier just to use @code{asprintf}, below. +@end deftypefun + +@node Dynamic Output +@subsection Dynamically Allocating Formatted Output + +The functions in this section do formatted output and place the results +in dynamically allocated memory. + +@comment stdio.h +@comment GNU +@deftypefun int asprintf (char **@var{ptr}, const char *@var{template}, @dots{}) +This function is similar to @code{sprintf}, except that it dynamically +allocates a string (as with @code{malloc}; @pxref{Unconstrained +Allocation}) to hold the output, instead of putting the output in a +buffer you allocate in advance. The @var{ptr} argument should be the +address of a @code{char *} object, and @code{asprintf} stores a pointer +to the newly allocated string at that location. + +Here is how to use @code{asprintf} to get the same result as the +@code{snprintf} example, but more easily: + +@smallexample +/* @r{Construct a message describing the value of a variable} + @r{whose name is @var{name} and whose value is @var{value}.} */ +char * +make_message (char *name, char *value) +@{ + char *result; + asprintf (&result, "value of %s is %s", name, value); + return result; +@} +@end smallexample +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int obstack_printf (struct obstack *@var{obstack}, const char *@var{template}, @dots{}) +This function is similar to @code{asprintf}, except that it uses the +obstack @var{obstack} to allocate the space. @xref{Obstacks}. + +The characters are written onto the end of the current object. +To get at them, you must finish the object with @code{obstack_finish} +(@pxref{Growing Objects}).@refill +@end deftypefun + +@node Variable Arguments Output +@subsection Variable Arguments Output Functions + +The functions @code{vprintf} and friends are provided so that you can +define your own variadic @code{printf}-like functions that make use of +the same internals as the built-in formatted output functions. + +The most natural way to define such functions would be to use a language +construct to say, ``Call @code{printf} and pass this template plus all +of my arguments after the first five.'' But there is no way to do this +in C, and it would be hard to provide a way, since at the C language +level there is no way to tell how many arguments your function received. + +Since that method is impossible, we provide alternative functions, the +@code{vprintf} series, which lets you pass a @code{va_list} to describe +``all of my arguments after the first five.'' + +When it is sufficient to define a macro rather than a real function, +the GNU C compiler provides a way to do this much more easily with macros. +For example: + +@smallexample +#define myprintf(a, b, c, d, e, rest...) printf (mytemplate , ## rest...) +@end smallexample + +@noindent +@xref{Macro Varargs, , Macros with Variable Numbers of Arguments, +gcc.info, Using GNU CC}, for details. But this is limited to macros, +and does not apply to real functions at all. + +Before calling @code{vprintf} or the other functions listed in this +section, you @emph{must} call @code{va_start} (@pxref{Variadic +Functions}) to initialize a pointer to the variable arguments. Then you +can call @code{va_arg} to fetch the arguments that you want to handle +yourself. This advances the pointer past those arguments. + +Once your @code{va_list} pointer is pointing at the argument of your +choice, you are ready to call @code{vprintf}. That argument and all +subsequent arguments that were passed to your function are used by +@code{vprintf} along with the template that you specified separately. + +In some other systems, the @code{va_list} pointer may become invalid +after the call to @code{vprintf}, so you must not use @code{va_arg} +after you call @code{vprintf}. Instead, you should call @code{va_end} +to retire the pointer from service. However, you can safely call +@code{va_start} on another pointer variable and begin fetching the +arguments again through that pointer. Calling @code{vprintf} does not +destroy the argument list of your function, merely the particular +pointer that you passed to it. + +GNU C does not have such restrictions. You can safely continue to fetch +arguments from a @code{va_list} pointer after passing it to +@code{vprintf}, and @code{va_end} is a no-op. (Note, however, that +subsequent @code{va_arg} calls will fetch the same arguments which +@code{vprintf} previously used.) + +Prototypes for these functions are declared in @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int vprintf (const char *@var{template}, va_list @var{ap}) +This function is similar to @code{printf} except that, instead of taking +a variable number of arguments directly, it takes an argument list +pointer @var{ap}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int vfprintf (FILE *@var{stream}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{fprintf} with the variable argument list +specified directly as for @code{vprintf}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int vsprintf (char *@var{s}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{sprintf} with the variable argument list +specified directly as for @code{vprintf}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int vsnprintf (char *@var{s}, size_t @var{size}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{snprintf} with the variable argument list +specified directly as for @code{vprintf}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int vasprintf (char **@var{ptr}, const char *@var{template}, va_list @var{ap}) +The @code{vasprintf} function is the equivalent of @code{asprintf} with the +variable argument list specified directly as for @code{vprintf}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int obstack_vprintf (struct obstack *@var{obstack}, const char *@var{template}, va_list @var{ap}) +The @code{obstack_vprintf} function is the equivalent of +@code{obstack_printf} with the variable argument list specified directly +as for @code{vprintf}.@refill +@end deftypefun + +Here's an example showing how you might use @code{vfprintf}. This is a +function that prints error messages to the stream @code{stderr}, along +with a prefix indicating the name of the program +(@pxref{Error Messages}, for a description of +@code{program_invocation_short_name}). + +@smallexample +@group +#include <stdio.h> +#include <stdarg.h> + +void +eprintf (const char *template, ...) +@{ + va_list ap; + extern char *program_invocation_short_name; + + fprintf (stderr, "%s: ", program_invocation_short_name); + va_start (ap, count); + vfprintf (stderr, template, ap); + va_end (ap); +@} +@end group +@end smallexample + +@noindent +You could call @code{eprintf} like this: + +@smallexample +eprintf ("file `%s' does not exist\n", filename); +@end smallexample + +In GNU C, there is a special construct you can use to let the compiler +know that a function uses a @code{printf}-style format string. Then it +can check the number and types of arguments in each call to the +function, and warn you when they do not match the format string. +For example, take this declaration of @code{eprintf}: + +@smallexample +void eprintf (const char *template, ...) + __attribute__ ((format (printf, 1, 2))); +@end smallexample + +@noindent +This tells the compiler that @code{eprintf} uses a format string like +@code{printf} (as opposed to @code{scanf}; @pxref{Formatted Input}); +the format string appears as the first argument; +and the arguments to satisfy the format begin with the second. +@xref{Function Attributes, , Declaring Attributes of Functions, +gcc.info, Using GNU CC}, for more information. + +@node Parsing a Template String +@subsection Parsing a Template String +@cindex parsing a template string + +You can use the function @code{parse_printf_format} to obtain +information about the number and types of arguments that are expected by +a given template string. This function permits interpreters that +provide interfaces to @code{printf} to avoid passing along invalid +arguments from the user's program, which could cause a crash. + +All the symbols described in this section are declared in the header +file @file{printf.h}. + +@comment printf.h +@comment GNU +@deftypefun size_t parse_printf_format (const char *@var{template}, size_t @var{n}, int *@var{argtypes}) +This function returns information about the number and types of +arguments expected by the @code{printf} template string @var{template}. +The information is stored in the array @var{argtypes}; each element of +this array describes one argument. This information is encoded using +the various @samp{PA_} macros, listed below. + +The @var{n} argument specifies the number of elements in the array +@var{argtypes}. This is the most elements that +@code{parse_printf_format} will try to write. + +@code{parse_printf_format} returns the total number of arguments required +by @var{template}. If this number is greater than @var{n}, then the +information returned describes only the first @var{n} arguments. If you +want information about more than that many arguments, allocate a bigger +array and call @code{parse_printf_format} again. +@end deftypefun + +The argument types are encoded as a combination of a basic type and +modifier flag bits. + +@comment printf.h +@comment GNU +@deftypevr Macro int PA_FLAG_MASK +This macro is a bitmask for the type modifier flag bits. You can write +the expression @code{(argtypes[i] & PA_FLAG_MASK)} to extract just the +flag bits for an argument, or @code{(argtypes[i] & ~PA_FLAG_MASK)} to +extract just the basic type code. +@end deftypevr + +Here are symbolic constants that represent the basic types; they stand +for integer values. + +@table @code +@comment printf.h +@comment GNU +@item PA_INT +@vindex PA_INT +This specifies that the base type is @code{int}. + +@comment printf.h +@comment GNU +@item PA_CHAR +@vindex PA_CHAR +This specifies that the base type is @code{int}, cast to @code{char}. + +@comment printf.h +@comment GNU +@item PA_STRING +@vindex PA_STRING +This specifies that the base type is @code{char *}, a null-terminated string. + +@comment printf.h +@comment GNU +@item PA_POINTER +@vindex PA_POINTER +This specifies that the base type is @code{void *}, an arbitrary pointer. + +@comment printf.h +@comment GNU +@item PA_FLOAT +@vindex PA_FLOAT +This specifies that the base type is @code{float}. + +@comment printf.h +@comment GNU +@item PA_DOUBLE +@vindex PA_DOUBLE +This specifies that the base type is @code{double}. + +@comment printf.h +@comment GNU +@item PA_LAST +@vindex PA_LAST +You can define additional base types for your own programs as offsets +from @code{PA_LAST}. For example, if you have data types @samp{foo} +and @samp{bar} with their own specialized @code{printf} conversions, +you could define encodings for these types as: + +@smallexample +#define PA_FOO PA_LAST +#define PA_BAR (PA_LAST + 1) +@end smallexample +@end table + +Here are the flag bits that modify a basic type. They are combined with +the code for the basic type using inclusive-or. + +@table @code +@comment printf.h +@comment GNU +@item PA_FLAG_PTR +@vindex PA_FLAG_PTR +If this bit is set, it indicates that the encoded type is a pointer to +the base type, rather than an immediate value. +For example, @samp{PA_INT|PA_FLAG_PTR} represents the type @samp{int *}. + +@comment printf.h +@comment GNU +@item PA_FLAG_SHORT +@vindex PA_FLAG_SHORT +If this bit is set, it indicates that the base type is modified with +@code{short}. (This corresponds to the @samp{h} type modifier.) + +@comment printf.h +@comment GNU +@item PA_FLAG_LONG +@vindex PA_FLAG_LONG +If this bit is set, it indicates that the base type is modified with +@code{long}. (This corresponds to the @samp{l} type modifier.) + +@comment printf.h +@comment GNU +@item PA_FLAG_LONG_LONG +@vindex PA_FLAG_LONG_LONG +If this bit is set, it indicates that the base type is modified with +@code{long long}. (This corresponds to the @samp{L} type modifier.) + +@comment printf.h +@comment GNU +@item PA_FLAG_LONG_DOUBLE +@vindex PA_FLAG_LONG_DOUBLE +This is a synonym for @code{PA_FLAG_LONG_LONG}, used by convention with +a base type of @code{PA_DOUBLE} to indicate a type of @code{long double}. +@end table + +@ifinfo +For an example of using these facilitles, see @ref{Example of Parsing}. +@end ifinfo + +@node Example of Parsing +@subsection Example of Parsing a Template String + +Here is an example of decoding argument types for a format string. We +assume this is part of an interpreter which contains arguments of type +@code{NUMBER}, @code{CHAR}, @code{STRING} and @code{STRUCTURE} (and +perhaps others which are not valid here). + +@smallexample +/* @r{Test whether the @var{nargs} specified objects} + @r{in the vector @var{args} are valid} + @r{for the format string @var{format}:} + @r{if so, return 1.} + @r{If not, return 0 after printing an error message.} */ + +int +validate_args (char *format, int nargs, OBJECT *args) +@{ + int *argtypes; + int nwanted; + + /* @r{Get the information about the arguments.} + @r{Each conversion specification must be at least two characters} + @r{long, so there cannot be more specifications than half the} + @r{length of the string.} */ + + argtypes = (int *) alloca (strlen (format) / 2 * sizeof (int)); + nwanted = parse_printf_format (string, nelts, argtypes); + + /* @r{Check the number of arguments.} */ + if (nwanted > nargs) + @{ + error ("too few arguments (at least %d required)", nwanted); + return 0; + @} + + /* @r{Check the C type wanted for each argument} + @r{and see if the object given is suitable.} */ + for (i = 0; i < nwanted; i++) + @{ + int wanted; + + if (argtypes[i] & PA_FLAG_PTR) + wanted = STRUCTURE; + else + switch (argtypes[i] & ~PA_FLAG_MASK) + @{ + case PA_INT: + case PA_FLOAT: + case PA_DOUBLE: + wanted = NUMBER; + break; + case PA_CHAR: + wanted = CHAR; + break; + case PA_STRING: + wanted = STRING; + break; + case PA_POINTER: + wanted = STRUCTURE; + break; + @} + if (TYPE (args[i]) != wanted) + @{ + error ("type mismatch for arg number %d", i); + return 0; + @} + @} + return 1; +@} +@end smallexample + +@node Customizing Printf +@section Customizing @code{printf} +@cindex customizing @code{printf} +@cindex defining new @code{printf} conversions +@cindex extending @code{printf} + +The GNU C library lets you define your own custom conversion specifiers +for @code{printf} template strings, to teach @code{printf} clever ways +to print the important data structures of your program. + +The way you do this is by registering the conversion with the function +@code{register_printf_function}; see @ref{Registering New Conversions}. +One of the arguments you pass to this function is a pointer to a handler +function that produces the actual output; see @ref{Defining the Output +Handler}, for information on how to write this function. + +You can also install a function that just returns information about the +number and type of arguments expected by the conversion specifier. +@xref{Parsing a Template String}, for information about this. + +The facilities of this section are declared in the header file +@file{printf.h}. + +@menu +* Registering New Conversions:: Using @code{register_printf_function} + to register a new output conversion. +* Conversion Specifier Options:: The handler must be able to get + the options specified in the + template when it is called. +* Defining the Output Handler:: Defining the handler and arginfo + functions that are passed as arguments + to @code{register_printf_function}. +* Printf Extension Example:: How to define a @code{printf} + handler function. +@end menu + +@strong{Portability Note:} The ability to extend the syntax of +@code{printf} template strings is a GNU extension. ANSI standard C has +nothing similar. + +@node Registering New Conversions +@subsection Registering New Conversions + +The function to register a new output conversion is +@code{register_printf_function}, declared in @file{printf.h}. +@pindex printf.h + +@comment printf.h +@comment GNU +@deftypefun int register_printf_function (int @var{spec}, printf_function @var{handler-function}, printf_arginfo_function @var{arginfo-function}) +This function defines the conversion specifier character @var{spec}. +Thus, if @var{spec} is @code{'z'}, it defines the conversion @samp{%z}. +You can redefine the built-in conversions like @samp{%s}, but flag +characters like @samp{#} and type modifiers like @samp{l} can never be +used as conversions; calling @code{register_printf_function} for those +characters has no effect. + +The @var{handler-function} is the function called by @code{printf} and +friends when this conversion appears in a template string. +@xref{Defining the Output Handler}, for information about how to define +a function to pass as this argument. If you specify a null pointer, any +existing handler function for @var{spec} is removed. + +The @var{arginfo-function} is the function called by +@code{parse_printf_format} when this conversion appears in a +template string. @xref{Parsing a Template String}, for information +about this. + +Normally, you install both functions for a conversion at the same time, +but if you are never going to call @code{parse_printf_format}, you do +not need to define an arginfo function. + +The return value is @code{0} on success, and @code{-1} on failure +(which occurs if @var{spec} is out of range). + +You can redefine the standard output conversions, but this is probably +not a good idea because of the potential for confusion. Library routines +written by other people could break if you do this. +@end deftypefun + +@node Conversion Specifier Options +@subsection Conversion Specifier Options + +If you define a meaning for @samp{%q}, what if the template contains +@samp{%+23q} or @samp{%-#q}? To implement a sensible meaning for these, +the handler when called needs to be able to get the options specified in +the template. + +Both the @var{handler-function} and @var{arginfo-function} arguments +to @code{register_printf_function} accept an argument that points to a +@code{struct printf_info}, which contains information about the options +appearing in an instance of the conversion specifier. This data type +is declared in the header file @file{printf.h}. +@pindex printf.h + +@comment printf.h +@comment GNU +@deftp {Type} {struct printf_info} +This structure is used to pass information about the options appearing +in an instance of a conversion specifier in a @code{printf} template +string to the handler and arginfo functions for that specifier. It +contains the following members: + +@table @code +@item int prec +This is the precision specified. The value is @code{-1} if no precision +was specified. If the precision was given as @samp{*}, the +@code{printf_info} structure passed to the handler function contains the +actual value retrieved from the argument list. But the structure passed +to the arginfo function contains a value of @code{INT_MIN}, since the +actual value is not known. + +@item int width +This is the minimum field width specified. The value is @code{0} if no +width was specified. If the field width was given as @samp{*}, the +@code{printf_info} structure passed to the handler function contains the +actual value retrieved from the argument list. But the structure passed +to the arginfo function contains a value of @code{INT_MIN}, since the +actual value is not known. + +@item char spec +This is the conversion specifier character specified. It's stored in +the structure so that you can register the same handler function for +multiple characters, but still have a way to tell them apart when the +handler function is called. + +@item unsigned int is_long_double +This is a boolean that is true if the @samp{L}, @samp{ll}, or @samp{q} +type modifier was specified. For integer conversions, this indicates +@code{long long int}, as opposed to @code{long double} for floating +point conversions. + +@item unsigned int is_short +This is a boolean that is true if the @samp{h} type modifier was specified. + +@item unsigned int is_long +This is a boolean that is true if the @samp{l} type modifier was specified. + +@item unsigned int alt +This is a boolean that is true if the @samp{#} flag was specified. + +@item unsigned int space +This is a boolean that is true if the @samp{ } flag was specified. + +@item unsigned int left +This is a boolean that is true if the @samp{-} flag was specified. + +@item unsigned int showsign +This is a boolean that is true if the @samp{+} flag was specified. + +@item unsigned int group +This is a boolean that is true if the @samp{'} flag was specified. + +@item char pad +This is the character to use for padding the output to the minimum field +width. The value is @code{'0'} if the @samp{0} flag was specified, and +@code{' '} otherwise. +@end table +@end deftp + + +@node Defining the Output Handler +@subsection Defining the Output Handler + +Now let's look at how to define the handler and arginfo functions +which are passed as arguments to @code{register_printf_function}. + +You should define your handler functions with a prototype like: + +@smallexample +int @var{function} (FILE *stream, const struct printf_info *info, + va_list *ap_pointer) +@end smallexample + +The @code{stream} argument passed to the handler function is the stream to +which it should write output. + +The @code{info} argument is a pointer to a structure that contains +information about the various options that were included with the +conversion in the template string. You should not modify this structure +inside your handler function. @xref{Conversion Specifier Options}, for +a description of this data structure. + +The @code{ap_pointer} argument is used to pass the tail of the variable +argument list containing the values to be printed to your handler. +Unlike most other functions that can be passed an explicit variable +argument list, this is a @emph{pointer} to a @code{va_list}, rather than +the @code{va_list} itself. Thus, you should fetch arguments by +means of @code{va_arg (@var{type}, *ap_pointer)}. + +(Passing a pointer here allows the function that calls your handler +function to update its own @code{va_list} variable to account for the +arguments that your handler processes. @xref{Variadic Functions}.) + +Your handler function should return a value just like @code{printf} +does: it should return the number of characters it has written, or a +negative value to indicate an error. + +@comment printf.h +@comment GNU +@deftp {Data Type} printf_function +This is the data type that a handler function should have. +@end deftp + +If you are going to use @w{@code{parse_printf_format}} in your +application, you should also define a function to pass as the +@var{arginfo-function} argument for each new conversion you install with +@code{register_printf_function}. + +You should define these functions with a prototype like: + +@smallexample +int @var{function} (const struct printf_info *info, + size_t n, int *argtypes) +@end smallexample + +The return value from the function should be the number of arguments the +conversion expects. The function should also fill in no more than +@var{n} elements of the @var{argtypes} array with information about the +types of each of these arguments. This information is encoded using the +various @samp{PA_} macros. (You will notice that this is the same +calling convention @code{parse_printf_format} itself uses.) + +@comment printf.h +@comment GNU +@deftp {Data Type} printf_arginfo_function +This type is used to describe functions that return information about +the number and type of arguments used by a conversion specifier. +@end deftp + +@node Printf Extension Example +@subsection @code{printf} Extension Example + +Here is an example showing how to define a @code{printf} handler function. +This program defines a data structure called a @code{Widget} and +defines the @samp{%W} conversion to print information about @w{@code{Widget *}} +arguments, including the pointer value and the name stored in the data +structure. The @samp{%W} conversion supports the minimum field width and +left-justification options, but ignores everything else. + +@smallexample +@include rprintf.c.texi +@end smallexample + +The output produced by this program looks like: + +@smallexample +|<Widget 0xffeffb7c: mywidget>| +| <Widget 0xffeffb7c: mywidget>| +|<Widget 0xffeffb7c: mywidget> | +@end smallexample + +@node Formatted Input +@section Formatted Input + +@cindex formatted input from a stream +@cindex reading from a stream, formatted +@cindex format string, for @code{scanf} +@cindex template, for @code{scanf} +The functions described in this section (@code{scanf} and related +functions) provide facilities for formatted input analogous to the +formatted output facilities. These functions provide a mechanism for +reading arbitrary values under the control of a @dfn{format string} or +@dfn{template string}. + +@menu +* Formatted Input Basics:: Some basics to get you started. +* Input Conversion Syntax:: Syntax of conversion specifications. +* Table of Input Conversions:: Summary of input conversions and what they do. +* Numeric Input Conversions:: Details of conversions for reading numbers. +* String Input Conversions:: Details of conversions for reading strings. +* Dynamic String Input:: String conversions that @code{malloc} the buffer. +* Other Input Conversions:: Details of miscellaneous other conversions. +* Formatted Input Functions:: Descriptions of the actual functions. +* Variable Arguments Input:: @code{vscanf} and friends. +@end menu + +@node Formatted Input Basics +@subsection Formatted Input Basics + +Calls to @code{scanf} are superficially similar to calls to +@code{printf} in that arbitrary arguments are read under the control of +a template string. While the syntax of the conversion specifications in +the template is very similar to that for @code{printf}, the +interpretation of the template is oriented more towards free-format +input and simple pattern matching, rather than fixed-field formatting. +For example, most @code{scanf} conversions skip over any amount of +``white space'' (including spaces, tabs, and newlines) in the input +file, and there is no concept of precision for the numeric input +conversions as there is for the corresponding output conversions. +Ordinarily, non-whitespace characters in the template are expected to +match characters in the input stream exactly, but a matching failure is +distinct from an input error on the stream. +@cindex conversion specifications (@code{scanf}) + +Another area of difference between @code{scanf} and @code{printf} is +that you must remember to supply pointers rather than immediate values +as the optional arguments to @code{scanf}; the values that are read are +stored in the objects that the pointers point to. Even experienced +programmers tend to forget this occasionally, so if your program is +getting strange errors that seem to be related to @code{scanf}, you +might want to double-check this. + +When a @dfn{matching failure} occurs, @code{scanf} returns immediately, +leaving the first non-matching character as the next character to be +read from the stream. The normal return value from @code{scanf} is the +number of values that were assigned, so you can use this to determine if +a matching error happened before all the expected values were read. +@cindex matching failure, in @code{scanf} + +The @code{scanf} function is typically used for things like reading in +the contents of tables. For example, here is a function that uses +@code{scanf} to initialize an array of @code{double}: + +@smallexample +void +readarray (double *array, int n) +@{ + int i; + for (i=0; i<n; i++) + if (scanf (" %lf", &(array[i])) != 1) + invalid_input_error (); +@} +@end smallexample + +The formatted input functions are not used as frequently as the +formatted output functions. Partly, this is because it takes some care +to use them properly. Another reason is that it is difficult to recover +from a matching error. + +If you are trying to read input that doesn't match a single, fixed +pattern, you may be better off using a tool such as Flex to generate a +lexical scanner, or Bison to generate a parser, rather than using +@code{scanf}. For more information about these tools, see @ref{, , , +flex.info, Flex: The Lexical Scanner Generator}, and @ref{, , , +bison.info, The Bison Reference Manual}. + +@node Input Conversion Syntax +@subsection Input Conversion Syntax + +A @code{scanf} template string is a string that contains ordinary +multibyte characters interspersed with conversion specifications that +start with @samp{%}. + +Any whitespace character (as defined by the @code{isspace} function; +@pxref{Classification of Characters}) in the template causes any number +of whitespace characters in the input stream to be read and discarded. +The whitespace characters that are matched need not be exactly the same +whitespace characters that appear in the template string. For example, +write @samp{ , } in the template to recognize a comma with optional +whitespace before and after. + +Other characters in the template string that are not part of conversion +specifications must match characters in the input stream exactly; if +this is not the case, a matching failure occurs. + +The conversion specifications in a @code{scanf} template string +have the general form: + +@smallexample +% @var{flags} @var{width} @var{type} @var{conversion} +@end smallexample + +In more detail, an input conversion specification consists of an initial +@samp{%} character followed in sequence by: + +@itemize @bullet +@item +An optional @dfn{flag character} @samp{*}, which says to ignore the text +read for this specification. When @code{scanf} finds a conversion +specification that uses this flag, it reads input as directed by the +rest of the conversion specification, but it discards this input, does +not use a pointer argument, and does not increment the count of +successful assignments. +@cindex flag character (@code{scanf}) + +@item +An optional flag character @samp{a} (valid with string conversions only) +which requests allocation of a buffer long enough to store the string in. +(This is a GNU extension.) +@xref{Dynamic String Input}. + +@item +An optional decimal integer that specifies the @dfn{maximum field +width}. Reading of characters from the input stream stops either when +this maximum is reached or when a non-matching character is found, +whichever happens first. Most conversions discard initial whitespace +characters (those that don't are explicitly documented), and these +discarded characters don't count towards the maximum field width. +String input conversions store a null character to mark the end of the +input; the maximum field width does not include this terminator. +@cindex maximum field width (@code{scanf}) + +@item +An optional @dfn{type modifier character}. For example, you can +specify a type modifier of @samp{l} with integer conversions such as +@samp{%d} to specify that the argument is a pointer to a @code{long int} +rather than a pointer to an @code{int}. +@cindex type modifier character (@code{scanf}) + +@item +A character that specifies the conversion to be applied. +@end itemize + +The exact options that are permitted and how they are interpreted vary +between the different conversion specifiers. See the descriptions of the +individual conversions for information about the particular options that +they allow. + +With the @samp{-Wformat} option, the GNU C compiler checks calls to +@code{scanf} and related functions. It examines the format string and +verifies that the correct number and types of arguments are supplied. +There is also a GNU C syntax to tell the compiler that a function you +write uses a @code{scanf}-style format string. +@xref{Function Attributes, , Declaring Attributes of Functions, +gcc.info, Using GNU CC}, for more information. + +@node Table of Input Conversions +@subsection Table of Input Conversions +@cindex input conversions, for @code{scanf} + +Here is a table that summarizes the various conversion specifications: + +@table @asis +@item @samp{%d} +Matches an optionally signed integer written in decimal. @xref{Numeric +Input Conversions}. + +@item @samp{%i} +Matches an optionally signed integer in any of the formats that the C +language defines for specifying an integer constant. @xref{Numeric +Input Conversions}. + +@item @samp{%o} +Matches an unsigned integer written in octal radix. +@xref{Numeric Input Conversions}. + +@item @samp{%u} +Matches an unsigned integer written in decimal radix. +@xref{Numeric Input Conversions}. + +@item @samp{%x}, @samp{%X} +Matches an unsigned integer written in hexadecimal radix. +@xref{Numeric Input Conversions}. + +@item @samp{%e}, @samp{%f}, @samp{%g}, @samp{%E}, @samp{%G} +Matches an optionally signed floating-point number. @xref{Numeric Input +Conversions}. + +@item @samp{%s} +Matches a string containing only non-whitespace characters. +@xref{String Input Conversions}. + +@item @samp{%[} +Matches a string of characters that belong to a specified set. +@xref{String Input Conversions}. + +@item @samp{%c} +Matches a string of one or more characters; the number of characters +read is controlled by the maximum field width given for the conversion. +@xref{String Input Conversions}. + +@item @samp{%p} +Matches a pointer value in the same implementation-defined format used +by the @samp{%p} output conversion for @code{printf}. @xref{Other Input +Conversions}. + +@item @samp{%n} +This conversion doesn't read any characters; it records the number of +characters read so far by this call. @xref{Other Input Conversions}. + +@item @samp{%%} +This matches a literal @samp{%} character in the input stream. No +corresponding argument is used. @xref{Other Input Conversions}. +@end table + +If the syntax of a conversion specification is invalid, the behavior is +undefined. If there aren't enough function arguments provided to supply +addresses for all the conversion specifications in the template strings +that perform assignments, or if the arguments are not of the correct +types, the behavior is also undefined. On the other hand, extra +arguments are simply ignored. + +@node Numeric Input Conversions +@subsection Numeric Input Conversions + +This section describes the @code{scanf} conversions for reading numeric +values. + +The @samp{%d} conversion matches an optionally signed integer in decimal +radix. The syntax that is recognized is the same as that for the +@code{strtol} function (@pxref{Parsing of Integers}) with the value +@code{10} for the @var{base} argument. + +The @samp{%i} conversion matches an optionally signed integer in any of +the formats that the C language defines for specifying an integer +constant. The syntax that is recognized is the same as that for the +@code{strtol} function (@pxref{Parsing of Integers}) with the value +@code{0} for the @var{base} argument. (You can print integers in this +syntax with @code{printf} by using the @samp{#} flag character with the +@samp{%x}, @samp{%o}, or @samp{%d} conversion. @xref{Integer Conversions}.) + +For example, any of the strings @samp{10}, @samp{0xa}, or @samp{012} +could be read in as integers under the @samp{%i} conversion. Each of +these specifies a number with decimal value @code{10}. + +The @samp{%o}, @samp{%u}, and @samp{%x} conversions match unsigned +integers in octal, decimal, and hexadecimal radices, respectively. The +syntax that is recognized is the same as that for the @code{strtoul} +function (@pxref{Parsing of Integers}) with the appropriate value +(@code{8}, @code{10}, or @code{16}) for the @var{base} argument. + +The @samp{%X} conversion is identical to the @samp{%x} conversion. They +both permit either uppercase or lowercase letters to be used as digits. + +The default type of the corresponding argument for the @code{%d} and +@code{%i} conversions is @code{int *}, and @code{unsigned int *} for the +other integer conversions. You can use the following type modifiers to +specify other sizes of integer: + +@table @samp +@item h +Specifies that the argument is a @code{short int *} or @code{unsigned +short int *}. + +@item l +Specifies that the argument is a @code{long int *} or @code{unsigned +long int *}. Two @samp{l} characters is like the @samp{L} modifier, below. + +@need 100 +@item ll +@itemx L +@itemx q +Specifies that the argument is a @code{long long int *} or @code{unsigned long long int *}. (The @code{long long} type is an extension supported by the +GNU C compiler. For systems that don't provide extra-long integers, this +is the same as @code{long int}.) + +The @samp{q} modifier is another name for the same thing, which comes +from 4.4 BSD; a @w{@code{long long int}} is sometimes called a ``quad'' +@code{int}. +@end table + +All of the @samp{%e}, @samp{%f}, @samp{%g}, @samp{%E}, and @samp{%G} +input conversions are interchangeable. They all match an optionally +signed floating point number, in the same syntax as for the +@code{strtod} function (@pxref{Parsing of Floats}). + +For the floating-point input conversions, the default argument type is +@code{float *}. (This is different from the corresponding output +conversions, where the default type is @code{double}; remember that +@code{float} arguments to @code{printf} are converted to @code{double} +by the default argument promotions, but @code{float *} arguments are +not promoted to @code{double *}.) You can specify other sizes of float +using these type modifiers: + +@table @samp +@item l +Specifies that the argument is of type @code{double *}. + +@item L +Specifies that the argument is of type @code{long double *}. +@end table + +@node String Input Conversions +@subsection String Input Conversions + +This section describes the @code{scanf} input conversions for reading +string and character values: @samp{%s}, @samp{%[}, and @samp{%c}. + +You have two options for how to receive the input from these +conversions: + +@itemize @bullet +@item +Provide a buffer to store it in. This is the default. You +should provide an argument of type @code{char *}. + +@strong{Warning:} To make a robust program, you must make sure that the +input (plus its terminating null) cannot possibly exceed the size of the +buffer you provide. In general, the only way to do this is to specify a +maximum field width one less than the buffer size. @strong{If you +provide the buffer, always specify a maximum field width to prevent +overflow.} + +@item +Ask @code{scanf} to allocate a big enough buffer, by specifying the +@samp{a} flag character. This is a GNU extension. You should provide +an argument of type @code{char **} for the buffer address to be stored +in. @xref{Dynamic String Input}. +@end itemize + +The @samp{%c} conversion is the simplest: it matches a fixed number of +characters, always. The maximum field with says how many characters to +read; if you don't specify the maximum, the default is 1. This +conversion doesn't append a null character to the end of the text it +reads. It also does not skip over initial whitespace characters. It +reads precisely the next @var{n} characters, and fails if it cannot get +that many. Since there is always a maximum field width with @samp{%c} +(whether specified, or 1 by default), you can always prevent overflow by +making the buffer long enough. + +The @samp{%s} conversion matches a string of non-whitespace characters. +It skips and discards initial whitespace, but stops when it encounters +more whitespace after having read something. It stores a null character +at the end of the text that it reads. + +For example, reading the input: + +@smallexample + hello, world +@end smallexample + +@noindent +with the conversion @samp{%10c} produces @code{" hello, wo"}, but +reading the same input with the conversion @samp{%10s} produces +@code{"hello,"}. + +@strong{Warning:} If you do not specify a field width for @samp{%s}, +then the number of characters read is limited only by where the next +whitespace character appears. This almost certainly means that invalid +input can make your program crash---which is a bug. + +To read in characters that belong to an arbitrary set of your choice, +use the @samp{%[} conversion. You specify the set between the @samp{[} +character and a following @samp{]} character, using the same syntax used +in regular expressions. As special cases: + +@itemize @bullet +@item +A literal @samp{]} character can be specified as the first character +of the set. + +@item +An embedded @samp{-} character (that is, one that is not the first or +last character of the set) is used to specify a range of characters. + +@item +If a caret character @samp{^} immediately follows the initial @samp{[}, +then the set of allowed input characters is the everything @emph{except} +the characters listed. +@end itemize + +The @samp{%[} conversion does not skip over initial whitespace +characters. + +Here are some examples of @samp{%[} conversions and what they mean: + +@table @samp +@item %25[1234567890] +Matches a string of up to 25 digits. + +@item %25[][] +Matches a string of up to 25 square brackets. + +@item %25[^ \f\n\r\t\v] +Matches a string up to 25 characters long that doesn't contain any of +the standard whitespace characters. This is slightly different from +@samp{%s}, because if the input begins with a whitespace character, +@samp{%[} reports a matching failure while @samp{%s} simply discards the +initial whitespace. + +@item %25[a-z] +Matches up to 25 lowercase characters. +@end table + +One more reminder: the @samp{%s} and @samp{%[} conversions are +@strong{dangerous} if you don't specify a maximum width or use the +@samp{a} flag, because input too long would overflow whatever buffer you +have provided for it. No matter how long your buffer is, a user could +supply input that is longer. A well-written program reports invalid +input with a comprehensible error message, not with a crash. + +@node Dynamic String Input +@subsection Dynamically Allocating String Conversions + +A GNU extension to formatted input lets you safely read a string with no +maximum size. Using this feature, you don't supply a buffer; instead, +@code{scanf} allocates a buffer big enough to hold the data and gives +you its address. To use this feature, write @samp{a} as a flag +character, as in @samp{%as} or @samp{%a[0-9a-z]}. + +The pointer argument you supply for where to store the input should have +type @code{char **}. The @code{scanf} function allocates a buffer and +stores its address in the word that the argument points to. You should +free the buffer with @code{free} when you no longer need it. + +Here is an example of using the @samp{a} flag with the @samp{%[@dots{}]} +conversion specification to read a ``variable assignment'' of the form +@samp{@var{variable} = @var{value}}. + +@smallexample +@{ + char *variable, *value; + + if (2 > scanf ("%a[a-zA-Z0-9] = %a[^\n]\n", + &variable, &value)) + @{ + invalid_input_error (); + return 0; + @} + + @dots{} +@} +@end smallexample + +@node Other Input Conversions +@subsection Other Input Conversions + +This section describes the miscellaneous input conversions. + +The @samp{%p} conversion is used to read a pointer value. It recognizes +the same syntax as is used by the @samp{%p} output conversion for +@code{printf} (@pxref{Other Output Conversions}); that is, a hexadecimal +number just as the @samp{%x} conversion accepts. The corresponding +argument should be of type @code{void **}; that is, the address of a +place to store a pointer. + +The resulting pointer value is not guaranteed to be valid if it was not +originally written during the same program execution that reads it in. + +The @samp{%n} conversion produces the number of characters read so far +by this call. The corresponding argument should be of type @code{int *}. +This conversion works in the same way as the @samp{%n} conversion for +@code{printf}; see @ref{Other Output Conversions}, for an example. + +The @samp{%n} conversion is the only mechanism for determining the +success of literal matches or conversions with suppressed assignments. +If the @samp{%n} follows the locus of a matching failure, then no value +is stored for it since @code{scanf} returns before processing the +@samp{%n}. If you store @code{-1} in that argument slot before calling +@code{scanf}, the presence of @code{-1} after @code{scanf} indicates an +error occurred before the @samp{%n} was reached. + +Finally, the @samp{%%} conversion matches a literal @samp{%} character +in the input stream, without using an argument. This conversion does +not permit any flags, field width, or type modifier to be specified. + +@node Formatted Input Functions +@subsection Formatted Input Functions + +Here are the descriptions of the functions for performing formatted +input. +Prototypes for these functions are in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int scanf (const char *@var{template}, @dots{}) +The @code{scanf} function reads formatted input from the stream +@code{stdin} under the control of the template string @var{template}. +The optional arguments are pointers to the places which receive the +resulting values. + +The return value is normally the number of successful assignments. If +an end-of-file condition is detected before any matches are performed +(including matches against whitespace and literal characters in the +template), then @code{EOF} is returned. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fscanf (FILE *@var{stream}, const char *@var{template}, @dots{}) +This function is just like @code{scanf}, except that the input is read +from the stream @var{stream} instead of @code{stdin}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int sscanf (const char *@var{s}, const char *@var{template}, @dots{}) +This is like @code{scanf}, except that the characters are taken from the +null-terminated string @var{s} instead of from a stream. Reaching the +end of the string is treated as an end-of-file condition. + +The behavior of this function is undefined if copying takes place +between objects that overlap---for example, if @var{s} is also given +as an argument to receive a string read under control of the @samp{%s} +conversion. +@end deftypefun + +@node Variable Arguments Input +@subsection Variable Arguments Input Functions + +The functions @code{vscanf} and friends are provided so that you can +define your own variadic @code{scanf}-like functions that make use of +the same internals as the built-in formatted output functions. +These functions are analogous to the @code{vprintf} series of output +functions. @xref{Variable Arguments Output}, for important +information on how to use them. + +@strong{Portability Note:} The functions listed in this section are GNU +extensions. + +@comment stdio.h +@comment GNU +@deftypefun int vscanf (const char *@var{template}, va_list @var{ap}) +This function is similar to @code{scanf} except that, instead of taking +a variable number of arguments directly, it takes an argument list +pointer @var{ap} of type @code{va_list} (@pxref{Variadic Functions}). +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int vfscanf (FILE *@var{stream}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{fscanf} with the variable argument list +specified directly as for @code{vscanf}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int vsscanf (const char *@var{s}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{sscanf} with the variable argument list +specified directly as for @code{vscanf}. +@end deftypefun + +In GNU C, there is a special construct you can use to let the compiler +know that a function uses a @code{scanf}-style format string. Then it +can check the number and types of arguments in each call to the +function, and warn you when they do not match the format string. +@xref{Function Attributes, , Declaring Attributes of Functions, +gcc.info, Using GNU CC}, for details. + +@node EOF and Errors +@section End-Of-File and Errors + +@cindex end of file, on a stream +Many of the functions described in this chapter return the value of the +macro @code{EOF} to indicate unsuccessful completion of the operation. +Since @code{EOF} is used to report both end of file and random errors, +it's often better to use the @code{feof} function to check explicitly +for end of file and @code{ferror} to check for errors. These functions +check indicators that are part of the internal state of the stream +object, indicators set if the appropriate condition was detected by a +previous I/O operation on that stream. + +These symbols are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypevr Macro int EOF +This macro is an integer value that is returned by a number of functions +to indicate an end-of-file condition, or some other error situation. +With the GNU library, @code{EOF} is @code{-1}. In other libraries, its +value may be some other negative number. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypefun void clearerr (FILE *@var{stream}) +This function clears the end-of-file and error indicators for the +stream @var{stream}. + +The file positioning functions (@pxref{File Positioning}) also clear the +end-of-file indicator for the stream. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int feof (FILE *@var{stream}) +The @code{feof} function returns nonzero if and only if the end-of-file +indicator for the stream @var{stream} is set. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int ferror (FILE *@var{stream}) +The @code{ferror} function returns nonzero if and only if the error +indicator for the stream @var{stream} is set, indicating that an error +has occurred on a previous operation on the stream. +@end deftypefun + +In addition to setting the error indicator associated with the stream, +the functions that operate on streams also set @code{errno} in the same +way as the corresponding low-level functions that operate on file +descriptors. For example, all of the functions that perform output to a +stream---such as @code{fputc}, @code{printf}, and @code{fflush}---are +implemented in terms of @code{write}, and all of the @code{errno} error +conditions defined for @code{write} are meaningful for these functions. +For more information about the descriptor-level I/O functions, see +@ref{Low-Level I/O}. + +@node Binary Streams +@section Text and Binary Streams + +The GNU system and other POSIX-compatible operating systems organize all +files as uniform sequences of characters. However, some other systems +make a distinction between files containing text and files containing +binary data, and the input and output facilities of ANSI C provide for +this distinction. This section tells you how to write programs portable +to such systems. + +@cindex text stream +@cindex binary stream +When you open a stream, you can specify either a @dfn{text stream} or a +@dfn{binary stream}. You indicate that you want a binary stream by +specifying the @samp{b} modifier in the @var{opentype} argument to +@code{fopen}; see @ref{Opening Streams}. Without this +option, @code{fopen} opens the file as a text stream. + +Text and binary streams differ in several ways: + +@itemize @bullet +@item +The data read from a text stream is divided into @dfn{lines} which are +terminated by newline (@code{'\n'}) characters, while a binary stream is +simply a long series of characters. A text stream might on some systems +fail to handle lines more than 254 characters long (including the +terminating newline character). +@cindex lines (in a text file) + +@item +On some systems, text files can contain only printing characters, +horizontal tab characters, and newlines, and so text streams may not +support other characters. However, binary streams can handle any +character value. + +@item +Space characters that are written immediately preceding a newline +character in a text stream may disappear when the file is read in again. + +@item +More generally, there need not be a one-to-one mapping between +characters that are read from or written to a text stream, and the +characters in the actual file. +@end itemize + +Since a binary stream is always more capable and more predictable than a +text stream, you might wonder what purpose text streams serve. Why not +simply always use binary streams? The answer is that on these operating +systems, text and binary streams use different file formats, and the +only way to read or write ``an ordinary file of text'' that can work +with other text-oriented programs is through a text stream. + +In the GNU library, and on all POSIX systems, there is no difference +between text streams and binary streams. When you open a stream, you +get the same kind of stream regardless of whether you ask for binary. +This stream can handle any file content, and has none of the +restrictions that text streams sometimes have. + +@node File Positioning +@section File Positioning +@cindex file positioning on a stream +@cindex positioning a stream +@cindex seeking on a stream + +The @dfn{file position} of a stream describes where in the file the +stream is currently reading or writing. I/O on the stream advances the +file position through the file. In the GNU system, the file position is +represented as an integer, which counts the number of bytes from the +beginning of the file. @xref{File Position}. + +During I/O to an ordinary disk file, you can change the file position +whenever you wish, so as to read or write any portion of the file. Some +other kinds of files may also permit this. Files which support changing +the file position are sometimes referred to as @dfn{random-access} +files. + +You can use the functions in this section to examine or modify the file +position indicator associated with a stream. The symbols listed below +are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun {long int} ftell (FILE *@var{stream}) +This function returns the current file position of the stream +@var{stream}. + +This function can fail if the stream doesn't support file positioning, +or if the file position can't be represented in a @code{long int}, and +possibly for other reasons as well. If a failure occurs, a value of +@code{-1} is returned. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fseek (FILE *@var{stream}, long int @var{offset}, int @var{whence}) +The @code{fseek} function is used to change the file position of the +stream @var{stream}. The value of @var{whence} must be one of the +constants @code{SEEK_SET}, @code{SEEK_CUR}, or @code{SEEK_END}, to +indicate whether the @var{offset} is relative to the beginning of the +file, the current file position, or the end of the file, respectively. + +This function returns a value of zero if the operation was successful, +and a nonzero value to indicate failure. A successful call also clears +the end-of-file indicator of @var{stream} and discards any characters +that were ``pushed back'' by the use of @code{ungetc}. + +@code{fseek} either flushes any buffered output before setting the file +position or else remembers it so it will be written later in its proper +place in the file. +@end deftypefun + +@strong{Portability Note:} In non-POSIX systems, @code{ftell} and +@code{fseek} might work reliably only on binary streams. @xref{Binary +Streams}. + +The following symbolic constants are defined for use as the @var{whence} +argument to @code{fseek}. They are also used with the @code{lseek} +function (@pxref{I/O Primitives}) and to specify offsets for file locks +(@pxref{Control Operations}). + +@comment stdio.h +@comment ANSI +@deftypevr Macro int SEEK_SET +This is an integer constant which, when used as the @var{whence} +argument to the @code{fseek} function, specifies that the offset +provided is relative to the beginning of the file. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int SEEK_CUR +This is an integer constant which, when used as the @var{whence} +argument to the @code{fseek} function, specifies that the offset +provided is relative to the current file position. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int SEEK_END +This is an integer constant which, when used as the @var{whence} +argument to the @code{fseek} function, specifies that the offset +provided is relative to the end of the file. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypefun void rewind (FILE *@var{stream}) +The @code{rewind} function positions the stream @var{stream} at the +begining of the file. It is equivalent to calling @code{fseek} on the +@var{stream} with an @var{offset} argument of @code{0L} and a +@var{whence} argument of @code{SEEK_SET}, except that the return +value is discarded and the error indicator for the stream is reset. +@end deftypefun + +These three aliases for the @samp{SEEK_@dots{}} constants exist for the +sake of compatibility with older BSD systems. They are defined in two +different header files: @file{fcntl.h} and @file{sys/file.h}. + +@table @code +@comment sys/file.h +@comment BSD +@item L_SET +@vindex L_SET +An alias for @code{SEEK_SET}. + +@comment sys/file.h +@comment BSD +@item L_INCR +@vindex L_INCR +An alias for @code{SEEK_CUR}. + +@comment sys/file.h +@comment BSD +@item L_XTND +@vindex L_XTND +An alias for @code{SEEK_END}. +@end table + +@node Portable Positioning +@section Portable File-Position Functions + +On the GNU system, the file position is truly a character count. You +can specify any character count value as an argument to @code{fseek} and +get reliable results for any random access file. However, some ANSI C +systems do not represent file positions in this way. + +On some systems where text streams truly differ from binary streams, it +is impossible to represent the file position of a text stream as a count +of characters from the beginning of the file. For example, the file +position on some systems must encode both a record offset within the +file, and a character offset within the record. + +As a consequence, if you want your programs to be portable to these +systems, you must observe certain rules: + +@itemize @bullet +@item +The value returned from @code{ftell} on a text stream has no predictable +relationship to the number of characters you have read so far. The only +thing you can rely on is that you can use it subsequently as the +@var{offset} argument to @code{fseek} to move back to the same file +position. + +@item +In a call to @code{fseek} on a text stream, either the @var{offset} must +either be zero; or @var{whence} must be @code{SEEK_SET} and the +@var{offset} must be the result of an earlier call to @code{ftell} on +the same stream. + +@item +The value of the file position indicator of a text stream is undefined +while there are characters that have been pushed back with @code{ungetc} +that haven't been read or discarded. @xref{Unreading}. +@end itemize + +But even if you observe these rules, you may still have trouble for long +files, because @code{ftell} and @code{fseek} use a @code{long int} value +to represent the file position. This type may not have room to encode +all the file positions in a large file. + +So if you do want to support systems with peculiar encodings for the +file positions, it is better to use the functions @code{fgetpos} and +@code{fsetpos} instead. These functions represent the file position +using the data type @code{fpos_t}, whose internal representation varies +from system to system. + +These symbols are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftp {Data Type} fpos_t +This is the type of an object that can encode information about the +file position of a stream, for use by the functions @code{fgetpos} and +@code{fsetpos}. + +In the GNU system, @code{fpos_t} is equivalent to @code{off_t} or +@code{long int}. In other systems, it might have a different internal +representation. +@end deftp + +@comment stdio.h +@comment ANSI +@deftypefun int fgetpos (FILE *@var{stream}, fpos_t *@var{position}) +This function stores the value of the file position indicator for the +stream @var{stream} in the @code{fpos_t} object pointed to by +@var{position}. If successful, @code{fgetpos} returns zero; otherwise +it returns a nonzero value and stores an implementation-defined positive +value in @code{errno}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fsetpos (FILE *@var{stream}, const fpos_t @var{position}) +This function sets the file position indicator for the stream @var{stream} +to the position @var{position}, which must have been set by a previous +call to @code{fgetpos} on the same stream. If successful, @code{fsetpos} +clears the end-of-file indicator on the stream, discards any characters +that were ``pushed back'' by the use of @code{ungetc}, and returns a value +of zero. Otherwise, @code{fsetpos} returns a nonzero value and stores +an implementation-defined positive value in @code{errno}. +@end deftypefun + +@node Stream Buffering +@section Stream Buffering + +@cindex buffering of streams +Characters that are written to a stream are normally accumulated and +transmitted asynchronously to the file in a block, instead of appearing +as soon as they are output by the application program. Similarly, +streams often retrieve input from the host environment in blocks rather +than on a character-by-character basis. This is called @dfn{buffering}. + +If you are writing programs that do interactive input and output using +streams, you need to understand how buffering works when you design the +user interface to your program. Otherwise, you might find that output +(such as progress or prompt messages) doesn't appear when you intended +it to, or other unexpected behavior. + +This section deals only with controlling when characters are transmitted +between the stream and the file or device, and @emph{not} with how +things like echoing, flow control, and the like are handled on specific +classes of devices. For information on common control operations on +terminal devices, see @ref{Low-Level Terminal Interface}. + +You can bypass the stream buffering facilities altogether by using the +low-level input and output functions that operate on file descriptors +instead. @xref{Low-Level I/O}. + +@menu +* Buffering Concepts:: Terminology is defined here. +* Flushing Buffers:: How to ensure that output buffers are flushed. +* Controlling Buffering:: How to specify what kind of buffering to use. +@end menu + +@node Buffering Concepts +@subsection Buffering Concepts + +There are three different kinds of buffering strategies: + +@itemize @bullet +@item +Characters written to or read from an @dfn{unbuffered} stream are +transmitted individually to or from the file as soon as possible. +@cindex unbuffered stream + +@item +Characters written to a @dfn{line buffered} stream are transmitted to +the file in blocks when a newline character is encountered. +@cindex line buffered stream + +@item +Characters written to or read from a @dfn{fully buffered} stream are +transmitted to or from the file in blocks of arbitrary size. +@cindex fully buffered stream +@end itemize + +Newly opened streams are normally fully buffered, with one exception: a +stream connected to an interactive device such as a terminal is +initially line buffered. @xref{Controlling Buffering}, for information +on how to select a different kind of buffering. Usually the automatic +selection gives you the most convenient kind of buffering for the file +or device you open. + +The use of line buffering for interactive devices implies that output +messages ending in a newline will appear immediately---which is usually +what you want. Output that doesn't end in a newline might or might not +show up immediately, so if you want them to appear immediately, you +should flush buffered output explicitly with @code{fflush}, as described +in @ref{Flushing Buffers}. + +@node Flushing Buffers +@subsection Flushing Buffers + +@cindex flushing a stream +@dfn{Flushing} output on a buffered stream means transmitting all +accumulated characters to the file. There are many circumstances when +buffered output on a stream is flushed automatically: + +@itemize @bullet +@item +When you try to do output and the output buffer is full. + +@item +When the stream is closed. @xref{Closing Streams}. + +@item +When the program terminates by calling @code{exit}. +@xref{Normal Termination}. + +@item +When a newline is written, if the stream is line buffered. + +@item +Whenever an input operation on @emph{any} stream actually reads data +from its file. +@end itemize + +If you want to flush the buffered output at another time, call +@code{fflush}, which is declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int fflush (FILE *@var{stream}) +This function causes any buffered output on @var{stream} to be delivered +to the file. If @var{stream} is a null pointer, then +@code{fflush} causes buffered output on @emph{all} open output streams +to be flushed. + +This function returns @code{EOF} if a write error occurs, or zero +otherwise. +@end deftypefun + +@strong{Compatibility Note:} Some brain-damaged operating systems have +been known to be so thoroughly fixated on line-oriented input and output +that flushing a line buffered stream causes a newline to be written! +Fortunately, this ``feature'' seems to be becoming less common. You do +not need to worry about this in the GNU system. + + +@node Controlling Buffering +@subsection Controlling Which Kind of Buffering + +After opening a stream (but before any other operations have been +performed on it), you can explicitly specify what kind of buffering you +want it to have using the @code{setvbuf} function. +@cindex buffering, controlling + +The facilities listed in this section are declared in the header +file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int setvbuf (FILE *@var{stream}, char *@var{buf}, int @var{mode}, size_t @var{size}) +This function is used to specify that the stream @var{stream} should +have the buffering mode @var{mode}, which can be either @code{_IOFBF} +(for full buffering), @code{_IOLBF} (for line buffering), or +@code{_IONBF} (for unbuffered input/output). + +If you specify a null pointer as the @var{buf} argument, then @code{setvbuf} +allocates a buffer itself using @code{malloc}. This buffer will be freed +when you close the stream. + +Otherwise, @var{buf} should be a character array that can hold at least +@var{size} characters. You should not free the space for this array as +long as the stream remains open and this array remains its buffer. You +should usually either allocate it statically, or @code{malloc} +(@pxref{Unconstrained Allocation}) the buffer. Using an automatic array +is not a good idea unless you close the file before exiting the block +that declares the array. + +While the array remains a stream buffer, the stream I/O functions will +use the buffer for their internal purposes. You shouldn't try to access +the values in the array directly while the stream is using it for +buffering. + +The @code{setvbuf} function returns zero on success, or a nonzero value +if the value of @var{mode} is not valid or if the request could not +be honored. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypevr Macro int _IOFBF +The value of this macro is an integer constant expression that can be +used as the @var{mode} argument to the @code{setvbuf} function to +specify that the stream should be fully buffered. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int _IOLBF +The value of this macro is an integer constant expression that can be +used as the @var{mode} argument to the @code{setvbuf} function to +specify that the stream should be line buffered. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int _IONBF +The value of this macro is an integer constant expression that can be +used as the @var{mode} argument to the @code{setvbuf} function to +specify that the stream should be unbuffered. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int BUFSIZ +The value of this macro is an integer constant expression that is good +to use for the @var{size} argument to @code{setvbuf}. This value is +guaranteed to be at least @code{256}. + +The value of @code{BUFSIZ} is chosen on each system so as to make stream +I/O efficient. So it is a good idea to use @code{BUFSIZ} as the size +for the buffer when you call @code{setvbuf}. + +Actually, you can get an even better value to use for the buffer size +by means of the @code{fstat} system call: it is found in the +@code{st_blksize} field of the file attributes. @xref{Attribute Meanings}. + +Sometimes people also use @code{BUFSIZ} as the allocation size of +buffers used for related purposes, such as strings used to receive a +line of input with @code{fgets} (@pxref{Character Input}). There is no +particular reason to use @code{BUFSIZ} for this instead of any other +integer, except that it might lead to doing I/O in chunks of an +efficient size. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypefun void setbuf (FILE *@var{stream}, char *@var{buf}) +If @var{buf} is a null pointer, the effect of this function is +equivalent to calling @code{setvbuf} with a @var{mode} argument of +@code{_IONBF}. Otherwise, it is equivalent to calling @code{setvbuf} +with @var{buf}, and a @var{mode} of @code{_IOFBF} and a @var{size} +argument of @code{BUFSIZ}. + +The @code{setbuf} function is provided for compatibility with old code; +use @code{setvbuf} in all new programs. +@end deftypefun + +@comment stdio.h +@comment BSD +@deftypefun void setbuffer (FILE *@var{stream}, char *@var{buf}, size_t @var{size}) +If @var{buf} is a null pointer, this function makes @var{stream} unbuffered. +Otherwise, it makes @var{stream} fully buffered using @var{buf} as the +buffer. The @var{size} argument specifies the length of @var{buf}. + +This function is provided for compatibility with old BSD code. Use +@code{setvbuf} instead. +@end deftypefun + +@comment stdio.h +@comment BSD +@deftypefun void setlinebuf (FILE *@var{stream}) +This function makes @var{stream} be line buffered, and allocates the +buffer for you. + +This function is provided for compatibility with old BSD code. Use +@code{setvbuf} instead. +@end deftypefun + +@node Other Kinds of Streams +@section Other Kinds of Streams + +The GNU library provides ways for you to define additional kinds of +streams that do not necessarily correspond to an open file. + +One such type of stream takes input from or writes output to a string. +These kinds of streams are used internally to implement the +@code{sprintf} and @code{sscanf} functions. You can also create such a +stream explicitly, using the functions described in @ref{String Streams}. + +More generally, you can define streams that do input/output to arbitrary +objects using functions supplied by your program. This protocol is +discussed in @ref{Custom Streams}. + +@strong{Portability Note:} The facilities described in this section are +specific to GNU. Other systems or C implementations might or might not +provide equivalent functionality. + +@menu +* String Streams:: Streams that get data from or put data in + a string or memory buffer. +* Obstack Streams:: Streams that store data in an obstack. +* Custom Streams:: Defining your own streams with an arbitrary + input data source and/or output data sink. +@end menu + +@node String Streams +@subsection String Streams + +@cindex stream, for I/O to a string +@cindex string stream +The @code{fmemopen} and @code{open_memstream} functions allow you to do +I/O to a string or memory buffer. These facilities are declared in +@file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment GNU +@deftypefun {FILE *} fmemopen (void *@var{buf}, size_t @var{size}, const char *@var{opentype}) +This function opens a stream that allows the access specified by the +@var{opentype} argument, that reads from or writes to the buffer specified +by the argument @var{buf}. This array must be at least @var{size} bytes long. + +If you specify a null pointer as the @var{buf} argument, @code{fmemopen} +dynamically allocates (as with @code{malloc}; @pxref{Unconstrained +Allocation}) an array @var{size} bytes long. This is really only useful +if you are going to write things to the buffer and then read them back +in again, because you have no way of actually getting a pointer to the +buffer (for this, try @code{open_memstream}, below). The buffer is +freed when the stream is open. + +The argument @var{opentype} is the same as in @code{fopen} +(@xref{Opening Streams}). If the @var{opentype} specifies +append mode, then the initial file position is set to the first null +character in the buffer. Otherwise the initial file position is at the +beginning of the buffer. + +When a stream open for writing is flushed or closed, a null character +(zero byte) is written at the end of the buffer if it fits. You +should add an extra byte to the @var{size} argument to account for this. +Attempts to write more than @var{size} bytes to the buffer result +in an error. + +For a stream open for reading, null characters (zero bytes) in the +buffer do not count as ``end of file''. Read operations indicate end of +file only when the file position advances past @var{size} bytes. So, if +you want to read characters from a null-terminated string, you should +supply the length of the string as the @var{size} argument. +@end deftypefun + +Here is an example of using @code{fmemopen} to create a stream for +reading from a string: + +@smallexample +@include memopen.c.texi +@end smallexample + +This program produces the following output: + +@smallexample +Got f +Got o +Got o +Got b +Got a +Got r +@end smallexample + +@comment stdio.h +@comment GNU +@deftypefun {FILE *} open_memstream (char **@var{ptr}, size_t *@var{sizeloc}) +This function opens a stream for writing to a buffer. The buffer is +allocated dynamically (as with @code{malloc}; @pxref{Unconstrained +Allocation}) and grown as necessary. + +When the stream is closed with @code{fclose} or flushed with +@code{fflush}, the locations @var{ptr} and @var{sizeloc} are updated to +contain the pointer to the buffer and its size. The values thus stored +remain valid only as long as no further output on the stream takes +place. If you do more output, you must flush the stream again to store +new values before you use them again. + +A null character is written at the end of the buffer. This null character +is @emph{not} included in the size value stored at @var{sizeloc}. + +You can move the stream's file position with @code{fseek} (@pxref{File +Positioning}). Moving the file position past the end of the data +already written fills the intervening space with zeroes. +@end deftypefun + +Here is an example of using @code{open_memstream}: + +@smallexample +@include memstrm.c.texi +@end smallexample + +This program produces the following output: + +@smallexample +buf = `hello', size = 5 +buf = `hello, world', size = 12 +@end smallexample + +@c @group Invalid outside @example. +@node Obstack Streams +@subsection Obstack Streams + +You can open an output stream that puts it data in an obstack. +@xref{Obstacks}. + +@comment stdio.h +@comment GNU +@deftypefun {FILE *} open_obstack_stream (struct obstack *@var{obstack}) +This function opens a stream for writing data into the obstack @var{obstack}. +This starts an object in the obstack and makes it grow as data is +written (@pxref{Growing Objects}). +@c @end group Doubly invalid because not nested right. + +Calling @code{fflush} on this stream updates the current size of the +object to match the amount of data that has been written. After a call +to @code{fflush}, you can examine the object temporarily. + +You can move the file position of an obstack stream with @code{fseek} +(@pxref{File Positioning}). Moving the file position past the end of +the data written fills the intervening space with zeros. + +To make the object permanent, update the obstack with @code{fflush}, and +then use @code{obstack_finish} to finalize the object and get its address. +The following write to the stream starts a new object in the obstack, +and later writes add to that object until you do another @code{fflush} +and @code{obstack_finish}. + +But how do you find out how long the object is? You can get the length +in bytes by calling @code{obstack_object_size} (@pxref{Status of an +Obstack}), or you can null-terminate the object like this: + +@smallexample +obstack_1grow (@var{obstack}, 0); +@end smallexample + +Whichever one you do, you must do it @emph{before} calling +@code{obstack_finish}. (You can do both if you wish.) +@end deftypefun + +Here is a sample function that uses @code{open_obstack_stream}: + +@smallexample +char * +make_message_string (const char *a, int b) +@{ + FILE *stream = open_obstack_stream (&message_obstack); + output_task (stream); + fprintf (stream, ": "); + fprintf (stream, a, b); + fprintf (stream, "\n"); + fclose (stream); + obstack_1grow (&message_obstack, 0); + return obstack_finish (&message_obstack); +@} +@end smallexample + +@node Custom Streams +@subsection Programming Your Own Custom Streams +@cindex custom streams +@cindex programming your own streams + +This section describes how you can make a stream that gets input from an +arbitrary data source or writes output to an arbitrary data sink +programmed by you. We call these @dfn{custom streams}. + +@c !!! this does not talk at all about the higher-level hooks + +@menu +* Streams and Cookies:: The @dfn{cookie} records where to fetch or + store data that is read or written. +* Hook Functions:: How you should define the four @dfn{hook + functions} that a custom stream needs. +@end menu + +@node Streams and Cookies +@subsubsection Custom Streams and Cookies +@cindex cookie, for custom stream + +Inside every custom stream is a special object called the @dfn{cookie}. +This is an object supplied by you which records where to fetch or store +the data read or written. It is up to you to define a data type to use +for the cookie. The stream functions in the library never refer +directly to its contents, and they don't even know what the type is; +they record its address with type @code{void *}. + +To implement a custom stream, you must specify @emph{how} to fetch or +store the data in the specified place. You do this by defining +@dfn{hook functions} to read, write, change ``file position'', and close +the stream. All four of these functions will be passed the stream's +cookie so they can tell where to fetch or store the data. The library +functions don't know what's inside the cookie, but your functions will +know. + +When you create a custom stream, you must specify the cookie pointer, +and also the four hook functions stored in a structure of type +@code{cookie_io_functions_t}. + +These facilities are declared in @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment GNU +@deftp {Data Type} {cookie_io_functions_t} +This is a structure type that holds the functions that define the +communications protocol between the stream and its cookie. It has +the following members: + +@table @code +@item cookie_read_function_t *read +This is the function that reads data from the cookie. If the value is a +null pointer instead of a function, then read operations on ths stream +always return @code{EOF}. + +@item cookie_write_function_t *write +This is the function that writes data to the cookie. If the value is a +null pointer instead of a function, then data written to the stream is +discarded. + +@item cookie_seek_function_t *seek +This is the function that performs the equivalent of file positioning on +the cookie. If the value is a null pointer instead of a function, calls +to @code{fseek} on this stream can only seek to locations within the +buffer; any attempt to seek outside the buffer will return an +@code{ESPIPE} error. + +@item cookie_close_function_t *close +This function performs any appropriate cleanup on the cookie when +closing the stream. If the value is a null pointer instead of a +function, nothing special is done to close the cookie when the stream is +closed. +@end table +@end deftp + +@comment stdio.h +@comment GNU +@deftypefun {FILE *} fopencookie (void *@var{cookie}, const char *@var{opentype}, cookie_io_functions_t @var{io-functions}) +This function actually creates the stream for communicating with the +@var{cookie} using the functions in the @var{io-functions} argument. +The @var{opentype} argument is interpreted as for @code{fopen}; +see @ref{Opening Streams}. (But note that the ``truncate on +open'' option is ignored.) The new stream is fully buffered. + +The @code{fopencookie} function returns the newly created stream, or a null +pointer in case of an error. +@end deftypefun + +@node Hook Functions +@subsubsection Custom Stream Hook Functions +@cindex hook functions (of custom streams) + +Here are more details on how you should define the four hook functions +that a custom stream needs. + +You should define the function to read data from the cookie as: + +@smallexample +ssize_t @var{reader} (void *@var{cookie}, void *@var{buffer}, size_t @var{size}) +@end smallexample + +This is very similar to the @code{read} function; see @ref{I/O +Primitives}. Your function should transfer up to @var{size} bytes into +the @var{buffer}, and return the number of bytes read, or zero to +indicate end-of-file. You can return a value of @code{-1} to indicate +an error. + +You should define the function to write data to the cookie as: + +@smallexample +ssize_t @var{writer} (void *@var{cookie}, const void *@var{buffer}, size_t @var{size}) +@end smallexample + +This is very similar to the @code{write} function; see @ref{I/O +Primitives}. Your function should transfer up to @var{size} bytes from +the buffer, and return the number of bytes written. You can return a +value of @code{-1} to indicate an error. + +You should define the function to perform seek operations on the cookie +as: + +@smallexample +int @var{seeker} (void *@var{cookie}, fpos_t *@var{position}, int @var{whence}) +@end smallexample + +For this function, the @var{position} and @var{whence} arguments are +interpreted as for @code{fgetpos}; see @ref{Portable Positioning}. In +the GNU library, @code{fpos_t} is equivalent to @code{off_t} or +@code{long int}, and simply represents the number of bytes from the +beginning of the file. + +After doing the seek operation, your function should store the resulting +file position relative to the beginning of the file in @var{position}. +Your function should return a value of @code{0} on success and @code{-1} +to indicate an error. + +You should define the function to do cleanup operations on the cookie +appropriate for closing the stream as: + +@smallexample +int @var{cleaner} (void *@var{cookie}) +@end smallexample + +Your function should return @code{-1} to indicate an error, and @code{0} +otherwise. + +@comment stdio.h +@comment GNU +@deftp {Data Type} cookie_read_function +This is the data type that the read function for a custom stream should have. +If you declare the function as shown above, this is the type it will have. +@end deftp + +@comment stdio.h +@comment GNU +@deftp {Data Type} cookie_write_function +The data type of the write function for a custom stream. +@end deftp + +@comment stdio.h +@comment GNU +@deftp {Data Type} cookie_seek_function +The data type of the seek function for a custom stream. +@end deftp + +@comment stdio.h +@comment GNU +@deftp {Data Type} cookie_close_function +The data type of the close function for a custom stream. +@end deftp + +@ignore +Roland says: + +@quotation +There is another set of functions one can give a stream, the +input-room and output-room functions. These functions must +understand stdio internals. To describe how to use these +functions, you also need to document lots of how stdio works +internally (which isn't relevant for other uses of stdio). +Perhaps I can write an interface spec from which you can write +good documentation. But it's pretty complex and deals with lots +of nitty-gritty details. I think it might be better to let this +wait until the rest of the manual is more done and polished. +@end quotation +@end ignore + +@c ??? This section could use an example. -- cgit v1.1