diff options
Diffstat (limited to 'manual')
-rw-r--r-- | manual/intro.texi | 51 | ||||
-rw-r--r-- | manual/string.texi | 225 |
2 files changed, 264 insertions, 12 deletions
diff --git a/manual/intro.texi b/manual/intro.texi index e0447b6..b8f4c8c 100644 --- a/manual/intro.texi +++ b/manual/intro.texi @@ -77,12 +77,13 @@ other symbols provided by the library. This list also states which standards each function or symbol comes from. @menu -* ISO C:: The international standard for the C +* ISO C:: The international standard for the C programming language. * POSIX:: The ISO/IEC 9945 (aka IEEE 1003) standards for operating systems. * Berkeley Unix:: BSD and SunOS. * SVID:: The System V Interface Description. +* XPG:: The X/Open Portability Guide. @end menu @node ISO C, POSIX, , Standards and Portability @@ -118,18 +119,22 @@ differences between @w{ISO C} and older dialects. It gives advice on how to write programs to work portably under multiple C dialects, but does not aim for completeness. + @node POSIX, Berkeley Unix, ISO C, Standards and Portability @subsection POSIX (The Portable Operating System Interface) @cindex POSIX @cindex POSIX.1 @cindex IEEE Std 1003.1 +@cindex ISO/IEC 9945-1 @cindex POSIX.2 @cindex IEEE Std 1003.2 +@cindex ISO/IEC 9945-2 -The GNU library is also compatible with the IEEE @dfn{POSIX} family of +The GNU library is also compatible with the ISO @dfn{POSIX} family of standards, known more formally as the @dfn{Portable Operating System -Interface for Computer Environments}. POSIX is derived mostly from -various versions of the Unix operating system. +Interface for Computer Environments} (ISO/IEC 9945). They were also +published as ANSI/IEEE Std 1003. POSIX is derived mostly from various +versions of the Unix operating system. The library facilities specified by the POSIX standards are a superset of those required by @w{ISO C}; POSIX specifies additional features for @@ -141,14 +146,14 @@ programming language support which can run in many diverse operating system environments.@refill The GNU C library implements all of the functions specified in -@cite{IEEE Std 1003.1-1990, the POSIX System Application Program +@cite{ISO/IEC 9945-1:1996, the POSIX System Application Program Interface}, commonly referred to as POSIX.1. The primary extensions to the @w{ISO C} facilities specified by this standard include file system interface primitives (@pxref{File System Interface}), device-specific terminal control functions (@pxref{Low-Level Terminal Interface}), and process control functions (@pxref{Processes}). -Some facilities from @cite{IEEE Std 1003.2-1992, the POSIX Shell and +Some facilities from @cite{ISO/IEC 9945-2:1993, the POSIX Shell and Utilities standard} (POSIX.2) are also implemented in the GNU library. These include utilities for dealing with regular expressions and other pattern matching facilities (@pxref{Pattern Matching}). @@ -186,7 +191,7 @@ The BSD facilities include symbolic links (@pxref{Symbolic Links}), the @code{select} function (@pxref{Waiting for I/O}), the BSD signal functions (@pxref{BSD Signal Handling}), and sockets (@pxref{Sockets}). -@node SVID, , Berkeley Unix, Standards and Portability +@node SVID, XPG, Berkeley Unix, Standards and Portability @subsection SVID (The System V Interface Description) @cindex SVID @cindex System V Unix @@ -196,14 +201,36 @@ The @dfn{System V Interface Description} (SVID) is a document describing the AT&T Unix System V operating system. It is to some extent a superset of the POSIX standard (@pxref{POSIX}). -The GNU C library defines some of the facilities required by the SVID +The GNU C library defines most of the facilities required by the SVID that are not also required by the @w{ISO C} or POSIX standards, for compatibility with System V Unix and other Unix systems (such as SunOS) which include these facilities. However, many of the more obscure and less generally useful facilities required by the SVID are not included. (In fact, Unix System V itself does not provide them all.) -@c !!! mention sysv ipc/shmem when it is there. +The supported facilities from System V include the methods for +inter-process communication and shared memory, the @code{hsearch} and +@code{drand48} families of functions, @code{fmtmsg} and several of the +mathematical functions. + +@node XPG, , SVID, Standards and Portability +@subsection XPG (The X/Open Portability Guide) + +The X/Open Portability Guide, published by the X/Open Company, Ltd., is +a more general standard than POSIX. X/Open owns the Unix copyright and +the XPG specifies the requirements for systems which are intended to be +a Unix system. + +The GNU C library complies to the X/Open Portability Guide, Issue 4.2, +with the with all extensions common to XSI (X/Open System Interface) +compliant systems and also all X/Open UNIX extensions. + +The additions on top of POSIX are mainly derived from functionality +available in @w{System V} and BSD systems. Some of the really bad +mistakes in @w{System V} systems were corrected, though. Since +fulfilling the XPG standard with the Unix extensions is a +precondition for getting the Unix brand chances are good that the +functionality is available on commercial systems. @node Using the Library, Roadmap to the Manual, Standards and Portability, Introduction @@ -360,7 +387,7 @@ and also provides a macro definition for @code{abs}. Then, in: @smallexample #include <stdlib.h> -int f (int *i) @{ return (abs (++*i)); @} +int f (int *i) @{ return abs (++*i); @} @end smallexample @noindent @@ -370,10 +397,10 @@ to a function and not a macro. @smallexample #include <stdlib.h> -int g (int *i) @{ return ((abs)(++*i)); @} +int g (int *i) @{ return (abs) (++*i); @} #undef abs -int h (int *i) @{ return (abs (++*i)); @} +int h (int *i) @{ return abs (++*i); @} @end smallexample Since macro definitions that double for a function behave in diff --git a/manual/string.texi b/manual/string.texi index e358b20..af95925 100644 --- a/manual/string.texi +++ b/manual/string.texi @@ -33,6 +33,7 @@ too. * Finding Tokens in a String:: Splitting a string into tokens by looking for delimiters. * Encode Binary Data:: Encoding and Decoding of Binary Data. +* Argz and Envz Vectors:: Null-separated string vectors. @end menu @node Representation of Strings @@ -1200,3 +1201,227 @@ sure the buffer pointer is update after each call to @code{a64l} since this function does not modify the buffer pointer. Every call consumes 6 characters. @end deftypefun + +@node Argz and Envz Vectors +@section Argz and Envz Vectors + +@cindex argz vectors +@cindex string vectors, null-character separated +@cindex argument vectors, null-character separated +@dfn{argz vectors} are vectors of strings in a contiguous block of +memory, each element separated from its neighbors by null-characters +(@code{'\0'}). + +@cindex envz vectors +@cindex environment vectors, null-character separated +@dfn{Envz vectors} are an extension of argz vectors where each element is a +name-value pair, separated by a @code{'='} character (as in a unix +environment). + +@menu +* Argz Functions:: Operations on argz vectors. +* Envz Functions:: Additional operations on environment vectors. +@end menu + +@node Argz Functions, Envz Functions, , Argz and Envz Vectors +@subsection Argz Functions + +Each argz vector is represented by a pointer to the first element, of +type @code{char *}, and a size, of type @code{size_t}, both of which can +be initialized to @code{0} to represent an empty argz vector. All argz +functions accept either a pointer and a size argument, or pointers to +them, if they will be modified. + +The argz functions use @code{malloc}/@code{realloc} to allocate/grow +argz vectors, and so any argz vector creating using these functions may +be freed by using @code{free}; conversely, any argz function that may +grow a string expects that string to have been allocated using +@code{malloc} (those argz functions that only examine their arguments or +modify them in place will work on any sort of memory). +@xref{Unconstrained Allocation}. + +All argz functions that do memory allocation have a return type of +@code{error_t}, and return @code{0} for success, and @code{ENOMEM} if an +allocation error occurs. + +@pindex argz.h +These functions are declared in the standard include file @file{argz.h}. + +@deftypefun {error_t} argz_create (char *const @var{argv}[], char **@var{argz}, size_t *@var{argz_len}) +The @code{argz_create} function converts the unix-style argument vector +@var{argv} (a vector of pointers to normal C strings, terminated by +@code{(char *)0}; @pxref{Program Arguments}) into an argz vector with +the same elements, which is returned in @var{argz} and @var{argz_len}. +@end deftypefun + +@deftypefun {error_t} argz_create_sep (const char *@var{string}, int @var{sep}, char **@var{argz}, size_t *@var{argz_len}) +The @code{argz_create_sep} function converts the null-terminated string +@var{string} into an argz vector (returned in @var{argz} and +@var{argz_len}) by splitting it into elements at every occurance of the +character @var{sep}. +@end deftypefun + +@deftypefun {size_t} argz_count (const char *@var{argz}, size_t @var{arg_len}) +Returns the number of elements in the argz vector @var{argz} and +@var{argz_len}. +@end deftypefun + +@deftypefun {void} argz_extract (char *@var{argz}, size_t @var{argz_len}, char **@var{argv}) +The @code{argz_extract} function converts the argz vector @var{argz} and +@var{argz_len} into a unix-style argument vector stored in @var{argv}, +by putting pointers to every element in @var{argz} into successive +positions in @var{argv}, followed by a terminator of @code{0}. +@var{Argv} must be pre-allocated with enough space to hold all the +elements in @var{argz} plus the terminating @code{(char *)0} +(@code{(argz_count (@var{argz}, @var{argz_len}) + 1) * sizeof (char *)} +bytes should be enough). Note that the string pointers stored into +@var{argv} point into @var{argz}---they are not copies---and so +@var{argz} must be copied if it will be changed while @var{argv} is +still active. This function is useful for passing the elements in +@var{argz} to an exec function (@pxref{Executing a File}). +@end deftypefun + +@deftypefun {void} argz_stringify (char *@var{argz}, size_t @var{len}, int @var{sep}) +The @code{argz_stringify} converts @var{argz} into a normal string with +the elements separated by the character @var{sep}, by replacing each +@code{'\0'} inside @var{argz} (except the last one, which terminates the +string) with @var{sep}. This is handy for printing @var{argz} in a +readable manner. +@end deftypefun + +@deftypefun {error_t} argz_add (char **@var{argz}, size_t *@var{argz_len}, const char *@var{str}) +The @code{argz_add} function adds the string @var{str} to the end of the +argz vector @code{*@var{argz}}, and updates @code{*@var{argz}} and +@code{*@var{argz_len}} accordingly. +@end deftypefun + +@deftypefun {error_t} argz_add_sep (char **@var{argz}, size_t *@var{argz_len}, const char *@var{str}, int @var{delim}) +The @code{argz_add_sep} function is similar to @code{argz_add}, but +@var{str} is split into separate elements in the result at occurances of +the character @var{delim}. This is useful, for instance, for +adding the components of a unix search path to an argz vector, by using +a value of @code{':'} for @var{delim}. +@end deftypefun + +@deftypefun {error_t} argz_append (char **@var{argz}, size_t *@var{argz_len}, const char *@var{buf}, size_t @var{buf_len}) +The @code{argz_append} function appends @var{buf_len} bytes starting at +@var{buf} to the argz vector @code{*@var{argz}}, reallocating +@code{*@var{argz}} to accommodate it, and adding @var{buf_len} to +@code{*@var{argz_len}}. +@end deftypefun + +@deftypefun {error_t} argz_delete (char **@var{argz}, size_t *@var{argz_len}, char *@var{entry}) +If @var{entry} points to the beginning of one of the elements in the +argz vector @code{*@var{argz}}, the @code{argz_delete} function will +remove this entry and reallocate @code{*@var{argz}}, modifying +@code{*@var{argz}} and @code{*@var{argz_len}} accordingly. Note that as +destructive argz functions usually reallocate their argz argument, +pointers into argz vectors such as @var{entry} will then become invalid. +@end deftypefun + +@deftypefun {error_t} argz_insert (char **@var{argz}, size_t *@var{argz_len}, char *@var{before}, const char *@var{entry}) +The @code{argz_insert} function inserts the string @var{entry} into the +argz vector @code{*@var{argz}} at a point just before the existing +element pointed to by @var{before}, reallocating @code{*@var{argz}} and +updating @code{*@var{argz}} and @code{*@var{argz_len}}. If @var{before} +is @code{0}, @var{entry} is added to the end instead (as if by +@code{argz_add}). Since the first element is in fact the same as +@code{*@var{argz}}, passing in @code{*@var{argz}} as the value of +@var{before} will result in @var{entry} being inserted at the beginning. +@end deftypefun + +@deftypefun {char *} argz_next (char *@var{argz}, size_t @var{argz_len}, const char *@var{entry}) +The @code{argz_next} function provides a convenient way of iterating +over the elements in the argz vector @var{argz}. It returns a pointer +to the next element in @var{argz} after the element @var{entry}, or +@code{0} if there are no elements following @var{entry}. If @var{entry} +is @code{0}, the first element of @var{argz} is returned. + +This behavior suggests two styles of iteration: + +@smallexample + char *entry = 0; + while ((entry = argz_next (@var{argz}, @var{argz_len}, entry))) + @var{action}; +@end smallexample + +(the double parentheses are necessary to make some C compilers shut up +about what they consider a questionable @code{while}-test) and: + +@smallexample + char *entry; + for (entry = @var{argz}; + entry; + entry = argz_next (@var{argz}, @var{argz_len}, entry)) + @var{action}; +@end smallexample + +Note that the latter depends on @var{argz} having a value of @code{0} if +it is empty (rather than a pointer to an empty block of memory); this +invariant is maintained for argz vectors created by the functions here. +@end deftypefun + +@node Envz Functions, , Argz Functions, Argz and Envz Vectors +@subsection Envz Functions + +Envz vectors are just argz vectors with additional constraints on the form +of each element; as such, argz functions can also be used on them, where it +makes sense. + +Each element in an envz vector is a name-value pair, separated by a @code{'='} +character; if multiple @code{'='} characters are present in an element, those +after the first are considered part of the value, and treated like all other +non-@code{'\0'} characters. + +If @emph{no} @code{'='} characters are present in an element, that element is +considered the name of a ``null'' entry, as distinct from an entry with an +empty value: @code{envz_get} will return @code{0} if given the name of null +entry, whereas an entry with an empty value would result in a value of +@code{""}; @code{envz_entry} will still find such entries, however. Null +entries can be removed with @code{envz_strip} function. + +As with argz functions, envz functions that may allocate memory (and thus +fail) have a return type of @code{error_t}, and return either @code{0} or +@code{ENOMEM}. + +@pindex envz.h +These functions are declared in the standard include file @file{envz.h}. + +@deftypefun {char *} envz_entry (const char *@var{envz}, size_t @var{envz_len}, const char *@var{name}) +The @code{envz_entry} function finds the entry in @var{envz} with the name +@var{name}, and returns a pointer to the whole entry---that is, the argz +element which begins with @var{name} followed by a @code{'='} character. If +there is no entry with that name, @code{0} is returned. +@end deftypefun + +@deftypefun {char *} envz_get (const char *@var{envz}, size_t @var{envz_len}, const char *@var{name}) +The @code{envz_get} function finds the entry in @var{envz} with the name +@var{name} (like @code{envz_entry}), and returns a pointer to the value +portion of that entry (following the @code{'='}). If there is no entry with +that name (or only a null entry), @code{0} is returned. +@end deftypefun + +@deftypefun {error_t} envz_add (char **@var{envz}, size_t *@var{envz_len}, const char *@var{name}, const char *@var{value}) +The @code{envz_add} function adds an entry to @code{*@var{envz}} +(updating @code{*@var{envz}} and @code{*@var{envz_len}}) with the name +@var{name}, and value @var{value}. If an entry with the same name +already exists in @var{envz}, it is removed first. If @var{value} is +@code{0}, then the new entry will the special null type of entry +(mentioned above). +@end deftypefun + +@deftypefun {error_t} envz_merge (char **@var{envz}, size_t *@var{envz_len}, const char *@var{envz2}, size_t @var{envz2_len}, int @var{override}) +The @code{envz_merge} function adds each entry in @var{envz2} to @var{envz}, +as if with @code{envz_add}, updating @code{*@var{envz}} and +@code{*@var{envz_len}}. If @var{override} is true, then values in @var{envz2} +will supersede those with the same name in @var{envz}, otherwise not. + +Null entries are treated just like other entries in this respect, so a null +entry in @var{envz} can prevent an entry of the same name in @var{envz2} from +being added to @var{envz}, if @var{override} is false. +@end deftypefun + +@deftypefun {void} envz_strip (char **@var{envz}, size_t *@var{envz_len}) +The @code{envz_strip} function removes any null entries from @var{envz}, +updating @code{*@var{envz}} and @code{*@var{envz_len}}. +@end deftypefun |