diff options
Diffstat (limited to 'libio/dbz/dbz.3z')
-rw-r--r-- | libio/dbz/dbz.3z | 547 |
1 files changed, 0 insertions, 547 deletions
diff --git a/libio/dbz/dbz.3z b/libio/dbz/dbz.3z deleted file mode 100644 index 6df2531..0000000 --- a/libio/dbz/dbz.3z +++ /dev/null @@ -1,547 +0,0 @@ -.TH DBZ 3Z "3 Feb 1991" -.BY "C News" -.SH NAME -dbminit, fetch, store, dbmclose \- somewhat dbm-compatible database routines -.br -dbzfresh, dbzagain, dbzfetch, dbzstore \- database routines -.br -dbzsync, dbzsize, dbzincore, dbzcancel, dbzdebug \- database routines -.SH SYNOPSIS -.nf -.B #include <dbz.h> -.PP -.B dbminit(base) -.B char *base; -.PP -.B datum -.B fetch(key) -.B datum key; -.PP -.B store(key, value) -.B datum key; -.B datum value; -.PP -.B dbmclose() -.PP -.B dbzfresh(base, size, fieldsep, cmap, tagmask) -.B char *base; -.B long size; -.B int fieldsep; -.B int cmap; -.B long tagmask; -.PP -.B dbzagain(base, oldbase) -.B char *base; -.B char *oldbase; -.PP -.B datum -.B dbzfetch(key) -.B datum key; -.PP -.B dbzstore(key, value) -.B datum key; -.B datum value; -.PP -.B dbzsync() -.PP -.B long -.B dbzsize(nentries) -.B long nentries; -.PP -.B dbzincore(newvalue) -.PP -.B dbzcancel() -.PP -.B dbzdebug(newvalue) -.SH DESCRIPTION -These functions provide an indexing system for rapid random access to a -text file (the -.I base -.IR file ). -Subject to certain constraints, they are call-compatible with -.IR dbm (3), -although they also provide some extensions. -(Note that they are -.I not -file-compatible with -.I dbm -or any variant thereof.) -.PP -In principle, -.I dbz -stores key-value pairs, where both key and value are arbitrary sequences -of bytes, specified to the functions by -values of type -.IR datum , -typedefed in the header file to be a structure with members -.I dptr -(a value of type -.I char * -pointing to the bytes) -and -.I dsize -(a value of type -.I int -indicating how long the byte sequence is). -.PP -In practice, -.I dbz -is more restricted than -.IR dbm . -A -.I dbz -database -must be an index into a base file, -with the database -.IR value s -being -.IR fseek (3) -offsets into the base file. -Each such -.I value -must ``point to'' a place in the base file where the corresponding -.I key -sequence is found. -A key can be no longer than -.SM DBZMAXKEY -(a constant defined in the header file) bytes. -No key can be an initial subsequence of another, -which in most applications requires that keys be -either bracketed or terminated in some way (see the -discussion of the -.I fieldsep -parameter of -.IR dbzfresh , -below, -for a fine point on terminators). -.PP -.I Dbminit -opens a database, -an index into the base file -.IR base , -consisting of files -.IB base .dir -and -.IB base .pag -which must already exist. -(If the database is new, they should be zero-length files.) -Subsequent accesses go to that database until -.I dbmclose -is called to close the database. -The base file need not exist at the time of the -.IR dbminit , -but it must exist before accesses are attempted. -.PP -.I Fetch -searches the database for the specified -.IR key , -returning the corresponding -.IR value -if any. -.I Store -stores the -.IR key - value -pair in the database. -.I Store -will fail unless the database files are writeable. -See below for a complication arising from case mapping. -.PP -.I Dbzfresh -is a variant of -.I dbminit -for creating a new database with more control over details. -Unlike for -.IR dbminit , -the database files need not exist: -they will be created if necessary, -and truncated in any case. -.PP -.IR Dbzfresh 's -.I size -parameter specifies the size of the first hash table within the database, -in key-value pairs. -Performance will be best if -.I size -is a prime number and -the number of key-value pairs stored in the database does not exceed -about 2/3 of -.IR size . -(The -.I dbzsize -function, given the expected number of key-value pairs, -will suggest a database size that meets these criteria.) -Assuming that an -.I fseek -offset is 4 bytes, -the -.B .pag -file will be -.RI 4* size -bytes -(the -.B .dir -file is tiny and roughly constant in size) -until -the number of key-value pairs exceeds about 80% of -.IR size . -(Nothing awful will happen if the database grows beyond 100% of -.IR size , -but accesses will slow down somewhat and the -.B .pag -file will grow somewhat.) -.PP -.IR Dbzfresh 's -.I fieldsep -parameter specifies the field separator in the base file. -If this is not -NUL (0), and the last character of a -.I key -argument is NUL, that NUL compares equal to either a NUL or a -.I fieldsep -in the base file. -This permits use of NUL to terminate key strings without requiring that -NULs appear in the base file. -The -.I fieldsep -of a database created with -.I dbminit -is the horizontal-tab character. -.PP -For use in news systems, various forms of case mapping (e.g. uppercase to -lowercase) in keys are available. -The -.I cmap -parameter to -.I dbzfresh -is a single character specifying which of several mapping algorithms to use. -Available algorithms are: -.RS -.TP -.B 0 -case-sensitive: no case mapping -.TP -.B B -same as -.B 0 -.TP -.B NUL -same as -.B 0 -.TP -.B = -case-insensitive: uppercase and lowercase equivalent -.TP -.B b -same as -.B = -.TP -.B C -RFC822 message-ID rules, case-sensitive before `@' (with certain exceptions) -and case-insensitive after -.TP -.B ? -whatever the local default is, normally -.B C -.RE -.PP -Mapping algorithm -.B 0 -(no mapping) is faster than the others and is overwhelmingly the correct -choice for most applications. -Unless compatibility constraints interfere, it is more efficient to pre-map -the keys, storing mapped keys in the base file, than to have -.I dbz -do the mapping on every search. -.PP -For historical reasons, -.I fetch -and -.I store -expect their -.I key -arguments to be pre-mapped, but expect unmapped keys in the base file. -.I Dbzfetch -and -.I dbzstore -do the same jobs but handle all case mapping internally, -so the customer need not worry about it. -.PP -.I Dbz -stores only the database -.IR value s -in its files, relying on reference to the base file to confirm a hit on a key. -References to the base file can be minimized, greatly speeding up searches, -if a little bit of information about the keys can be stored in the -.I dbz -files. -This is ``free'' if there are some unused bits in an -.I fseek -offset, -so that the offset can be -.I tagged -with some information about the key. -The -.I tagmask -parameter of -.I dbzfresh -allows specifying the location of unused bits. -.I Tagmask -should be a mask with -one group of -contiguous -.B 1 -bits. -The bits in the mask should -be unused (0) in -.I most -offsets. -The bit immediately above the mask (the -.I flag -bit) should be unused (0) in -.I all -offsets; -.I (dbz)store -will reject attempts to store a key-value pair in which the -.I value -has the flag bit on. -Apart from this restriction, tagging is invisible to the user. -As a special case, a -.I tagmask -of 1 means ``no tagging'', for use with enormous base files or -on systems with unusual offset representations. -.PP -A -.I size -of 0 -given to -.I dbzfresh -is synonymous with the local default; -the normal default is suitable for tables of 90-100,000 -key-value pairs. -A -.I cmap -of 0 (NUL) is synonymous with the character -.BR 0 , -signifying no case mapping -(note that the character -.B ? -specifies the local default mapping, -normally -.BR C ). -A -.I tagmask -of 0 is synonymous with the local default tag mask, -normally 0x7f000000 (specifying the top bit in a 32-bit offset -as the flag bit, and the next 7 bits as the mask, -which is suitable for base files up to circa 24MB). -Calling -.I dbminit(name) -with the database files empty is equivalent to calling -.IR dbzfresh(name,0,'\et','?',0) . -.PP -When databases are regenerated periodically, as in news, -it is simplest to pick the parameters for a new database based on the old one. -This also permits some memory of past sizes of the old database, so that -a new database size can be chosen to cover expected fluctuations. -.I Dbzagain -is a variant of -.I dbminit -for creating a new database as a new generation of an old database. -The database files for -.I oldbase -must exist. -.I Dbzagain -is equivalent to calling -.I dbzfresh -with the same field separator, case mapping, and tag mask as the old database, -and a -.I size -equal to the result of applying -.I dbzsize -to the largest number of entries in the -.I oldbase -database and its previous 10 generations. -.PP -When many accesses are being done by the same program, -.I dbz -is massively faster if its first hash table is in memory. -If an internal flag is 1, -an attempt is made to read the table in when -the database is opened, and -.I dbmclose -writes it out to disk again (if it was read successfully and -has been modified). -.I Dbzincore -sets the flag to -.I newvalue -(which should be 0 or 1) -and returns the previous value; -this does not affect the status of a database that has already been opened. -The default is 0. -The attempt to read the table in may fail due to memory shortage; -in this case -.I dbz -quietly falls back on its default behavior. -.IR Store s -to an in-memory database are not (in general) written out to the file -until -.IR dbmclose -or -.IR dbzsync , -so if robustness in the presence of crashes -or concurrent accesses -is crucial, in-memory databases -should probably be avoided. -.PP -.I Dbzsync -causes all buffers etc. to be flushed out to the files. -It is typically used as a precaution against crashes or concurrent accesses -when a -.IR dbz -using -process will be running for a long time. -It is a somewhat expensive operation, -especially -for an in-memory database. -.PP -.I Dbzcancel -cancels any pending writes from buffers. -This is typically useful only for in-core databases, since writes are -otherwise done immediately. -Its main purpose is to let a child process, in the wake of a -.IR fork , -do a -.I dbmclose -without writing its parent's data to disk. -.PP -If -.I dbz -has been compiled with debugging facilities available (which makes it -bigger and a bit slower), -.I dbzdebug -alters the value (and returns the previous value) of an internal flag -which (when 1; default is 0) causes -verbose and cryptic debugging output on standard output. -.PP -Concurrent reading of databases is fairly safe, -but there is no (inter)locking, -so concurrent updating is not. -.PP -The database files include a record of the byte order of the processor -creating the database, and accesses by processors with different byte -order will work, although they will be slightly slower. -Byte order is preserved by -.IR dbzagain . -However, -agreement on the size and internal structure of an -.I fseek -offset is necessary, as is consensus on -the character set. -.PP -An open database occupies three -.I stdio -streams and their corresponding file descriptors; -a fourth is needed for an in-memory database. -Memory consumption is negligible (except for -.I stdio -buffers) except for in-memory databases. -.SH SEE ALSO -dbz(1), dbm(3) -.SH DIAGNOSTICS -Functions returning -.I int -values return 0 for success, \-1 for failure. -Functions returning -.I datum -values return a value with -.I dptr -set to NULL for failure. -.I Dbminit -attempts to have -.I errno -set plausibly on return, but otherwise this is not guaranteed. -An -.I errno -of -.B EDOM -from -.I dbminit -indicates that the database did not appear to be in -.I dbz -format. -.SH HISTORY -The original -.I dbz -was written by -Jon Zeeff (zeeff@b-tech.ann-arbor.mi.us). -Later contributions by David Butler and Mark Moraes. -Extensive reworking, -including this documentation, -by Henry Spencer (henry@zoo.toronto.edu) as -part of the C News project. -Hashing function by Peter Honeyman. -.SH BUGS -The -.I dptr -members of returned -.I datum -values point to static storage which is overwritten by later calls. -.PP -Unlike -.IR dbm , -.I dbz -will misbehave if an existing key-value pair is `overwritten' by -a new -.I (dbz)store -with the same key. -The user is responsible for avoiding this by using -.I (dbz)fetch -first to check for duplicates; -an internal optimization remembers the result of the -first search so there is minimal overhead in this. -.PP -Waiting until after -.I dbminit -to bring the base file into existence -will fail if -.IR chdir (2) -has been used meanwhile. -.PP -The RFC822 case mapper implements only a first approximation to the -hideously-complex RFC822 case rules. -.PP -The prime finder in -.I dbzsize -is not particularly quick. -.PP -Should implement the -.I dbm -functions -.IR delete , -.IR firstkey , -and -.IR nextkey . -.PP -On C implementations which trap integer overflow, -.I dbz -will refuse to -.I (dbz)store -an -.I fseek -offset equal to the greatest -representable -positive number, -as this would cause overflow in the biased representation used. -.PP -.I Dbzagain -perhaps ought to notice when many offsets -in the old database were -too big for -tagging, and shrink the tag mask to match. -.PP -Marking -.IR dbz 's -file descriptors -.RI close-on- exec -would be a better approach to the problem -.I dbzcancel -tries to address, but that's harder to do portably. |