MetaKit for Tcl

The structured database which fits in the palm of your hand            

[ Overview | Terminology | Installation | Getting started | Mk4tcl Reference ]

What it is - MetaKit is an embeddable database which runs on Unix, Windows, Macintosh, and other platforms. It lets you build applications which store their data efficiently, in a portable way, and which will not need a complex runtime installation. In terms of the data model, MetaKit takes the middle ground between RDBMS, OODBMS, and flat-file databases - yet it is quite different from each of them.

What it isn't - MetaKit is not: 1) an SQL database, 2) multi-user, 3) scalable to gigabytes, 4) proprietary software, 5) a toy.

Technology - Everything is stored variable-sized yet with efficient positional row access. Changing an existing datafile structure is as simple as re-opening it with that new structure. All changes are transacted. You can mix and match software written in C++, Python, and Tcl. Things can't get much more flexible...

Tcl/Tk - The extension for Tcl is called "Mk4tcl". It is being used in a number of commercial projects, for in-house use as well as in commercially distributed products.

Mk4tcl 2.01 - is the latest release. The homepage points to a download area with pre-compiled shared libraries for Unix, Windows, and Macintosh. The MetaKit source distribution includes this documentation, the Mk4tcl C++ source code, a small Tcl test suite, a "mkshow.tcl" utility which lets you examine data in any MetaKit datafile from the command line, and a few more goodies.

Changes since 2.0 - Mk4tcl is now part of MetaKit 2.0, and adds:

  • a fix for the "mk::file views" command, which forgot to list the first view
  • Tequila can act as proxy server for most mk::* commands (see tcl/tequila/)
  • a fix for a free space manageemnt problem in the C++ core library
  • see the change log for further details (it's in the file called "CHANGES")

    License and support - MetaKit 2.01 is distributed under the liberal X/MIT-style open source license. Commercial support is available through an Enterprise License. See the license page for details.

    Credits - Are due to Mark Roseman for providing the initial incentive and feedback, and to Matt Newman for a range of suggestions and ideas. Evidently, Mk4tcl could not exist without the Tcl/Tk scripting platform and its superb extensibility.

    Updates - The latest version of this document is at http://www.equi4.com/metakit/tcl.html.


    Overview

    MetaKit is a machine- and language-independent toolkit for storing and managing structured data. This is a description of the Mk4tcl extension, which allows you to create, access, and manipulate MetaKit datafiles using Tcl. Here is a Tcl script which selects, sorts, and displays some previously stored results:
        mk::file open db phonebook.dat -readonly
        foreach i [mk::select db.persons -glob name Jon* -sort date] {
            puts "Found [mk::get db.persons!$i name phone date]"
        }
    This script illustrates how easy it is to access stored data from Tcl. What it does not show, however, is that numeric data can be stored in binary format (yet remain fully portable), that datafiles can contain complex (nested) datastructures, that the structure of datafiles can be adjusted at any time, and that all modifications use the commit / rollback transaction model.

    In actual use, MetaKit resembles more an array manipulation package than a database - with the main access mechanism being 'by position', not by primary key. The Tcl interface does not yet cover all operations provided by the complete C++ interface of MetaKit, but as the mk::select command illsutrates, it does include quite flexible forms of searching and sorting.


    Terminology

    There are several ways to say the same thing, depending on where you're coming from. For example, the terms table, list, collection, array, sequence, and vector all denote a more or less similar concept. To help avoid confusion, MetaKit uses a simple (but hopefully precise) terminology.

    The terms adopted by MetaKit can be summarized as follows:

    The Mk4tcl extension adds several notational conventions:

    A few more comments about the semantics of MetaKit:


    Installation

    1. Download the latest version from http://www.equi4.com/pub/download.html
    2. On Unix, rename the appropriate compiled extension to "Mk4tcl.so" (on Win/Mac, use the corresponding file)
    3. Do a small test, by running "demo.tcl". If all is well, you should get some self-explanatory output
    4. Place the extension somewhere on Tcl's package search path (or just leave it in ".")


    Getting started

    Create a database:
    package require Mk4tcl
    mk::file open db datafile.mk
    Create a view (this is the MetaKit term for "table"):
    set vw [mk::view layout db.people {first last shoesize:I}]
    Add two rows (this is the MetaKit term for "record"):
    mk::row append $vw first John last Lennon shoesize 44
    mk::row append $vw first Flash last Gordon shoesize 42
    Commit the changes to file:
    mk::file commit db
    Show a list of all people:
    mk::loop c $vw {puts [mk::get $c first last shoesize]}
    Show a list of all people, sorted by last name:
    foreach r [mk::select $vw -sort last] {puts [mk::get $vw!$r]}
    Show a list of all people with first name 'John':
    foreach r [mk::select $vw first John] {puts [mk::get $vw!$r]}


    Mk4tcl Reference

    mk::fileOpening, closing, and saving datafiles
    mk::viewView structure and size operations
    mk::cursorCursor variables for positioning
    mk::rowCreate, insert, and delete rows
    mk::getFetch values
    mk::setStore values
    mk::loopIterate over the rows of a view
    mk::selectSelection and sorting
    mk::channelChannel interface (new in 1.2)


    mk::file

    Opening, closing, and saving datafiles

    SYNOPSIS
    mk::file  open
    mk::file  open  tag
    mk::file  open  tag  filename  ?-readonly?  ?-nocommit?  
    mk::file  views  tag  
    mk::file  close  tag  
    mk::file  commit  tag  
    mk::file  rollback  tag  
    mk::file  load  tag  channel  
    mk::file  save  tag  channel  

    DESCRIPTION
    The mk::file command is used to open and close MetaKit datafiles. It is also used to force pending changes to disk (commit), to cancel the last changes (rollback), and to send/receive the entire contents of a datafile over a Tcl channel, including sockets (load/save).

    Without arguments, 'mk::file open' returns the list of tags and filenames of all datasets which are currently open (of the form tag1 name1 tag2 name2 ...).

    The 'mk::file open' command associates a datafile with a unique symbolic tag. A tag must consist of alphanumeric characters, and is used in the other commands to refer to a specfic open datafile. If filename is omitted, a temporary in-memory dataset is created (which cannot use commit, but which you could save to an I/O channel). When a datafile is closed, all pending changes will be written to file, unless the -nocommit option is specified. In that case, only an explicit commit will save changes. To open a file only for reading, use the -readonly option. Datafiles can be opened read-only by any number of readers, or by a single writer (no other combinations are allowed).

    The 'mk::file views' command returns a list with the views currently defined in the open datafile associated with tag. You can use the 'mk::view layout' command to determine the current structure of each view.

    The 'mk::file close' command closes the datafile and releases all associated resources. If not opened with -readonly or -nocommit, all pending changes will be saved to file before closing it. A tag loses its special meaning after the corresponding datafile has been closed.

    The 'mk::file commit' command flushes all pending changes to disk. It should not be used on a file opened with the -readonly option. The 'mk::file rollback' command cancels all pending changes and reverts the situation to match what was last stored on file.

    The 'mk::file load' command replaces all views with data read from any Tcl channel. This data must have been generated using 'mk::file save'. Changes are made permanent when commit is called (explicitly or implicitly, when a datafile is closed), or they can be reverted by calling rollback.

    EXAMPLES
    Open a datafile (create it if necessary), for read-write access:
        mk::file open db test.dat
    Display the structure of every view in the datafile:
        foreach v [mk::file views db] {
            puts [mk::view layout db.$v]
        }
    Send all data across a TCP/IP socket connection:
        set chan [socket 127.0.0.1 12345]
        mk::file save db $chan
        close $chan


    mk::view

    View structure and size operations

    SYNOPSIS
    mk::view  layout  tag.view  
    mk::view  layout  tag.view  {structure}  
    mk::view  delete  tag.view  
    mk::view  size  path  
    mk::view  size  path  size  
    mk::view  info  path  

    DESCRIPTION
    The mk::view command is used to query or alter the structure of a view in a datafile (layout, delete), as well as the number of rows it contains (size). The last command (info) returns the list of properties currently defined for a view.

    The 'mk::view layout' command returns a description of the current datastructure of tag.view. If a structure is specified, the current data is restructured to match that, by adding new properties with a default value, deleting obsolete ones, and reordering them.

    Structure definitions consist of a list of properties. Subviews are specified as a sublist of two entries: the name and the list of properties in that subview. Note that subviews add two levels of nesting (see phones in the phonebook example below). The type of a property is specified by appending a suffix to the property name (the default type is string):

      :S
      A string property for storing strings of any size, but no null bytes.
      :I
      An integer property for efficiently storing values as integers (1..32 bits).
      :F
      A float property for storing single-precision floating point values (32 bits).
      :D
      A double property for storing double-precision floating point values (64 bits).
      :B
      A binary property for untyped binary data (including null bytes).
      :M
      A memo property for large amounts of binary data (currently treated as :B).

    Properties which are not listed int the layout will only remain set while the datafile is open, but not be stored. To make properties persist, you must list them in the layout definition, and do so before setting them.

    The 'mk::view delete' command completely removes a view and all the data it contains from a datafile.

    The 'mk::view size' command returns the number of rows contained in the view identified as tag.view. If an argument is specified, the size of the view is adjusted accordingly, dropping the highest rows if the size is decreased or adding new empty ones if the size is increased. The command 'mk::view size 0' deletes all rows from a view, but keeps the view in the datafile so rows can be added again later (unlike 'mk::view delete'.

    The 'mk::view info' returns the list of properties which are currently defined for path.

    Note that the layout and delete sub-commands operate only on top-level views (of the form tag.view), whereas size and info take a path as arguments, which is either a top-level view or a nested subview (of the form 'tag.view!index.subview!subindex...etc...subview').

    EXAMPLES
    Define a phonebook view which can store more than one phone number for each person:
        mk::view layout db.book {name address {phones {type phone}}}
    Restructure the view in the datafile, adding an integer date field:
        mk::view layout db.book {name address {phones {type phone}} date:I}
    Delete all phonebook entries as well as its definition from the datafile:
        mk::view delete db.book


    mk::cursor

    Cursor variables for positioning

    SYNOPSIS
    mk::cursor  create  cursorName  ?path?  ?index?  
    mk::cursor  position  cursorName  
    mk::cursor  position  cursorName  0  
    mk::cursor  position  cursorName  end  
    mk::cursor  position  cursorName  index  
    mk::cursor  incr  cursorName  ?step?  

    DESCRIPTION
    The mk::cursor command is used to manipulate 'cursor variables', which offer an efficient means of iterating and repositioning a 'reference to a row in a view'. Though cursors are equivalent to strings of the form somepath!N, it is much more efficient to keep a cursor around in a variable and to adjust it (using the position subcommand), than evaluating a 'somepath!$index' expression every time a cursor is expected.

    The 'mk::cursor create' command defines (or redefines) a cursor variable. The index argument defaults to zero. This is a convenience function, since 'mk::cursor create X somePath N' is equivalent to 'set X somePath!N'.

    When both path and index arguments are omitted from the 'mk::cursor create' command, a cursor pointing to an empty temporary view is created, which can be used as buffer for data not stored on file.

    The 'mk::cursor position' command returns the current position of a cursor, i.e. the 0-based index of the row it is pointing to. If an extra argument is specified, the cursor position will be adjusted accordingly. The 'end' pseudo-position is the index of the last row (or -1 if the view is currently empty). Note that if 'X' is a cursor equivalent to somePath!N, then 'mk::cursor position X M' is equivalent to the far less efficient 'set X somePath!M'.

    The 'mk::cursor incr' command adjusts the current position of a cursor with a specified relative step, which can be positive as well as negative. If step is zero, then this command does nothing. The command 'mk::cursor incr X N' is equivalent to 'mk::cursor position X [expr {[mk::cursor position X] + N}]'.


    mk::row

    Create, insert, and delete rows

    SYNOPSIS
    mk::row  create  ?prop  value  ...?  
    mk::row  append  path  ?prop  value  ...?  
    mk::row  insert  cursor  cursor2  ?count?  
    mk::row  delete  cursor  ?count?  
    mk::row  replace  cursor  ?cursor2?  

    DESCRIPTION
    The mk::row command deals with one or more rows of information. There is a command to allocate a temporary row which is not part of any datafile (create), and the usual set of container operations: appending, inserting, deleting, and replacing rows.

    The 'mk::row create' command creates an empty temporary row, which is not stored in any datafile. Each temporary rows starts out without any properties. Setting a property in a row will implicitly add that property if necessary. The return value is a unique cursor, pointing to this temporary row. The row (and all data stored in it) will cease to exist when no cursor references to it remain.

    The 'mk::row append' command extends the view with a new row, optionally setting some properties in it to the specified values.

    The 'mk::row insert' command is similar to the append sub-command, inserting the new row in a specified position instead of at the end. The optional count argument can be used to efficiently insert multiple copies of a row.

    The 'mk::row delete' command deletes one or more rows from a view, starting at the row pointed to by cursor.

    The 'mk::row replace' command replaces one row with a copy of another one, or clears its contents if cursor2 is not specified.

    EXAMPLES
    Define a cursor pointing to a new empty row:
        set cursor [mk::row create]
    Initialize a temporary view with 100 copies of the string "Hello":
        mk::cursor create cursor 
        mk::row insert $cursor [mk::row create text Hello] 100


    mk::get

    Fetch values

    SYNOPSIS
    mk::get  cursor  ?-size?
    mk::get  cursor  ?-size?  prop  ...  

    DESCRIPTION
    The mk::get command fetches values from the row specified by cursor.

    Without argument, get returns a list of 'prop1 value1 prop2 value2 ...'. This format is most convenient for setting an array variable, as the following example illustrates:

        array set v [mk::get db.phonebook!0]
        parray v
    Note that the cursor argument can be the value of a cursor variable, or it can be synthesized on the spot, as in the above example.

    If the -size option is specified, the size of property values is returned instead of their contents. This is normally in bytes, but for integers it can be a negative value indicating the number of bits used to store ints (-1, -2, or -4). This is an efficient way to determine the sizes of property values without fetching them.

    If arguments are specified in the get command, they are interpreted as property names and a list will be returned containing the values of these properties in the specified order.

    If cursor does not point to a valid row, default values are returned instead (no properties, and empty strings or numeric zero's, according to the property types).

    EXAMPLES
    Set up an array containing all the fields in the third row:
        array set fields [mk::get db.phonebook!2]
    Created a line with some formatted fields:
        puts [eval format {%-20s %d} [mk::get db.phonebook!2 name date]]


    mk::set

    Store values

    SYNOPSIS
    mk::set  cursor  ?prop  value  ...?  

    DESCRIPTION
    The mk::set command stores values into the row specified by cursor.

    If a property is specified which does not exist, it will be appended as a new definition for the containing view. As an important side effect, all other rows in this view will now also have such a property, with an appropriate default value for the property. Note that when new properties are defined in this way, they will be created as string properties unless qualified by a type suffix (see 'mk::view layout' for details on property types and their default values).

    Using mk::set command without specifying properties returns the current value and is identical to mk::get.

    If cursor points to a non-existent row past the end of the view, an appropriate number of empty rows will be inserted first.


    mk::loop

    Iterate over the rows of a view

    SYNOPSIS
    mk::loop  cursorName  {body}  
    mk::loop  cursorName  path  {body}  
    mk::loop  cursorName  path  first  ?limit?  ?step?  {body}  

    DESCRIPTION
    The mk::loop command offers a convenient way to iterate over the rows of a view. Iteration can be restricted to a certain range, and can optionally use a forward or backward step. This is a convenience function which is more efficient than performing explicit iteration over an index and positioning a cursor.

    When called with just a path argument, the loop will iterate over all the rows in the corresponding view. The cursorName loop variable will be set (or reset) on each iteration, and is created if it did not yet exist.

    When path is not specified, the cursorName variable must exist and be a valid cursor, although its current position will be ignored. The command 'mk::loop X {...}' is identical to 'mk::loop X $X {...}'.

    The first argument specifies the first index position to use (default 0), the limit argument specifies the last argument (default 'end'), and the step argument specifies the increment (default 1). If step is negative and limit exceeds first, then the loop body will never be executed. A zero step value can lead to infinite looping unless the break command is called inside the loop.

    The first, limit, and step arguments may be arbitrary integer expressions and are evaluated exactly once when the loop is entered.

    Note that you cannot easily use a loop to insert or delete rows, since changes to views do not adjust cursors pointing into that view. Instead, you can use tricks like moving backwards (for deletions), or splitting the work into two separate passes.


    mk::select

    Selection and sorting

    SYNOPSIS
    mk::select  path  ?options  ...?  

    DESCRIPTION
    The mk::select command combines a flexible selection operation with a way to sort the resulting set of rows. The result is a list of row index numbers (possibly empty), which can be used to reposition a cursor and to address rows directly.

    A selection is specified using any combination of these criteria:

      prop value
      Numeric or case-insensitive match
      -min prop value
      Property must be greater or equal to value (case is ignored)
      -max prop value
      Property must be less or equal to value (case is ignored)
      -exact prop value
      Exact case-sensitive string match
      -glob prop pattern
      Match "glob-style" expression wildcard
      -globnc prop pattern
      Match "glob-style" expression, ignoring case
      -regexp prop pattern
      Match specified regular expression
      -keyword prop word
      Match word as free text or partial prefix
    If multiple criteria are specified, then selection succeeds only if all criteria are satisfied. If prop is a list, selection succeeds if any of the given properties satisfies the corresponding match.

    Optional selection constraints:

      -first pos
      Selection starts at specified row index
      -count num
      Return no more than this many results
    Note: not yet very useful with sorting, which is done after these constraints have been applied.

    To sort the set of rows (with or without preliminary selection), use:

      -sort prop
      -sort {prop ...}
      Sort on one or more properties, ascending
      -rsort prop
      -rsort {prop ...}
      Sort on one or more properties, descending
    Multiple sort options are combined in the order given.

    EXAMPLES
    Select a range of entries:
        foreach i [mk::select db.phonebook -min date 19980101 -max date 19980131] {
            puts "Dated Jan 1998: [mk::get db.phonebook!$i name]"
        }
    Search for a unique match ('-count 2' speeds up selection when many entries match):
        set v [mk::select db.phonebook -count 2 -glob name John*]
        switch [llength $v] {
            0       {puts "not found"}
            1       {puts "found: [mk::get db.phonebook![lindex $v 0] name]"}
            2       {puts "there is more than one entry matching 'John*'"}
        }
    Sort by descending date and by ascending name:
        foreach i [mk::select db.phonebook -rsort date -sort name] {
            puts "Change log: [mk::get db.phonebook!$i date name]"
        }


    mk::channel

    Channel interface

    SYNOPSIS
    mk::channel  path  prop  ?mode?  

    DESCRIPTION
    The mk::channel command provides a channel interface to memo fields. It needs the path of a row and the name of a memo prop, and returns a channel descriptor which can be used to read or write from.

    Channels are opened in one of three modes:

      read - open for reading existing contents (default)
      write - clear contents and start saving data
      append - keep contents, set seek pointer to end

    Note: do not insert or delete rows in a view within which there are open channels, because subsequent reads and writes may end up going to the wrong memo property.

    EXAMPLES
    Write a few values (with line separators):
        mk::view layout db.v {m:M}
        mk::view size db.v 1
    
        set fd [mk::channel db.v!0 m w]
        puts $fd one
        puts $fd two
        puts $fd three
        close $fd
    Read values back, line by line:
        set fd [mk::channel db.v!0 m]
        while {[gets $fd text] >= 0} {
            puts $text
        }
        close $fd


    © 1999 Jean-Claude Wippler <jcw@equi4.com>