aboutsummaryrefslogtreecommitdiff
path: root/boehm-gc/doc
diff options
context:
space:
mode:
authorTom Tromey <tromey@gcc.gnu.org>2001-08-17 18:39:15 +0000
committerTom Tromey <tromey@gcc.gnu.org>2001-08-17 18:39:15 +0000
commit61f38a77a0c4f467c59707ce434d34eae52d6692 (patch)
treeb8e0afefff8ad4f413b4d19a28a69f1e15de225b /boehm-gc/doc
parentb38a75e57d32c479904ae185ad716f66ab44c591 (diff)
downloadgcc-61f38a77a0c4f467c59707ce434d34eae52d6692.zip
gcc-61f38a77a0c4f467c59707ce434d34eae52d6692.tar.gz
gcc-61f38a77a0c4f467c59707ce434d34eae52d6692.tar.bz2
Initial revision
From-SVN: r44969
Diffstat (limited to 'boehm-gc/doc')
-rw-r--r--boehm-gc/doc/README.autoconf59
-rw-r--r--boehm-gc/doc/README.macros78
-rw-r--r--boehm-gc/doc/debugging.html289
-rw-r--r--boehm-gc/doc/gcdescr.html438
-rw-r--r--boehm-gc/doc/tree.html198
5 files changed, 1062 insertions, 0 deletions
diff --git a/boehm-gc/doc/README.autoconf b/boehm-gc/doc/README.autoconf
new file mode 100644
index 0000000..53fcf5a
--- /dev/null
+++ b/boehm-gc/doc/README.autoconf
@@ -0,0 +1,59 @@
+As of GC6.0alpha8, we attempt to support GNU-style builds based on automake,
+autoconf and libtool. This is based almost entirely on Tom Tromey's work
+with gcj.
+
+To build and install libraries use
+
+configure; make; make install
+
+The advantages of this process are:
+
+1) It should eventually do a better job of automatically determining the
+right compiler to use, etc. It probably already does in some cases.
+
+2) It tries to automatically set a good set of default GC parameters for
+the platform (e.g. thread support). It provides an easier way to configure
+some of the others.
+
+3) It integrates better with other projects using a GNU-style build process.
+
+4) It builds both dynamic and static libraries.
+
+The known disadvantages are:
+
+1) The build scripts are much more complex and harder to debug (though largely
+standard). I don't understand them all, and there's probably lots of redundant
+stuff.
+
+2) It probably doesn't work on all Un*x-like platforms yet. It probably will
+never work on the rest.
+
+3) The scripts are not yet complete. Some of the standard GNU targets don't
+yet work. (Corrections/additions are very welcome.)
+
+The distribution should contain all files needed to run "configure" and "make",
+as well as the sources needed to regenerate the derived files. (If I missed
+some, please let me know.)
+
+Note that the distribution comes with a "Makefile" which will be overwritten
+by "configure" with one that is not at all equiavelent to the original. The
+distribution contains a copy of the original "Makefile" in "Makefile.direct".
+
+Important options to configure:
+
+ --prefix=PREFIX install architecture-independent files in PREFIX
+ [/usr/local]
+ --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX
+ [same as prefix]
+ --enable-threads=TYPE choose threading package
+ --enable-parallel-mark parallelize marking and free list construction
+ --enable-full-debug include full support for pointer backtracing etc.
+
+Unless --prefix is set (or --exec-prefix or one of the more obscure options),
+make install will install libgc.a and libgc.so in /usr/local/bin, which
+would typically require the "make install" to be run as root.
+
+Most commonly --enable-threads=posix or will be needed. --enable-parallel-mark
+is recommended for multiprocessors if it is supported on the platform.
+
+
diff --git a/boehm-gc/doc/README.macros b/boehm-gc/doc/README.macros
new file mode 100644
index 0000000..d9df8dd
--- /dev/null
+++ b/boehm-gc/doc/README.macros
@@ -0,0 +1,78 @@
+The collector uses a large amount of conditional compilation in order to
+deal with platform dependencies. This violates a number of known coding
+standards. On the other hand, it seems to be the only practical way to
+support this many platforms without excessive code duplication.
+
+A few guidelines have mostly been followed in order to keep this manageable:
+
+1) #if and #ifdef directives are properly indented whenever easily possible.
+All known C compilers allow whitespace between the "#" and the "if" to make
+this possible. ANSI C also allows white space before the "#", though we
+avoid that. It has the known disadvantages that it differs from the normal
+GNU conventions, and that it makes patches larger than otherwise necessary.
+In my opinion, it's still well worth it, for the same reason that we indent
+ordinary "if" statements.
+
+2) Whenever possible, tests are performed on the macros defined in gcconfig.h
+instead of directly testing patform-specific predefined macros. This makes it
+relatively easy to adapt to new compilers with a different set of predefined
+macros. Currently these macros generally identify platforms instead of
+features. In many cases, this is a mistake.
+
+3) The code currently avoids #elif, eventhough that would make it more
+readable. This was done since #elif would need to be understood by ALL
+compilers used to build the collector, and that hasn't always been the case.
+It makes sense to reconsider this decision at some point, since #elif has been
+standardized at least since 1989.
+
+Many of the tested configuration macros are at least somewhat defined in
+either include/private/gcconfig.h or in Makefile.direct. Here is an attempt
+at defining some of the remainder: (Thanks to Walter Bright for suggesting
+this. This is a work in progress)
+
+MACRO EXPLANATION
+----- -----------
+
+__DMC__ Always #define'd by the Digital Mars compiler. Expands
+ to the compiler version number in hex, i.e. 0x810 is
+ version 8.1b0
+
+_ENABLE_ARRAYNEW
+ #define'd by the Digital Mars C++ compiler when
+ operator new[] and delete[] are separately
+ overloadable. Used in gc_cpp.h.
+
+_MSC_VER Expands to the Visual C++ compiler version. Assumed to
+ not be defined for other compilers (at least if they behave
+ appreciably differently).
+
+_DLL Defined by Visual C++ if dynamic libraries are being built
+ or used. Used to test whether __declspec(dllimport) or
+ __declspec(dllexport) needs to be added to declarations
+ to support the case in which the collector is in a dll.
+
+GC_DLL User-settable macro that forces the effect of _DLL.
+
+GC_NOT_DLL User-settable macro that overrides _DLL, e.g. if dynamic
+ libraries are used, but the collector is in a static library.
+
+__STDC__ Assumed to be defined only by compilers that understand
+ prototypes and other C89 features. Its value is generally
+ not used, since we are fine with most nonconforming extensions.
+
+SUNOS5SIGS Solaris-like signal handling. This is probably misnamed,
+ since it really doesn't guarantee much more than Posix.
+ Currently set only for Solaris2.X, HPUX, and DRSNX. Should
+ probably be set for some other platforms.
+
+PCR Set if the collector is being built as part of the Xerox
+ Portable Common Runtime.
+
+SRC_M3 Set if the collector is being built as a replacement of the
+ one in the DEC/Compaq SRC Modula-3 runtime. I suspect this
+ was last used around 1994, and no doubt broke a long time ago.
+ It's there primarily incase someone wants to port to a similar
+ system.
+
+
+
diff --git a/boehm-gc/doc/debugging.html b/boehm-gc/doc/debugging.html
new file mode 100644
index 0000000..a186ff5
--- /dev/null
+++ b/boehm-gc/doc/debugging.html
@@ -0,0 +1,289 @@
+<HTML>
+<HEAD>
+<TITLE>Debugging Garbage Collector Related Problems</title>
+</head>
+<BODY>
+<H1>Debugging Garbage Collector Related Problems</h1>
+This page contains some hints on
+debugging issues specific to
+the Boehm-Demers-Weiser conservative garbage collector.
+It applies both to debugging issues in client code that manifest themselves
+as collector misbehavior, and to debugging the collector itself.
+<P>
+If you suspect a bug in the collector itself, it is strongly recommended
+that you try the latest collector release, even if it is labelled as "alpha",
+before proceeding.
+<H2>Bus Errors and Segmentation Violations</h2>
+<P>
+If the fault occurred in GC_find_limit, or with incremental collection enabled,
+this is probably normal. The collector installs handlers to take care of
+these. You will not see these unless you are using a debugger.
+Your debugger <I>should</i> allow you to continue.
+It's often preferable to tell the debugger to ignore SIGBUS and SIGSEGV
+("<TT>handle SIGSEGV SIGBUS nostop noprint</tt>" in gdb,
+"<TT>ignore SIGSEGV SIGBUS</tt>" in most versions of dbx)
+and set a breakpoint in <TT>abort</tt>.
+The collector will call abort if the signal had another cause,
+and there was not other handler previously installed.
+<P>
+We recommend debugging without incremental collection if possible.
+(This applies directly to UNIX systems.
+Debugging with incremental collection under win32 is worse. See README.win32.)
+<P>
+If the application generates an unhandled SIGSEGV or equivalent, it may
+often be easiest to set the environment variable GC_LOOP_ON_ABORT. On many
+platforms, this will cause the collector to loop in a handler when the
+SIGSEGV is encountered (or when the collector aborts for some other reason),
+and a debugger can then be attached to the looping
+process. This sidesteps common operating system problems related
+to incomplete core files for multithreaded applications, etc.
+<H2>Other Signals</h2>
+On most platforms, the multithreaded version of the collector needs one or
+two other signals for internal use by the collector in stopping threads.
+It is normally wise to tell the debugger to ignore these. On Linux,
+the collector currently uses SIGPWR and SIGXCPU by default.
+<H2>Warning Messages About Needing to Allocate Blacklisted Blocks</h2>
+The garbage collector generates warning messages of the form
+<PRE>
+Needed to allocate blacklisted block at 0x...
+</pre>
+when it needs to allocate a block at a location that it knows to be
+referenced by a false pointer. These false pointers can be either permanent
+(<I>e.g.</i> a static integer variable that never changes) or temporary.
+In the latter case, the warning is largely spurious, and the block will
+eventually be reclaimed normally.
+In the former case, the program will still run correctly, but the block
+will never be reclaimed. Unless the block is intended to be
+permanent, the warning indicates a memory leak.
+<OL>
+<LI>Ignore these warnings while you are using GC_DEBUG. Some of the routines
+mentioned below don't have debugging equivalents. (Alternatively, write
+the missing routines and send them to me.)
+<LI>Replace allocator calls that request large blocks with calls to
+<TT>GC_malloc_ignore_off_page</tt> or
+<TT>GC_malloc_atomic_ignore_off_page</tt>. You may want to set a
+breakpoint in <TT>GC_default_warn_proc</tt> to help you identify such calls.
+Make sure that a pointer to somewhere near the beginning of the resulting block
+is maintained in a (preferably volatile) variable as long as
+the block is needed.
+<LI>
+If the large blocks are allocated with realloc, we suggest instead allocating
+them with something like the following. Note that the realloc size increment
+should be fairly large (e.g. a factor of 3/2) for this to exhibit reasonable
+performance. But we all know we should do that anyway.
+<PRE>
+void * big_realloc(void *p, size_t new_size)
+{
+ size_t old_size = GC_size(p);
+ void * result;
+
+ if (new_size <= 10000) return(GC_realloc(p, new_size));
+ if (new_size <= old_size) return(p);
+ result = GC_malloc_ignore_off_page(new_size);
+ if (result == 0) return(0);
+ memcpy(result,p,old_size);
+ GC_free(p);
+ return(result);
+}
+</pre>
+
+<LI> In the unlikely case that even relatively small object
+(&lt;20KB) allocations are triggering these warnings, then your address
+space contains lots of "bogus pointers", i.e. values that appear to
+be pointers but aren't. Usually this can be solved by using GC_malloc_atomic
+or the routines in gc_typed.h to allocate large pointer-free regions of bitmaps, etc. Sometimes the problem can be solved with trivial changes of encoding
+in certain values. It is possible, to identify the source of the bogus
+pointers by building the collector with <TT>-DPRINT_BLACK_LIST</tt>,
+which will cause it to print the "bogus pointers", along with their location.
+
+<LI> If you get only a fixed number of these warnings, you are probably only
+introducing a bounded leak by ignoring them. If the data structures being
+allocated are intended to be permanent, then it is also safe to ignore them.
+The warnings can be turned off by calling GC_set_warn_proc with a procedure
+that ignores these warnings (e.g. by doing absolutely nothing).
+</ol>
+
+<H2>The Collector References a Bad Address in <TT>GC_malloc</tt></h2>
+
+This typically happens while the collector is trying to remove an entry from
+its free list, and the free list pointer is bad because the free list link
+in the last allocated object was bad.
+<P>
+With &gt; 99% probability, you wrote past the end of an allocated object.
+Try setting <TT>GC_DEBUG</tt> before including <TT>gc.h</tt> and
+allocating with <TT>GC_MALLOC</tt>. This will try to detect such
+overwrite errors.
+
+<H2>Unexpectedly Large Heap</h2>
+
+Unexpected heap growth can be due to one of the following:
+<OL>
+<LI> Data structures that are being unintentionally retained. This
+is commonly caused by data structures that are no longer being used,
+but were not cleared, or by caches growing without bounds.
+<LI> Pointer misidentification. The garbage collector is interpreting
+integers or other data as pointers and retaining the "referenced"
+objects.
+<LI> Heap fragmentation. This should never result in unbounded growth,
+but it may account for larger heaps. This is most commonly caused
+by allocation of large objects. On some platforms it can be reduced
+by building with -DUSE_MUNMAP, which will cause the collector to unmap
+memory corresponding to pages that have not been recently used.
+<LI> Per object overhead. This is usually a relatively minor effect, but
+it may be worth considering. If the collector recognizes interior
+pointers, object sizes are increased, so that one-past-the-end pointers
+are correctly recognized. The collector can be configured not to do this
+(<TT>-DDONT_ADD_BYTE_AT_END</tt>).
+<P>
+The collector rounds up object sizes so the result fits well into the
+chunk size (<TT>HBLKSIZE</tt>, normally 4K on 32 bit machines, 8K
+on 64 bit machines) used by the collector. Thus it may be worth avoiding
+objects of size 2K + 1 (or 2K if a byte is being added at the end.)
+</ol>
+The last two cases can often be identified by looking at the output
+of a call to <TT>GC_dump()</tt>. Among other things, it will print the
+list of free heap blocks, and a very brief description of all chunks in
+the heap, the object sizes they correspond to, and how many live objects
+were found in the chunk at the last collection.
+<P>
+Growing data structures can usually be identified by
+<OL>
+<LI> Building the collector with <TT>-DKEEP_BACK_PTRS</tt>,
+<LI> Preferably using debugging allocation (defining <TT>GC_DEBUG</tt>
+before including <TT>gc.h</tt> and allocating with <TT>GC_MALLOC</tt>),
+so that objects will be identified by their allocation site,
+<LI> Running the application long enough so
+that most of the heap is composed of "leaked" memory, and
+<LI> Then calling <TT>GC_generate_random_backtrace()</tt> from backptr.h
+a few times to determine why some randomly sampled objects in the heap are
+being retained.
+</ol>
+<P>
+The same technique can often be used to identify problems with false
+pointers, by noting whether the reference chains printed by
+<TT>GC_generate_random_backtrace()</tt> involve any misidentified pointers.
+An alternate technique is to build the collector with
+<TT>-DPRINT_BLACK_LIST</tt> which will cause it to report values that
+are almost, but not quite, look like heap pointers. It is very likely that
+actual false pointers will come from similar sources.
+<P>
+In the unlikely case that false pointers are an issue, it can usually
+be resolved using one or more of the following techniques:
+<OL>
+<LI> Use <TT>GC_malloc_atomic</tt> for objects containing no pointers.
+This is especially important for large arrays containing compressed data,
+pseudo-random numbers, and the like. It is also likely to improve GC
+performance, perhaps drastically so if the application is paging.
+<LI> If you allocate large objects containing only
+one or two pointers at the beginning, either try the typed allocation
+primitives is <TT>gc_typed.h</tt>, or separate out the pointerfree component.
+<LI> Consider using <TT>GC_malloc_ignore_off_page()</tt>
+to allocate large objects. (See <TT>gc.h</tt> and above for details.
+Large means &gt; 100K in most environments.)
+</ol>
+<H2>Prematurely Reclaimed Objects</h2>
+The usual symptom of this is a segmentation fault, or an obviously overwritten
+value in a heap object. This should, of course, be impossible. In practice,
+it may happen for reasons like the following:
+<OL>
+<LI> The collector did not intercept the creation of threads correctly in
+a multithreaded application, <I>e.g.</i> because the client called
+<TT>pthread_create</tt> without including <TT>gc.h</tt>, which redefines it.
+<LI> The last pointer to an object in the garbage collected heap was stored
+somewhere were the collector couldn't see it, <I>e.g.</i> in an
+object allocated with system <TT>malloc</tt>, in certain types of
+<TT>mmap</tt>ed files,
+or in some data structure visible only to the OS. (On some platforms,
+thread-local storage is one of these.)
+<LI> The last pointer to an object was somehow disguised, <I>e.g.</i> by
+XORing it with another pointer.
+<LI> Incorrect use of <TT>GC_malloc_atomic</tt> or typed allocation.
+<LI> An incorrect <TT>GC_free</tt> call.
+<LI> The client program overwrote an internal garbage collector data structure.
+<LI> A garbage collector bug.
+<LI> (Empirically less likely than any of the above.) A compiler optimization
+that disguised the last pointer.
+</ol>
+The following relatively simple techniques should be tried first to narrow
+down the problem:
+<OL>
+<LI> If you are using the incremental collector try turning it off for
+debugging.
+<LI> Try to reproduce the problem with fully debuggable unoptimized code.
+This will eliminate the last possibility, as well as making debugging easier.
+<LI> Try replacing any suspect typed allocation and <TT>GC_malloc_atomic</tt>
+calls with calls to <TT>GC_malloc</tt>.
+<LI> Try removing any GC_free calls (<I>e.g.</i> with a suitable
+<TT>#define</tt>).
+<LI> Rebuild the collector with <TT>-DGC_ASSERTIONS</tt>.
+<LI> If the following works on your platform (i.e. if gctest still works
+if you do this), try building the collector with
+<TT>-DREDIRECT_MALLOC=GC_malloc_uncollectable</tt>. This will cause
+the collector to scan memory allocated with malloc.
+</ol>
+If all else fails, you will have to attack this with a debugger.
+Suggested steps:
+<OL>
+<LI> Call <TT>GC_dump()</tt> from the debugger around the time of the failure. Verify
+that the collectors idea of the root set (i.e. static data regions which
+it should scan for pointers) looks plausible. If not, i.e. if it doesn't
+include some static variables, report this as
+a collector bug. Be sure to describe your platform precisely, since this sort
+of problem is nearly always very platform dependent.
+<LI> Especially if the failure is not deterministic, try to isolate it to
+a relatively small test case.
+<LI> Set a break point in <TT>GC_finish_collection</tt>. This is a good
+point to examine what has been marked, i.e. found reachable, by the
+collector.
+<LI> If the failure is deterministic, run the process
+up to the last collection before the failure.
+Note that the variable <TT>GC_gc_no</tt> counts collections and can be used
+to set a conditional breakpoint in the right one. It is incremented just
+before the call to GC_finish_collection.
+If object <TT>p</tt> was prematurely recycled, it may be helpful to
+look at <TT>*GC_find_header(p)</tt> at the failure point.
+The <TT>hb_last_reclaimed</tt> field will identify the collection number
+during which its block was last swept.
+<LI> Verify that the offending object still has its correct contents at
+this point.
+The call <TT>GC_is_marked(p)</tt> from the debugger to verify that the
+object has not been marked, and is about to be reclaimed.
+<LI> Determine a path from a root, i.e. static variable, stack, or
+register variable,
+to the reclaimed object. Call <TT>GC_is_marked(q)</tt> for each object
+<TT>q</tt> along the path, trying to locate the first unmarked object, say
+<TT>r</tt>.
+<LI> If <TT>r</tt> is pointed to by a static root,
+verify that the location
+pointing to it is part of the root set printed by <TT>GC_dump()</tt>. If it
+is on the stack in the main (or only) thread, verify that
+<TT>GC_stackbottom</tt> is set correctly to the base of the stack. If it is
+in another thread stack, check the collector's thread data structure
+(<TT>GC_thread[]</tt> on several platforms) to make sure that stack bounds
+are set correctly.
+<LI> If <TT>r</tt> is pointed to by heap object <TT>s</tt>, check that the
+collector's layout description for <TT>s</tt> is such that the pointer field
+will be scanned. Call <TT>*GC_find_header(s)</tt> to look at the descriptor
+for the heap chunk. The <TT>hb_descr</tt> field specifies the layout
+of objects in that chunk. See gc_mark.h for the meaning of the descriptor.
+(If it's low order 2 bits are zero, then it is just the length of the
+object prefix to be scanned. This form is always used for objects allocated
+with <TT>GC_malloc</tt> or <TT>GC_malloc_atomic</tt>.)
+<LI> If the failure is not deterministic, you may still be able to apply some
+of the above technique at the point of failure. But remember that objects
+allocated since the last collection will not have been marked, even if the
+collector is functioning properly. On some platforms, the collector
+can be configured to save call chains in objects for debugging.
+Enabling this feature will also cause it to save the call stack at the
+point of the last GC in GC_arrays._last_stack.
+<LI> When looking at GC internal data structures remember that a number
+of <TT>GC_</tt><I>xxx</i> variables are really macro defined to
+<TT>GC_arrays._</tt><I>xxx</i>, so that
+the collector can avoid scanning them.
+</ol>
+</body>
+</html>
+
+
+
+
diff --git a/boehm-gc/doc/gcdescr.html b/boehm-gc/doc/gcdescr.html
new file mode 100644
index 0000000..65e8a8f
--- /dev/null
+++ b/boehm-gc/doc/gcdescr.html
@@ -0,0 +1,438 @@
+<HTML>
+<HEAD>
+ <TITLE> Conservative GC Algorithmic Overview </TITLE>
+ <AUTHOR> Hans-J. Boehm, Silicon Graphics</author>
+</HEAD>
+<BODY>
+<H1> <I>This is under construction</i> </h1>
+<H1> Conservative GC Algorithmic Overview </h1>
+<P>
+This is a description of the algorithms and data structures used in our
+conservative garbage collector. I expect the level of detail to increase
+with time. For a survey of GC algorithms, see for example
+<A HREF="ftp://ftp.cs.utexas.edu/pub/garbage/gcsurvey.ps"> Paul Wilson's
+excellent paper</a>. For an overview of the collector interface,
+see <A HREF="gcinterface.html">here</a>.
+<P>
+This description is targeted primarily at someone trying to understand the
+source code. It specifically refers to variable and function names.
+It may also be useful for understanding the algorithms at a higher level.
+<P>
+The description here assumes that the collector is used in default mode.
+In particular, we assume that it used as a garbage collector, and not just
+a leak detector. We initially assume that it is used in stop-the-world,
+non-incremental mode, though the presence of the incremental collector
+will be apparent in the design.
+We assume the default finalization model, but the code affected by that
+is very localized.
+<H2> Introduction </h2>
+The garbage collector uses a modified mark-sweep algorithm. Conceptually
+it operates roughly in four phases:
+
+<OL>
+
+<LI>
+<I>Preparation</i> Clear all mark bits, indicating that all objects
+are potentially unreachable.
+
+<LI>
+<I>Mark phase</i> Marks all objects that can be reachable via chains of
+pointers from variables. Normally the collector has no real information
+about the location of pointer variables in the heap, so it
+views all static data areas, stacks and registers as potentially containing
+containing pointers. Any bit patterns that represent addresses inside
+heap objects managed by the collector are viewed as pointers.
+Unless the client program has made heap object layout information
+available to the collector, any heap objects found to be reachable from
+variables are again scanned similarly.
+
+<LI>
+<I>Sweep phase</i> Scans the heap for inaccessible, and hence unmarked,
+objects, and returns them to an appropriate free list for reuse. This is
+not really a separate phase; even in non incremental mode this is operation
+is usually performed on demand during an allocation that discovers an empty
+free list. Thus the sweep phase is very unlikely to touch a page that
+would not have been touched shortly thereafter anyway.
+
+<LI>
+<I>Finalization phase</i> Unreachable objects which had been registered
+for finalization are enqueued for finalization outside the collector.
+
+</ol>
+
+<P>
+The remaining sections describe the memory allocation data structures,
+and then the last 3 collection phases in more detail. We conclude by
+outlining some of the additional features implemented in the collector.
+
+<H2>Allocation</h2>
+The collector includes its own memory allocator. The allocator obtains
+memory from the system in a platform-dependent way. Under UNIX, it
+uses either <TT>malloc</tt>, <TT>sbrk</tt>, or <TT>mmap</tt>.
+<P>
+Most static data used by the allocator, as well as that needed by the
+rest of the garbage collector is stored inside the
+<TT>_GC_arrays</tt> structure.
+This allows the garbage collector to easily ignore the collectors own
+data structures when it searches for root pointers. Other allocator
+and collector internal data structures are allocated dynamically
+with <TT>GC_scratch_alloc</tt>. <TT>GC_scratch_alloc</tt> does not
+allow for deallocation, and is therefore used only for permanent data
+structures.
+<P>
+The allocator allocates objects of different <I>kinds</i>.
+Different kinds are handled somewhat differently by certain parts
+of the garbage collector. Certain kinds are scanned for pointers,
+others are not. Some may have per-object type descriptors that
+determine pointer locations. Or a specific kind may correspond
+to one specific object layout. Two built-in kinds are uncollectable.
+One (<TT>STUBBORN</tt>) is immutable without special precautions.
+In spite of that, it is very likely that most applications currently
+use at most two kinds: <TT>NORMAL</tt> and <TT>PTRFREE</tt> objects.
+<P>
+The collector uses a two level allocator. A large block is defined to
+be one larger than half of <TT>HBLKSIZE</tt>, which is a power of 2,
+typically on the order of the page size.
+<P>
+Large block sizes are rounded up to
+the next multiple of <TT>HBLKSIZE</tt> and then allocated by
+<TT>GC_allochblk</tt>. This uses roughly what Paul Wilson has termed
+a "next fit" algorithm, i.e. first-fit with a rotating pointer.
+The implementation does check for a better fitting immediately
+adjacent block, which gives it somewhat better fragmentation characteristics.
+I'm now convinced it should use a best fit algorithm. The actual
+implementation of <TT>GC_allochblk</tt>
+is significantly complicated by black-listing issues
+(see below).
+<P>
+Small blocks are allocated in blocks of size <TT>HBLKSIZE</tt>.
+Each block is
+dedicated to only one object size and kind. The allocator maintains
+separate free lists for each size and kind of object.
+<P>
+In order to avoid allocating blocks for too many distinct object sizes,
+the collector normally does not directly allocate objects of every possible
+request size. Instead request are rounded up to one of a smaller number
+of allocated sizes, for which free lists are maintained. The exact
+allocated sizes are computed on demand, but subject to the constraint
+that they increase roughly in geometric progression. Thus objects
+requested early in the execution are likely to be allocated with exactly
+the requested size, subject to alignment constraints.
+See <TT>GC_init_size_map</tt> for details.
+<P>
+The actual size rounding operation during small object allocation is
+implemented as a table lookup in <TT>GC_size_map</tt>.
+<P>
+Both collector initialization and computation of allocated sizes are
+handled carefully so that they do not slow down the small object fast
+allocation path. An attempt to allocate before the collector is initialized,
+or before the appropriate <TT>GC_size_map</tt> entry is computed,
+will take the same path as an allocation attempt with an empty free list.
+This results in a call to the slow path code (<TT>GC_generic_malloc_inner</tt>)
+which performs the appropriate initialization checks.
+<P>
+In non-incremental mode, we make a decision about whether to garbage collect
+whenever an allocation would otherwise have failed with the current heap size.
+If the total amount of allocation since the last collection is less than
+the heap size divided by <TT>GC_free_space_divisor</tt>, we try to
+expand the heap. Otherwise, we initiate a garbage collection. This ensures
+that the amount of garbage collection work per allocated byte remains
+constant.
+<P>
+The above is in fat an oversimplification of the real heap expansion
+heuristic, which adjusts slightly for root size and certain kinds of
+fragmentation. In particular, programs with a large root set size and
+little live heap memory will expand the heap to amortize the cost of
+scanning the roots.
+<P>
+Versions 5.x of the collector actually collect more frequently in
+nonincremental mode. The large block allocator usually refuses to split
+large heap blocks once the garbage collection threshold is
+reached. This often has the effect of collecting well before the
+heap fills up, thus reducing fragmentation and working set size at the
+expense of GC time. 6.x will chose an intermediate strategy depending
+on how much large object allocation has taken place in the past.
+(If the collector is configured to unmap unused pages, versions 6.x
+will use the 5.x strategy.)
+<P>
+(It has been suggested that this should be adjusted so that we favor
+expansion if the resulting heap still fits into physical memory.
+In many cases, that would no doubt help. But it is tricky to do this
+in a way that remains robust if multiple application are contending
+for a single pool of physical memory.)
+
+<H2>Mark phase</h2>
+
+The marker maintains an explicit stack of memory regions that are known
+to be accessible, but that have not yet been searched for contained pointers.
+Each stack entry contains the starting address of the block to be scanned,
+as well as a descriptor of the block. If no layout information is
+available for the block, then the descriptor is simply a length.
+(For other possibilities, see <TT>gc_mark.h</tt>.)
+<P>
+At the beginning of the mark phase, all root segments are pushed on the
+stack by <TT>GC_push_roots</tt>. If <TT>ALL_INTERIOR_PTRS</tt> is not
+defined, then stack roots require special treatment. In this case, the
+normal marking code ignores interior pointers, but <TT>GC_push_all_stack</tt>
+explicitly checks for interior pointers and pushes descriptors for target
+objects.
+<P>
+The marker is structured to allow incremental marking.
+Each call to <TT>GC_mark_some</tt> performs a small amount of
+work towards marking the heap.
+It maintains
+explicit state in the form of <TT>GC_mark_state</tt>, which
+identifies a particular sub-phase. Some other pieces of state, most
+notably the mark stack, identify how much work remains to be done
+in each sub-phase. The normal progression of mark states for
+a stop-the-world collection is:
+<OL>
+<LI> <TT>MS_INVALID</tt> indicating that there may be accessible unmarked
+objects. In this case <TT>GC_objects_are_marked</tt> will simultaneously
+be false, so the mark state is advanced to
+<LI> <TT>MS_PUSH_UNCOLLECTABLE</tt> indicating that it suffices to push
+uncollectable objects, roots, and then mark everything reachable from them.
+<TT>Scan_ptr</tt> is advanced through the heap until all uncollectable
+objects are pushed, and objects reachable from them are marked.
+At that point, the next call to <TT>GC_mark_some</tt> calls
+<TT>GC_push_roots</tt> to push the roots. It the advances the
+mark state to
+<LI> <TT>MS_ROOTS_PUSHED</tt> asserting that once the mark stack is
+empty, all reachable objects are marked. Once in this state, we work
+only on emptying the mark stack. Once this is completed, the state
+changes to
+<LI> <TT>MS_NONE</tt> indicating that reachable objects are marked.
+</ol>
+
+The core mark routine <TT>GC_mark_from_mark_stack</tt>, is called
+repeatedly by several of the sub-phases when the mark stack starts to fill
+up. It is also called repeatedly in <TT>MS_ROOTS_PUSHED</tt> state
+to empty the mark stack.
+The routine is designed to only perform a limited amount of marking at
+each call, so that it can also be used by the incremental collector.
+It is fairly carefully tuned, since it usually consumes a large majority
+of the garbage collection time.
+<P>
+The marker correctly handles mark stack overflows. Whenever the mark stack
+overflows, the mark state is reset to <TT>MS_INVALID</tt>.
+Since there are already marked objects in the heap,
+this eventually forces a complete
+scan of the heap, searching for pointers, during which any unmarked objects
+referenced by marked objects are again pushed on the mark stack. This
+process is repeated until the mark phase completes without a stack overflow.
+Each time the stack overflows, an attempt is made to grow the mark stack.
+All pieces of the collector that push regions onto the mark stack have to be
+careful to ensure forward progress, even in case of repeated mark stack
+overflows. Every mark attempt results in additional marked objects.
+<P>
+Each mark stack entry is processed by examining all candidate pointers
+in the range described by the entry. If the region has no associated
+type information, then this typically requires that each 4-byte aligned
+quantity (8-byte aligned with 64-bit pointers) be considered a candidate
+pointer.
+<P>
+We determine whether a candidate pointer is actually the address of
+a heap block. This is done in the following steps:
+<NL>
+<LI> The candidate pointer is checked against rough heap bounds.
+These heap bounds are maintained such that all actual heap objects
+fall between them. In order to facilitate black-listing (see below)
+we also include address regions that the heap is likely to expand into.
+Most non-pointers fail this initial test.
+<LI> The candidate pointer is divided into two pieces; the most significant
+bits identify a <TT>HBLKSIZE</tt>-sized page in the address space, and
+the least significant bits specify an offset within that page.
+(A hardware page may actually consist of multiple such pages.
+HBLKSIZE is usually the page size divided by a small power of two.)
+<LI>
+The page address part of the candidate pointer is looked up in a
+<A HREF="tree.html">table</a>.
+Each table entry contains either 0, indicating that the page is not part
+of the garbage collected heap, a small integer <I>n</i>, indicating
+that the page is part of large object, starting at least <I>n</i> pages
+back, or a pointer to a descriptor for the page. In the first case,
+the candidate pointer i not a true pointer and can be safely ignored.
+In the last two cases, we can obtain a descriptor for the page containing
+the beginning of the object.
+<LI>
+The starting address of the referenced object is computed.
+The page descriptor contains the size of the object(s)
+in that page, the object kind, and the necessary mark bits for those
+objects. The size information can be used to map the candidate pointer
+to the object starting address. To accelerate this process, the page header
+also contains a pointer to a precomputed map of page offsets to displacements
+from the beginning of an object. The use of this map avoids a
+potentially slow integer remainder operation in computing the object
+start address.
+<LI>
+The mark bit for the target object is checked and set. If the object
+was previously unmarked, the object is pushed on the mark stack.
+The descriptor is read from the page descriptor. (This is computed
+from information <TT>GC_obj_kinds</tt> when the page is first allocated.)
+</nl>
+<P>
+At the end of the mark phase, mark bits for left-over free lists are cleared,
+in case a free list was accidentally marked due to a stray pointer.
+
+<H2>Sweep phase</h2>
+
+At the end of the mark phase, all blocks in the heap are examined.
+Unmarked large objects are immediately returned to the large object free list.
+Each small object page is checked to see if all mark bits are clear.
+If so, the entire page is returned to the large object free list.
+Small object pages containing some reachable object are queued for later
+sweeping.
+<P>
+This initial sweep pass touches only block headers, not
+the blocks themselves. Thus it does not require significant paging, even
+if large sections of the heap are not in physical memory.
+<P>
+Nonempty small object pages are swept when an allocation attempt
+encounters an empty free list for that object size and kind.
+Pages for the correct size and kind are repeatedly swept until at
+least one empty block is found. Sweeping such a page involves
+scanning the mark bit array in the page header, and building a free
+list linked through the first words in the objects themselves.
+This does involve touching the appropriate data page, but in most cases
+it will be touched only just before it is used for allocation.
+Hence any paging is essentially unavoidable.
+<P>
+Except in the case of pointer-free objects, we maintain the invariant
+that any object in a small object free list is cleared (except possibly
+for the link field). Thus it becomes the burden of the small object
+sweep routine to clear objects. This has the advantage that we can
+easily recover from accidentally marking a free list, though that could
+also be handled by other means. The collector currently spends a fair
+amount of time clearing objects, and this approach should probably be
+revisited.
+<P>
+In most configurations, we use specialized sweep routines to handle common
+small object sizes. Since we allocate one mark bit per word, it becomes
+easier to examine the relevant mark bits if the object size divides
+the word length evenly. We also suitably unroll the inner sweep loop
+in each case. (It is conceivable that profile-based procedure cloning
+in the compiler could make this unnecessary and counterproductive. I
+know of no existing compiler to which this applies.)
+<P>
+The sweeping of small object pages could be avoided completely at the expense
+of examining mark bits directly in the allocator. This would probably
+be more expensive, since each allocation call would have to reload
+a large amount of state (e.g. next object address to be swept, position
+in mark bit table) before it could do its work. The current scheme
+keeps the allocator simple and allows useful optimizations in the sweeper.
+
+<H2>Finalization</h2>
+Both <TT>GC_register_disappearing_link</tt> and
+<TT>GC_register_finalizer</tt> add the request to a corresponding hash
+table. The hash table is allocated out of collected memory, but
+the reference to the finalizable object is hidden from the collector.
+Currently finalization requests are processed non-incrementally at the
+end of a mark cycle.
+<P>
+The collector makes an initial pass over the table of finalizable objects,
+pushing the contents of unmarked objects onto the mark stack.
+After pushing each object, the marker is invoked to mark all objects
+reachable from it. The object itself is not explicitly marked.
+This assures that objects on which a finalizer depends are neither
+collected nor finalized.
+<P>
+If in the process of marking from an object the
+object itself becomes marked, we have uncovered
+a cycle involving the object. This usually results in a warning from the
+collector. Such objects are not finalized, since it may be
+unsafe to do so. See the more detailed
+<A HREF="finalization.html"> discussion of finalization semantics</a>.
+<P>
+Any objects remaining unmarked at the end of this process are added to
+a queue of objects whose finalizers can be run. Depending on collector
+configuration, finalizers are dequeued and run either implicitly during
+allocation calls, or explicitly in response to a user request.
+<P>
+The collector provides a mechanism for replacing the procedure that is
+used to mark through objects. This is used both to provide support for
+Java-style unordered finalization, and to ignore certain kinds of cycles,
+<I>e.g.</i> those arising from C++ implementations of virtual inheritance.
+
+<H2>Generational Collection and Dirty Bits</h2>
+We basically use the parallel and generational GC algorithm described in
+<A HREF="papers/pldi91.ps.gz">"Mostly Parallel Garbage Collection"</a>,
+by Boehm, Demers, and Shenker.
+<P>
+The most significant modification is that
+the collector always runs in the allocating thread.
+There is no separate garbage collector thread.
+If an allocation attempt either requests a large object, or encounters
+an empty small object free list, and notices that there is a collection
+in progress, it immediately performs a small amount of marking work
+as described above.
+<P>
+This change was made both because we wanted to easily accommodate
+single-threaded environments, and because a separate GC thread requires
+very careful control over the scheduler to prevent the mutator from
+out-running the collector, and hence provoking unneeded heap growth.
+<P>
+In incremental mode, the heap is always expanded when we encounter
+insufficient space for an allocation. Garbage collection is triggered
+whenever we notice that more than
+<TT>GC_heap_size</tt>/2 * <TT>GC_free_space_divisor</tt>
+bytes of allocation have taken place.
+After <TT>GC_full_freq</tt> minor collections a major collection
+is started.
+<P>
+All collections initially run interrupted until a predetermined
+amount of time (50 msecs by default) has expired. If this allows
+the collection to complete entirely, we can avoid correcting
+for data structure modifications during the collection. If it does
+not complete, we return control to the mutator, and perform small
+amounts of additional GC work during those later allocations that
+cannot be satisfied from small object free lists. When marking completes,
+the set of modified pages is retrieved, and we mark once again from
+marked objects on those pages, this time with the mutator stopped.
+<P>
+We keep track of modified pages using one of three distinct mechanisms:
+<OL>
+<LI>
+Through explicit mutator cooperation. Currently this requires
+the use of <TT>GC_malloc_stubborn</tt>.
+<LI>
+By write-protecting physical pages and catching write faults. This is
+implemented for many Unix-like systems and for win32. It is not possible
+in a few environments.
+<LI>
+By retrieving dirty bit information from /proc. (Currently only Sun's
+Solaris supports this. Though this is considerably cleaner, performance
+may actually be better with mprotect and signals.)
+</ol>
+
+<H2>Thread support</h2>
+We support several different threading models. Unfortunately Pthreads,
+the only reasonably well standardized thread model, supports too narrow
+an interface for conservative garbage collection. There appears to be
+no portable way to allow the collector to coexist with various Pthreads
+implementations. Hence we currently support only a few of the more
+common Pthreads implementations.
+<P>
+In particular, it is very difficult for the collector to stop all other
+threads in the system and examine the register contents. This is currently
+accomplished with very different mechanisms for different Pthreads
+implementations. The Solaris implementation temporarily disables much
+of the user-level threads implementation by stopping kernel-level threads
+("lwp"s). The Irix implementation sends signals to individual Pthreads
+and has them wait in the signal handler. The Linux implementation
+is similar in spirit to the Irix one.
+<P>
+The Irix implementation uses
+only documented Pthreads calls, but relies on extensions to their semantics,
+notably the use of mutexes and condition variables from signal
+handlers. The Linux implementation should be far closer to
+portable, though impirically it is not completely portable.
+<P>
+All implementations must
+intercept thread creation and a few other thread-specific calls to allow
+enumeration of threads and location of thread stacks. This is current
+accomplished with <TT># define</tt>'s in <TT>gc.h</tt>, or optionally
+by using ld's function call wrapping mechanism under Linux.
+<P>
+Comments are appreciated. Please send mail to
+<A HREF="mailto:boehm@acm.org"><TT>boehm@acm.org</tt></a>
+</body>
diff --git a/boehm-gc/doc/tree.html b/boehm-gc/doc/tree.html
new file mode 100644
index 0000000..89c515d
--- /dev/null
+++ b/boehm-gc/doc/tree.html
@@ -0,0 +1,198 @@
+<HTML>
+<HEAD>
+ <TITLE> Two-Level Tree Structure for Fast Pointer Lookup</TITLE>
+ <AUTHOR> Hans-J. Boehm, Silicon Graphics</author>
+</HEAD>
+<BODY>
+<H1>Two-Level Tree Structure for Fast Pointer Lookup</h1>
+<P>
+The conservative garbage collector described
+<A HREF="gc.html">here</a> uses a 2-level tree
+data structure to aid in fast pointer identification.
+This data structure is described in a bit more detail here, since
+<OL>
+<LI> Variations of the data structure are more generally useful.
+<LI> It appears to be hard to understand by reading the code.
+<LI> Some other collectors appear to use inferior data structures to
+solve the same problem.
+<LI> It is central to fast collector operation.
+</ol>
+A candidate pointer is divided into three sections, the <I>high</i>,
+<I>middle</i>, and <I>low</i> bits. The exact division between these
+three groups of bits is dependent on the detailed collector configuration.
+<P>
+The high and middle bits are used to look up an entry in the table described
+here. The resulting table entry consists of either a block descriptor
+(<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>)
+identifying the layout of objects in the block, or an indication that this
+address range corresponds to the middle of a large block, together with a
+hint for locating the actual block descriptor. Such a hint consist
+of a displacement that can be subtracted from the middle bits of the candidate
+pointer without leaving the object.
+<P>
+In either case, the block descriptor (<TT>struct hblkhdr</tt>)
+refers to a table of object starting addresses (the <TT>hb_map</tt> field).
+The starting address table is indexed by the low bits if the candidate pointer.
+The resulting entry contains a displacement to the beginning of the object,
+or an indication that this cannot be a valid object pointer.
+(If all interior pointer are recognized, pointers into large objects
+are handled specially, as appropriate.)
+
+<H2>The Tree</h2>
+<P>
+The rest of this discussion focuses on the two level data structure
+used to map the high and middle bits to the block descriptor.
+<P>
+The high bits are used as an index into the <TT>GC_top_index</tt> (really
+<TT>GC_arrays._top_index</tt>) array. Each entry points to a
+<TT>bottom_index</tt> data structure. This structure in turn consists
+mostly of an array <TT>index</tt> indexed by the middle bits of
+the candidate pointer. The <TT>index</tt> array contains the actual
+<TT>hdr</tt> pointers.
+<P>
+Thus a pointer lookup consists primarily of a handful of memory references,
+and can be quite fast:
+<OL>
+<LI> The appropriate <TT>bottom_index</tt> pointer is looked up in
+<TT>GC_top_index</tt>, based on the high bits of the candidate pointer.
+<LI> The appropriate <TT>hdr</tt> pointer is looked up in the
+<TT>bottom_index</tt> structure, based on the middle bits.
+<LI> The block layout map pointer is retrieved from the <TT>hdr</tt>
+structure. (This memory reference is necessary since we try to share
+block layout maps.)
+<LI> The displacement to the beginning of the object is retrieved from the
+above map.
+</ol>
+<P>
+In order to conserve space, not all <TT>GC_top_index</tt> entries in fact
+point to distinct <TT>bottom_index</tt> structures. If no address with
+the corresponding high bits is part of the heap, then the entry points
+to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting
+only of NULL <TT>hdr</tt> pointers.
+<P>
+<TT>Bottom_index</tt> structures contain slightly more information than
+just <TT>hdr</tt> pointers. The <TT>asc_link</tt> field is used to link
+all <TT>bottom_index</tt> structures in ascending order for fast traversal.
+This list is pointed to be <TT>GC_all_bottom_indices</tt>.
+It is maintained with the aid of <TT>key</tt> field that contains the
+high bits corresponding to the <TT>bottom_index</tt>.
+
+<H2>64 bit addresses</h2>
+<P>
+In the case of 64 bit addresses, this picture is complicated slightly
+by the fact that one of the index structures would have to be huge to
+cover the entire address space with a two level tree. We deal with this
+by turning <TT>GC_top_index</tt> into a chained hash table, instead of
+a simple array. This adds a <TT>hash_link</tt> field to the
+<TT>bottom_index</tt> structure.
+<P>
+The "hash function" consists of dropping the high bits. This is cheap to
+compute, and guarantees that there will be no collisions if the heap
+is contiguous and not excessively large.
+
+<H2>A picture</h2>
+<P>
+The following is an ASCII diagram of the data structure.
+This was contributed by Dave Barrett several years ago.
+<PRE>
+
+ Data Structure used by GC_base in gc3.7:
+ 21-Apr-94
+
+
+
+
+ 63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13]
+ +------------------+----------------+------------------+------------------+
+ p:| | TL_HASH(hi) | | HBLKDISPL(p) |
+ +------------------+----------------+------------------+------------------+
+ \-----------------------HBLKPTR(p)-------------------/
+ \------------hi-------------------/
+ \______ ________/ \________ _______/ \________ _______/
+ V V V
+ | | |
+ GC_top_index[] | | |
+ --- +--------------+ | | |
+ ^ | | | | |
+ | | | | | |
+ TOP +--------------+<--+ | |
+ _SZ +-<| [] | * | |
+(items)| +--------------+ if 0 < bi< HBLKSIZE | |
+ | | | | then large object | |
+ | | | | starts at the bi'th | |
+ v | | | HBLK before p. | i |
+ --- | +--------------+ | (word- |
+ v | aligned) |
+ bi= |GET_BI(p){->hash_link}->key==hi | |
+ v | |
+ | (bottom_index) \ scratch_alloc'd | |
+ | ( struct bi ) / by get_index() | |
+ --- +->+--------------+ | |
+ ^ | | | |
+ ^ | | | |
+ BOTTOM | | ha=GET_HDR_ADDR(p) | |
+_SZ(items)+--------------+<----------------------+ +-------+
+ | +--<| index[] | |
+ | | +--------------+ GC_obj_map: v
+ | | | | from / +-+-+-----+-+-+-+-+ ---
+ v | | | GC_add < 0| | | | | | | | ^
+ --- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ |
+ | | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ
+ | +--------------+ +-->| | | j | | | | | +1
+ | | key | | +-+-+-----+-+-+-+-+ |
+ | +--------------+ | +-+-+-----+-+-+-+-+ |
+ | | hash_link | | | | | | | | | | v
+ | +--------------+ | +-+-+-----+-+-+-+-+ ---
+ | | |<--MAX_OFFSET--->|
+ | | (bytes)
+HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->|
+ | \ from | =HBLKSIZE/WORDSZ
+ | (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha)
+ +-->+----------------------+ | (8/16 bits each)
+GET_HDR(p)| word hb_sz (words) | |
+ +----------------------+ |
+ | struct hblk *hb_next | |
+ +----------------------+ |
+ |mark_proc hb_mark_proc| |
+ +----------------------+ |
+ | char * hb_map |>-------------+
+ +----------------------+
+ | ushort hb_obj_kind |
+ +----------------------+
+ | hb_last_reclaimed |
+ --- +----------------------+
+ ^ | |
+ MARK_BITS| hb_marks[] | *if hdr is free, hb_sz + DISCARD_WORDS
+_SZ(words)| | is the size of a heap chunk (struct hblk)
+ v | | of at least MININCR*HBLKSIZE bytes (below),
+ --- +----------------------+ otherwise, size of each object in chunk.
+
+Dynamic data structures above are interleaved throughout the heap in blocks of
+size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be
+freed; free lists are used (e.g. alloc_hdr). HBLK's below are collected.
+
+ (struct hblk)
+ --- +----------------------+ < HBLKSIZE --- --- DISCARD_
+ ^ |garbage[DISCARD_WORDS]| aligned ^ ^ HDR_BYTES WORDS
+ | | | | v (bytes) (words)
+ | +-----hb_body----------+ < WORDSZ | --- ---
+ | | | aligned | ^ ^
+ | | Object 0 | | hb_sz |
+ | | | i |(word- (words)|
+ | | | (bytes)|aligned) v |
+ | + - - - - - - - - - - -+ --- | --- |
+ | | | ^ | ^ |
+ n * | | j (words) | hb_sz BODY_SZ
+ HBLKSIZE | Object 1 | v v | (words)
+ (bytes) | |--------------- v MAX_OFFSET
+ | + - - - - - - - - - - -+ --- (bytes)
+ | | | !All_INTERIOR_PTRS ^ |
+ | | | sets j only for hb_sz |
+ | | Object N | valid object offsets. | |
+ v | | All objects WORDSZ v v
+ --- +----------------------+ aligned. --- ---
+
+DISCARD_WORDS is normally zero. Indeed the collector has not been tested
+with another value in ages.
+</pre>
+</body>