From cec54960bbbfa351cab7dab75eb1418585e4fe64 Mon Sep 17 00:00:00 2001
From: John Hauser <jhauser@eecs.berkeley.edu>
Date: Wed, 17 Dec 2014 19:09:39 -0800
Subject: Finalized documentation for TestFloat Release 3.

---
 doc/TestFloat-general.html | 507 +++++++++++++++++++++++++++------------------
 1 file changed, 305 insertions(+), 202 deletions(-)

(limited to 'doc/TestFloat-general.html')
diff --git a/doc/TestFloat-general.html b/doc/TestFloat-general.html
index 1618d4a..d72807e 100644
--- a/doc/TestFloat-general.html
+++ b/doc/TestFloat-general.html
@@ -11,49 +11,38 @@
 
 <P>
 John R. Hauser<BR>
-2014 ______<BR>
-</P>
-
-<P>
-*** CONTENT DONE.
-</P>
-
-<P>
-*** REPLACE QUOTATION MARKS.
-<BR>
-*** REPLACE APOSTROPHES.
-<BR>
-*** REPLACE EM DASH.
+2014 Dec 17<BR>
 </P>
 
 
 <H2>Contents</H2>
 
-<P>
-*** CHECK.<BR>
-*** FIX FORMATTING.
-</P>
-
-<PRE>
-    Introduction
-    Limitations
-    Acknowledgments and License
-    What TestFloat Does
-    Executing TestFloat
-    Operations Tested by TestFloat
-        Conversion Operations
-        Basic Arithmetic Operations
-        Fused Multiply-Add Operations
-        Remainder Operations
-        Round-to-Integer Operations
-        Comparison Operations
-    Interpreting TestFloat Output
-    Variations Allowed by the IEEE Floating-Point Standard
-        Underflow
-        NaNs
-        Conversions to Integer
-    Contact Information
-</PRE>
+<BLOCKQUOTE>
+<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
+<COL WIDTH=25>
+<COL WIDTH=*>
+<TR><TD COLSPAN=2>1. Introduction</TD></TR>
+<TR><TD COLSPAN=2>2. Limitations</TD></TR>
+<TR><TD COLSPAN=2>3. Acknowledgments and License</TD></TR>
+<TR><TD COLSPAN=2>4. What TestFloat Does</TD></TR>
+<TR><TD COLSPAN=2>5. Executing TestFloat</TD></TR>
+<TR><TD COLSPAN=2>6. Operations Tested by TestFloat</TD></TR>
+<TR><TD></TD><TD>6.1. Conversion Operations</TD></TR>
+<TR><TD></TD><TD>6.2. Basic Arithmetic Operations</TD></TR>
+<TR><TD></TD><TD>6.3. Fused Multiply-Add Operations</TD></TR>
+<TR><TD></TD><TD>6.4. Remainder Operations</TD></TR>
+<TR><TD></TD><TD>6.5. Round-to-Integer Operations</TD></TR>
+<TR><TD></TD><TD>6.6. Comparison Operations</TD></TR>
+<TR><TD COLSPAN=2>7. Interpreting TestFloat Output</TD></TR>
+<TR>
+  <TD COLSPAN=2>8. Variations Allowed by the IEEE Floating-Point Standard</TD>
+</TR>
+<TR><TD></TD><TD>8.1. Underflow</TD></TR>
+<TR><TD></TD><TD>8.2. NaNs</TD></TR>
+<TR><TD></TD><TD>8.3. Conversions to Integer</TD></TR>
+<TR><TD COLSPAN=2>9. Contact Information</TD></TR>
+</TABLE>
+</BLOCKQUOTE>
 
 
 <H2>1. Introduction</H2>
@@ -89,8 +78,8 @@ Details about the standard are available elsewhere.
 
 <P>
 The current version of TestFloat is <NOBR>Release 3</NOBR>.
-The set of TestFloat programs as well as the programs' arguments and behavior
-have changed some compared to earlier TestFloat releases.
+The set of TestFloat programs as well as the programs&rsquo; arguments and
+behavior have changed some compared to earlier TestFloat releases.
 </P>
 
 
@@ -119,15 +108,20 @@ bugs can be found through links posted on the TestFloat Web page
 The TestFloat package was written by me, <NOBR>John R.</NOBR> Hauser.
 <NOBR>Release 3</NOBR> of TestFloat is a completely new implementation
 supplanting earlier releases.
-This project was done in the employ of the University of California, Berkeley,
-within the Department of Electrical Engineering and Computer Sciences, first
-for the Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab.
+This project (<NOBR>Release 3</NOBR> only, not earlier releases) was done in
+the employ of the University of California, Berkeley, within the Department of
+Electrical Engineering and Computer Sciences, first for the Parallel Computing
+Laboratory (Par Lab) and then for the ASPIRE Lab.
 The work was officially overseen by Prof. Krste Asanovic, with funding provided
 by these sources:
 <BLOCKQUOTE>
 <TABLE>
+<COL WIDTH=*>
+<COL WIDTH=10>
+<COL WIDTH=*>
 <TR>
-<TD><NOBR>Par Lab:</NOBR></TD>
+<TD VALIGN=TOP><NOBR>Par Lab:</NOBR></TD>
+<TD></TD>
 <TD>
 Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
 (Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
@@ -135,7 +129,8 @@ NVIDIA, Oracle, and Samsung.
 </TD>
 </TR>
 <TR>
-<TD><NOBR>ASPIRE Lab:</NOBR></TD>
+<TD VALIGN=TOP><NOBR>ASPIRE Lab:</NOBR></TD>
+<TD></TD>
 <TD>
 DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
 ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
@@ -191,8 +186,8 @@ ENHANCEMENTS, OR MODIFICATIONS.
 
 <P>
 TestFloat is designed to test a floating-point implementation by comparing its
-behavior with that of TestFloat's own internal floating-point implemented in
-software.
+behavior with that of TestFloat&rsquo;s own internal floating-point implemented
+in software.
 For each operation to be tested, the TestFloat programs can generate a large
 number of test cases, made up of simple pattern tests intermixed with weighted
 random inputs.
@@ -263,19 +258,20 @@ for programs <CODE>testfloat_ver</CODE> and <CODE>testfloat</CODE>.
 TestFloat normally compares an implementation of floating-point against the
 Berkeley SoftFloat software implementation of floating-point, also created by
 me.
-The SoftFloat functions are linked into each TestFloat program's executable.
+The SoftFloat functions are linked into each TestFloat program&rsquo;s
+executable.
 Information about SoftFloat can be found at the Web page
 <A HREF="http://www.jhauser.us/arithmetic/SoftFloat.html"><CODE>http://www.jhauser.us/arithmetic/SoftFloat.html</CODE></A>.
 </P>
 
 <P>
 For testing SoftFloat itself, the TestFloat package includes a
-<CODE>testsoftfloat</CODE> program that compares SoftFloat's floating-point
-against <EM>another</EM> software floating-point implementation.
+<CODE>testsoftfloat</CODE> program that compares SoftFloat&rsquo;s
+floating-point against <EM>another</EM> software floating-point implementation.
 The second software floating-point is simpler and slower than SoftFloat, and is
 completely independent of SoftFloat.
 Although the second software floating-point cannot be guaranteed to be
-bug-free, the chance that it would mimic any of SoftFloat's bugs is low.
+bug-free, the chance that it would mimic any of SoftFloat&rsquo;s bugs is low.
 Consequently, an error in one or the other floating-point version should appear
 as an unexpected difference between the two implementations.
 Note that testing SoftFloat should be necessary only when compiling a new
@@ -347,9 +343,11 @@ These results can then be piped to <CODE>testfloat_ver</CODE> to be checked for
 correctness.
 Assuming a vertical bar (<CODE>|</CODE>) indicates a pipe between programs, the
 complete process could be written as a single command like so:
+<BLOCKQUOTE>
 <PRE>
-     testfloat_gen ... &lt;type&gt; | &lt;program-that-invokes-op&gt; | testfloat_ver ... &lt;function&gt;
+testfloat_gen ... &lt;type&gt; | &lt;program-that-invokes-op&gt; | testfloat_ver ... &lt;function&gt;
 </PRE>
+</BLOCKQUOTE>
 The program in the middle is not supplied by TestFloat but must be created
 independently.
 If for some reason this program cannot take command-line arguments, the
@@ -363,9 +361,11 @@ A second method for running TestFloat is similar but has
 expected results for each case.
 With this additional information, the job done by <CODE>testfloat_ver</CODE>
 can be folded into the invoking program to give the following command:
+<BLOCKQUOTE>
 <PRE>
-     testfloat_gen ... &lt;function&gt; | &lt;program-that-invokes-op-and-compares-results&gt;
+testfloat_gen ... &lt;function&gt; | &lt;program-that-invokes-op-and-compares-results&gt;
 </PRE>
+</BLOCKQUOTE>
 Again, the program that actually invokes the floating-point operation is not
 supplied by TestFloat but must be created independently.
 Depending on circumstance, it may be preferable either to let
@@ -429,8 +429,8 @@ multiplication, division, and square root operations;
 for each format, the floating-point remainder operation defined by the IEEE
 Standard;
 <LI>
-for each format, a ``round to integer'' operation that rounds to the nearest
-integer value in the same format; and
+for each format, a &ldquo;round to integer&rdquo; operation that rounds to the
+nearest integer value in the same format; and
 <LI>
 comparisons between two values in the same floating-point format.
 </UL>
@@ -451,8 +451,8 @@ is called <CODE>f32</CODE>, <NOBR>64-bit</NOBR> double-precision is
 <CODE>extF80</CODE>, and <NOBR>128-bit</NOBR> quadruple-precision is
 <CODE>f128</CODE>.
 TestFloat generally uses the same names for operations as Berkeley SoftFloat,
-except that TestFloat's names never include the <CODE>M</CODE> that SoftFloat
-uses to indicate that values are passed through pointers.
+except that TestFloat&rsquo;s names never include the <CODE>M</CODE> that
+SoftFloat uses to indicate that values are passed through pointers.
 </P>
 
 <H3>6.1. Conversion Operations</H3>
@@ -462,21 +462,23 @@ All conversions among the floating-point formats and all conversions between a
 floating-point format and <NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR> integers
 can be tested.
 The conversion operations are:
+<BLOCKQUOTE>
 <PRE>
-     ui32_to_f32      ui64_to_f32      i32_to_f32       i64_to_f32
-     ui32_to_f64      ui64_to_f64      i32_to_f64       i64_to_f64
-     ui32_to_extF80   ui64_to_extF80   i32_to_extF80    i64_to_extF80
-     ui32_to_f128     ui64_to_f128     i32_to_f128      i64_to_f128
-
-     f32_to_ui32      f64_to_ui32      extF80_to_ui32   f128_to_ui32
-     f32_to_ui64      f64_to_ui64      extF80_to_ui64   f128_to_ui64
-     f32_to_i32       f64_to_i32       extF80_to_i32    f128_to_i32
-     f32_to_i64       f64_to_i64       extF80_to_i64    f128_to_i64
-
-     f32_to_f64       f64_to_f32       extF80_to_f32    f128_to_f32
-     f32_to_extF80    f64_to_extF80    extF80_to_f64    f128_to_f64
-     f32_to_f128      f64_to_f128      extF80_to_f128   f128_to_extF80
+ui32_to_f32      ui64_to_f32      i32_to_f32       i64_to_f32
+ui32_to_f64      ui64_to_f64      i32_to_f64       i64_to_f64
+ui32_to_extF80   ui64_to_extF80   i32_to_extF80    i64_to_extF80
+ui32_to_f128     ui64_to_f128     i32_to_f128      i64_to_f128
+
+f32_to_ui32      f64_to_ui32      extF80_to_ui32   f128_to_ui32
+f32_to_ui64      f64_to_ui64      extF80_to_ui64   f128_to_ui64
+f32_to_i32       f64_to_i32       extF80_to_i32    f128_to_i32
+f32_to_i64       f64_to_i64       extF80_to_i64    f128_to_i64
+
+f32_to_f64       f64_to_f32       extF80_to_f32    f128_to_f32
+f32_to_extF80    f64_to_extF80    extF80_to_f64    f128_to_f64
+f32_to_f128      f64_to_f128      extF80_to_f128   f128_to_extF80
 </PRE>
+</BLOCKQUOTE>
 Abbreviations <CODE>ui32</CODE> and <CODE>ui64</CODE> indicate
 <NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR> unsigned integer types, while
 <CODE>i32</CODE> and <CODE>i64</CODE> indicate their signed counterparts.
@@ -495,22 +497,27 @@ operations requires amendment.
 For <CODE>testfloat</CODE> only, conversions to an integer type have names that
 explicitly specify the rounding mode and treatment of inexactness.
 Thus, instead of
+<BLOCKQUOTE>
 <PRE>
-     &lt;float&gt;_to_&lt;int&gt;
+&lt;float&gt;_to_&lt;int&gt;
 </PRE>
+</BLOCKQUOTE>
 as listed above, operations converting to integer type have names of these
 forms:
+<BLOCKQUOTE>
 <PRE>
-     &lt;float&gt;_to_&lt;int&gt;_r_&lt;round&gt;
-     &lt;float&gt;_to_&lt;int&gt;_rx_&lt;round&gt;
+&lt;float&gt;_to_&lt;int&gt;_r_&lt;round&gt;
+&lt;float&gt;_to_&lt;int&gt;_rx_&lt;round&gt;
 </PRE>
-The <CODE>&lt;round&gt;</CODE> component is one of `<CODE>near_even</CODE>',
-`<CODE>near_maxMag</CODE>', `<CODE>minMag</CODE>', `<CODE>min</CODE>', or
-`<CODE>max</CODE>', choosing the rounding mode.
+</BLOCKQUOTE>
+The <CODE>&lt;round&gt;</CODE> component is one of
+&lsquo;<CODE>near_even</CODE>&rsquo;, &lsquo;<CODE>near_maxMag</CODE>&rsquo;,
+&lsquo;<CODE>minMag</CODE>&rsquo;, &lsquo;<CODE>min</CODE>&rsquo;, or
+&lsquo;<CODE>max</CODE>&rsquo;, choosing the rounding mode.
 Any other indication of rounding mode is ignored.
-The operations with `<CODE>_r_</CODE>' in their names never raise the
-<I>inexact</I> exception, while those with `<CODE>_rx_</CODE>' raise the
-<I>inexact</I> exception whenever the result is not exact.
+The operations with &lsquo;<CODE>_r_</CODE>&rsquo; in their names never raise
+the <I>inexact</I> exception, while those with &lsquo;<CODE>_rx_</CODE>&rsquo;
+raise the <I>inexact</I> exception whenever the result is not exact.
 </P>
 
 <P>
@@ -518,7 +525,8 @@ TestFloat assumes that conversions from floating-point to an integer type
 should raise the <I>invalid</I> exception if the input cannot be rounded to an
 integer representable by the result format.
 In such a circumstance, if the result type is an unsigned integer, TestFloat
-expects the result of the operation to be the type's largest integer value.
+expects the result of the operation to be the type&rsquo;s largest integer
+value.
 If the result type is a signed integer and conversion overflows, TestFloat
 expects the result to be the largest-magnitude integer with the same sign as
 the input.
@@ -533,12 +541,14 @@ exception.
 
 <P>
 The following standard arithmetic operations can be tested:
+<BLOCKQUOTE>
 <PRE>
-     f32_add      f32_sub      f32_mul      f32_div      f32_sqrt
-     f64_add      f64_sub      f64_mul      f64_div      f64_sqrt
-     extF80_add   extF80_sub   extF80_mul   extF80_div   extF80_sqrt
-     f128_add     f128_sub     f128_mul     f128_div     f128_sqrt
+f32_add      f32_sub      f32_mul      f32_div      f32_sqrt
+f64_add      f64_sub      f64_mul      f64_div      f64_sqrt
+extF80_add   extF80_sub   extF80_mul   extF80_div   extF80_sqrt
+f128_add     f128_sub     f128_mul     f128_div     f128_sqrt
 </PRE>
+</BLOCKQUOTE>
 The double-extended-precision (<CODE>extF80</CODE>) operations can be rounded
 to reduced precision under rounding precision control.
 </P>
@@ -550,11 +560,13 @@ For all floating-point formats except <NOBR>80-bit</NOBR>
 double-extended-precision, TestFloat can test the fused multiply-add operation
 defined by the 2008 IEEE Floating-Point Standard.
 The fused multiply-add operations are:
+<BLOCKQUOTE>
 <PRE>
-     f32_mulAdd
-     f64_mulAdd
-     f128_mulAdd
+f32_mulAdd
+f64_mulAdd
+f128_mulAdd
 </PRE>
+</BLOCKQUOTE>
 </P>
 
 <P>
@@ -566,29 +578,34 @@ exception even if the third operand is a NaN.
 <H3>6.4. Remainder Operations</H3>
 
 <P>
-For each format, TestFloat can test the IEEE Standard's remainder operation.
+For each format, TestFloat can test the IEEE Standard&rsquo;s remainder
+operation.
 These operations are:
+<BLOCKQUOTE>
 <PRE>
-     f32_rem
-     f64_rem
-     extF80_rem
-     f128_rem
+f32_rem
+f64_rem
+extF80_rem
+f128_rem
 </PRE>
+</BLOCKQUOTE>
 The remainder operations are always exact and so require no rounding.
 </P>
 
 <H3>6.5. Round-to-Integer Operations</H3>
 
 <P>
-For each format, TestFloat can test the IEEE Standard's round-to-integer
+For each format, TestFloat can test the IEEE Standard&rsquo;s round-to-integer
 operation.
 For most TestFloat programs, these operations are:
+<BLOCKQUOTE>
 <PRE>
-     f32_roundToInt
-     f64_roundToInt
-     extF80_roundToInt
-     f128_roundToInt
+f32_roundToInt
+f64_roundToInt
+extF80_roundToInt
+f128_roundToInt
 </PRE>
+</BLOCKQUOTE>
 </P>
 
 <P>
@@ -596,35 +613,40 @@ Just as for conversions to integer types (<NOBR>section 6.1</NOBR> above), the
 all-in-one <CODE>testfloat</CODE> program is again an exception.
 For <CODE>testfloat</CODE> only, the round-to-integer operations have names of
 these forms:
+<BLOCKQUOTE>
 <PRE>
-     &lt;float&gt;_roundToInt_r_&lt;round&gt;
-     &lt;float&gt;_roundToInt_x
+&lt;float&gt;_roundToInt_r_&lt;round&gt;
+&lt;float&gt;_roundToInt_x
 </PRE>
-For the `<CODE>_r_</CODE>' versions, the <I>inexact</I> exception is never
-raised, and the <CODE>&lt;round&gt;</CODE> component specifies the rounding
-mode as one of `<CODE>near_even</CODE>', `<CODE>near_maxMag</CODE>',
-`<CODE>minMag</CODE>', `<CODE>min</CODE>', or `<CODE>max</CODE>'.
+</BLOCKQUOTE>
+For the &lsquo;<CODE>_r_</CODE>&rsquo; versions, the <I>inexact</I> exception
+is never raised, and the <CODE>&lt;round&gt;</CODE> component specifies the
+rounding mode as one of &lsquo;<CODE>near_even</CODE>&rsquo;,
+&lsquo;<CODE>near_maxMag</CODE>&rsquo;, &lsquo;<CODE>minMag</CODE>&rsquo;,
+&lsquo;<CODE>min</CODE>&rsquo;, or &lsquo;<CODE>max</CODE>&rsquo;.
 The usual indication of rounding mode is ignored.
-In contrast, the `<CODE>_x</CODE>' versions accept the usual indication of
-rounding mode and raise the <I>inexact</I> exception whenever the result is not
-exact.
-This irregular system follows the IEEE Standard's precise specification for the
-round-to-integer operations.
+In contrast, the &lsquo;<CODE>_x</CODE>&rsquo; versions accept the usual
+indication of rounding mode and raise the <I>inexact</I> exception whenever the
+result is not exact.
+This irregular system follows the IEEE Standard&rsquo;s precise specification
+for the round-to-integer operations.
 </P>
 
 <H3>6.6. Comparison Operations</H3>
 
 <P>
 The following floating-point comparison operations can be tested:
+<BLOCKQUOTE>
 <PRE>
-     f32_eq      f32_le      f32_lt
-     f64_eq      f64_le      f64_lt
-     extF80_eq   extF80_le   extF80_lt
-     f128_eq     f128_le     f128_lt
+f32_eq      f32_le      f32_lt
+f64_eq      f64_le      f64_lt
+extF80_eq   extF80_le   extF80_lt
+f128_eq     f128_le     f128_lt
 </PRE>
-The abbreviation <CODE>eq</CODE> stands for ``equal'' (=), <CODE>le</CODE>
-stands for ``less than or equal'' (&le;), and <CODE>lt</CODE> stands for
-``less than'' (&lt;).
+</BLOCKQUOTE>
+The abbreviation <CODE>eq</CODE> stands for &ldquo;equal&rdquo; (=),
+<CODE>le</CODE> stands for &ldquo;less than or equal&rdquo; (&le;), and
+<CODE>lt</CODE> stands for &ldquo;less than&rdquo; (&lt;).
 </P>
 
 <P>
@@ -635,12 +657,14 @@ The equality comparisons, on the other hand, are defined by default to raise
 the <I>invalid</I> exception only for signaling NaNs, not for quiet NaNs.
 For completeness, the following additional operations can be tested if
 supported:
+<BLOCKQUOTE>
 <PRE>
-     f32_eq_signaling      f32_le_quiet      f32_lt_quiet
-     f64_eq_signaling      f64_le_quiet      f64_lt_quiet
-     extF80_eq_signaling   extF80_le_quiet   extF80_lt_quiet
-     f128_eq_signaling     f128_le_quiet     f128_lt_quiet
+f32_eq_signaling      f32_le_quiet      f32_lt_quiet
+f64_eq_signaling      f64_le_quiet      f64_lt_quiet
+extF80_eq_signaling   extF80_le_quiet   extF80_lt_quiet
+f128_eq_signaling     f128_le_quiet     f128_lt_quiet
 </PRE>
+</BLOCKQUOTE>
 The <CODE>signaling</CODE> equality comparisons are identical to the standard
 operations except that the <I>invalid</I> exception should be raised for any
 NaN input.
@@ -658,8 +682,8 @@ Any rounding mode is ignored.
 <H2>7. Interpreting TestFloat Output</H2>
 
 <P>
-The ``errors'' reported by TestFloat programs may or may not really represent
-errors in the system being tested.
+The &ldquo;errors&rdquo; reported by TestFloat programs may or may not really
+represent errors in the system being tested.
 For each test case tried, the results from the floating-point implementation
 being tested could differ from the expected results for several reasons:
 <UL>
@@ -694,14 +718,16 @@ For each reported error (or apparent error), a line of text is written to the
 default output.
 If a line would be longer than 79 characters, it is divided.
 The first part of each error line begins in the leftmost column, and any
-subsequent ``continuation'' lines are indented with a tab.
+subsequent &ldquo;continuation&rdquo; lines are indented with a tab.
 </P>
 
 <P>
 Each error reported is of the form:
+<BLOCKQUOTE>
 <PRE>
-     &lt;inputs&gt;  => &lt;observed-output&gt;  expected: &lt;expected-output&gt;
+&lt;inputs&gt;  => &lt;observed-output&gt;  expected: &lt;expected-output&gt;
 </PRE>
+</BLOCKQUOTE>
 The <CODE>&lt;inputs&gt;</CODE> are the inputs to the operation.
 Each output (observed and expected) is shown as a pair:  the result value
 first, followed by the exception flags.
@@ -709,10 +735,12 @@ first, followed by the exception flags.
 
 <P>
 For example, two typical error lines could be
+<BLOCKQUOTE>
 <PRE>
-     800.7FFF00  87F.000100  => 001.000000 ...ux  expected: 001.000000 ....x
-     081.000004  000.1FFFFF  => 001.000000 ...ux  expected: 001.000000 ....x
+800.7FFF00  87F.000100  => 001.000000 ...ux  expected: 001.000000 ....x
+081.000004  000.1FFFFF  => 001.000000 ...ux  expected: 001.000000 ....x
 </PRE>
+</BLOCKQUOTE>
 In the first line, the inputs are <CODE>800.7FFF00</CODE> and
 <CODE>87F.000100</CODE>, and the observed result is <CODE>001.000000</CODE>
 with flags <CODE>...ux</CODE>.
@@ -732,8 +760,9 @@ Four are floating-point types:  <NOBR>32-bit</NOBR> single-precision,
 <NOBR>64-bit</NOBR> double-precision, <NOBR>80-bit</NOBR>
 double-extended-precision, and <NOBR>128-bit</NOBR> quadruple-precision.
 The remaining five types are <NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR>
-unsigned integers, <NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR> two's-complement
-signed integers, and Boolean values (the results of comparison operations).
+unsigned integers, <NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR>
+two&rsquo;s-complement signed integers, and Boolean values (the results of
+comparison operations).
 Boolean values are represented as a single character, either a <CODE>0</CODE>
 or a <CODE>1</CODE>.
 <NOBR>32-bit</NOBR> integers are represented as 8 hexadecimal digits.
@@ -749,47 +778,93 @@ hexadecimal digits that give the raw bits of the floating-point encoding.
 A period separates the 3rd and 4th hexadecimal digits to mark the division
 between the exponent bits and fraction bits.
 Some notable <NOBR>64-bit</NOBR> double-precision values include:
-<PRE>
-     000.0000000000000    +0
-     3FF.0000000000000     1
-     400.0000000000000     2
-     7FF.0000000000000    +infinity
-
-     800.0000000000000    -0
-     BFF.0000000000000    -1
-     C00.0000000000000    -2
-     FFF.0000000000000    -infinity
-
-     3FE.FFFFFFFFFFFFF    largest representable number less than +1
-</PRE>
+<BLOCKQUOTE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
+<TR>
+  <TD><CODE>000.0000000000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
+  <TD>+0</TD>
+</TR>
+<TR><TD><CODE>3FF.0000000000000</CODE></TD><TD>&nbsp;1</TD></TR>
+<TR><TD><CODE>400.0000000000000</CODE></TD><TD>&nbsp;2</TD></TR>
+<TR><TD><CODE>7FF.0000000000000</CODE></TD><TD>+infinity</TD></TR>
+<TR><TD>&nbsp;</TD></TR>
+<TR><TD><CODE>800.0000000000000</CODE></TD><TD>&minus;0</TD></TR>
+<TR><TD><CODE>BFF.0000000000000</CODE></TD><TD>&minus;1</TD></TR>
+<TR><TD><CODE>C00.0000000000000</CODE></TD><TD>&minus;2</TD></TR>
+<TR><TD><CODE>FFF.0000000000000</CODE></TD><TD>&minus;infinity</TD></TR>
+<TR><TD>&nbsp;</TD></TR>
+<TR>
+  <TD><CODE>3FE.FFFFFFFFFFFFF</CODE></TD>
+  <TD>largest representable number less than +1</TD>
+</TR>
+</TABLE>
+</BLOCKQUOTE>
 The following categories are easily distinguished (assuming the
 <CODE>x</CODE>s are not all 0):
-<PRE>
-     000.xxxxxxxxxxxxx    positive subnormal (denormalized) numbers
-     7FF.xxxxxxxxxxxxx    positive NaNs
-     800.xxxxxxxxxxxxx    negative subnormal numbers
-     FFF.xxxxxxxxxxxxx    negative NaNs
-</PRE>
+<BLOCKQUOTE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
+<TR>
+  <TD><CODE>000.xxxxxxxxxxxxx&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
+  <TD>positive subnormal (denormalized) numbers</TD>
+</TR>
+<TR><TD><CODE>7FF.xxxxxxxxxxxxx</CODE></TD><TD>positive NaNs</TD></TR>
+<TR>
+  <TD><CODE>800.xxxxxxxxxxxxx</CODE></TD>
+  <TD>negative subnormal numbers</TD>
+</TR>
+<TR><TD><CODE>FFF.xxxxxxxxxxxxx</CODE></TD><TD>negative NaNs</TD></TR>
+</TABLE>
+</BLOCKQUOTE>
 </P>
 
 <P>
 <NOBR>128-bit</NOBR> quadruple-precision values are written the same except
 with 4 hexadecimal digits for the sign and exponent and 28 for the fraction.
 Notable values include:
-<PRE>
-     0000.0000000000000000000000000000    +0
-     3FFF.0000000000000000000000000000     1
-     4000.0000000000000000000000000000     2
-     7FFF.0000000000000000000000000000    +infinity
-
-     8000.0000000000000000000000000000    -0
-     BFFF.0000000000000000000000000000    -1
-     C000.0000000000000000000000000000    -2
-     FFFF.0000000000000000000000000000    -infinity
-
-     3FFE.FFFFFFFFFFFFFFFFFFFFFFFFFFFF    largest representable number
-                                              less than +1
-</PRE>
+<BLOCKQUOTE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
+<TR>
+  <TD>
+    <CODE>0000.0000000000000000000000000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE>
+  </TD>
+  <TD>+0</TD>
+</TR>
+<TR>
+  <TD><CODE>3FFF.0000000000000000000000000000</CODE></TD>
+  <TD>&nbsp;1</TD>
+</TR>
+<TR>
+  <TD><CODE>4000.0000000000000000000000000000</CODE></TD>
+  <TD>&nbsp;2</TD>
+</TR>
+<TR>
+  <TD><CODE>7FFF.0000000000000000000000000000</CODE></TD>
+  <TD>+infinity</TD>
+</TR>
+<TR><TD>&nbsp;</TD></TR>
+<TR>
+  <TD><CODE>8000.0000000000000000000000000000</CODE></TD>
+  <TD>&minus;0</TD>
+</TR>
+<TR>
+  <TD><CODE>BFFF.0000000000000000000000000000</CODE></TD>
+  <TD>&minus;1</TD>
+</TR>
+<TR>
+  <TD><CODE>C000.0000000000000000000000000000</CODE></TD>
+  <TD>&minus;2</TD>
+</TR>
+<TR>
+  <TD><CODE>FFFF.0000000000000000000000000000</CODE></TD>
+  <TD>&minus;infinity</TD>
+</TR>
+<TR><TD>&nbsp;</TD></TR>
+<TR>
+  <TD><CODE>3FFE.FFFFFFFFFFFFFFFFFFFFFFFFFFFF</CODE></TD>
+  <TD>largest representable number less than +1</TD>
+</TR>
+</TABLE>
+</BLOCKQUOTE>
 </P>
 
 <P>
@@ -801,19 +876,27 @@ and will be 1 otherwise.
 Hence, the same values listed above appear in <NOBR>80-bit</NOBR>
 double-extended-precision as follows (note the leading <CODE>8</CODE> digit in
 the significands):
-<PRE>
-     0000.0000000000000000    +0
-     3FFF.8000000000000000     1
-     4000.8000000000000000     2
-     7FFF.8000000000000000    +infinity
-
-     8000.0000000000000000    -0
-     BFFF.8000000000000000    -1
-     C000.8000000000000000    -2
-     FFFF.8000000000000000    -infinity
-
-     3FFE.FFFFFFFFFFFFFFFF    largest representable number less than +1
-</PRE>
+<BLOCKQUOTE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
+<TR>
+  <TD><CODE>0000.0000000000000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
+  <TD>+0</TD>
+</TR>
+<TR><TD><CODE>3FFF.8000000000000000</CODE></TD><TD>&nbsp;1</TD></TR>
+<TR><TD><CODE>4000.8000000000000000</CODE></TD><TD>&nbsp;2</TD></TR>
+<TR><TD><CODE>7FFF.8000000000000000</CODE></TD><TD>+infinity</TD></TR>
+<TR><TD>&nbsp;</TD></TR>
+<TR><TD><CODE>8000.0000000000000000</CODE></TD><TD>&minus;0</TD></TR>
+<TR><TD><CODE>BFFF.8000000000000000</CODE></TD><TD>&minus;1</TD></TR>
+<TR><TD><CODE>C000.8000000000000000</CODE></TD><TD>&minus;2</TD></TR>
+<TR><TD><CODE>FFFF.8000000000000000</CODE></TD><TD>&minus;infinity</TD></TR>
+<TR><TD>&nbsp;</TD></TR>
+<TR>
+  <TD><CODE>3FFE.FFFFFFFFFFFFFFFF</CODE></TD>
+  <TD>largest representable number less than +1</TD>
+</TR>
+</TABLE>
+</BLOCKQUOTE>
 </P>
 
 <P>
@@ -826,11 +909,13 @@ These are written as 9 hexadecimal digits, with a period separating the 3rd and
 4th hexadecimal digits.
 Broken out into bits, the 9 hexademical digits cover the <NOBR>32-bit</NOBR>
 single-precision subfields as follows:
+<BLOCKQUOTE>
 <PRE>
-     x000 .... ....  .  .... .... .... .... .... ....    sign       (1 bit)
-     .... xxxx xxxx  .  .... .... .... .... .... ....    exponent   (8 bits)
-     .... .... ....  .  0xxx xxxx xxxx xxxx xxxx xxxx    fraction  (23 bits)
+x000 .... ....  .  .... .... .... .... .... ....    sign       (1 bit)
+.... xxxx xxxx  .  .... .... .... .... .... ....    exponent   (8 bits)
+.... .... ....  .  0xxx xxxx xxxx xxxx xxxx xxxx    fraction  (23 bits)
 </PRE>
+</BLOCKQUOTE>
 As shown in this schematic, the first hexadecimal digit contains only the sign,
 and will be either <CODE>0</CODE> <NOBR>or <CODE>8</CODE></NOBR>.
 The next two digits give the biased exponent as an <NOBR>8-bit</NOBR> integer.
@@ -841,27 +926,37 @@ The most significant hexadecimal digit of the fraction can be at most
 
 <P>
 Notable single-precision values include:
-<PRE>
-     000.000000    +0
-     07F.000000     1
-     080.000000     2
-     0FF.000000    +infinity
-
-     800.000000    -0
-     87F.000000    -1
-     880.000000    -2
-     8FF.000000    -infinity
-
-     07E.7FFFFF    largest representable number less than +1
-</PRE>
+<BLOCKQUOTE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
+<TR><TD><CODE>000.000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD><TD>+0</TD></TR>
+<TR><TD><CODE>07F.000000</CODE></TD><TD>&nbsp;1</TD></TR>
+<TR><TD><CODE>080.000000</CODE></TD><TD>&nbsp;2</TD></TR>
+<TR><TD><CODE>0FF.000000</CODE></TD><TD>+infinity</TD></TR>
+<TR><TD>&nbsp;</TD></TR>
+<TR><TD><CODE>800.000000</CODE></TD><TD>&minus;0</TD></TR>
+<TR><TD><CODE>87F.000000</CODE></TD><TD>&minus;1</TD></TR>
+<TR><TD><CODE>880.000000</CODE></TD><TD>&minus;2</TD></TR>
+<TR><TD><CODE>8FF.000000</CODE></TD><TD>&minus;infinity</TD></TR>
+<TR><TD>&nbsp;</TD></TR>
+<TR>
+  <TD><CODE>07E.7FFFFF</CODE></TD>
+  <TD>largest representable number less than +1</TD>
+</TR>
+</TABLE>
+</BLOCKQUOTE>
 Again, certain categories are easily distinguished (assuming the
 <CODE>x</CODE>s are not all 0):
-<PRE>
-     000.xxxxxx    positive subnormal (denormalized) numbers
-     0FF.xxxxxx    positive NaNs
-     800.xxxxxx    negative subnormal numbers
-     8FF.xxxxxx    negative NaNs
-</PRE>
+<BLOCKQUOTE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
+<TR>
+  <TD><CODE>000.xxxxxx&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
+  <TD>positive subnormal (denormalized) numbers</TD>
+</TR>
+<TR><TD><CODE>0FF.xxxxxx</CODE></TD><TD>positive NaNs</TD></TR>
+<TR><TD><CODE>800.xxxxxx</CODE></TD><TD>negative subnormal numbers</TD></TR>
+<TR><TD><CODE>8FF.xxxxxx</CODE></TD><TD>negative NaNs</TD></TR>
+</TABLE>
+</BLOCKQUOTE>
 </P>
 
 <P>
@@ -871,13 +966,21 @@ Each flag is written as either a letter or a period (<CODE>.</CODE>) according
 to whether the flag was set or not by the operation.
 A period indicates the flag was not set.
 The letter used to indicate a set flag depends on the flag:
-<PRE>
-     v    invalid exception
-     i    infinite exception ("divide by zero")
-     o    overflow exception
-     u    underflow exception
-     x    inexact exception
-</PRE>
+<BLOCKQUOTE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
+<TR>
+  <TD><CODE>v&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
+  <TD>invalid exception</TD>
+</TR>
+<TR>
+  <TD><CODE>i</CODE></TD>
+  <TD>infinite exception (&ldquo;divide by zero&rdquo;)</TD>
+</TR>
+<TR><TD><CODE>o</CODE></TD><TD>overflow exception</TD></TR>
+<TR><TD><CODE>u</CODE></TD><TD>underflow exception</TD></TR>
+<TR><TD><CODE>x</CODE></TD><TD>inexact exception</TD></TR>
+</TABLE>
+</BLOCKQUOTE>
 For example, the notation <CODE>...ux</CODE> indicates that the
 <I>underflow</I> and <I>inexact</I> exception flags were set and that the other
 three flags (<I>invalid</I>, <I>infinite</I>, and <I>overflow</I>) were not
-- 
cgit v1.1


1. Introduction
2. Limitations
3. Acknowledgments and License
4. What TestFloat Does
5. Executing TestFloat
6. Operations Tested by TestFloat
	6.1. Conversion Operations
	6.2. Basic Arithmetic Operations
	6.3. Fused Multiply-Add Operations
	6.4. Remainder Operations
	6.5. Round-to-Integer Operations
	6.6. Comparison Operations
7. Interpreting TestFloat Output
8. Variations Allowed by the IEEE Floating-Point Standard
	8.1. Underflow
	8.2. NaNs
	8.3. Conversions to Integer
9. Contact Information