Finalized documentation for SoftFloat Release 3.

author: John Hauser <jhauser@eecs.berkeley.edu> 2014-12-17 19:08:03 -0800
committer: John Hauser <jhauser@eecs.berkeley.edu> 2014-12-17 19:08:03 -0800
commit: 7276b0022ec5f461af9c3b4a1fe2e5526825b58e (patch)
tree: 3bdcfb60a30c5161db65da8d110be9434e0ffad3 /doc/SoftFloat.html
parent: 437d9b9fb281962ea10d5e4475e3851eaa7ffd25 (diff)
download: berkeley-softfloat-3-7276b0022ec5f461af9c3b4a1fe2e5526825b58e.zip
berkeley-softfloat-3-7276b0022ec5f461af9c3b4a1fe2e5526825b58e.tar.gz
berkeley-softfloat-3-7276b0022ec5f461af9c3b4a1fe2e5526825b58e.tar.bz2
1 files changed, 185 insertions, 165 deletions
diff --git a/doc/SoftFloat.html b/doc/SoftFloat.html
index fa3919a..d406d91 100644
--- a/doc/SoftFloat.html
+++ b/doc/SoftFloat.html
@@ -11,66 +11,59 @@
 
 <P>
 John R. Hauser<BR>
-2014 ______<BR>
-</P>
-
-<P>
-*** CONTENT DONE.
-</P>
-
-<P>
-*** REPLACE QUOTATION MARKS.
-<BR>
-*** REPLACE APOSTROPHES.
-<BR>
-*** REPLACE EM DASH.
+2014 Dec 17<BR>
 </P>
 
 
 <H2>Contents</H2>
 
-<P>
-*** CHECK.<BR>
-*** FIX FORMATTING.
-</P>
-
-<PRE>
-    Introduction
-    Limitations
-    Acknowledgments and License
-    Types and Functions
-        Boolean and Integer Types
-        Floating-Point Types
-        Supported Floating-Point Functions
-        Non-canonical Representations in extFloat80_t
-        Conventions for Passing Arguments and Results
-    Reserved Names
-    Mode Variables
-        Rounding Mode
-        Underflow Detection
-        Rounding Precision for 80-Bit Extended Format
-    Exceptions and Exception Flags
-    Function Details
-        Conversions from Integer to Floating-Point
-        Conversions from Floating-Point to Integer
-        Conversions Among Floating-Point Types
-        Basic Arithmetic Functions
-        Fused Multiply-Add Functions
-        Remainder Functions
-        Round-to-Integer Functions
-        Comparison Functions
-        Signaling NaN Test Functions
-        Raise-Exception Function
-    Changes from SoftFloat Release 2
-        Name Changes
-        Changes to Function Arguments
-        Added Capabilities
-        Better Compatibility with the C Language
-        New Organization as a Library
-        Optimization Gains (and Losses)
-    Future Directions
-    Contact Information
-</PRE>
+<BLOCKQUOTE>
+<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
+<COL WIDTH=25>
+<COL WIDTH=*>
+<TR><TD COLSPAN=2>1. Introduction</TD></TR>
+<TR><TD COLSPAN=2>2. Limitations</TD></TR>
+<TR><TD COLSPAN=2>3. Acknowledgments and License</TD></TR>
+<TR><TD COLSPAN=2>4. Types and Functions</TD></TR>
+<TR><TD></TD><TD>4.1. Boolean and Integer Types</TD></TR>
+<TR><TD></TD><TD>4.2. Floating-Point Types</TD></TR>
+<TR><TD></TD><TD>4.3. Supported Floating-Point Functions</TD></TR>
+<TR>
+  <TD></TD>
+  <TD>4.4. Non-canonical Representations in <CODE>extFloat80_t</CODE></TD>
+</TR>
+<TR><TD></TD><TD>4.5. Conventions for Passing Arguments and Results</TD></TR>
+<TR><TD COLSPAN=2>5. Reserved Names</TD></TR>
+<TR><TD COLSPAN=2>6. Mode Variables</TD></TR>
+<TR><TD></TD><TD>6.1. Rounding Mode</TD></TR>
+<TR><TD></TD><TD>6.2. Underflow Detection</TD></TR>
+<TR>
+  <TD></TD>
+  <TD>6.3. Rounding Precision for the <NOBR>80-Bit</NOBR> Extended Format</TD>
+</TR>
+<TR><TD COLSPAN=2>7. Exceptions and Exception Flags</TD></TR>
+<TR><TD COLSPAN=2>8. Function Details</TD></TR>
+<TR><TD></TD><TD>8.1. Conversions from Integer to Floating-Point</TD></TR>
+<TR><TD></TD><TD>8.2. Conversions from Floating-Point to Integer</TD></TR>
+<TR><TD></TD><TD>8.3. Conversions Among Floating-Point Types</TD></TR>
+<TR><TD></TD><TD>8.4. Basic Arithmetic Functions</TD></TR>
+<TR><TD></TD><TD>8.5. Fused Multiply-Add Functions</TD></TR>
+<TR><TD></TD><TD>8.6. Remainder Functions</TD></TR>
+<TR><TD></TD><TD>8.7. Round-to-Integer Functions</TD></TR>
+<TR><TD></TD><TD>8.8. Comparison Functions</TD></TR>
+<TR><TD></TD><TD>8.9. Signaling NaN Test Functions</TD></TR>
+<TR><TD></TD><TD>8.10. Raise-Exception Function</TD></TR>
+<TR><TD COLSPAN=2>9. Changes from SoftFloat <NOBR>Release 2</NOBR></TD></TR>
+<TR><TD></TD><TD>9.1. Name Changes</TD></TR>
+<TR><TD></TD><TD>9.2. Changes to Function Arguments</TD></TR>
+<TR><TD></TD><TD>9.3. Added Capabilities</TD></TR>
+<TR><TD></TD><TD>9.4. Better Compatibility with the C Language</TD></TR>
+<TR><TD></TD><TD>9.5. New Organization as a Library</TD></TR>
+<TR><TD></TD><TD>9.6. Optimization Gains (and Losses)</TD></TR>
+<TR><TD COLSPAN=2>10. Future Directions</TD></TR>
+<TR><TD COLSPAN=2>11. Contact Information</TD></TR>
+</TABLE>
+</BLOCKQUOTE>
 
 
 <H2>1. Introduction</H2>
@@ -156,15 +149,20 @@ SoftFloat <NOBR>Release 3</NOBR>.
 The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser.
 <NOBR>Release 3</NOBR> of SoftFloat is a completely new implementation
 supplanting earlier releases.
-This project was done in the employ of the University of California, Berkeley,
-within the Department of Electrical Engineering and Computer Sciences, first
-for the Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab.
+This project (<NOBR>Release 3</NOBR> only, not earlier releases) was done in
+the employ of the University of California, Berkeley, within the Department of
+Electrical Engineering and Computer Sciences, first for the Parallel Computing
+Laboratory (Par Lab) and then for the ASPIRE Lab.
 The work was officially overseen by Prof. Krste Asanovic, with funding provided
 by these sources:
 <BLOCKQUOTE>
 <TABLE>
+<COL WIDTH=*>
+<COL WIDTH=10>
+<COL WIDTH=*>
 <TR>
-<TD><NOBR>Par Lab:</NOBR></TD>
+<TD VALIGN=TOP><NOBR>Par Lab:</NOBR></TD>
+<TD></TD>
 <TD>
 Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
 (Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
@@ -172,7 +170,8 @@ NVIDIA, Oracle, and Samsung.
 </TD>
 </TR>
 <TR>
-<TD><NOBR>ASPIRE Lab:</NOBR></TD>
+<TD VALIGN=TOP><NOBR>ASPIRE Lab:</NOBR></TD>
+<TD></TD>
 <TD>
 DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
 ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
@@ -245,16 +244,18 @@ for these headers.
 Header <CODE>softfloat.h</CODE> depends only on the name <CODE>bool</CODE> from
 <CODE>&lt;stdbool.h&gt;</CODE> and on these type names from
 <CODE>&lt;stdint.h&gt;</CODE>:
+<BLOCKQUOTE>
 <PRE>
-     uint16_t
-     uint32_t
-     uint64_t
-     int32_t
-     int64_t
-     uint_fast8_t
-     uint_fast32_t
-     uint_fast64_t
+uint16_t
+uint32_t
+uint64_t
+int32_t
+int64_t
+uint_fast8_t
+uint_fast32_t
+uint_fast64_t
 </PRE>
+</BLOCKQUOTE>
 </P>
 
 
@@ -263,26 +264,22 @@ Header <CODE>softfloat.h</CODE> depends only on the name <CODE>bool</CODE> from
 <P>
 The <CODE>softfloat.h</CODE> header defines four floating-point types:
 <BLOCKQUOTE>
-<TABLE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
 <TR>
 <TD><CODE>float32_t</CODE></TD>
-<TD>&nbsp;</TD>
 <TD><NOBR>32-bit</NOBR> single-precision binary format</TD>
 </TR>
 <TR>
 <TD><CODE>float64_t</CODE></TD>
-<TD>&nbsp;</TD>
 <TD><NOBR>64-bit</NOBR> double-precision binary format</TD>
 </TR>
 <TR>
-<TD><CODE>extFloat80_t</CODE></TD>
-<TD>&nbsp;</TD>
+<TD><CODE>extFloat80_t&nbsp;&nbsp;&nbsp;</CODE></TD>
 <TD><NOBR>80-bit</NOBR> double-extended-precision binary format (old Intel or
 Motorola format)</TD>
 </TR>
 <TR>
 <TD><CODE>float128_t</CODE></TD>
-<TD>&nbsp;</TD>
 <TD><NOBR>128-bit</NOBR> quadruple-precision binary format</TD>
 </TR>
 </TABLE>
@@ -304,10 +301,10 @@ Header file <CODE>softfloat.h</CODE> also defines a structure,
 This structure is the same size as type <CODE>extFloat80_t</CODE> and contains
 at least these two fields (not necessarily in this order):
 <BLOCKQUOTE>
-<TABLE>
-<TR><TD><CODE>uint16_t signExp;</CODE></TD></TR>
-<TR><TD><CODE>uint64_t signif;</CODE></TD></TR>
-</TABLE>
+<PRE>
+uint16_t signExp;
+uint64_t signif;
+</PRE>
 </BLOCKQUOTE>
 Field <CODE>signExp</CODE> contains the sign and exponent of the floating-point
 value, with the sign in the most significant bit (<NOBR>bit 15</NOBR>) and the
@@ -339,8 +336,8 @@ operation defined by the IEEE Standard;
 for each format, the floating-point remainder operation defined by the IEEE
 Standard;
 <LI>
-for each format, a ``round to integer'' operation that rounds to the nearest
-integer value in the same format; and
+for each format, a &ldquo;round to integer&rdquo; operation that rounds to the
+nearest integer value in the same format; and
 <LI>
 comparisons between two values in the same floating-point format.
 </UL>
@@ -357,12 +354,12 @@ not supported in SoftFloat <NOBR>Release 3</NOBR>:
 conversions between floating-point formats and decimal or hexadecimal character
 sequences;
 <LI>
-all ``quiet-computation'' operations (<B>copy</B>, <B>negate</B>, <B>abs</B>,
-and <B>copySign</B>, which all involve only simple copying and/or manipulation
-of the floating-point sign bit); and
+all &ldquo;quiet-computation&rdquo; operations (<B>copy</B>, <B>negate</B>,
+<B>abs</B>, and <B>copySign</B>, which all involve only simple copying and/or
+manipulation of the floating-point sign bit); and
 <LI>
-all ``non-computational'' operations other than <B>isSignaling</B> (which is
-supported).
+all &ldquo;non-computational&rdquo; operations other than <B>isSignaling</B>
+(which is supported).
 </UL>
 </P>
 
@@ -393,9 +390,9 @@ leading significand bit must <NOBR>be 1</NOBR> unless it is required to
 For <NOBR>Release 3</NOBR> of SoftFloat, functions are not guaranteed to
 operate as expected when inputs of type <CODE>extFloat80_t</CODE> are
 non-canonical.
-Assuming all of a function's <CODE>extFloat80_t</CODE> inputs (if any) are
-canonical, function outputs of type <CODE>extFloat80_t</CODE> will always be
-canonical.
+Assuming all of a function&rsquo;s <CODE>extFloat80_t</CODE> inputs (if any)
+are canonical, function outputs of type <CODE>extFloat80_t</CODE> will always
+be canonical.
 </P>
 
 <H3>4.5. Conventions for Passing Arguments and Results</H3>
@@ -426,8 +423,8 @@ SoftFloat supplies this function:
 The first two arguments point to the values to be added, and the last argument
 points to the location where the sum will be stored.
 The <CODE>M</CODE> in the name <CODE>f128M_add</CODE> is mnemonic for the fact
-that the <NOBR>128-bit</NOBR> inputs and outputs are ``in memory'', pointed to
-by pointer arguments.
+that the <NOBR>128-bit</NOBR> inputs and outputs are &ldquo;in memory&rdquo;,
+pointed to by pointer arguments.
 </P>
 
 <P>
@@ -464,10 +461,11 @@ platforms of interest, programmers can use whichever version they prefer.
 <P>
 In addition to the variables and functions documented here, SoftFloat defines
 some symbol names for its own private use.
-These private names always begin with the prefix `<CODE>softfloat_</CODE>'.
+These private names always begin with the prefix
+&lsquo;<CODE>softfloat_</CODE>&rsquo;.
 When a program includes header <CODE>softfloat.h</CODE> or links with the
-SoftFloat library, all names with prefix `<CODE>softfloat_</CODE>' are reserved
-for possible use by SoftFloat.
+SoftFloat library, all names with prefix &lsquo;<CODE>softfloat_</CODE>&rsquo;
+are reserved for possible use by SoftFloat.
 Applications that use SoftFloat should not define their own names with this
 prefix, and should reference only such names as are documented.
 </P>
@@ -477,7 +475,7 @@ prefix, and should reference only such names as are documented.
 
 <P>
 The following variables control rounding mode, underflow detection, and the
-<NOBR>80-bit</NOBR> extended format's rounding precision:
+<NOBR>80-bit</NOBR> extended format&rsquo;s rounding precision:
 <BLOCKQUOTE>
 <CODE>softfloat_roundingMode</CODE><BR>
 <CODE>softfloat_detectTininess</CODE><BR>
@@ -497,30 +495,25 @@ The rounding mode is selected by the global variable
 </BLOCKQUOTE>
 This variable may be set to one of the values
 <BLOCKQUOTE>
-<TABLE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
 <TR>
 <TD><CODE>softfloat_round_near_even</CODE></TD>
-<TD>&nbsp;</TD>
 <TD>round to nearest, with ties to even</TD>
 </TR>
 <TR>
-<TD><CODE>softfloat_round_near_maxMag</CODE></TD>
-<TD>&nbsp;</TD>
+<TD><CODE>softfloat_round_near_maxMag&nbsp;&nbsp;</CODE></TD>
 <TD>round to nearest, with ties to maximum magnitude (away from zero)</TD>
 </TR>
 <TR>
 <TD><CODE>softfloat_round_minMag</CODE></TD>
-<TD>&nbsp;</TD>
 <TD>round to minimum magnitude (toward zero)</TD>
 </TR>
 <TR>
 <TD><CODE>softfloat_round_min</CODE></TD>
-<TD>&nbsp;</TD>
 <TD>round to minimum (down)</TD>
 </TR>
 <TR>
 <TD><CODE>softfloat_round_max</CODE></TD>
-<TD>&nbsp;</TD>
 <TD>round to maximum (up)</TD>
 </TR>
 </TABLE>
@@ -550,7 +543,7 @@ Like most systems (and as required by the newer 2008 IEEE Standard), SoftFloat
 always detects loss of accuracy for underflow as an inexact result.
 </P>
 
-<H3>6.3. Rounding Precision for 80-Bit Extended Format</H3>
+<H3>6.3. Rounding Precision for the <NOBR>80-Bit</NOBR> Extended Format</H3>
 
 <P>
 For <CODE>extFloat80_t</CODE> only, the rounding precision of the basic
@@ -639,7 +632,7 @@ It does always raise the <I>inexact</I> exception flag as required.
 In this section, <CODE>&lt;<I>float</I>&gt;</CODE> appears in function names as
 a substitute for one of these abbreviations:
 <BLOCKQUOTE>
-<TABLE>
+<TABLE CELLSPACING=0 CELLPADDING=0>
 <TR>
 <TD><CODE>f32</CODE></TD>
 <TD>indicates <CODE>float32_t</CODE>, passed by value</TD>
@@ -696,11 +689,14 @@ Each conversion function takes one input of the appropriate type and generates
 one output.
 The following illustrates the signatures of these functions in cases when the
 floating-point result is passed either by value or via pointers:
+<BLOCKQUOTE>
 <PRE>
-     float64_t i32_to_f64( int32_t <I>a</I> );
-
-     void i32_to_f128M( int32_t <I>a</I>, float128_t *<I>destPtr</I> );
+float64_t i32_to_f64( int32_t <I>a</I> );
 </PRE>
+<PRE>
+void i32_to_f128M( int32_t <I>a</I>, float128_t *<I>destPtr</I> );
+</PRE>
+</BLOCKQUOTE>
 </P>
 
 <H3>8.2. Conversions from Floating-Point to Integer</H3>
@@ -717,12 +713,15 @@ functions:
 </BLOCKQUOTE>
 The functions have signatures as follows, depending on whether the
 floating-point input is passed by value or via pointers:
+<BLOCKQUOTE>
 <PRE>
-     int32_t f64_to_i32( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
-
-     int32_t
-      f128M_to_i32( const float128_t *<I>aPtr</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
+int32_t f64_to_i32( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
 </PRE>
+<PRE>
+int32_t
+ f128M_to_i32( const float128_t *<I>aPtr</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
+</PRE>
+</BLOCKQUOTE>
 The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode for
 the conversion.
 The variable that usually indicates rounding mode,
@@ -768,12 +767,14 @@ and convenience:
 These functions round only toward zero (to minimum magnitude).
 The signatures for these functions are the same as above without the redundant
 <CODE><I>roundingMode</I></CODE> argument:
+<BLOCKQUOTE>
 <PRE>
-     int32_t f64_to_i32_r_minMag( float64_t <I>a</I>, bool <I>exact</I> );
+int32_t f64_to_i32_r_minMag( float64_t <I>a</I>, bool <I>exact</I> );
 </PRE>
 <PRE>
-     int32_t f128M_to_i32_r_minMag( const float128_t *<I>aPtr</I>, bool <I>exact</I> );
+int32_t f128M_to_i32_r_minMag( const float128_t *<I>aPtr</I>, bool <I>exact</I> );
 </PRE>
+</BLOCKQUOTE>
 </P>
 
 <H3>8.3. Conversions Among Floating-Point Types</H3>
@@ -789,18 +790,20 @@ result are different formats.
 There are four different styles of signature for these functions, depending on
 whether the input and the output floating-point values are passed by value or
 via pointers:
+<BLOCKQUOTE>
 <PRE>
-     float32_t f64_to_f32( float64_t <I>a</I> );
+float32_t f64_to_f32( float64_t <I>a</I> );
 </PRE>
 <PRE>
-     float32_t f128M_to_f32( const float128_t *<I>aPtr</I> );
+float32_t f128M_to_f32( const float128_t *<I>aPtr</I> );
 </PRE>
 <PRE>
-     void f32_to_f128M( float32_t <I>a</I>, float128_t *<I>destPtr</I> );
+void f32_to_f128M( float32_t <I>a</I>, float128_t *<I>destPtr</I> );
 </PRE>
 <PRE>
-     void extF80M_to_f128M( const extFloat80_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
+void extF80M_to_f128M( const extFloat80_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
 </PRE>
+</BLOCKQUOTE>
 </P>
 
 <P>
@@ -823,22 +826,22 @@ Each floating-point operation takes two operands, except for <CODE>sqrt</CODE>
 (square root) which takes only one.
 The operands and result are all of the same floating-point format.
 Signatures for these functions take the following forms:
+<BLOCKQUOTE>
 <PRE>
-     float64_t f64_add( float64_t <I>a</I>, float64_t <I>b</I> );
+float64_t f64_add( float64_t <I>a</I>, float64_t <I>b</I> );
 </PRE>
 <PRE>
-     void
-      f128M_add(
-          const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
+void
+ f128M_add(
+     const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
 </PRE>
-</P>
-<P>
 <PRE>
-     float64_t f64_sqrt( float64_t <I>a</I> );
+float64_t f64_sqrt( float64_t <I>a</I> );
 </PRE>
 <PRE>
-     void f128M_sqrt( const float128_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
+void f128M_sqrt( const float128_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
 </PRE>
+</BLOCKQUOTE>
 When floating-point values are passed indirectly through pointers, arguments
 <CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to the input
 operands, and the last argument, <CODE><I>destPtr</I></CODE>, points to the
@@ -850,7 +853,7 @@ Rounding of the <NOBR>80-bit</NOBR> double-extended-precision
 (<CODE>extFloat80_t</CODE>) functions is affected by variable
 <CODE>extF80_roundingPrecision</CODE>, as explained earlier in
 <NOBR>section 6.3</NOBR>,
-<I>Rounding Precision for <NOBR>80-Bit</NOBR> Extended Format</I>.
+<I>Rounding Precision for the <NOBR>80-Bit</NOBR> Extended Format</I>.
 </P>
 
 <H3>8.5. Fused Multiply-Add Functions</H3>
@@ -873,18 +876,20 @@ No fused multiple-add function is currently provided for the
 <P>
 Depending on whether floating-point values are passed by value or via pointers,
 the fused multiply-add functions have signatures of these forms:
+<BLOCKQUOTE>
 <PRE>
-     float64_t f64_mulAdd( float64_t <I>a</I>, float64_t <I>b</I>, float64_t <I>c</I> );
+float64_t f64_mulAdd( float64_t <I>a</I>, float64_t <I>b</I>, float64_t <I>c</I> );
 </PRE>
 <PRE>
-     void
-      f128M_mulAdd(
-          const float128_t *<I>aPtr</I>,
-          const float128_t *<I>bPtr</I>,
-          const float128_t *<I>cPtr</I>,
-          float128_t *<I>destPtr</I>
-      );
+void
+ f128M_mulAdd(
+     const float128_t *<I>aPtr</I>,
+     const float128_t *<I>bPtr</I>,
+     const float128_t *<I>cPtr</I>,
+     float128_t *<I>destPtr</I>
+ );
 </PRE>
+</BLOCKQUOTE>
 The functions compute
 <NOBR>(<CODE><I>a</I></CODE> &times; <CODE><I>b</I></CODE>)
  + <CODE><I>c</I></CODE></NOBR>
@@ -915,14 +920,16 @@ Each remainder operation takes two floating-point operands of the same format
 and returns a result in the same format.
 Depending on whether floating-point values are passed by value or via pointers,
 the remainder functions have signatures of these forms:
+<BLOCKQUOTE>
 <PRE>
-     float64_t f64_rem( float64_t <I>a</I>, float64_t <I>b</I> );
+float64_t f64_rem( float64_t <I>a</I>, float64_t <I>b</I> );
 </PRE>
 <PRE>
-     void
-      f128M_rem(
-          const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
+void
+ f128M_rem(
+     const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
 </PRE>
+</BLOCKQUOTE>
 When floating-point values are passed indirectly through pointers, arguments
 <CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to operands
 <CODE><I>a</I></CODE> and <CODE><I>b</I></CODE> respectively, and
@@ -938,8 +945,8 @@ where <I>n</I> is the integer closest to
 If <NOBR><CODE><I>a</I></CODE> &divide; <CODE><I>b</I></CODE></NOBR> is exactly
 halfway between two integers, <I>n</I> is the <EM>even</EM> integer closest to
 <NOBR><CODE><I>a</I></CODE> &divide; <CODE><I>b</I></CODE></NOBR>.
-The IEEE Standard's remainder operation is always exact and so requires no
-rounding.
+The IEEE Standard&rsquo;s remainder operation is always exact and so requires
+no rounding.
 </P>
 
 <P>
@@ -968,18 +975,20 @@ and the resulting integer value is returned in the same floating-point format.
 <P>
 The signatures of the round-to-integer functions are similar to those for
 conversions to an integer type:
+<BLOCKQUOTE>
 <PRE>
-     float64_t f64_roundToInt( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
+float64_t f64_roundToInt( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
 </PRE>
 <PRE>
-     void
-      f128M_roundToInt(
-          const float128_t *<I>aPtr</I>,
-          uint_fast8_t <I>roundingMode</I>,
-          bool <I>exact</I>,
-          float128_t *<I>destPtr</I>
-      );
+void
+ f128M_roundToInt(
+     const float128_t *<I>aPtr</I>,
+     uint_fast8_t <I>roundingMode</I>,
+     bool <I>exact</I>,
+     float128_t *<I>destPtr</I>
+ );
 </PRE>
+</BLOCKQUOTE>
 The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode to
 apply.
 The variable that usually indicates rounding mode,
@@ -1005,17 +1014,19 @@ provided:
 <CODE>&lt;<I>float</I>&gt;_lt</CODE>
 </BLOCKQUOTE>
 Each comparison takes two operands of the same type and returns a Boolean.
-The abbreviation <CODE>eq</CODE> stands for ``equal'' (=);
-<CODE>le</CODE> stands for ``less than or equal'' (&le;);
-and <CODE>lt</CODE> stands for ``less than'' (&lt;).
+The abbreviation <CODE>eq</CODE> stands for &ldquo;equal&rdquo; (=);
+<CODE>le</CODE> stands for &ldquo;less than or equal&rdquo; (&le;);
+and <CODE>lt</CODE> stands for &ldquo;less than&rdquo; (&lt;).
 Depending on whether the floating-point operands are passed by value or via
 pointers, the comparison functions have signatures of these forms:
+<BLOCKQUOTE>
 <PRE>
-     bool f64_eq( float64_t <I>a</I>, float64_t <I>b</I> );
+bool f64_eq( float64_t <I>a</I>, float64_t <I>b</I> );
 </PRE>
 <PRE>
-     bool f128M_eq( const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I> );
+bool f128M_eq( const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I> );
 </PRE>
+</BLOCKQUOTE>
 </P>
 
 <P>
@@ -1058,21 +1069,25 @@ provided with these names:
 The functions take one floating-point operand and return a Boolean indicating
 whether the operand is a signaling NaN.
 Accordingly, the functions have the forms
+<BLOCKQUOTE>
 <PRE>
-     bool f64_isSignalingNaN( float64_t <I>a</I> );
+bool f64_isSignalingNaN( float64_t <I>a</I> );
 </PRE>
 <PRE>
-     bool f128M_isSignalingNaN( const float128_t *<I>aPtr</I> );
+bool f128M_isSignalingNaN( const float128_t *<I>aPtr</I> );
 </PRE>
+</BLOCKQUOTE>
 </P>
 
 <H3>8.10. Raise-Exception Function</H3>
 
 <P>
 SoftFloat provides a single function for raising floating-point exceptions:
+<BLOCKQUOTE>
 <PRE>
-     void softfloat_raise( uint_fast8_t <I>exceptions</I> );
+void softfloat_raise( uint_fast8_t <I>exceptions</I> );
 </PRE>
+</BLOCKQUOTE>
 The <CODE><I>exceptions</I></CODE> argument is a mask indicating the set of
 exceptions to raise.
 (See earlier section 7, <I>Exceptions and Exception Flags</I>.)
@@ -1084,6 +1099,11 @@ function may cause a trap or abort appropriate for the current system.
 
 <H2>9. Changes from SoftFloat <NOBR>Release 2</NOBR></H2>
 
+<P>
+Apart from the change in the legal use license, there are numerous technical
+differences between <NOBR>Release 3</NOBR> of SoftFloat and earlier releases.
+</P>
+
 <H3>9.1. Name Changes</H3>
 
 <P>
@@ -1214,17 +1234,17 @@ Lastly, there are a few other changes to function names:
 <TR>
 <TD><CODE>_round_to_zero</CODE></TD>
 <TD><CODE>_r_minMag</CODE></TD>
-<TD>conversions from floating-point to integer, section 8.2</TD>
+<TD>conversions from floating-point to integer (<NOBR>section 8.2</NOBR>)</TD>
 </TR>
 <TR>
 <TD><CODE>round_to_int</CODE></TD>
 <TD><CODE>roundToInt</CODE></TD>
-<TD>round-to-integer functions, section 8.7</TD>
+<TD>round-to-integer functions (<NOBR>section 8.7</NOBR>)</TD>
 </TR>
 <TR>
 <TD><CODE>is_signaling_nan&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
 <TD><CODE>isSignalingNaN</CODE></TD>
-<TD>signaling NaN test functions, section 8.9</TD>
+<TD>signaling NaN test functions (<NOBR>section 8.9</NOBR>)</TD>
 </TR>
 </TABLE>
 </BLOCKQUOTE>
@@ -1296,7 +1316,7 @@ argument <CODE><I>exact</I></CODE>.
 <P>
 With <NOBR>Release 3</NOBR>, a port of SoftFloat can now define any of the
 floating-point types <CODE>float32_t</CODE>, <CODE>float64_t</CODE>,
-<CODE>extFloat80_t</CODE>, and <CODE>float128_t</CODE> as aliases for C's
+<CODE>extFloat80_t</CODE>, and <CODE>float128_t</CODE> as aliases for C&rsquo;s
 standard floating-point types <CODE>float</CODE>, <CODE>double</CODE>, and
 <CODE>long</CODE> <CODE>double</CODE>, using either <CODE>#define</CODE> or
 <CODE>typedef</CODE>.
@@ -1304,9 +1324,9 @@ This potential convenience was not supported under <NOBR>Release 2</NOBR>.
 </P>
 
 <P>
-(Note, however, that there may be a performance cost to defining SoftFloat's
-floating-point types this way, depending on the platform and the applications
-using SoftFloat.
+(Note, however, that there may be a performance cost to defining
+SoftFloat&rsquo;s floating-point types this way, depending on the platform and
+the applications using SoftFloat.
 Ports of SoftFloat may choose to forgo the convenience in favor of better
 speed.)
 </P>
@@ -1338,7 +1358,7 @@ Fused multiply-add functions have been added for the non-extended formats,
 
 <P>
 <NOBR>Release 3</NOBR> of SoftFloat is written to conform better to the ISO C
-Standard's rules for portability.
+Standard&rsquo;s rules for portability.
 For example, older releases of SoftFloat employed type conversions in ways
 that, while commonly practiced, are not fully defined by the C Standard.
 Such problematic type conversions have generally been replaced by the use of
@@ -1387,8 +1407,8 @@ Some loss of speed has been observed due to this change.
 The following improvements are anticipated for future releases of SoftFloat:
 <UL>
 <LI>
-support for the common <NOBR>16-bit</NOBR> ``half-precision'' floating-point
-format;
+support for the common <NOBR>16-bit</NOBR> &ldquo;half-precision&rdquo;
+floating-point format;
 <LI>
 more functions from the 2008 version of the IEEE Floating-Point Standard;
 <LI>
author	John Hauser <jhauser@eecs.berkeley.edu>	2014-12-17 19:08:03 -0800
committer	John Hauser <jhauser@eecs.berkeley.edu>	2014-12-17 19:08:03 -0800
commit	7276b0022ec5f461af9c3b4a1fe2e5526825b58e (patch)
tree	3bdcfb60a30c5161db65da8d110be9434e0ffad3 /doc/SoftFloat.html
parent	437d9b9fb281962ea10d5e4475e3851eaa7ffd25 (diff)
download	berkeley-softfloat-3-7276b0022ec5f461af9c3b4a1fe2e5526825b58e.zip berkeley-softfloat-3-7276b0022ec5f461af9c3b4a1fe2e5526825b58e.tar.gz berkeley-softfloat-3-7276b0022ec5f461af9c3b4a1fe2e5526825b58e.tar.bz2