diff options
author | John Hauser <jhauser@eecs.berkeley.edu> | 2016-07-22 18:03:04 -0700 |
---|---|---|
committer | John Hauser <jhauser@eecs.berkeley.edu> | 2016-07-22 18:03:04 -0700 |
commit | cb5087cd7403acf31ac24ac4be8e019a51904895 (patch) | |
tree | 3eeb55d6ad63e33dc8e3be33614e94bbe8a8cac5 /doc | |
parent | 45fdcf1c6583e4af380b147ac568f5aa721b7ba8 (diff) | |
download | berkeley-softfloat-3-cb5087cd7403acf31ac24ac4be8e019a51904895.zip berkeley-softfloat-3-cb5087cd7403acf31ac24ac4be8e019a51904895.tar.gz berkeley-softfloat-3-cb5087cd7403acf31ac24ac4be8e019a51904895.tar.bz2 |
Release 3b. See "doc/SoftFloat-history.html".
Diffstat (limited to 'doc')
-rw-r--r-- | doc/SoftFloat-history.html | 31 | ||||
-rw-r--r-- | doc/SoftFloat-source.html | 99 | ||||
-rw-r--r-- | doc/SoftFloat.html | 145 |
3 files changed, 172 insertions, 103 deletions
diff --git a/doc/SoftFloat-history.html b/doc/SoftFloat-history.html index 08cab39..f5f7c91 100644 --- a/doc/SoftFloat-history.html +++ b/doc/SoftFloat-history.html @@ -7,14 +7,41 @@ <BODY> -<H1>History of Berkeley SoftFloat, to Release 3a</H1> +<H1>History of Berkeley SoftFloat, to Release 3b</H1> <P> John R. Hauser<BR> -2015 October 23<BR> +2016 July 22<BR> </P> +<H3>Release 3b (2016 July)</H3> + +<UL> + +<LI> +Implemented the common <NOBR>16-bit</NOBR> “half-precision” +floating-point format (<CODE>float16_t</CODE>). + +<LI> +Made the integer values returned on invalid conversions to integer formats +be determined by the port-specific specialization instead of being the same for +all ports. + +<LI> +Added preprocessor macro <CODE>THREAD_LOCAL</CODE> to allow the floating-point +state (modes and exception flags) to be made per-thread. + +<LI> +Modified the provided Makefiles to allow some options to be overridden from the +<CODE>make</CODE> command. + +<LI> +Made other minor improvements. + +</UL> + + <H3>Release 3a (2015 October)</H3> <UL> diff --git a/doc/SoftFloat-source.html b/doc/SoftFloat-source.html index dff77aa..b69565f 100644 --- a/doc/SoftFloat-source.html +++ b/doc/SoftFloat-source.html @@ -7,11 +7,11 @@ <BODY> -<H1>Berkeley SoftFloat Release 3a: Source Documentation</H1> +<H1>Berkeley SoftFloat Release 3b: Source Documentation</H1> <P> John R. Hauser<BR> -2015 October 23<BR> +2016 July 22<BR> </P> @@ -53,7 +53,7 @@ This document gives information needed for compiling and/or porting Berkeley SoftFloat, a library of C functions implementing binary floating-point conforming to the IEEE Standard for Floating-Point Arithmetic. For basic documentation about SoftFloat refer to -<A HREF="SoftFloat.html"><CODE>SoftFloat.html</CODE></A>. +<A HREF="SoftFloat.html"><NOBR><CODE>SoftFloat.html</CODE></NOBR></A>. </P> <P> @@ -68,8 +68,8 @@ SoftFloat has been successfully compiled with the GNU C Compiler <NOBR>Release 3</NOBR> of SoftFloat was a complete rewrite relative to <NOBR>Release 2</NOBR> or earlier. Changes to the interface of SoftFloat functions are documented in -<A HREF="SoftFloat.html"><CODE>SoftFloat.html</CODE></A>. -The current version of SoftFloat is <NOBR>Release 3a</NOBR>. +<A HREF="SoftFloat.html"><NOBR><CODE>SoftFloat.html</CODE></NOBR></A>. +The current version of SoftFloat is <NOBR>Release 3b</NOBR>. </P> @@ -114,10 +114,10 @@ SoftFloat’s dependence on these headers is detailed later in The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser. <NOBR>Release 3</NOBR> of SoftFloat was a completely new implementation supplanting earlier releases. -The project to create <NOBR>Release 3</NOBR> (and <NOBR>now 3a</NOBR>) was done -in the employ of the University of California, Berkeley, within the Department -of Electrical Engineering and Computer Sciences, first for the Parallel -Computing Laboratory (Par Lab) and then for the ASPIRE Lab. +The project to create <NOBR>Release 3</NOBR> (now <NOBR>through 3b</NOBR>) was +done in the employ of the University of California, Berkeley, within the +Department of Electrical Engineering and Computer Sciences, first for the +Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab. The work was officially overseen by Prof. Krste Asanovic, with funding provided by these sources: <BLOCKQUOTE> @@ -148,12 +148,12 @@ Oracle, and Samsung. </P> <P> -The following applies to the whole of SoftFloat <NOBR>Release 3a</NOBR> as well +The following applies to the whole of SoftFloat <NOBR>Release 3b</NOBR> as well as to each source file individually. </P> <P> -Copyright 2011, 2012, 2013, 2014, 2015 The Regents of the University of +Copyright 2011, 2012, 2013, 2014, 2015, 2016 The Regents of the University of California. All rights reserved. </P> @@ -236,9 +236,9 @@ processors. The files in directory <CODE>8086</CODE> give floating-point behavior consistent solely with Intel’s older, 8087-derived floating-point, while those in <NOBR><CODE>8086-SSE</CODE></NOBR> update the behavior of the -non-extended formats (<CODE>float32_t</CODE>, <CODE>float64_t</CODE>, and -<CODE>float128_t</CODE>) to mirror Intel’s more recent Streaming SIMD -Extensions (SSE) and other compatible extensions. +non-extended formats (<CODE>float16_t</CODE>, <CODE>float32_t</CODE>, +<CODE>float64_t</CODE>, and <CODE>float128_t</CODE>) to mirror Intel’s +more recent Streaming SIMD Extensions (SSE) and other compatible extensions. If other specializations are attempted, these would be expected to be other subdirectories of <CODE>source</CODE> alongside <CODE>8086</CODE> and <NOBR><CODE>8086-SSE</CODE></NOBR>. @@ -370,9 +370,12 @@ what (if anything) special happens when exceptions are raised; <LI> how signaling NaNs are distinguished from quiet NaNs; <LI> -the default generated quiet NaNs; and +the default generated quiet NaNs; <LI> -how NaNs are propagated from function inputs to output. +how NaNs are propagated from function inputs to output; and +<LI> +the integer results returned when conversions to integer type raise the +<I>invalid</I> exception. </UL> </P> @@ -418,6 +421,13 @@ For very small microprocessors whose buses and registers are <NOBR>8-bit</NOBR> or <NOBR>16-bit</NOBR> in size, this macro should usually not be defined. Whether this macro should be defined for a <NOBR>32-bit</NOBR> processor may depend on the target machine and the applications that will use SoftFloat. +<DT><CODE>SOFTFLOAT_FAST_DIV32TO16</CODE> +<DD> +Can be defined to indicate that the target’s division operator +<NOBR>in C</NOBR> (written as <CODE>/</CODE>) is reasonably efficient for +dividing a <NOBR>32-bit</NOBR> unsigned integer by a <NOBR>16-bit</NOBR> +unsigned integer. +Setting this macro may affect the performance of function <CODE>f16_div</CODE>. <DT><CODE>SOFTFLOAT_FAST_DIV64TO32</CODE> <DD> Can be defined to indicate that the target’s division operator @@ -425,7 +435,7 @@ Can be defined to indicate that the target’s division operator dividing a <NOBR>64-bit</NOBR> unsigned integer by a <NOBR>32-bit</NOBR> unsigned integer. Setting this macro may affect the performance of division, remainder, and -square root operations. +square root operations other than <CODE>f16_div</CODE>. <DT><CODE>INLINE_LEVEL</CODE> <DD> Can be defined to an integer to determine the degree of inlining requested of @@ -443,26 +453,41 @@ inlined. If macro <CODE>INLINE_LEVEL</CODE> is defined with a value of 1 or higher, this macro must be defined; otherwise, this macro is ignored and need not be defined. -For some compilers, this macro can be defined as the single keyword +For compilers that conform to the C Standard’s rules for inline +functions, this macro can be defined as the single keyword <CODE>inline</CODE>. +For other compilers that follow a convention pre-dating the standardization of +<CODE>inline</CODE>, this macro may need to be defined to <CODE>extern</CODE> <CODE>inline</CODE>. -Historically, the <CODE>gcc</CODE> compiler has required that this macro be -defined to <CODE>extern</CODE> <CODE>inline</CODE>. +<DT><CODE>THREAD_LOCAL</CODE> +<DD> +Can be defined to a sequence of tokens that, when appearing at the start of a +variable declaration, indicates to the C compiler that the variable is +<I>per-thread</I>, meaning that each execution thread gets its own separate +instance of the variable. +This macro is used in header <CODE>softfloat.h</CODE> in the declarations of +variables <CODE>softfloat_roundingMode</CODE>, +<CODE>softfloat_detectTininess</CODE>, <CODE>extF80_roundingPrecision</CODE>, +and <CODE>softfloat_exceptionFlags</CODE>. +If macro <CODE>THREAD_LOCAL</CODE> is left undefined, these variables will +default to being ordinary global variables. +Depending on the compiler, possible valid definitions of this macro include +<CODE>_Thread_local</CODE> and <CODE>__thread</CODE>. </DL> </BLOCKQUOTE> </P> <P> -Following the usual custom <NOBR>for C</NOBR>, for the first three macros (all -except <CODE>INLINE_LEVEL</CODE> and <CODE>INLINE</CODE>), the content of any -definition is irrelevant; +Following the usual custom <NOBR>for C</NOBR>, for the first four macros (all +except <CODE>INLINE_LEVEL</CODE>, <CODE>INLINE</CODE>, and +<CODE>THREAD_LOCAL</CODE>), the content of any definition is irrelevant; what matters is a macro’s effect on <CODE>#ifdef</CODE> directives. </P> <P> -It is recommended that any definitions of macros <CODE>LITTLEENDIAN</CODE> and -<CODE>INLINE</CODE> be made in a build target’s <CODE>platform.h</CODE> -header file, because these macros are expected to be determined inflexibly by -the target machine and compiler. +It is recommended that any definitions of macros <CODE>LITTLEENDIAN</CODE>, +<CODE>INLINE</CODE>, and <CODE>THREAD_LOCAL</CODE> be made in a build +target’s <CODE>platform.h</CODE> header file, because these macros are +expected to be determined inflexibly by the target machine and compiler. The other three macros control optimization and might be better located in the target’s Makefile (or its equivalent). </P> @@ -496,7 +521,7 @@ underlying arithmetic operations upon which many of SoftFloat’s floating-point functions are ultimately built. The SoftFloat sources include implementations of all of these functions/macros, written as standard C code, so a complete and correct SoftFloat library can be -built using only the supplied code for all functions. +created using only the supplied code for all functions. However, for many targets, SoftFloat’s performance can be improved by substituting target-specific implementations of some of the functions/macros declared in <CODE>primitives.h</CODE>. @@ -505,8 +530,8 @@ declared in <CODE>primitives.h</CODE>. <P> For example, <CODE>primitives.h</CODE> declares a function called <CODE>softfloat_countLeadingZeros32</CODE> that takes an unsigned -<NOBR>32-bit</NOBR> integer as an argument and returns the maximal number of -the integer’s most-significant bits that are all zeros. +<NOBR>32-bit</NOBR> integer as an argument and returns the number of the +integer’s most-significant bits that are zeros. While the SoftFloat sources include an implementation of this function written in <NOBR>standard C</NOBR>, many processors can perform this same function directly in only one or two machine instructions. @@ -534,7 +559,7 @@ where <NOBR><CODE><function-name></CODE></NOBR> is the name of the function. This technically defines <NOBR><CODE><function-name></CODE></NOBR> as a macro, but one that resolves to the same name, which may then be a function. -(A preprocessor conforming to the C Standard must limit recursive macro +(A preprocessor that conforms to the C Standard must limit recursive macro expansion from being applied more than once.) </P> @@ -546,7 +571,7 @@ SoftFloat can be tested using the <CODE>testsoftfloat</CODE> program by the same author. This program is part of the Berkeley TestFloat package available at the Web page -<A HREF="http://www.jhauser.us/arithmetic/TestFloat.html"><CODE>http://www.jhauser.us/arithmetic/TestFloat.html</CODE></A>. +<A HREF="http://www.jhauser.us/arithmetic/TestFloat.html"><NOBR><CODE>http://www.jhauser.us/arithmetic/TestFloat.html</CODE></NOBR></A>. The TestFloat package also has a program called <CODE>timesoftfloat</CODE> that measures the speed of SoftFloat’s floating-point functions. </P> @@ -566,10 +591,10 @@ As supplied, <CODE>softfloat.h</CODE> depends on another header, <CODE>softfloat_types.h</CODE>, that is not intended for public use but which must also be visible to the programmer’s compiler. <LI> -More troubling, at the time <CODE>softfloat.h</CODE> is included in a C -source file, macro <CODE>SOFTFLOAT_FAST_INT64</CODE> must be defined, or not -defined, consistent with whether this macro was defined when the SoftFloat -library was built. +More troubling, at the time <CODE>softfloat.h</CODE> is included in a C source +file, macros <CODE>SOFTFLOAT_FAST_INT64</CODE> and <CODE>THREAD_LOCAL</CODE> +must be defined, or not defined, consistent with how these macro were defined +when the SoftFloat library was built. </UL> In the situation that new programs may regularly <CODE>#include</CODE> header file <CODE>softfloat.h</CODE>, it is recommended that a custom, self-contained @@ -582,7 +607,7 @@ version of this header file be created that eliminates these issues. <P> At the time of this writing, the most up-to-date information about SoftFloat and the latest release can be found at the Web page -<A HREF="http://www.jhauser.us/arithmetic/SoftFloat.html"><CODE>http://www.jhauser.us/arithmetic/SoftFloat.html</CODE></A>. +<A HREF="http://www.jhauser.us/arithmetic/SoftFloat.html"><NOBR><CODE>http://www.jhauser.us/arithmetic/SoftFloat.html</CODE></NOBR></A>. </P> diff --git a/doc/SoftFloat.html b/doc/SoftFloat.html index 19176dc..b0ae66f 100644 --- a/doc/SoftFloat.html +++ b/doc/SoftFloat.html @@ -7,11 +7,11 @@ <BODY> -<H1>Berkeley SoftFloat Release 3a: Library Interface</H1> +<H1>Berkeley SoftFloat Release 3b: Library Interface</H1> <P> John R. Hauser<BR> -2015 October 23<BR> +2016 July 22<BR> </P> @@ -71,9 +71,10 @@ John R. Hauser<BR> <P> Berkeley SoftFloat is a software implementation of binary floating-point that conforms to the IEEE Standard for Floating-Point Arithmetic. -The current release supports four binary formats: <NOBR>32-bit</NOBR> -single-precision, <NOBR>64-bit</NOBR> double-precision, <NOBR>80-bit</NOBR> -double-extended-precision, and <NOBR>128-bit</NOBR> quadruple-precision. +The current release supports five binary formats: <NOBR>16-bit</NOBR> +half-precision, <NOBR>32-bit</NOBR> single-precision, <NOBR>64-bit</NOBR> +double-precision, <NOBR>80-bit</NOBR> double-extended-precision, and +<NOBR>128-bit</NOBR> quadruple-precision. The following functions are supported for each format: <UL> <LI> @@ -105,15 +106,19 @@ Information about the standard is available elsewhere. </P> <P> -The current version of SoftFloat is <NOBR>Release 3a</NOBR>. -The only difference between this version and the previous -<NOBR>Release 3</NOBR> is the replacement of the license text supplied by the -University of California. +The current version of SoftFloat is <NOBR>Release 3b</NOBR>. +This release differs from the previous <NOBR>Release 3a</NOBR> mainly in the +addition of support for the <NOBR>16-bit</NOBR> half-precision format. +Depending on the specific port of SoftFloat, this release may also change the +result obtained when conversion of a floating-point number to an integer format +overflows or is otherwise invalid. +For more about the evolution of SoftFloat releases, see +<A HREF="SoftFloat-history.html"><NOBR><CODE>SoftFloat-history.html</CODE></NOBR></A>. </P> <P> -The functional interface of SoftFloat <NOBR>Release 3</NOBR> and afterward -differs in many details from that of earlier releases. +The functional interface of SoftFloat <NOBR>Release 3</NOBR> and later differs +in many details from that of earlier releases. For specifics of these differences, see <NOBR>section 9</NOBR> below, <I>Changes from SoftFloat <NOBR>Release 2</NOBR></I>. </P> @@ -145,7 +150,7 @@ strictly required. <P> Most operations not required by the original 1985 version of the IEEE Floating-Point Standard but added in the 2008 version are not yet supported in -SoftFloat <NOBR>Release 3a</NOBR>. +SoftFloat <NOBR>Release 3b</NOBR>. </P> @@ -155,10 +160,10 @@ SoftFloat <NOBR>Release 3a</NOBR>. The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser. <NOBR>Release 3</NOBR> of SoftFloat was a completely new implementation supplanting earlier releases. -The project to create <NOBR>Release 3</NOBR> (and <NOBR>now 3a</NOBR>) was done -in the employ of the University of California, Berkeley, within the Department -of Electrical Engineering and Computer Sciences, first for the Parallel -Computing Laboratory (Par Lab) and then for the ASPIRE Lab. +The project to create <NOBR>Release 3</NOBR> (now <NOBR>through 3b</NOBR>) was +done in the employ of the University of California, Berkeley, within the +Department of Electrical Engineering and Computer Sciences, first for the +Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab. The work was officially overseen by Prof. Krste Asanovic, with funding provided by these sources: <BLOCKQUOTE> @@ -189,12 +194,12 @@ Oracle, and Samsung. </P> <P> -The following applies to the whole of SoftFloat <NOBR>Release 3a</NOBR> as well +The following applies to the whole of SoftFloat <NOBR>Release 3b</NOBR> as well as to each source file individually. </P> <P> -Copyright 2011, 2012, 2013, 2014, 2015 The Regents of the University of +Copyright 2011, 2012, 2013, 2014, 2015, 2016 The Regents of the University of California. All rights reserved. </P> @@ -257,7 +262,7 @@ Header file <CODE>softfloat.h</CODE> depends on standard headers <CODE>bool</CODE> and several integer types. These standard headers have been part of the ISO C Standard Library since 1999. With any recent compiler, they are likely to be supported, even if the compiler -does not claim complete conformance to the ISO C Standard. +does not claim complete conformance to the latest ISO C Standard. For older or nonstandard compilers, a port of SoftFloat may have substitutes for these headers. Header <CODE>softfloat.h</CODE> depends only on the name <CODE>bool</CODE> from @@ -273,6 +278,8 @@ int64_t uint_fast8_t uint_fast32_t uint_fast64_t +int_fast32_t +int_fast64_t </PRE> </BLOCKQUOTE> </P> @@ -281,10 +288,14 @@ uint_fast64_t <H3>4.2. Floating-Point Types</H3> <P> -The <CODE>softfloat.h</CODE> header defines four floating-point types: +The <CODE>softfloat.h</CODE> header defines five floating-point types: <BLOCKQUOTE> <TABLE CELLSPACING=0 CELLPADDING=0> <TR> +<TD><CODE>float16_t</CODE></TD> +<TD><NOBR>16-bit</NOBR> half-precision binary format</TD> +</TR> +<TR> <TD><CODE>float32_t</CODE></TD> <TD><NOBR>32-bit</NOBR> single-precision binary format</TD> </TR> @@ -304,8 +315,9 @@ Motorola format)</TD> </TABLE> </BLOCKQUOTE> The non-extended types are each exactly the size specified: -<NOBR>32 bits</NOBR> for <CODE>float32_t</CODE>, <NOBR>64 bits</NOBR> for -<CODE>float64_t</CODE>, and <NOBR>128 bits</NOBR> for <CODE>float128_t</CODE>. +<NOBR>16 bits</NOBR> for <CODE>float16_t</CODE>, <NOBR>32 bits</NOBR> for +<CODE>float32_t</CODE>, <NOBR>64 bits</NOBR> for <CODE>float64_t</CODE>, and +<NOBR>128 bits</NOBR> for <CODE>float128_t</CODE>. Aside from these size requirements, the definitions of all these types may differ for different ports of SoftFloat to specific systems. A given port of SoftFloat may or may not define some of the floating-point @@ -364,7 +376,7 @@ comparisons between two values in the same floating-point format. <P> The following operations required by the 2008 IEEE Floating-Point Standard are -not supported in SoftFloat <NOBR>Release 3a</NOBR>: +not supported in SoftFloat <NOBR>Release 3b</NOBR>: <UL> <LI> <B>nextUp</B>, <B>nextDown</B>, <B>minNum</B>, <B>maxNum</B>, <B>minNumMag</B>, @@ -492,14 +504,17 @@ prefix, and should reference only such names as are documented. <H2>6. Mode Variables</H2> <P> -The following variables control rounding mode, underflow detection, and the -<NOBR>80-bit</NOBR> extended format’s rounding precision: +The following global variables control rounding mode, underflow detection, and +the <NOBR>80-bit</NOBR> extended format’s rounding precision: <BLOCKQUOTE> <CODE>softfloat_roundingMode</CODE><BR> <CODE>softfloat_detectTininess</CODE><BR> <CODE>extF80_roundingPrecision</CODE> </BLOCKQUOTE> These mode variables are covered in the next several subsections. +For some SoftFloat ports, these variables may be <I>per-thread</I> (declared +<CODE>thread_local</CODE>), meaning that different execution threads have their +own separate copies of the variables. </P> <H3>6.1. Rounding Mode</H3> @@ -616,30 +631,36 @@ meaning no exceptions. </P> <P> +For some SoftFloat ports, <CODE>softfloat_exceptionFlags</CODE> may be +<I>per-thread</I> (declared <CODE>thread_local</CODE>), meaning that different +execution threads have their own separate instances of it. +</P> + +<P> An individual exception flag can be cleared with the statement <BLOCKQUOTE> <CODE>softfloat_exceptionFlags &= ~softfloat_flag_<<I>exception</I>>;</CODE> </BLOCKQUOTE> where <CODE><<I>exception</I>></CODE> is the appropriate name. -To raise a floating-point exception, function <CODE>softfloat_raise</CODE> +To raise a floating-point exception, function <CODE>softfloat_raiseFlags</CODE> should normally be used. </P> <P> When SoftFloat detects an exception other than <I>inexact</I>, it calls -<CODE>softfloat_raise</CODE>. +<CODE>softfloat_raiseFlags</CODE>. The default version of this function simply raises the corresponding exception flags. Particular ports of SoftFloat may support alternate behavior, such as exception -traps, by modifying the default <CODE>softfloat_raise</CODE>. -A program may also supply its own <CODE>softfloat_raise</CODE> function to +traps, by modifying the default <CODE>softfloat_raiseFlags</CODE>. +A program may also supply its own <CODE>softfloat_raiseFlags</CODE> function to override the one from the SoftFloat library. </P> <P> Because inexact results occur frequently under most circumstances (and thus are hardly exceptional), SoftFloat does not ordinarily call -<CODE>softfloat_raise</CODE> for <I>inexact</I> exceptions. +<CODE>softfloat_raiseFlags</CODE> for <I>inexact</I> exceptions. It does always raise the <I>inexact</I> exception flag as required. </P> @@ -652,6 +673,10 @@ a substitute for one of these abbreviations: <BLOCKQUOTE> <TABLE CELLSPACING=0 CELLPADDING=0> <TR> +<TD><CODE>f16</CODE></TD> +<TD>indicates <CODE>float16_t</CODE>, passed by value</TD> +</TR> +<TR> <TD><CODE>f32</CODE></TD> <TD>indicates <CODE>float32_t</CODE>, passed by value</TD> </TR> @@ -752,24 +777,14 @@ otherwise, it will not be, even if the conversion is inexact. </P> <P> -Conversions from floating-point to integer raise the <I>invalid</I> exception -if the source value cannot be rounded to a representable integer of the desired -size (32 or 64 bits). -In such a circumstance, if the floating-point input is a NaN or if the -conversion is to an unsigned integer type, the largest positive integer is -returned; -otherwise, the largest integer with the same sign as the input is returned. -The functions that convert to integer types never raise the <I>overflow</I> -exception. -</P> - -<P> -Note that, when converting to an unsigned integer type, if the <I>invalid</I> -exception is raised because the input floating-point value would round to a -negative integer, the value returned is the <EM>maximum positive unsigned -integer</EM>. -Zero is not returned when the <I>invalid</I> exception is raised, even when -zero is the closest integer to the original floating-point value. +A conversion from floating-point to integer format raises the <I>invalid</I> +exception if the source value cannot be rounded to a representable integer of +the desired size (32 or 64 bits). +In such circumstances, the integer result returned is determined by the +particular port of SoftFloat, although typically this value will be either the +maximum or minimum value of the integer format. +The functions that convert to integer types never raise the floating-point +<I>overflow</I> exception. </P> <P> @@ -884,11 +899,9 @@ SoftFloat implements fused multiply-add with functions <BLOCKQUOTE> <CODE><<I>float</I>>_mulAdd</CODE> </BLOCKQUOTE> -Unlike other operations, fused multiple-add is supported only for the -non-extended formats, <CODE>float32_t</CODE>, <CODE>float64_t</CODE>, and -<CODE>float128_t</CODE>. -No fused multiple-add function is currently provided for the -<NOBR>80-bit</NOBR> double-extended-precision type, <CODE>extFloat80_t</CODE>. +Unlike other operations, fused multiple-add is not supported for the +<NOBR>80-bit</NOBR> double-extended-precision format, +<CODE>extFloat80_t</CODE>. </P> <P> @@ -971,8 +984,8 @@ no rounding. Depending on the relative magnitudes of the operands, the remainder functions can take considerably longer to execute than the other SoftFloat functions. -This is inherent in the remainder operation itself and is not a flaw in the -SoftFloat implementation. +This is an inherent characteristic of the remainder operation itself and is not +a flaw in the SoftFloat implementation. </P> <H3>8.7. Round-to-Integer Functions</H3> @@ -1103,14 +1116,14 @@ bool f128M_isSignalingNaN( const float128_t *<I>aPtr</I> ); SoftFloat provides a single function for raising floating-point exceptions: <BLOCKQUOTE> <PRE> -void softfloat_raise( uint_fast8_t <I>exceptions</I> ); +void softfloat_raiseFlags( uint_fast8_t <I>exceptions</I> ); </PRE> </BLOCKQUOTE> The <CODE><I>exceptions</I></CODE> argument is a mask indicating the set of exceptions to raise. (See earlier section 7, <I>Exceptions and Exception Flags</I>.) In addition to setting the specified exception flags in variable -<CODE>softfloat_exceptionFlags</CODE>, the <CODE>softfloat_raise</CODE> +<CODE>softfloat_exceptionFlags</CODE>, the <CODE>softfloat_raiseFlags</CODE> function may cause a trap or abort appropriate for the current system. </P> @@ -1216,7 +1229,7 @@ have been renamed as follows: </TR> <TR> <TD><CODE>float_raise</CODE></TD> -<TD><CODE>softfloat_raise</CODE></TD> +<TD><CODE>softfloat_raiseFlags</CODE></TD> </TR> </TABLE> </BLOCKQUOTE> @@ -1367,8 +1380,15 @@ all cases involving rounding. <P> <LI> -Fused multiply-add functions have been added for the non-extended formats, -<CODE>float32_t</CODE>, <CODE>float64_t</CODE>, and <CODE>float128_t</CODE>. +Fused multiply-add functions have been added for all floating-point formats +except <NOBR>80-bit</NOBR> double-extended-precision, +<CODE>extFloat80_t</CODE>. +</P> + +<P> +<LI> +As of <NOBR>Release 3b</NOBR>, <NOBR>16-bit</NOBR> half-precision, +<CODE>float16_t</CODE>, is supported. </P> </UL> @@ -1427,9 +1447,6 @@ Some loss of speed has been observed due to this change. The following improvements are anticipated for future releases of SoftFloat: <UL> <LI> -support for the common <NOBR>16-bit</NOBR> “half-precision” -floating-point format; -<LI> more functions from the 2008 version of the IEEE Floating-Point Standard; <LI> consistent, defined behavior for non-canonical representations of extended @@ -1445,7 +1462,7 @@ format <CODE>extFloat80_t</CODE> (discussed in <NOBR>section 4.4</NOBR>, <P> At the time of this writing, the most up-to-date information about SoftFloat and the latest release can be found at the Web page -<A HREF="http://www.jhauser.us/arithmetic/SoftFloat.html"><CODE>http://www.jhauser.us/arithmetic/SoftFloat.html</CODE></A>. +<A HREF="http://www.jhauser.us/arithmetic/SoftFloat.html"><NOBR><CODE>http://www.jhauser.us/arithmetic/SoftFloat.html</CODE></NOBR></A>. </P> |