Age | Commit message (Collapse) | Author | Files | Lines |
|
Fixed another precision bug in powf(). This one is in the computation
[t=p_l+p_h High]. We multiply t by lg2_h, and want the result to be
exact. For the bogus float case of the high-low decomposition trick, we
normally discard the lowest 12 bits of the fraction for the high part,
keeping 12 bits of precision. That was used for t here, but it doesnt't
work because for some reason we only discard the lowest 9 bits in the
fraction for lg2_h. Discard another 3 bits of the fraction for t to
compensate.
This bug gave wrong results like:
powf(0.9999999, -2.9999995) = 1.0000002 (should be 1.0000001)
hex values: 3F7FFFFF C03FFFFE 3F800002 3F800001
As explained in the log for the previous commit, the bug is normally
masked by doing float calculations in extra precision on i386's, but is
easily detected by ucbtest on systems that don't have accidental extra
precision.
Reference: https://github.com/freebsd/freebsd-src/commit/5f20e5ce7f396ad064bfc1f45b8075ea1c0580f9
Original Author: Bruce Evans
|
|
(1) The bit for the 1.0 part of bp[k] was right shifted by 4. This
seems to have been caused by a typo in converting e_pow.c to
e_powf.c.
(2) The lower 12 bits of ax+bp[k] were not discarded, so t_h was
actually plain ax+bp[k]. This seems to have been caused by a logic
error in the conversion.
These bugs gave wrong results like:
powf(-1.1, 101.0) = -15158.703 (should be -15158.707)
hex values: BF8CCCCD 42CA0000 C66CDAD0 C66CDAD4
Fixing (1) gives a result wrong in the opposite direction
(hex C66CDAD8), and fixing (2) gives the correct result.
ucbtest has been reporting this particular wrong result on i386 systems
with unpatched libraries for 9 years. I finally figured out the extent
of the bugs. On i386's they are normally hidden by extra precision.
We use the trick of representing floats as a sum of 2 floats (one much
smaller) to get extra precision in intermediate calculations without
explicitly using more than float precision. This trick is just a
pessimization when extra precision is available naturally (as it always
is when dealing with IEEE single precision, so the float precision part
of the library is mostly misimplemented). (1) and (2) break the trick
in different ways, except on i386's it turns out that the intermediate
calculations are done in enough precision to mask both the bugs and
the limited precision of the float variables (as far as ucbtest can
check).
ucbtest detects the bugs because it forces float precision, but this
is not a normal mode of operation so the bug normally has little effect
on i386's.
On systems that do float arithmetic in float precision, e.g., amd64's,
there is no accidental extra precision and the bugs just give wrong
results.
Reference: https://github.com/freebsd/freebsd-src/commit/12be4e0d5a54a6750913aee2564d164baa71f0dc
Original Author: Bruce Evans
|
|
The decomposition needs to be into 12+24 bits of precision for extra-
precision multiplication, but was into 13+24 bits. On i386 with -O1 the
bug was hidden by accidental extra precision, but on amd64, in 2^32
trials the bug caused about 200000 errors of more than 1 ulp, with a
maximum error of about 80 ulps. Now the maximum error in 2^32 trials
on amd64 is 0.8573 ulps. It is still 0.8316 ulps on i386 with -O1.
The nearby decomposition of 1/ln2 and the decomposition of 2/(3ln2) in
the double precision version seem to be sub-optimal but not broken.
Reference: https://github.com/freebsd/freebsd-src/commit/b4437c3d322a0f6d23d12b6f76d2fc72d2ff0ec2
Original Author: Bruce Evans
|
|
When creating a split manual with one-node-per-page, the main index.html
ends up getting clobbered by the page for the index() function because
it uses "@node index" which, for html, also creates an index.html page.
To remedy this, add "Function " to every function node so now we output
"Function-index.html" and avoid clobbering. It also namespaces every
other function and helps make sure we don't clobber anything else.
Otherwise, there isn't really much rendering difference as @node text
is mostly internal. Node title text comes from @section instead.
|
|
|
|
|
|
j is int32_t and thus j<<31 is undefined if j==1.
Taken from FreeBSD commit bdd8abc6d6a93ce3ab8ad5db716222ee3110c4a3
|
|
E.g. known errors:
x = 0x1.000002c5e2e99p+0, y = 0x1.c9eee35374af6p+31 had an error of 639
ULP and is now correctly rounded.
x = 0x1.fffffd2e3e669p-1, y = 0x1.344c9823eb66cp+32 had an error of 428
ULP and is now correctly rounded.
|
|
odd integer
|
|
|
|
The implementation of expf() explains how approximation in the range [0 - 0.34] is done. The comment describes the "Reme" algorithm for constructing the polynomial. This is a typo and should be the "Remez" algorithm. The remez algorithm (or minimax) is used to calculate the coefficients of polynomials in other implementations of exp(0 and log().
See more:
https://en.wikipedia.org/wiki/Remez_algorithm
|
|
By default, Newlib uses a huge object of type struct _reent to store
thread-specific data. This object is returned by __getreent() if the
__DYNAMIC_REENT__ Newlib configuration option is defined.
The reentrancy structure contains for example errno and the standard input,
output, and error file streams. This means that if an application only uses
errno it has a dependency on the file stream support even if it does not use
it. This is an issue for lower end targets and applications which need to
qualify the software according to safety standards (for example ECSS-E-ST-40C,
ECSS-Q-ST-80C, IEC 61508, ISO 26262, DO-178, DO-330, DO-333).
If the new _REENT_THREAD_LOCAL configuration option is enabled, then struct
_reent is replaced by dedicated thread-local objects for each struct _reent
member. The thread-local objects are defined in translation units which use
the corresponding object.
|
|
Convert all the libm/ subdir makes into the top-level Makefile. This
allows us to build all of libm from the top Makefile without using any
recursive make calls. This is faster and avoids the funky lib.a logic
where we unpack subdir archives to repack into a single libm.a. The
machine override logic is maintained though by way of Makefile include
ordering, and source file accumulation in libm_a_SOURCES.
One thing to note is that this will require GNU Make because of:
libm_a_CFLAGS = ... $(libm_a_CFLAGS_$(subst /,_,$(@D)))
This was the only way I could find to supporting per-dir compiler
settings, and I couldn't find a POSIX compatible way of transforming
the variable content. I don't think this is a big deal as other
Makefiles in the tree are using GNU Make-specific syntax, but I call
this out as it's the only one so far in the new automake code that
I've been writing.
Automake doesn't provide precise control over the output object names
(by design). This is fine by default as we get consistent names in all
the subdirs: libm_a-<source>.o. But this relies on using the same set
of compiler flags for all objects. We currently compile libm/common/
with different optimizations than the rest.
If we want to compile objects differently, we can create an intermediate
archive with the subset of objects with unique flags, and then add those
objects to the main archive. But Automake will use a different prefix
for the objects, and thus we can't rely on ordering to override.
But if we leverage $@, we can turn Automake's CFLAGS into a multiplex
on a per-dir (and even per-file if we wanted) basis. Unfortunately,
since $@ contains /, Automake complains it's an invalid name. While
GNU Make supports this, it's a POSIX extension, so Automake flags it.
Using $(subst) avoids the Automake warning to get a POSIX compliant
name, albeit with a GNU Make extension.
|
|
This is used in a bunch of places, but nowhere is it ever set, and
nowhere can I find any documentation, nor can I find any other project
using it. So delete the flags to simplify.
|
|
The original cut for small arguments at |x|<2**-70 (copied from the
double version) produces that when computing nadj we get a subnormal
number for t*x and thus, the division of pi/subnormal will be INF and
the logarithm of it too, which is wrong as a result for lgammaf in this
range.
The proposed new limit seems to be safe and has been tested to
produce accurate results.
(Courtesy of Andreas Jung, ESA)
|
|
Correct the overflow limit in the variable o_threshold to be consistent
with the FLT_UWORD_LOG_MAX variable used by the internal implementation
of the expf algorithm itself.
The u_threshold variable has also been modified to be written in the
same format.
Note that this fix improves the situation but does not completely
correct the inconsistencies regarding the overflow and underflow limits
between the expf wrapper (wf_exp.c) and the expf algorithm itself
(ef_exp.c).
Currently these limits are different for the
_FLT_LARGEST_EXPONENT_IS_NORMAL and _FLT_NO_DENORMALS cases as well as
for the case where __OBSOLETE_MATH is not defined (only for the
underflow limit in this case).
|
|
This kills off the last configure script under libm/ and folds it
into the top newlib configure script. The vast majority of logic
was already in the top configure script, so move the little that
is left into a libm/acinclude.m4 file.
|
|
This was needed only to support libtool in case objects ended in .lo
instead of .o, but we dropped libtool, so drop this too.
|
|
This was needed only to support libtool in case the library ended in
.la instead of .a, but we dropped libtool, so drop this too.
|
|
This was only ever used for i?86-pc-linux-gnu targets, but that's been
broken for years, and has since been dropped. So clean this up too.
This also deletes the funky objectlist logic since it only existed for
the libtool libraries. Since it was the only thing left in the small
Makefile.shared file, we can punt that too.
|
|
Now that we require automake-1.15, we can use this macro rather than
do the tool search ourselves.
|
|
Now that we use AC_NO_EXECUTABLES, and we require a recent version of
autoconf, we don't need to define our own copies of these macros. So
switch to the standard AC_PROG_CC.
|
|
Now that the top-level makefile handles these, don't need to copy
these into every single subdir.
|
|
This allows building the libc & libm pages in parallel, and drops
the duplication in the subdirs with the chew/chapter settings.
The unused rules in Makefile.shared are left in place to minimize
noise in the change.
|
|
This doesn't migrate all the docs, just the libm's manual (pdf/info).
This is to show the basic form of migrating the chew files.
|
|
When using the top-level configure script but subdir Makefiles, the
newlib_basedir value gets a bit out of sync: it's relative to where
configure lives, not where the Makefile lives. Move the abs setting
from the top-level configure script into acinclude.m4 so we can rely
on it being available everywhere. Although this commit doesn't use
it anywhere, just lays the groundwork.
|
|
The machine configure scripts are all effectively stub scripts that
pass the higher level options to its own makefile. The only one doing
any custom tests was nds32. The rest were all effectively the same as
the libm/ configure script.
So instead of recursively running configure in all of these subdirs,
generate their makefiles from the top-level configure. For nds32,
deploy a pattern of including subdir logic via m4:
m4_include([machine/nds32/acinclude.m4])
Even its set of checks are very small -- it does 2 preprocessor tests
and sets up 2 makefile conditionals.
Some of the generated machine makefiles have a bunch of extra stuff
added to them, but that's because they were inconsistent in their
configure libtool calls. The top-level has it, so it exports some
new vars to the ones that weren't already.
|
|
The machine/{configure,Makefile} files exist only to fan out to the
specific machine/$arch/ subdir. We already have all that same info
in the libm/ dir itself, so by moving the recursive configure and
make calls into it, we can cut off this logic entirely and save the
overhead.
For arches that don't have a machine subdir, it means they can skip
the logic entirely.
|
|
This matches what the other GNU toolchain projects have done already.
The generated diff in practice isn't terribly large. This will allow
more use of subdir local.mk includes due to fixes & improvements that
came after the 1.11 release series.
|
|
The newlib & libgloss dirs are already generated using autoconf-2.69.
To avoid merging new code and/or accidental regeneration using diff
versions, leverage config/override.m4 to pin to 2.69 exactly. This
matches what gcc/binutils/gdb are already doing.
The README file already says to use autoconf-2.69.
To accomplish this, it's just as simple as adding -I flags to the
top-level config/ dir when running aclocal. This is because the
override.m4 file overrides AC_INIT to first require the specific
autoconf version before calling the real AC_INIT.
|
|
Since automake deprecated the INCLUDES name in favor of AM_CPPFLAGS,
change all existing users over. The generated code is the same since
the two variables have been used in the same exact places by design.
There are other cleanups to be done, but lets focus on just renaming
here so we can upgrade to a newer automake version w/out triggering
new warnings.
|
|
|
|
The 'cygnus' option was removed from automake 1.13 in 2012, so the
presence of this option prevents that or a later version of automake
being used.
A check-list of the effects of '--cygnus' from the automake 1.12
documentation, and steps taken (where possible) to preserve those
effects (See also this thread [1] for discussion on that):
[1] https://lists.gnu.org/archive/html/bug-automake/2012-03/msg00048.html
1. The foreign strictness is implied.
Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4
2. The options no-installinfo, no-dependencies and no-dist are implied.
Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4
Future work: Remove no-dependencies and any explicit header dependencies,
and use automatic dependency tracking instead. Are there explicit rules
which are now redundant to removing no-installinfo and no-dist?
3. The macro AM_MAINTAINER_MODE is required.
Already present in newlib/acinclude.m4
Note that maintainer-mode is still disabled by default.
4. Info files are always created in the build directory, and not in the
source directory.
This appears to be an error in the automake documentation describing
'--cygnus' [2]. newlib's info files are generated in the source
directory, and no special steps are needed to keep doing that.
[2] https://lists.gnu.org/archive/html/bug-automake/2012-04/msg00028.html
5. texinfo.tex is not required if a Texinfo source file is specified.
(The assumption is that the file will be supplied, but in a place that
automake cannot find.)
This effect is overriden by an explicit setting of the TEXINFO_TEX
variable (the directory part of which is fed into texi2X via the
TEXINPUTS environment variable).
6. Certain tools will be searched for in the build tree as well as in the
user's PATH. These tools are runtest, expect, makeinfo and texi2dvi.
For obscure automake reasons, this effect of '--cygnus' is not active
for makeinfo in newlib's configury.
However, there appears to be top-level configury which selects in-tree
runtest, expect and makeinfo, if present. So, if that works as it
appears, this effect is preserved. If not, this may cause problem if
anyone is building those tools in-tree.
This effect is not preserved for texi2dvi. This may cause problems if
anyone is building texinfo in-tree.
If needed, explicit checks for those tools looking in places relative to
$(top_srcdir)/../ as well as in PATH could be added.
7. The check target doesn't depend on all.
This effect is not preseved. The check target now depends on the all
target.
This concern seems somewhat academic given the current state of the
testsuite.
Also note that this doesn't touch libgloss.
|
|
|
|
Regenerate all aclocal.m4, configure and Makefile.in files.
|
|
The maintainer rules refer to configure.in directly, so update that
after renaming all the configure.ac files.
|
|
- compiler is sometimes optimizing out the rounding check in
e_sqrt.c and ef_sqrt.c which uses two constants to create
an inexact operation
- there is a similar constant operation in s_tanh.c/sf_tanh.c
- make the one and tiny constants volatile to stop this
|
|
So far the build mechanism in newlib only allowed to either define
machine-specific headers, or headers shared between all machines.
In some cases, architectures are sufficiently alike to share header
files between them, but not with other architectures. A good example
is ix86 vs. x86_64, which share certain traits with each other, but
not with other architectures.
Introduce a new configure variable called "shared_machine_dir". This
dir can then be used for headers shared between architectures.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
|
|
|
|
- fixes issue with inf sign when x is -0
|
|
This patch fixes the error found by Paul Zimmermann (see
https://homepages.loria.fr/PZimmermann/papers/#accuracy) regarding x
close to 1 and rather large y (specifically he found the case
powf(0x1.ffffeep-1,-0x1.000002p+27) which returns +Inf instead of the
correct value). We found 2 more values for x which show the same faulty
behaviour, and all 3 are fixed with this patch. We have tested all
combinations for x in [+1.fffdfp-1, +1.00020p+0] and y in
[-1.000007p+27, -1.000002p+27] and [1.000002p+27,1.000007p+27].
|
|
Need to copy the argument sign to the output for tgamma(finite)
overflow case.
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
The current gamma, gamma_r, gammaf and gammaf_r functions return
|gamma(x)| instead of ln(|gamma(x)|) due to a change made back in 2002
to the __ieee754_gamma_r implementation. This patch fixes that, making
all of these functions map too their lgamma equivalents.
To fix the underlying bug, the __ieee754_gamma functions have been
changed to return gamma(x), removing the _r variants as those are no
longer necessary. Their names have been changed to __ieee754_tgamma to
avoid potential confusion from users.
Now that the __ieee754_tgamma functions return the correctly signed
value, the tgamma functions have been modified to use them.
libm.a now exposes the following gamma functions:
ln(|gamma(x)|):
__ieee754_lgamma_r
__ieee754_lgammaf_r
lgamma
lgamma_r
gamma
gamma_r
lgammaf
lgammaf_r
gammaf
gammaf_r
lgammal (on machines where long double is double)
gamma(x):
__ieee754_tgamma
__ieee754_tgammaf
tgamma
tgammaf
tgammal (on machines where long double is double)
Additional aliases for any of the above functions can be added if
necessary; in particular, I'm not sure if we need to include
__ieee754_gamma*_r functions (which would return ln(|(gamma(x)|).
Signed-off-by: Keith Packard <keithp@keithp.com>
----
v2:
Switch commit message to ASCII
|
|
It was calling __math_uflow(0) instead of __math_uflowf(0), which
resulted in no exception being set on machines with exception support
for float but not double.
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
The __ieee754 functions already return the right value in exception
cases, so don't modify those. Setting the library to _POSIX_/_IEEE_
mode now only affects whether errno is modified.
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
POSIX says that the errno for pow(0, -y) should be ERANGE instead of
EDOM.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/pow.html
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
The y0, y1 and yn functions need separate conditions when x is zero as
that returns ERANGE instead of EDOM.
Also stop adjusting the return value from the __ieee754_y* functions
as that is already correct and we were just breaking it.
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
For POSIX, gamma(i) (i non-positive integer) should set errno to
ERANGE instead of EDOM.
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
C compilers may fold const values at compile time, so expressions
which try to elicit underflow/overflow by performing simple
arithemetic on suitable values will not generate the required
exceptions.
Work around this by replacing code which does these arithmetic
operations with calls to the existing __math_xflow functions that are
designed to do this correctly.
Signed-off-by: Keith Packard <keithp@keithp.com>
----
v2:
libm/math: Pass sign to __math_xflow instead of muliplying result
|
|
The IEEE spec for pow only has special case for x**0 and 1**y when x/y
are quiet NaN. For signaling NaN, the general case applies and these functions
should signal the invalid exception and return a quiet NaN.
Signed-off-by: Keith Packard <keithp@keithp.com>
|