Age | Commit message (Collapse) | Author | Files | Lines |
|
The vector ilogb routines, including the ones inlined into fmod, had a bug
in which the conditional masks were not properly applied, causing the value of
one lane to be affected by conditional choices of another lane. The problem
was not immediately obviously because all values were calculated correctly when
no lane contained a subnormal input.
The problem is fixed by proper use of VECTOR_COND_MOVE and VECTOR_WHILE.
|
|
The end condition on this loop, unlike all the other similar loops, is
"i >= 0", which is a problem because "i <<= 1" can go negative and then zero if
you continue shifting, and so back to true, again. This isn't a problem for
the loop in the scalar implementation, but it means we need to mask the shift
in the vector implementation.
This fixes GCC PR#121392.
|
|
This patch introduces dummy implementation of fegetprec and fsetprec for Cygwin
build as those symbols are being exported by cygwin1.dll and AArch64 do not support
changing floating point precision at runtime.
Signed-off-by: Radek Bartoň <radek.barton@microsoft.com>
|
|
The merged below patch modifies the __ieee754_sqrtf and __ieee754_sqrt functions to use a shared implementation, replacing the original fsqrt.d[s] instruction usage.
patch:https://sourceware.org/git/?p=newlib-cygwin.git;a=commit;h=d572c4482b473d7725be0f9380d4f5d8342e4390
Signed-off-by: Songhe Zhu <zhusonghe@eswincomputing.com>
|
|
Since January, GCC has been miscompiling Newlib libm on AMD GCN due to
undefined behaviour in the RESIZE_VECTOR macro. It was "working" but expanding
the size of a vector would no longer zero the additional lanes, as it expected.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325
|
|
- bump up release to 4.5.0
|
|
That according to C99/POSIX, nextafter(x,y) should return y if x==y.
[1] NetBSD fix for this: https://github.com/IIJ-NetBSD/netbsd-src/commit/3bc685224189d2b7dfb68da52d9725a256a667bd
[2] glibc fix for this: https://github.com/bminor/glibc/commit/bc9f6000f6752153e5e1902259d5f491a88a1ae5#diff-bcc0628a39c3c2003047dcb5a40a8b50c00f01a74b1c8c1100d770a8e48b1ce2
[3] Linux man page: https://man7.org/linux/man-pages/man3/nextafter.3.html
|
|
Fixed another precision bug in powf(). This one is in the computation
[t=p_l+p_h High]. We multiply t by lg2_h, and want the result to be
exact. For the bogus float case of the high-low decomposition trick, we
normally discard the lowest 12 bits of the fraction for the high part,
keeping 12 bits of precision. That was used for t here, but it doesnt't
work because for some reason we only discard the lowest 9 bits in the
fraction for lg2_h. Discard another 3 bits of the fraction for t to
compensate.
This bug gave wrong results like:
powf(0.9999999, -2.9999995) = 1.0000002 (should be 1.0000001)
hex values: 3F7FFFFF C03FFFFE 3F800002 3F800001
As explained in the log for the previous commit, the bug is normally
masked by doing float calculations in extra precision on i386's, but is
easily detected by ucbtest on systems that don't have accidental extra
precision.
Reference: https://github.com/freebsd/freebsd-src/commit/5f20e5ce7f396ad064bfc1f45b8075ea1c0580f9
Original Author: Bruce Evans
|
|
(1) The bit for the 1.0 part of bp[k] was right shifted by 4. This
seems to have been caused by a typo in converting e_pow.c to
e_powf.c.
(2) The lower 12 bits of ax+bp[k] were not discarded, so t_h was
actually plain ax+bp[k]. This seems to have been caused by a logic
error in the conversion.
These bugs gave wrong results like:
powf(-1.1, 101.0) = -15158.703 (should be -15158.707)
hex values: BF8CCCCD 42CA0000 C66CDAD0 C66CDAD4
Fixing (1) gives a result wrong in the opposite direction
(hex C66CDAD8), and fixing (2) gives the correct result.
ucbtest has been reporting this particular wrong result on i386 systems
with unpatched libraries for 9 years. I finally figured out the extent
of the bugs. On i386's they are normally hidden by extra precision.
We use the trick of representing floats as a sum of 2 floats (one much
smaller) to get extra precision in intermediate calculations without
explicitly using more than float precision. This trick is just a
pessimization when extra precision is available naturally (as it always
is when dealing with IEEE single precision, so the float precision part
of the library is mostly misimplemented). (1) and (2) break the trick
in different ways, except on i386's it turns out that the intermediate
calculations are done in enough precision to mask both the bugs and
the limited precision of the float variables (as far as ucbtest can
check).
ucbtest detects the bugs because it forces float precision, but this
is not a normal mode of operation so the bug normally has little effect
on i386's.
On systems that do float arithmetic in float precision, e.g., amd64's,
there is no accidental extra precision and the bugs just give wrong
results.
Reference: https://github.com/freebsd/freebsd-src/commit/12be4e0d5a54a6750913aee2564d164baa71f0dc
Original Author: Bruce Evans
|
|
The decomposition needs to be into 12+24 bits of precision for extra-
precision multiplication, but was into 13+24 bits. On i386 with -O1 the
bug was hidden by accidental extra precision, but on amd64, in 2^32
trials the bug caused about 200000 errors of more than 1 ulp, with a
maximum error of about 80 ulps. Now the maximum error in 2^32 trials
on amd64 is 0.8573 ulps. It is still 0.8316 ulps on i386 with -O1.
The nearby decomposition of 1/ln2 and the decomposition of 2/(3ln2) in
the double precision version seem to be sub-optimal but not broken.
Reference: https://github.com/freebsd/freebsd-src/commit/b4437c3d322a0f6d23d12b6f76d2fc72d2ff0ec2
Original Author: Bruce Evans
|
|
Remove sys/xtensa that is actually duplicate newlib's code.
Move used code to machine/xtensa or to libgloss
|
|
For RISC-V, AArch64, i386, and x86_64
- k_tanl.c is required for tanl; the linker will report `__kernel_tanl`
not found when tanl is called.
- s_cospil.c and s_sinpil.c are added for cospil and sinpil, which are already
available for ld80 but not for ld128.
|
|
The vfma.f32|64 z, x, y instruction performs the operation
z += x * y without intermediate rounding.
The register used for z is both read and written by the instruction.
The inline assembly must therefore use the "+" constraint modifier.
|
|
The rest of "long double" functions aren't compiled with long double
and double are not the same. Do the same for all complex functions.
Signed-off-by: Pietro Monteiro <pietro@sociotechnical.xyz>
|
|
When creating a split manual with one-node-per-page, the main index.html
ends up getting clobbered by the page for the index() function because
it uses "@node index" which, for html, also creates an index.html page.
To remedy this, add "Function " to every function node so now we output
"Function-index.html" and avoid clobbering. It also namespaces every
other function and helps make sure we don't clobber anything else.
Otherwise, there isn't really much rendering difference as @node text
is mostly internal. Node title text comes from @section instead.
|
|
The generated function documentation makes sure to include entries for
every function in the function index via @findex, but then the manuals
forget to actually print the index.
|
|
- bump up version to 4.4.0
|
|
RISC-V newlib fails to build for soft float multilibs since long double
support was enabled in:
commit 04798b7bb69571452d2cfc7e0b052a9bbd3b619d
Author: Kito Cheng <kito.cheng@sifive.com>
Date: Mon Dec 4 15:41:39 2023 +0800
RISC-V: Support long double math
Long double for RISC-V is using 128 bit IEEE 754 format like Aarch64,
so we reference AArch64 to support that.
The RISC-V soft floating point environment only supports the
FE_TONEAREST rounding mode and does not support exceptions. Guard long
double rounding and exception support with ifdefs based on the presence
of the relevant rounding modes and exceptions.
Tested on gcc/g++ testsuite using RISC-V GNU Newlib Toolchain built by
riscv-gnu-toolchain with multilibs:
riscv-sim/-march=rv32i/-mabi=ilp32/-mcmodel=medlow
riscv-sim/-march=rv32iac/-mabi=ilp32/-mcmodel=medlow
riscv-sim/-march=rv32im/-mabi=ilp32/-mcmodel=medlow
riscv-sim/-march=rv32imac/-mabi=ilp32/-mcmodel=medlow
riscv-sim/-march=rv32imafc/-mabi=ilp32f/-mcmodel=medlow
riscv-sim/-march=rv64imac/-mabi=lp64/-mcmodel=medlow
Co-authored-by: Simon Cook <simon.cook@embecosm.com>
|
|
Curly braces cause documentation build failure with texinfo 7.1 (works fine
up to 7.0.3):
|
|
Upstream GCC changed -Wimplicit-function-declaration warning into an
error. The build break about missing fpclassifyf function prototype
exposed a bug in the PRU port of libm.
The fix is to use the fpclassify macro for both double and float types.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
Long double for RISC-V is using 128 bit IEEE 754 format like Aarch64,
so we reference AArch64 to support that.
|
|
The check incorrectly results in catan returning nan + inf i when real part is +/- 1 and
imaginary part is 0. The same occurs for real 0.8 and imaginary 0.6.
The change ends up matching glibc behaviour.
|
|
|
|
soft-fp should round floating pointer numbers according to the current
rounding mode. However, in the current code of lrint() and llrint(),
there are if statements before the actual rounding computation
if(j0 < -1)
return 0;
Where j0 is the exponent of the floating point number.
It means any number having a exponent less than -1
(i.e. interval (-0.5, 0.5)) will be rounded to 0 regardeless of the
rounding mode.
The bug already fixed in glibc in 2006 by moving the check afterwards
the rounding computation, but still persists in newlib.
This patch fixed it in a similar way to glibc
Ref Commit in glibc: 6624dbc07b5a9fb316ed188ef01f65b8eea8b47c
|
|
Zfinx/Zdinx are new extensions ratified in 2022, it similar to F/D extensions,
support hard float operation for single/double precision, but the difference
between Zfinx/Zdinx and F/D is Zfinx/Zdinx is operating under general purpose
registers rather than dedicated floating-point registers.
This patch improve the hard float support detection for RISC-V port, so
that Zfinx/Zdinx can have better/right performance.
Co-authored-by: Jesse Huang <jesse.huang@sifive.com>
|
|
Rename s_nearbyint.c, s_fdim.c and s_scalbln.c to remove conflicts
Remove functions that are not needed from above files
Modify include paths
Add includes missing in cygwin build
Add missing types
Create Makefiles
Create header files to resolve dependencies between directories
Modify some instances of unsigned long to uint64_t for 32 bit platforms
Add HAVE_FPMATH_H
|
|
FreeBSD files to add long double support for i386,
aarch64 and x86_64.
|
|
|
|
|
|
j is int32_t and thus j<<31 is undefined if j==1.
Taken from FreeBSD commit bdd8abc6d6a93ce3ab8ad5db716222ee3110c4a3
|
|
E.g. known errors:
x = 0x1.000002c5e2e99p+0, y = 0x1.c9eee35374af6p+31 had an error of 639
ULP and is now correctly rounded.
x = 0x1.fffffd2e3e669p-1, y = 0x1.344c9823eb66cp+32 had an error of 428
ULP and is now correctly rounded.
|
|
odd integer
|
|
While commit 9e09d6ed8 (tag: newlib-4.3.0) bumped the newlib version to 4.3.0,
this commit updates the version/date in the libc/libm manuals to match.
|
|
This implements a set of vectorized math routines to be used by the
compiler auto-vectorizer. Versions for vectors with 2 lanes up to
64 lanes (in powers of 2) are provided.
These routines are based on the scalar versions of the math routines in
libm/common, libm/math and libm/mathfp. They make extensive use of the GCC
C vector extensions and GCN-specific builtins in GCC.
|
|
|
|
This reverts commit 125e39bfea1a39341a60348c93a65cf4894e0f2a.
|
|
The implementation of expf() explains how approximation in the range [0 - 0.34] is done. The comment describes the "Reme" algorithm for constructing the polynomial. This is a typo and should be the "Remez" algorithm. The remez algorithm (or minimax) is used to calculate the coefficients of polynomials in other implementations of exp(0 and log().
See more:
https://en.wikipedia.org/wiki/Remez_algorithm
|
|
This implements a set of vectorized math routines to be used by the
compiler auto-vectorizer. Versions for vectors with 2 lanes up to
64 lanes (in powers of 2) are provided.
These routines are based on the scalar versions of the math routines in
libm/common, libm/math and libm/mathfp. They make extensive use of the GCC
C vector extensions and GCN-specific builtins in GCC.
|
|
By default, Newlib uses a huge object of type struct _reent to store
thread-specific data. This object is returned by __getreent() if the
__DYNAMIC_REENT__ Newlib configuration option is defined.
The reentrancy structure contains for example errno and the standard input,
output, and error file streams. This means that if an application only uses
errno it has a dependency on the file stream support even if it does not use
it. This is an issue for lower end targets and applications which need to
qualify the software according to safety standards (for example ECSS-E-ST-40C,
ECSS-Q-ST-80C, IEC 61508, ISO 26262, DO-178, DO-330, DO-333).
If the new _REENT_THREAD_LOCAL configuration option is enabled, then struct
_reent is replaced by dedicated thread-local objects for each struct _reent
member. The thread-local objects are defined in translation units which use
the corresponding object.
|
|
Unless make is invoked with V=1, have xmlto pass the parameter
'man.output.quietly=1' to xsltproc to suppress "Note: Writing foo.N"
output from the manpages stylesheet.
(This doesn't quite do what it says: The output is not silenced if V has
any value, including 0. You could consider that either a bug or a
feature.)
|
|
This will generate multiple manpage files as an output, but we don't
know what they will be called, so use a timestamp for build avoidance.
|
|
Simplify rules for creating docbook XML used to create manpages:
Updating the output using move-if-change and then unconditionally
touching the .stamp file doesn't make much sense.
|
|
Integrate the old libm/test/ subdir into the main build. It hasn't
been used in a long time causing the code to rot a bit. I've fixed
some of those, but it still fails for many ports, so it's disabled
by default. People who want to take a closer look can run:
$ make libm/test/test
|
|
Convert all the libm/ subdir makes into the top-level Makefile. This
allows us to build all of libm from the top Makefile without using any
recursive make calls. This is faster and avoids the funky lib.a logic
where we unpack subdir archives to repack into a single libm.a. The
machine override logic is maintained though by way of Makefile include
ordering, and source file accumulation in libm_a_SOURCES.
One thing to note is that this will require GNU Make because of:
libm_a_CFLAGS = ... $(libm_a_CFLAGS_$(subst /,_,$(@D)))
This was the only way I could find to supporting per-dir compiler
settings, and I couldn't find a POSIX compatible way of transforming
the variable content. I don't think this is a big deal as other
Makefiles in the tree are using GNU Make-specific syntax, but I call
this out as it's the only one so far in the new automake code that
I've been writing.
Automake doesn't provide precise control over the output object names
(by design). This is fine by default as we get consistent names in all
the subdirs: libm_a-<source>.o. But this relies on using the same set
of compiler flags for all objects. We currently compile libm/common/
with different optimizations than the rest.
If we want to compile objects differently, we can create an intermediate
archive with the subset of objects with unique flags, and then add those
objects to the main archive. But Automake will use a different prefix
for the objects, and thus we can't rely on ordering to override.
But if we leverage $@, we can turn Automake's CFLAGS into a multiplex
on a per-dir (and even per-file if we wanted) basis. Unfortunately,
since $@ contains /, Automake complains it's an invalid name. While
GNU Make supports this, it's a POSIX extension, so Automake flags it.
Using $(subst) avoids the Automake warning to get a POSIX compliant
name, albeit with a GNU Make extension.
|
|
Commit 8fa73a9f8414a4926365324c2fe32a237c2eb91d changed how fenv.c is
compiled wrt mips16 targets used the wrong variable to add fenv.o to
libm.a. Fix that thinko so it's included in the build again.
|
|
This is used in a bunch of places, but nowhere is it ever set, and
nowhere can I find any documentation, nor can I find any other project
using it. So delete the flags to simplify.
|
|
The original cut for small arguments at |x|<2**-70 (copied from the
double version) produces that when computing nadj we get a subnormal
number for t*x and thus, the division of pi/subnormal will be INF and
the logarithm of it too, which is wrong as a result for lgammaf in this
range.
The proposed new limit seems to be safe and has been tested to
produce accurate results.
(Courtesy of Andreas Jung, ESA)
|
|
These look like they were just copied & pasted from common/Makefile.am.
The funcs in this dir are all stubs that don't actually call any math
or builtin functions, and a simple compile shows they produce identical
object code. So delete to simplify the build rules.
|
|
Correct the overflow limit in the variable o_threshold to be consistent
with the FLT_UWORD_LOG_MAX variable used by the internal implementation
of the expf algorithm itself.
The u_threshold variable has also been modified to be written in the
same format.
Note that this fix improves the situation but does not completely
correct the inconsistencies regarding the overflow and underflow limits
between the expf wrapper (wf_exp.c) and the expf algorithm itself
(ef_exp.c).
Currently these limits are different for the
_FLT_LARGEST_EXPONENT_IS_NORMAL and _FLT_NO_DENORMALS cases as well as
for the case where __OBSOLETE_MATH is not defined (only for the
underflow limit in this case).
|
|
I missed this cleanup when rebasing on top of the latest branch.
|