Age | Commit message (Collapse) | Author | Files | Lines |
|
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
|
|
glibc has had exp10 functions since long before they were
standardized; now they are standardized in TS 18661-4 and C2X, they
are also specified there to have a corresponding type-generic macro.
Add one to <tgmath.h>, so fixing bug 26108.
glibc doesn't have other functions from TS 18661-4 yet, but when
added, it will be natural to add the type-generic macro for each
function family at the same time as the functions.
Tested for x86_64.
|
|
C2X adds new <math.h> functions for floating-point maximum and
minimum, corresponding to the new operations that were added in IEEE
754-2019 because of concerns about the old operations not being
associative in the presence of signaling NaNs. fmaximum and fminimum
handle NaNs like most <math.h> functions (any NaN argument means the
result is a quiet NaN). fmaximum_num and fminimum_num handle both
quiet and signaling NaNs the way fmax and fmin handle quiet NaNs (if
one argument is a number and the other is a NaN, return the number),
but still raise "invalid" for a signaling NaN argument, making them
exceptions to the normal rule that a function with a floating-point
result raising "invalid" also returns a quiet NaN. fmaximum_mag,
fminimum_mag, fmaximum_mag_num and fminimum_mag_num are corresponding
functions returning the argument with greatest or least absolute
value. All these functions also treat +0 as greater than -0. There
are also corresponding <tgmath.h> type-generic macros.
Add these functions to glibc. The implementations use type-generic
templates based on those for fmax, fmin, fmaxmag and fminmag, and test
inputs are based on those for those functions with appropriate
adjustments to the expected results. The RISC-V maintainers might
wish to add optimized versions of fmaximum_num and fminimum_num (for
float and double), since RISC-V (F extension version 2.2 and later)
provides instructions corresponding to those functions - though it
might be at least as useful to add architecture-independent built-in
functions to GCC and teach the RISC-V back end to expand those
functions inline, which is what you generally want for functions that
can be implemented with a single instruction.
Tested for x86_64 and x86, and with build-many-glibcs.py.
|
|
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well. As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma. The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.
The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.
This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done). (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
|
|
This patch adds the narrowing square root functions from TS 18661-1 /
TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64,
f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x,
f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128,
f64xsqrtf128 for configurations with _Float64x and _Float128;
__f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case
(for calls to fsqrtl and dsqrtl when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, so the description of those generally
applies to this patch as well. However, the not-actually-narrowing
cases (where the two types involved in the function have the same
floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather
than needing a separately built not-actually-narrowing function such
as was needed for add / sub / mul / div. Thus, there is no
__nldbl_dsqrtl name for ldbl-opt because no such name was needed
(whereas the other functions needed such a name since the only other
name for that entry point was e.g. f32xaddf64, not reserved by TS
18661-1); the headers are made to arrange for sqrt to be called in
that case instead.
The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because
they were observed to be needed in GCC 7 testing of
riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/
files added didn't need such DIAG_* in any configuration I tested with
build-many-glibcs.py, but if they do turn out to be needed in more
files with some other configuration / GCC version, they can always be
added there.
I reused the same test inputs in auto-libm-test-in as for
non-narrowing sqrt rather than adding extra or separate inputs for
narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow
those for non-narrowing sqrt.
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
|
|
Some math functions (such as __isnan*) are built into both libm and
libc because they are needed in libc. The symbol gets exported from
libc.so and not libm.so, because of which dynamic linking works fine;
the symbols are always resolved from libc.so and libm.so uses its
internal copy of the same function if needed.
When linking statically though, the libm variants get used throughout
because the symbols are exported in both archives and libm.a is
searched first.
This patch removes these duplicate objects from the libm.a archive so
that programs always link to libc in both, the static and dynamic
case. The difference this will cause is that libm uses of these
functions will start using the libc versions in the !SHARED case.
This is harmless at the moment because the objects are identical
except for their names.
Some of these duplicates could be removed from libm.so too, but I
avoided that in the interest of retaining an internal reference if at
all those functions get used within libm in future.
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
|
|
Finally remove all mpa related files, headers, declarations, probes, unused
tables and update makefiles.
Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
|
|
compat_symbol_reference is now available without tests-internal.
Do not build the test at all on glibc versions that lack the symbols,
to avoid spurious UNSUPPORTED results.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
tests-internal is no longer needed because compat_symbol_reference
now works in regular tests.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
|
|
With fenvinline.h removal the flag is not used anymore.
Checked on x86_64-linux-gnu.
|
|
Similar to string2.h (18b10de7ce) and string3.h (09a596cc2c) this
patch removes the fenvinline.h on all architectures. Currently
only powerpc implements some optimizations. This kind of optimization
is better implemented by the compiler (which handles the architecture
ISA transparently).
Also, for the specific optimized powerpc implementation the code is
becoming convoluted and these micro-optimization are hardly wildly
used, even more being a possible hotspot in realword cases
(non-default rounding are used only on specific cases and exception
handling are done most likely only on errors path). Only x86
implements similar optimization (on fenv.h) also indicates that
these should no be on libc.
The math/test-fenv already covers all math/test-fenvinline tests,
so it is safe to remove it.
The powerpc fegetround optimization is moved to internal
fenv_libc.h.
The BZ#94193 [1] the corresponding GCC bug for adding replacements
for these on powerpc.
Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94193
|
|
With mathinline removal there is no need to keep building and testing
inline math tests.
The gen-libm-tests.py support to generate ULP_I_* is removed and all
libm-test-ulps files are updated to longer have the
i{float,double,ldouble} entries. The support for no-test-inline is
also removed from both gen-auto-libm-tests and the
auto-libm-test-out-* were regenerated.
Checked on x86_64-linux-gnu and i686-linux-gnu.
|
|
With m68k mathinline.h removal the flag is not used anymore.
Checked with a m68k-linux-gnu build/check.
|
|
With m68k bits moved to internal headers, no architectures export
additional optimizations on mathinline.
|
|
On platforms where long double has the same ABI as double, glibc
defines long double functions as aliases for the corresponding double
functions. The declarations of those functions in <math.h> are
disabled to avoid problems with aliases having incompatible types, but
GCC 10 now gives errors for incompatible types when the long double
function is known to GCC as a built-in function, not just when there
is an incompatible header declaration.
This patch fixes those errors by using appropriate
-fno-builtin-<function> options to compile the double functions. The
list of CFLAGS-* settings is an appropriately adapted version of that
in sysdeps/ieee754/ldbl-opt/Makefile used there for building nldbl-*.c
files; in particular, the options are used even if GCC does not
currently have a built-in function of a given function, so that adding
such a built-in function in future will not break the glibc build.
Thus, various of the CFLAGS-* settings are only for future-proofing
and may not currently be needed (and it's possible some could be
irrelevant for other reasons).
Tested with build-many-glibcs.py for arm-linux-gnueabi (compilers and
glibcs builds), where it fixes the build that previously failed.
|
|
This patch creates test-ibm128* tests from the long double function tests.
In order to explicitly test IBM long double functions -mabi=ibmlongdouble is
added to CFLAGS.
Likewise, update the test headers to correct choose ULPs when redirects
are enabled.
Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Co-authored-by: Paul E. Murphy <murphyp@linux.vnet.ibm.com>
|
|
This is a preparatory patch to enable building a _Float128
variant to ease reuse when building a _Float128 variant to
alias this long double only symbol.
Notably, stubs are added where missing to the native _Float128
sysdep dir to prevent building these newly templated variants
created inside the build directories.
Also noteworthy are the changes around LIBM_SVID_COMPAT. These
changes are not intuitive. The templated version is only
enabled when !LIBM_SVID_COMPAT, and the compat version is
predicated entirely on LIBM_SVID_COMPAT. Thus, exactly one is
stubbed out entirely when building. The nldbl scalb compat
files are updated to account for this.
Likewise, fixup the reuse of m68k's e_scalb{f,l}.c to include
it's override of e_scalb.c. Otherwise, the search path finds
the templated copy in the build directory. This could be
futher simplified by providing an overridden template, but I
lack the hardware to verify.
|
|
|
|
Remove _finite tests and references from x86_64. Rather than calling
__exp_finite, use exp directly (since it's the same entry point).
x86_64 builds and passes testsuite.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Remove the finite-math tests from the testsuite - these are no longer
useful after removing math-finite.h header.
Passes buildmanyglibc, build&test on x86_64 and AArch64.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Remove math-finite.h redirections for math functions.
Passes buildmanyglibc.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
|
|
When adding some of the TS 18661 narrowing functions for glibc 2.28, I
deferred adding corresponding <tgmath.h> support because of unresolved
questions about the specification for those type-generic macros,
especially in relation to _FloatN and _FloatNx types.
Those issues are now clarified in the response to Clarification
Request 13 to TS 18661-3, and this patch adds the deferred tgmath.h
support. As with other tgmath.h macros, there are fairly
straightforward implementations based on __builtin_tgmath for GCC 8
and later, which result in exactly the right function being called in
each case, and more complicated implementations for GCC 7 and earlier,
which generally result in a function being called whose arguments have
the right format (i.e. an alias for the right function), but which
might not be exactly the function name specified by TS 18661.
In one case with older compilers (f32x* macros, where the type
_Float64x exists and all the arguments have type _Float32 or
_Float32x), there is a further relaxation and the function called may
have arguments narrower than the one specified by the TS, but still
wide enough to represent the arguments exactly, so the result of the
call is unchanged (as this does not affect any case where rounding of
integer arguments might be involved). With GCC 6 or before this is
inherently unavoidable (but still harmless and not detectable by how
the compiled program behaves, unless it redefines the functions in
question like the testcases do) because _Float32x and _Float64 are
both typedefs for double in that case but the specified semantics
result in different functions, with different argument formats, being
called for those two argument types.
Tests for the new macros are handled through gen-tgmath-tests.py,
which deals with the special-case handling for older GCC.
Tested as follows: with the full glibc testsuite on x86_64 and x86
(with GCC 6, 7 and 8); with the math/ tests on aarch64 and arm (with
GCC 6, 7 and 8); with build-many-glibcs.py (with GCC 6, 7 and 9).
* math/tgmath.h [__HAVE_FLOAT128X]: Give error.
[(__HAVE_FLOAT64X && !__HAVE_FLOAT128)
|| (__HAVE_FLOAT128 && !__HAVE_FLOAT64X)]: Likewise.
(__TGMATH_2_NARROW_F): Likewise.
(__TGMATH_2_NARROW_D): New macro.
(__TGMATH_2_NARROW_F16): Likewise.
(__TGMATH_2_NARROW_F32): Likewise.
(__TGMATH_2_NARROW_F64): Likewise.
(__TGMATH_2_NARROW_F32X): Likewise.
(__TGMATH_2_NARROW_F64X): Likewise.
[__HAVE_BUILTIN_TGMATH] (__TGMATH_NARROW_FUNCS_F): Likewise.
[__HAVE_BUILTIN_TGMATH] (__TGMATH_NARROW_FUNCS_F16): Likewise.
[__HAVE_BUILTIN_TGMATH] (__TGMATH_NARROW_FUNCS_F32): Likewise.
[__HAVE_BUILTIN_TGMATH] (__TGMATH_NARROW_FUNCS_F64): Likewise.
[__HAVE_BUILTIN_TGMATH] (__TGMATH_NARROW_FUNCS_F32X): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT_C2X)] (fadd): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT_C2X)] (dadd): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT_C2X)] (fdiv): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT_C2X)] (ddiv): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT_C2X)] (fmul): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT_C2X)] (dmul): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT_C2X)] (fsub): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT_C2X)] (dsub): Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT16] (f16add):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT16] (f16div):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT16] (f16mul):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT16] (f16sub):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT32] (f32add):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT32] (f32div):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT32] (f32mul):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT32] (f32sub):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT64
&& (__HAVE_FLOAT64X || __HAVE_FLOAT128)] (f64add): Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT64
&& (__HAVE_FLOAT64X || __HAVE_FLOAT128)] (f64div): Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT64
&& (__HAVE_FLOAT64X || __HAVE_FLOAT128)] (f64mul): Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT64
&& (__HAVE_FLOAT64X || __HAVE_FLOAT128)] (f64sub): Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT32X] (f32xadd):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT32X] (f32xdiv):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT32X] (f32xmul):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT32X] (f32xsub):
Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT64X
&& (__HAVE_FLOAT128X || __HAVE_FLOAT128)] (f64xadd): Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT64X
&& (__HAVE_FLOAT128X || __HAVE_FLOAT128)] (f64xdiv): Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT64X
&& (__HAVE_FLOAT128X || __HAVE_FLOAT128)] (f64xmul): Likewise.
[__GLIBC_USE (IEC_60559_TYPES_EXT) && __HAVE_FLOAT64X
&& (__HAVE_FLOAT128X || __HAVE_FLOAT128)] (f64xsub): Likewise.
* math/gen-tgmath-tests.py (Type): Add members
non_standard_real_argument_types_list, long_double_type,
complex_float64_type and float32x_ext_type.
(Type.__init__): Set the new members.
(Type.floating_type): Add new argument floatn.
(Type.real_floating_type): Likewise.
(Type.can_combine_types): Likewise.
(Type.combine_types): Likewise.
(Type.init_types): Create internal Float32x_ext type.
(Tests.__init__): Define Float32x_ext in generated C code.
(Tests.add_tests): Handle narrowing functions.
(Tests.add_all_tests): Likewise.
(Tests.tests_text): Allow variation in mant_dig for narrowing
functions with compilers before GCC 8.
* math/Makefile (tgmath3-narrow-types): New variable.
(tgmath3-narrow-macros): Likewise.
(tgmath3-macros): Add $(tgmath3-narrow-macros).
|
|
The resolution of C floating-point Clarification Request 25
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2397.htm#dr_25> is
that the totalorder and totalordermag functions should take pointer
arguments, and this has been adopted in C2X (with const added; note
that the integration of this change into C2X is present in the C
standard git repository but postdates the most recent public PDF
draft).
This patch updates glibc accordingly. As a defect resolution, the API
is changed unconditionally rather than supporting any sort of TS
18661-1 mode for compilation with the old version of the API. There
are compat symbols for existing binaries that pass floating-point
arguments directly. As a consequence of changing to pointer
arguments, there are no longer type-generic macros in tgmath.h for
these functions.
Because of the fairly complicated logic for creating libm function
aliases and determining the set of aliases to create in a given glibc
configuration, rather than duplicating all that in individual source
files to create the versioned and compat symbols, the source files for
the various versions of totalorder functions are set up to redefine
weak_alias before using libm_alias_* macros to create the symbols
required. In turn, this requires creating a separate alias for each
symbol version pointing to the same implementation (see binutils bug
<https://sourceware.org/bugzilla/show_bug.cgi?id=23840>), which is
done automatically using __COUNTER__. (As I noted in
<https://sourceware.org/ml/libc-alpha/2018-10/msg00631.html>, it might
well make sense for glibc's symbol versioning macros to do that alias
creation with __COUNTER__ themselves, which would somewhat simplify
the logic in the totalorder source files.)
It is of course desirable to test the compat symbols. I did this with
the generic libm-test machinery, but didn't wish to duplicate the
actual tables of test inputs and outputs, and thought it risky to
attempt to have a single object file refer to both default and compat
versions of the same function in order to test them together. Thus, I
created libm-test-compat_totalorder.inc and
libm-test-compat_totalordermag.inc which include the generated .c
files (with the processed version of those tables of inputs) from the
non-compat tests, and added appropriate dependencies. I think this
provides sufficient test coverage for the compat symbols without also
needing to make the special ldbl-96 and ldbl-128ibm tests (of
peculiarities relating to the representations of those formats that
can't be covered in the generic tests) run for the compat symbols.
Tests of compat symbols need to be internal tests, meaning _ISOMAC is
not defined. Making some libm-test tests into internal tests showed
up two other issues. GCC diagnoses duplicate macro definitions of
__STDC_* macros, including __STDC_WANT_IEC_60559_TYPES_EXT__; I added
an appropriate conditional and filed
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91451> for this issue.
On ia64, include/setjmp.h ends up getting included indirectly from
libm-symbols.h, resulting in conflicting definitions of the STR macro
(also defined in libm-test-driver.c); I renamed the macros in
include/setjmp.h. (It's arguable that we should have common internal
headers used everywhere for stringizing and concatenation macros.)
Tested for x86_64 and x86, and with build-many-glibcs.py.
* math/bits/mathcalls.h
[__GLIBC_USE (IEC_60559_BFP_EXT) || __MATH_DECLARING_FLOATN]
(totalorder): Take pointer arguments.
[__GLIBC_USE (IEC_60559_BFP_EXT) || __MATH_DECLARING_FLOATN]
(totalordermag): Likewise.
* manual/arith.texi (totalorder): Likewise.
(totalorderf): Likewise.
(totalorderl): Likewise.
(totalorderfN): Likewise.
(totalorderfNx): Likewise.
(totalordermag): Likewise.
(totalordermagf): Likewise.
(totalordermagl): Likewise.
(totalordermagfN): Likewise.
(totalordermagfNx): Likewise.
* math/tgmath.h (__TGMATH_BINARY_REAL_RET_ONLY): Remove macro.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (totalorder): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (totalordermag): Likewise.
* math/Versions (GLIBC_2.31): Add totalorder, totalorderf,
totalorderl, totalordermag, totalordermagf, totalordermagl,
totalorderf32, totalorderf64, totalorderf32x, totalordermagf32,
totalordermagf64, totalordermagf32x, totalorderf64x,
totalordermagf64x, totalorderf128 and totalordermagf128.
* math/Makefile (libm-test-funcs-noauto): Add compat_totalorder
and compat_totalordermag.
(libm-test-funcs-compat): New variable.
(libm-tests-compat): Likewise.
(tests): Do not include compat tests.
(tests-internal): Add compat tests.
($(foreach t,$(libm-tests-base),
$(objpfx)$(t)-compat_totalorder.o)): Depend
on $(objpfx)libm-test-totalorder.c.
($(foreach t,$(libm-tests-base),
$(objpfx)$(t)-compat_totalordermag.o): Depend on
$(objpfx)libm-test-totalordermag.c.
(tgmath3-macros): Remove totalorder and totalordermag.
* math/libm-test-compat_totalorder.inc: New file.
* math/libm-test-compat_totalordermag.inc: Likewise.
* math/libm-test-driver.c (struct test_ff_i_data): Update comment.
(RUN_TEST_fpfp_b): New macro.
(RUN_TEST_LOOP_fpfp_b): Likewise.
* math/libm-test-totalorder.inc (totalorder_test_data): Use
TEST_fpfp_b.
(totalorder_test): Condition on [!COMPAT_TEST].
(do_test): Likewise.
* math/libm-test-totalordermag.inc (totalordermag_test_data): Use
TEST_fpfp_b.
(totalordermag_test): Condition on [!COMPAT_TEST].
(do_test): Likewise.
* math/gen-tgmath-tests.py (Tests.add_all_tests): Remove
totalorder and totalordermag.
* math/test-tgmath.c (NCALLS): Change to 132.
(F(compile_test)): Do not call totalorder or totalordermag.
(F(totalorder)): Remove.
(F(totalordermag)): Likewise.
* include/float.h (__STDC_WANT_IEC_60559_TYPES_EXT__): Do not
define if [__STDC_WANT_IEC_60559_TYPES_EXT__].
* include/setjmp.h [!_ISOMAC] (STR_HELPER): Rename to
SJSTR_HELPER.
[!_ISOMAC] (STR): Rename to SJSTR. Update call to STR_HELPER.
[!_ISOMAC] (TEST_SIZE): Update call to STR.
[!_ISOMAC] (TEST_ALIGN): Likewise.
[!_ISOMAC] (TEST_OFFSET): Likewise.
* sysdeps/ieee754/dbl-64/s_totalorder.c: Include <shlib-compat.h>
and <first-versions.h>.
(__totalorder): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/dbl-64/s_totalordermag.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermag): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/dbl-64/wordsize-64/s_totalorder.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalorder): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/dbl-64/wordsize-64/s_totalordermag.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermag): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/float128/float128_private.h
(__totalorder_compatl): New macro.
(__totalordermag_compatl): Likewise.
* sysdeps/ieee754/flt-32/s_totalorderf.c: Include <shlib-compat.h>
and <first-versions.h>.
(__totalorderf): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/flt-32/s_totalordermagf.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermagf): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/ldbl-128/s_totalorderl.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalorderl): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermagl): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/ldbl-128ibm/s_totalorderl.c: Include
<shlib-compat.h>.
(__totalorderl): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/ldbl-128ibm/s_totalordermagl.c: Include
<shlib-compat.h>.
(__totalordermagl): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/ldbl-96/s_totalorderl.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalorderl): Take pointer arguments. Add symbol versions and
compat symbols.
* sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Include
<shlib-compat.h> and <first-versions.h>.
(__totalordermagl): Take pointer arguments. Add symbol versions
and compat symbols.
* sysdeps/ieee754/ldbl-opt/nldbl-totalorder.c (totalorderl): Take
pointer arguments.
* sysdeps/ieee754/ldbl-opt/nldbl-totalordermag.c (totalordermagl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c
(do_test): Update calls to totalorderl and totalordermagl.
* sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c (do_test):
Update calls to totalorderl and totalordermagl.
* sysdeps/mach/hurd/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/csky/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
|
|
2019-03-07 Martin Liska <mliska@suse.cz>
* math/Makefile: Change location where math-vector-fortran.h is
installed.
* math/finclude/math-vector-fortran.h: Move from bits/math-vector-fortran.h.
* sysdeps/x86/fpu/finclude/math-vector-fortran.h: Move
from sysdeps/x86/fpu/bits/math-vector-fortran.h.
* scripts/check-installed-headers.sh: Skip Fortran header files.
* scripts/check-wrapper-headers.py: Likewise.
|
|
|
|
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
This patch makes Python 3.4 or later a required tool for building
glibc, so allowing changes of awk, perl etc. code used in the build
and test to Python code without any such changes needing makefile
conditionals or to handle older Python versions.
This patch makes the configure test for Python check the version and
give an error if Python is missing or too old, and removes makefile
conditionals that are no longer needed. It does not itself convert
any code from another language to Python, and does not remove any
compatibility with older Python versions from existing scripts.
Tested for x86_64.
* configure.ac (PYTHON_PROG): Use AC_CHECK_PROG_VER. Set
critic_missing for versions before 3.4.
* configure: Regenerated.
* manual/install.texi (Tools for Compilation): Document
requirement for Python to build glibc.
* INSTALL: Regenerated.
* Rules [PYTHON]: Make code unconditional.
* benchtests/Makefile [PYTHON]: Likewise.
* conform/Makefile [PYTHON]: Likewise.
* manual/Makefile [PYTHON]: Likewise.
* math/Makefile [PYTHON]: Likewise.
|
|
The algorithm is exp(y * log(x)), where log(x) is computed with about
1.3*2^-68 relative error (1.5*2^-68 without fma), returning the result
in two doubles, and the exp part uses the same algorithm (and lookup
tables) as exp, but takes the input as two doubles and a sign (to handle
negative bases with odd integer exponent). The __exp1 internal symbol
is no longer necessary.
There is separate code path when fma is not available but the worst case
error is about 0.54 ULP in both cases. The lookup table and consts for
log are 4168 bytes. The .rodata+.text is decreased by 37908 bytes on
aarch64. The non-nearest rounding error is less than 1 ULP.
Improvements on Cortex-A72 compared to current glibc master:
pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1]
pow latency: 1.84x in [0.01 11.1]x[0.01 11.1]
Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and
arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets.
* NEWS: Mention pow improvements.
* math/Makefile (type-double-routines): Add e_pow_log_data.
* sysdeps/generic/math_private.h (__exp1): Remove.
* sysdeps/i386/fpu/e_pow_log_data.c: New file.
* sysdeps/ia64/fpu/e_pow_log_data.c: New file.
* sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma
contraction.
* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove.
(exp_inline): Remove.
(__ieee754_exp): Only single double input is handled.
* sysdeps/ieee754/dbl-64/e_pow.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file.
* sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define.
(__pow_log_data): Define.
* sysdeps/ieee754/dbl-64/upow.h: Remove.
* sysdeps/ieee754/dbl-64/upow.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file.
* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma
contraction.
(CFLAGS-e_pow-fma4.c): Likewise.
|
|
Similar algorithm is used as in log: log2(2^k x) = k + log2(c) + log2(x/c)
where the last term is approximated by a polynomial of x/c - 1, the first
order coefficient is about 1/ln2 in this case.
There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, for which the table size is doubled.
The worst case error is 0.547 ULP (0.55 without fma), the read only
global data size is 1168 bytes (2192 without fma) on aarch64. The
non-nearest rounding error is less than 1 ULP.
Improvements on Cortex-A72 compared to current glibc master:
log2 thruput: 2.00x in [0.01 11.1]
log2 latency: 2.04x in [0.01 11.1]
log2 thruput: 2.17x in [0.999 1.001]
log2 latency: 2.88x in [0.999 1.001]
Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linxu-gnu (defined __FP_FAST_FMA)
targets.
* NEWS: Mention log2 improvements.
* math/Makefile (type-double-routines): Add e_log2_data.
* sysdeps/i386/fpu/e_log2_data.c: New file.
* sysdeps/ia64/fpu/e_log2_data.c: New file.
* sysdeps/ieee754/dbl-64/e_log2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_log2_data.c: New file.
* sysdeps/ieee754/dbl-64/math_config.h (__log2_data): Add.
* sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c: Remove.
* sysdeps/m68k/m680x0/fpu/e_log2_data.c: New file.
|
|
Optimized log using carefully generated lookup table with 1/c and log(c)
values for small intervalls around 1. The log(c) is very near a double
precision value, it has about 62 bits precision. The algorithm is
log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is
approximated by a polynomial of x/c - 1. Near 1 a single polynomial of
x - 1 is used.
There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, in which case the table size is doubled.
The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined
as an instruction.
With the default configuration settings the worst case error is 0.519 ULP
(and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma).
The non-nearest rounding error is less than 1 ULP.
Improvements on Cortex-A72 compared to current glibc master:
log thruput: 3.28x in [0.01 11.1]
log latency: 2.23x in [0.01 11.1]
log thruput: 1.56x in [0.999 1.001]
log latency: 1.57x in [0.999 1.001]
Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linux-gnu (defined __FP_FAST_FMA)
targets.
* NEWS: Mention log improvement.
* math/Makefile (type-double-routines): Add e_log_data.
* sysdeps/i386/fpu/e_log_data.c: New file.
* sysdeps/ia64/fpu/e_log_data.c: New file.
* sysdeps/ieee754/dbl-64/e_log.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_log_data.c: New file.
* sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add.
* sysdeps/ieee754/dbl-64/ulog.h: Remove.
* sysdeps/ieee754/dbl-64/ulog.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.
|
|
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
|
|
Remove empty files due to the sin/cos improvements: k_sinf.c, k_cosf.c,
k_cos.c, k_sin.c. After the tanf change s_rem_pio2f.c and k_rem_pio2f.c
(and the ia64, m68k and powerpc equivalents) are no longer used,
so remove them. All e_rem_pio2.c files were already empty or commented
out, so remove them too. Passes build-many-glibcs.
* math/Makefile: Remove empty files k_sin(f).c, k_cos(f).c.
Remove unused files e_rem_pio2(f).c, k_rem_pio2f.c.
* sysdeps/i386/fpu/e_rem_pio2.c: Delete file.
* sysdeps/ia64/fpu/e_rem_pio2.c: Likewise.
* sysdeps/ia64/fpu/e_rem_pio2f.c: Likewise.
* sysdeps/ia64/fpu/k_rem_pio2f.c: Likewise.
* sysdeps/ieee754/dbl-64/e_rem_pio2.c: Likewise.
* sysdeps/ieee754/dbl-64/k_cos.c: Likewise.
* sysdeps/ieee754/dbl-64/k_sin.c: Likewise.
* sysdeps/ieee754/flt-32/e_rem_pio2f.c: Likewise.
* sysdeps/ieee754/flt-32/k_cosf.c: Likewise.
* sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise.
* sysdeps/ieee754/flt-32/k_sinf.c: Likewise.
* sysdeps/m68k/m680x0/fpu/e_rem_pio2.c: Likewise
* sysdeps/m68k/m680x0/fpu/e_rem_pio2f.c: Likewise
* sysdeps/m68k/m680x0/fpu/k_rem_pio2f.c: Likewise
* sysdeps/powerpc/fpu/e_rem_pio2f.c: Likewise.
* sysdeps/powerpc/fpu/k_rem_pio2f.c: Likewise.
|
|
This patch is a complete rewrite of sincosf. The new version is
significantly faster, as well as simple and accurate.
The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over
all 4 billion inputs. In non-nearest rounding modes the error is 1ULP.
The algorithm uses 3 main cases: small inputs which don't need argument
reduction, small inputs which need a simple range reduction and large inputs
requiring complex range reduction. The code uses approximate integer
comparisons to quickly decide between these cases.
The small range reducer uses a single reduction step to handle values up to
120.0. It is fastest on targets which support inlined round instructions.
The large range reducer uses integer arithmetic for simplicity. It does a
32x96 bit multiply to compute a 64-bit modulo result. This is more than
accurate enough to handle the worst-case cancellation for values close to
an integer multiple of PI/4. It could be further optimized, however it is
already much faster than necessary.
sincosf throughput gains on Cortex-A72:
* |x| < 0x1p-12 : 1.6x
* |x| < M_PI_4 : 1.7x
* |x| < 2 * M_PI: 1.5x
* |x| < 120.0 : 1.8x
* |x| < Inf : 2.3x
* math/Makefile: Add s_sincosf_data.c.
* sysdeps/ia64/fpu/s_sincosf_data.c: New file.
* sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function.
(sincosf_poly): Likewise.
(reduce_small): Likewise.
(reduce_large): Likewise.
* sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite.
* sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data.
* sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file.
* sysdeps/x86_64/fpu/s_sincosf_data.c: New file.
|
|
Following the recent discussion of using Python instead of Perl and
Awk for glibc build / test, this patch replaces gen-libm-test.pl with
a new gen-libm-test.py script. This script should work with all
Python versions supported by glibc (tested by hand with Python 2.7,
tested in the build system with Python 3.5; configure prefers Python 3
if available).
This script is designed to give identical output to gen-libm-test.pl
for ease of verification of the change, except for generated comments
referring to .py instead of .pl. (That is, identical for actual
inputs passed to the script, not necessarily for all possible input;
for example, this version more precisely follows the C standard syntax
for floating-point constants when deciding when to add LIT macro
calls.) In one place a comment notes that the generation of
NON_FINITE flags is replicating a bug in the Perl script to assist in
such comparisons (with the expectation that this bug can then be
separately fixed in the Python script later).
Tested for x86_64, including comparison of generated files (and hand
testing of the case of generating a sorted libm-test-ulps file, which
isn't covered by normal "make check").
I'd expect to follow this up by extending the new script to produce
the ulps tables for the manual as well (replacing
manual/libm-err-tab.pl, so that then we just have one ulps file
parser) - at which point the manual build would depend on both Perl
and Python (eliminating the Perl dependency would require someone to
rewrite summary.pl in Python, and that would only eliminate the
*direct* Perl dependency; current makeinfo is written in Perl so there
would still be an indirect dependency).
I think install.texi is more or less equally out-of-date regarding
Perl and Python uses before and after this patch, so I don't think
this patch depends on my patch
<https://sourceware.org/ml/libc-alpha/2018-08/msg00133.html> to update
install.texi regarding such uses (pending review).
* math/gen-libm-test.py: New file.
* math/gen-libm-test.pl: Remove.
* math/Makefile [$(PERL) != no]: Change condition to [PYTHON].
($(objpfx)libm-test-ulps.h): Use gen-libm-test.py instead of
gen-libm-test.pl.
($(libm-test-c-noauto-obj)): Likewise.
($(libm-test-c-auto-obj)): Likewise.
($(libm-test-c-narrow-obj)): Likewise.
(regen-ulps): Likewise.
* math/README.libm-test: Update references to gen-libm-test.pl.
* math/libm-test-driver.c (struct test_fj_f_data): Update comment
referencing gen-libm-test.pl.
* math/libm-test-nexttoward.inc (nexttoward_test_data): Likewise.
* math/libm-test-support.c: Likewise.
* math/libm-test-support.h: Likewise.
* sysdeps/generic/libm-test-ulps: Likewise.
|
|
Create a template for significand.
* math/Makefile (libm-calls): Move s_significandF to...
(gen-libm-calls): ... here.
* math/s_significand_template.c: New file.
* math/s_significand.c: Removed.
* math/s_significandf.c: Removed.
* math/s_significandl.c: Removed.
* sysdeps/ieee754/ldbl-opt/s_significand.c: Removed.
* sysdeps/ieee754/ldbl-opt/s_significandl.c: Removed.
Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
|
|
As in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86113 for
__builtin_nan, bits/mathcalls.h wrongly declares the nan function with
the __const__ attribute. Because the function reads memory pointed to
by an argument, it's only pure, not const. This patch removes the
incorrect attribute and adds a testcase for the bug. No __pure__
attribute is added to replace the incorrect __const__ one, since that
would introduce problems when using GCC versions that have the
incorrect built-in __const__ attribute and warn for the combination of
those two attributes.
Tested for x86_64.
[BZ #23277]
* math/bits/mathcalls.h [__USE_ISOC99] (nan): Do not use __const__
attribute.
* math/test-nan-const.c: New file.
* math/Makefile (tests): Add test-nan-const.
(CFLAGS-test-nan-const.c): New variable.
|
|
It has been noted that test-tgmath3 is slow to compile, and to link on
some systems
<https://sourceware.org/ml/libc-alpha/2018-02/msg00477.html>, because
of the size of the test.
I'm working on tgmath.h support for the TS 18661-1 / 18661-3 functions
that round their results to a narrower type. For the functions
already present in glibc, this wouldn't make test-tgmath3 much bigger,
because those functions only have two arguments. For the narrowing
versions of fma (for which I've not yet added the functions to glibc),
however, it would result in many configurations building tests of the
type-generic macros f32fma, f64fma, f32xfma, f64xfma, each with 21
possible types for each of three arguments (float, double, long double
aren't valid argument types for these macros when they return a
_FloatN / _FloatNx type), so substantially increasing the size of the
testcase.
To avoid further increasing the size of a single test when adding the
type-generic narrowing fma macros, this patch arranges for the
test-tgmath3 tests to be run separately for each function tested. The
fma tests are still by far the largest (next is pow, as that has two
arguments that can be real or complex; after that, the two-argument
real-only functions), but each type-generic fma macro for a different
return type would end up with its tests being run separately, rather
than increasing the size of a single test.
To avoid accidentally missing testing a macro because
gen-tgmath-tests.py supports testing it but the makefile fails to call
it for that function, a test is also added that verifies that the
lists of macros in the makefile and gen-tgmath-tests.py agree.
Tested for x86_64.
* math/gen-tgmath-tests.py: Import sys.
(Tests.__init__): Initialize macros_seen.
(Tests.add_tests): Add macro to macros_seen. Only generate tests
if requested to do so for this macro.
(Tests.add_all_tests): Take argument for macro for which to
generate tests.
(Tests.check_macro_list): New function.
(main): Handle check-list argument and argument specifying macro
for which to generate tests.
* math/Makefile [PYTHON] (tgmath3-macros): New variable.
[PYTHON] (tgmath3-macro-tests): Likewise.
[PYTHON] (tests): Add $(tgmath3-macro-tests) not test-tgmath3.
[PYTHON] (generated): Add $(addsuffix .c,$(tgmath3-macro-tests))
not test-tgmath3.c.
[PYTHON] (CFLAGS-test-tgmath3.c): Remove.
[PYTHON] ($(tgmath3-macro-tests:%=$(objpfx)%.o): Add -fno-builtin
to CFLAGS.
[PYTHON] ($(objpfx)test-tgmath3.c): Replace rule by....
[PYTHON] ($(foreach
m,$(tgmath3-macros),$(objpfx)test-tgmath3-$(m).c): ... this. New
rule.
[PYTHON] (tests-special): Add
$(objpfx)test-tgmath3-macro-list.out.
[PYTHON] ($(objpfx)test-tgmath3-macro-list.out): New rule.
|
|
This patch adds the narrowing divide functions from TS 18661-1 to
glibc's libm: fdiv, fdivl, ddivl, f32divf64, f32divf32x, f32xdivf64
for all configurations; f32divf64x, f32divf128, f64divf64x,
f64divf128, f32xdivf64x, f32xdivf128, f64xdivf128 for configurations
with _Float64x and _Float128; __nldbl_ddivl for ldbl-opt.
The changes are mostly essentially the same as for the other narrowing
functions, so the description of those generally applies to this patch
as well.
Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.
* math/Makefile (libm-narrow-fns): Add div.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing divide functions.
* math/bits/mathcalls-narrow.h (div): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add div.
* math/math-narrow.h (CHECK_NARROW_DIV): New macro.
(NARROW_DIV_ROUND_TO_ODD): Likewise.
(NARROW_DIV_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fdivl): New
macro.
(__ddivl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fdiv and
ddiv.
(CFLAGS-nldbl-ddiv.c): New variable.
(CFLAGS-nldbl-fdiv.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_ddivl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_ddivl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fdiv, fdivl,
ddivl, fMdivfN, fMdivfNx, fMxdivfN and fMxdivfNx.
* math/auto-libm-test-in: Add tests of div.
* math/auto-libm-test-out-narrow-div: New generated file.
* math/libm-test-narrow-div.inc: New file.
* sysdeps/i386/fpu/s_f32xdivf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xdivf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fdiv.c: Likewise.
* sysdeps/ieee754/float128/s_f32divf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64divf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xdivf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_ddivl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xdivf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fdivl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_ddivl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fdivl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_ddivl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fdivl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-ddiv.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fdiv.c: Likewise.
* sysdeps/ieee754/soft-fp/s_ddivl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fdiv.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fdivl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
|
|
This patch adds the narrowing multiply functions from TS 18661-1 to
glibc's libm: fmul, fmull, dmull, f32mulf64, f32mulf32x, f32xmulf64
for all configurations; f32mulf64x, f32mulf128, f64mulf64x,
f64mulf128, f32xmulf64x, f32xmulf128, f64xmulf128 for configurations
with _Float64x and _Float128; __nldbl_dmull for ldbl-opt.
The changes are mostly essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well. f32xmulf64 for i386 cannot use precision control as used for
add and subtract, because that would result in double rounding for
subnormal results, so that uses round-to-odd with long double
intermediate result instead. The soft-fp support involves adding a
new FP_TRUNC_COOKED since soft-fp multiplication uses cooked inputs
and outputs.
Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.
* math/Makefile (libm-narrow-fns): Add mul.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing multiply functions.
* math/bits/mathcalls-narrow.h (mul): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add mul.
* math/math-narrow.h (CHECK_NARROW_MUL): New macro.
(NARROW_MUL_ROUND_TO_ODD): Likewise.
(NARROW_MUL_TRIVIAL): Likewise.
* soft-fp/op-common.h (FP_TRUNC_COOKED): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fmull): New
macro.
(__dmull): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fmul and
dmul.
(CFLAGS-nldbl-dmul.c): New variable.
(CFLAGS-nldbl-fmul.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_dmull.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dmull): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fmul, fmull,
dmull, fMmulfN, fMmulfNx, fMxmulfN and fMxmulfNx.
* math/auto-libm-test-in: Add tests of mul.
* math/auto-libm-test-out-narrow-mul: New generated file.
* math/libm-test-narrow-mul.inc: New file.
* sysdeps/i386/fpu/s_f32xmulf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xmulf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fmul.c: Likewise.
* sysdeps/ieee754/float128/s_f32mulf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64mulf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xmulf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_dmull.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xmulf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fmull.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_dmull.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fmull.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_dmull.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fmull.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dmul.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fmul.c: Likewise.
* sysdeps/ieee754/soft-fp/s_dmull.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fmul.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fmull.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
|
|
This patch adds the narrowing subtract functions from TS 18661-1 to
glibc's libm: fsub, fsubl, dsubl, f32subf64, f32subf32x, f32xsubf64
for all configurations; f32subf64x, f32subf128, f64subf64x,
f64subf128, f32xsubf64x, f32xsubf128, f64xsubf128 for configurations
with _Float64x and _Float128; __nldbl_dsubl for ldbl-opt.
The changes are essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well.
Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.
* math/Makefile (libm-narrow-fns): Add sub.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing subtract functions.
* math/bits/mathcalls-narrow.h (sub): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add sub.
* math/math-narrow.h (CHECK_NARROW_SUB): New macro.
(NARROW_SUB_ROUND_TO_ODD): Likewise.
(NARROW_SUB_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fsubl): New
macro.
(__dsubl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fsub and
dsub.
(CFLAGS-nldbl-dsub.c): New variable.
(CFLAGS-nldbl-fsub.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_dsubl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dsubl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fsub, fsubl,
dsubl, fMsubfN, fMsubfNx, fMxsubfN and fMxsubfNx.
* math/auto-libm-test-in: Add tests of sub.
* math/auto-libm-test-out-narrow-sub: New generated file.
* math/libm-test-narrow-sub.inc: New file.
* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xsubf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fsub.c: Likewise.
* sysdeps/ieee754/float128/s_f32subf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64subf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xsubf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xsubf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dsub.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fsub.c: Likewise.
* sysdeps/ieee754/soft-fp/s_dsubl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fsub.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fsubl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
|
|
Remove the now unused mplog and mpexp files.
* math/Makefile: Remove mpexp.c and mplog.c
* sysdeps/i386/fpu/mpexp.c: Delete file.
* sysdeps/i386/fpu/mplog.c: Likewise.
* sysdeps/ia64/fpu/mpexp.c: Likewise.
* sysdeps/ia64/fpu/mplog.c: Likewise.
* sysdeps/ieee754/dbl-64/e_exp.c: Remove mention of mpexp and mplog.
* sysdeps/ieee754/dbl-64/mpa.h (__pow_mp): Remove unused function.
* sysdeps/ieee754/dbl-64/mpexp.c: Delete file.
* sysdeps/ieee754/dbl-64/mplog.c: Likewise.
* sysdeps/m68k/m680x0/fpu/mpexp.c: Likewise.
* sysdeps/m68k/m680x0/fpu/mplog.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/Makefile: Remove mpexp* and mplog*.
* sysdeps/x86_64/fpu/multiarch/e_log-avx.c: Remove unused defines.
* sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/e_log-fma4.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mpexp-avx.c: Delete file.
* sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mpexp-fma4.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mplog-avx.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mplog-fma4.c: Likewise.
|
|
Remove the __slowexp code, so exp is no longer correctly rounded. The
result is computed to about 70 bits precision so the worst case ulp
error is about 0.500007 in nearest rounding mode.
* manual/probes.texi: Remove slowexp probes.
* math/Makefile: Remove slowexp.
* sysdeps/generic/math_private.h (__slowexp): Remove.
* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and
document error bounds.
* sysdeps/i386/fpu/slowexp.c: Remove.
* sysdeps/ia64/fpu/slowexp.c: Remove.
* sysdeps/ieee754/dbl-64/slowexp.c: Remove.
* sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove.
* sysdeps/m68k/m680x0/fpu/slowexp.c: Remove.
* sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove.
* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma.
* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove.
* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove.
* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove.
* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove.
|
|
Remove the slow paths from pow. Like several other double precision math
functions, pow is exactly rounded. This is not required from math functions
and causes major overheads as it requires multiple fallbacks using higher
precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns
of up to 100000x have been reported when the highest precision path triggers.
All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1).
The worst case error is ~0.506ULP. A simple test over a few hundred million
values shows pow is 10% faster on average. This fixes BZ #13932.
[BZ #13932]
* sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove.
* benchtests/pow-inputs: Update comment for slow path cases.
* manual/probes.texi (slowpow_p10): Delete removed probe.
(slowpow_p10): Likewise.
* math/Makefile: Remove halfulp.c and slowpow.c.
* sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1.
* sysdeps/generic/math_private.h (__exp1): Remove error argument.
(__halfulp): Remove.
(__slowpow): Remove.
* sysdeps/i386/fpu/halfulp.c: Delete file.
* sysdeps/i386/fpu/slowpow.c: Likewise.
* sysdeps/ia64/fpu/halfulp.c: Likewise.
* sysdeps/ia64/fpu/slowpow.c: Likewise.
* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument,
improve comments and add error analysis.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis.
(power1): Remove function:
(log1): Remove error argument, add error analysis.
(my_log2): Remove function.
* sysdeps/ieee754/dbl-64/halfulp.c: Delete file.
* sysdeps/ieee754/dbl-64/slowpow.c: Likewise.
* sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise.
* sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise.
* sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c.
* sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1.
* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c,
slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c.
* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define.
* sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise.
* sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file.
* sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise.
|
|
This patch adds the narrowing add functions from TS 18661-1 to glibc's
libm: fadd, faddl, daddl, f32addf64, f32addf32x, f32xaddf64 for all
configurations; f32addf64x, f32addf128, f64addf64x, f64addf128,
f32xaddf64x, f32xaddf128, f64xaddf128 for configurations with
_Float64x and _Float128; __nldbl_daddl for ldbl-opt. As discussed for
the build infrastructure patch, tgmath.h support is deliberately
deferred, and FP_FAST_* macros are not applicable without optimized
function implementations.
Function implementations are added for all relevant pairs of formats
(including certain cases of a format and itself where more than one
type has that format). The main implementations use round-to-odd, or
a trivial computation in the case where both formats are the same or
where the wider format is IBM long double (in which case we don't
attempt to be correctly rounding). The sysdeps/ieee754/soft-fp
implementations use soft-fp, and are used automatically for
configurations without exceptions and rounding modes by virtue of
existing Implies files. As previously discussed, optimized versions
for particular architectures are possible, but not included.
i386 gets a special version of f32xaddf64 to avoid problems with
double rounding (similar to the existing fdim version), since this
function must round just once without an intermediate rounding to long
double. (No such special version is needed for any other function,
because the nontrivial functions use round-to-odd, which does the
intermediate computation with the rounding mode set to round-to-zero,
and double rounding is OK except in round-to-nearest mode, so is OK
for that intermediate round-to-zero computation.) mul and div will
need slightly different special versions for i386 (using round-to-odd
on long double instead of precision control) because of the
possibility of inexact intermediate results in the subnormal range for
double.
To reduce duplication among the different function implementations,
math-narrow.h gets macros CHECK_NARROW_ADD, NARROW_ADD_ROUND_TO_ODD
and NARROW_ADD_TRIVIAL.
In the trivial cases and for any architecture-specific optimized
implementations, the overhead of the errno setting might be
significant, but I think that's best handled through compiler built-in
functions rather than providing separate no-errno versions in glibc
(and likewise there are no __*_finite entry points for these function
provided, __*_finite effectively being no-errno versions at present in
most cases).
Tested for x86_64 and x86, with both GCC 6 and GCC 7. Tested for
mips64 (all three ABIs, both hard and soft float) and powerpc with GCC
7. Tested with build-many-glibcs.py with both GCC 6 and GCC 7.
* math/Makefile (libm-narrow-fns): Add add.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing add functions.
* math/bits/mathcalls-narrow.h (add): Use __MATHCALL_NARROW .
* math/gen-auto-libm-tests.c (test_functions): Add add.
* math/math-narrow.h (CHECK_NARROW_ADD): New macro.
(NARROW_ADD_ROUND_TO_ODD): Likewise.
(NARROW_ADD_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__faddl): New
macro.
(__daddl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fadd and
dadd.
(CFLAGS-nldbl-dadd.c): New variable.
(CFLAGS-nldbl-fadd.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_daddl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_daddl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fadd, faddl,
daddl, fMaddfN, fMaddfNx, fMxaddfN and fMxaddfNx.
* math/auto-libm-test-in: Add tests of add.
* math/auto-libm-test-out-narrow-add: New generated file.
* math/libm-test-narrow-add.inc: New file.
* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xaddf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fadd.c: Likewise.
* sysdeps/ieee754/float128/s_f32addf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64addf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xaddf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xaddf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dadd.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fadd.c: Likewise.
* sysdeps/ieee754/soft-fp/s_daddl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fadd.c: Likewise.
* sysdeps/ieee754/soft-fp/s_faddl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
|
|
This patch continues preparations for adding TS 18661-1 narrowing libm
functions by adding the required testsuite infrastructure to test such
functions through the libm-test infrastructure.
That infrastructure is based around testing for a single type, FLOAT.
For the narrowing functions, FLOAT, the "main" type for testing, is
the function return type; the argument type is ARG_FLOAT. This is
consistent with how the code built once for each type,
libm-test-support.c, depends on FLOAT for such things as calculating
ulps errors in results but can already handle different argument types
(pointers, integers, long double for nexttoward).
Makefile machinery is added to handle building tests for all pairs of
types for which there are narrowing functions (as with non-narrowing
functions, aliases are tested just the same as the functions they
alias). gen-auto-libm-tests gains a --narrow option for building
outputs for narrowing functions (so narrowing sqrt and fma will share
the same inputs as non-narrowing, but gen-auto-libm-tests will be run
with and without that option to generate different output files). In
the narrowing case, the auto-libm-test-out-narrow-* files include
annotations for each test about what properties ARG_FLOAT must have to
be able to represent all the inputs for that test; those annotations
result in calls to the TEST_COND_arg_fmt macro.
gen-libm-test.pl has some minor updates to handle narrowing tests (for
example, arguments in such tests must be surrounded by ARG_LIT calls
instead of LIT calls). Various new macros are added to the C test
support code (for example, sNaN initializers need to be properly
typed, so arg_snan_value is added; other such arg_* macros are added
as it seems cleanest to do so, though some are not strictly required).
Special-casing of the ibm128 format to allow for its limitations is
adjusted to handle it as the argument format as well as as the result
format; thus, the tests of the new functions allow nonzero ulps only
in the case where ibm128 is the argument format, as otherwise the
functions correspond to fully-defined IEEE operations. The ulps in
question appear as e.g. 'Function: "add_ldouble"' in libm-test-ulps
(with 1ulp errors then listed for double and float for that function
in powerpc); no support is added to generate corresponding faddl /
daddl ulps listings in the ulps table in the manual.
For the previous patch, I noted the need to avoid spurious macro
expansions of identifiers such as "add". A test test-narrow-macros.c
is added to verify such macro expansions are successfully avoided, and
there is also a -mlong-double-64 version of that test for ldbl-opt.
This test is set up to cover the full set of relevant identifiers from
the start rather than adding functions one at a time as each function
group is added.
Tested for x86_64 (this patch in isolation, as well as testing for
various configurations in conjunction with the actual addition of
"add" functions).
* math/Makefile (test-type-pairs): New variable.
(test-type-pairs-f64xf128-yes): Likewise.
(tests): Add test-narrow-macros.
(libm-test-funcs-narrow): New variable.
(libm-test-c-narrow): Likewise.
(generated): Add $(libm-test-c-narrow).
(libm-tests-base-narrow): New variable.
(libm-tests-narrow): Likewise.
(libm-tests): Add $(libm-tests-narrow).
(libm-tests-for-type): Handle $(libm-tests-narrow).
(libm-test-c-narrow-obj): New variable.
($(libm-test-c-narrow-obj)): New rule.
($(foreach t,$(libm-tests-narrow),$(objpfx)$(t).c)): Likewise.
($(foreach f,$(libm-test-funcs-narrow),$(objpfx)$(o)-$(f).o)): Use
$(o-iterator) to set dependencies and CFLAGS.
* math/gen-auto-libm-tests.c: Document use for narrowing
functions.
(output_for_one_input_case): Take argument NARROW.
(generate_output): Likewise. Update call to
output_for_one_input_case.
(main): Take --narrow option. Update call to generate_output.
* math/gen-libm-test.pl (_apply_lit): Take macro name as argument.
(apply_lit): Update call to _apply_lit.
(apply_arglit): New function.
(parse_args): Handle "a" arguments.
(parse_auto_input): Handle format names using ":".
* math/README.libm-test: Document "a" parameter type.
* math/libm-test-support.h (ARG_TYPE_MIN): New macro.
(ARG_TYPE_TRUE_MIN): Likewise.
(ARG_TYPE_MAX): Likwise.
(ARG_MIN_EXP): Likewise.
(ARG_MAX_EXP): Likewise.
(ARG_MANT_DIG): Likewise.
(TEST_COND_arg_ibm128): Likewise.
(TEST_COND_ibm128_libgcc): Define conditional on [ARG_FLOAT].
(TEST_COND_arg_fmt): New macro.
(init_max_error): Update prototype.
* math/libm-test-support.c (test_ibm128): New variable.
(init_max_error): Take argument testing_ibm128 and set test_ibm128
instead of using [TEST_COND_ibm128] conditional.
(test_exceptions): Use test_ibm128 instead of TEST_COND_ibm128.
* math/libm-test-driver.c (STR_ARG_FLOAT): New macro.
[TEST_NARROW] (TEST_MSG): New definition.
(arg_plus_zero): New macro.
(arg_minus_zero): Likewise.
(arg_plus_infty): Likewise.
(arg_minus_infty): Likewise.
(arg_qnan_value_pl): Likewise.
(arg_qnan_value): Likewise.
(arg_snan_value_pl): Likewise.
(arg_snan_value): Likewise.
(arg_max_value): Likewise.
(arg_min_value): Likewise.
(arg_min_subnorm_value): Likewise.
[ARG_FLOAT] (struct test_aa_f_data): New struct type.
(RUN_TEST_LOOP_aa_f): New macro.
(TEST_SUFF): New macro.
(TEST_SUFF_STR): Likewise.
[!TEST_MATHVEC] (VEC_SUFF): Don't define.
(TEST_COND_any_ibm128): New macro.
(START): Use TEST_SUFF and TEST_SUFF_STR in initializer for
this_func. Update call to init_max_error.
* math/test-double.h (FUNC_NARROW_PREFIX): New macro.
* math/test-float.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float128.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float32.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float32x.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float64.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float64x.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-math-scalar.h (TEST_NARROW): Likewise.
* math/test-math-vector.h (TEST_NARROW): Likewise.
* math/test-arg-double.h: New file.
* math/test-arg-float128.h: Likewise.
* math/test-arg-float32x.h: Likewise.
* math/test-arg-float64.h: Likewise.
* math/test-arg-float64x.h: Likewise.
* math/test-arg-ldouble.h: Likewise.
* math/test-math-narrow.h: Likewise.
* math/test-narrow-macros.c: Likewise.
* sysdeps/ieee754/ldbl-opt/test-narrow-macros-ldbl-64.c: Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (tests): Add
test-narrow-macros-ldbl-64.
(CFLAGS-test-narrow-macros-ldbl-64.c): New variable.
|
|
TS 18661-1 defines libm functions that carry out an operation (+ - * /
sqrt fma) on their arguments and return a result rounded to a
(usually) narrower type, as if the original result were computed to
infinite precision and then rounded directly to the result type
without any intermediate rounding to the argument type. For example,
fadd, faddl and daddl for addition. These are the last remaining TS
18661-1 functions left to be added to glibc. TS 18661-3 extends this
to corresponding functions for _FloatN and _FloatNx types.
As functions parametrized by two rather than one varying
floating-point types, these functions require infrastructure in glibc
that was not required for previous libm functions. This patch
provides such infrastructure - excluding test support, and actual
function implementations, which will be in subsequent patches.
Declaring the functions uses a header bits/mathcalls-narrow.h, which
is included many times, for each relevant pair of types. This will
end up containing macro calls of the form
__MATHCALL_NARROW (__MATHCALL_NAME (add), __MATHCALL_REDIR_NAME (add), 2);
for each family of narrowing functions. (The structure of this macro
call, with the calls to __MATHCALL_NAME and __MATHCALL_REDIR_NAME
there rather than in the definition of __MATHCALL_NARROW, arises from
the names such as "add" *not* themselves being reserved identifiers -
meaning it's necessary to avoid any indirection that would result in a
user-defined "add" macro being expanded.) Whereas for existing
functions declaring long double functions is disabled if _LIBC in the
case where they alias double functions, to facilitate defining the
long double functions as aliases of the double ones, there is no such
logic for the narrowing functions in this patch. Rather, the files
defining such functions are expected to use #define to hide the
original declarations of the alias names, to avoid errors about
defining aliases with incompatible types.
math/Makefile support is added for building the functions (listed in
libm-narrow-fns, currently empty) for all relevant pairs of types. An
internal header math-narrow.h is added for macros shared between
multiple function implementations - currently a ROUND_TO_ODD macro to
facilitate writing functions using the round-to-odd implementation
approach, and alias macros to create all the required function
aliases. libc_feholdexcept_setroundf128 and libc_feupdateenv_testf128
are added for use when required (only for x86_64). float128_private.h
support is added for ldbl-128 narrowing functions to be used for
_Float128.
Certain things are specifically omitted from this patch and the
immediate followups. tgmath.h support is deferred; there remain
unresolved questions about how the type-generic macros for these
functions are supposed to work, especially in the case of arguments of
integer type. The math.h / bits/mathcalls-narrow.h logic, and the
logic for determining what functions / aliases to define, will need
some adjustments to support the sqrt and fma functions, where
e.g. f32xsqrtf64 can just be an alias for sqrt rather than a separate
function. TS 18661-1 defines FP_FAST_* macros but no support is
included for defining them (they won't in general be true without
architecture-specific optimized function versions).
For each of the function groups (add sub mul div sqrt fma) there are
always six functions present (e.g. fadd, faddl, daddl, f32addf64,
f32addf32x, f32xaddf64). When _Float64x and _Float128 are supported,
there are seven more (e.g. f32addf64x, f32addf128, f64addf64x,
f64addf128, f32xaddf64x, f32xaddf128, f64xaddf128). In addition, in
the ldbl-opt case there are function names such as __nldbl_daddl (an
alias for f32xaddf64, which is not a reserved name in TS 18661-1, only
in TS 18661-3), for calls to daddl to be mapped to in the
-mlong-double-64 case. (Calls to faddl just get mapped to fadd, and
for sqrt and fma there won't be __nldbl_* functions because dsqrtl and
dfmal can just be mapped to sqrt and fma with -mlong-double-64.)
While there are six or thirteen functions present in each group (plus
__nldbl_* names only as an ABI, not an API), not all are distinct;
they fall in various groups of aliases. There are two distinct
versions built if long double has the same format as double; four if
they have distinct formats but there is no _Float64x or _Float128
support; five if long double has binary128 format; seven when
_Float128 is distinct from long double.
Architecture-specific optimized versions are possible, but not
included in my patches. For example, IA64 generally supports
narrowing the result of most floating-point instructions; Power ISA
2.07 (POWER8) supports double values as arguments to float
instructions, with the results narrowed as expected; Power ISA 3
(POWER9) supports round-to-odd for float128 instructions, so meaning
that approach can be used without needing to set and restore the
rounding mode and test "inexact". I intend to leave any such
optimized versions to the architecture maintainers. Generally in such
cases it would also make sense for calls to these functions to be
expanded inline (given -fno-math-errno); I put a suggestion for TS
18661-1 built-in functions at <https://gcc.gnu.org/wiki/SummerOfCode>.
Tested for x86_64 (this patch in isolation, as well as testing for
various configurations in conjunction with further patches).
* math/bits/mathcalls-narrow.h: New file.
* include/bits/mathcalls-narrow.h: Likewise.
* math/math-narrow.h: Likewise.
* math/math.h (__MATHCALL_NARROW_ARGS_1): New macro.
(__MATHCALL_NARROW_ARGS_2): Likewise.
(__MATHCALL_NARROW_ARGS_3): Likewise.
(__MATHCALL_NARROW_NORMAL): Likewise.
(__MATHCALL_NARROW_REDIR): Likewise.
(__MATHCALL_NARROW): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)]: Repeatedly include
<bits/mathcalls-narrow.h> with _Mret_, _Marg_ and __MATHCALL_NAME
defined.
[__GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise.
* math/Makefile (headers): Add bits/mathcalls-narrow.h.
(libm-narrow-fns): New variable.
(libm-narrow-types-basic): Likewise.
(libm-narrow-types-ldouble-yes): Likewise.
(libm-narrow-types-float128-yes): Likewise.
(libm-narrow-types-float128-alias-yes): Likewise.
(libm-narrow-types): Likewise.
(libm-routines): Add narrowing functions.
* sysdeps/i386/fpu/fenv_private.h [__x86_64__]
(libc_feholdexcept_setroundf128): New macro.
[__x86_64__] (libc_feupdateenv_testf128): Likewise.
* sysdeps/ieee754/float128/float128_private.h: Include
<math/math-narrow.h>.
[libc_feholdexcept_setroundf128] (libc_feholdexcept_setroundl):
Undefine and redefine.
[libc_feupdateenv_testf128] (libc_feupdateenv_testl): Likewise.
(libm_alias_float_ldouble): Undefine and redefine.
(libm_alias_double_ldouble): Likewise.
|
|
The math/Makefile variable libm-test-incs was formerly used, but no
longer is. This patch removes it.
Tested for x86_64.
* math/Makefile [$(PERL) != no] (libm-test-incs): Remove variable.
|
|
I found that "make regen-ulps" failed when building with unmodified
GNU make 4.1, and an objdir /some/where/math/ longer than about 37
characters, because the list of tests in the "for run in $^" loop
exceeded the Linux kernel's MAX_ARG_STRLEN limit (131072 bytes) on the
length of a single argument passed to a command.
Some GNU/Linux distributions have a patch to make to work around this
limit (see e.g. Debian bug 688601), but clearly this ought to work
without needing such a patch. This patch arranges for the shell loop
to be over the test names without a $(objdir) prefix, which reduces
the space used to less than half MAX_ARG_STRLEN.
(I think we ought to aim to get rid of bits/mathinline.h completely -
filing GCC bugs for any optimizations GCC can't currently do with
-ffast-math - which would mean we could halve the number of libm tests
run because separate inline function tests would no longer be needed.
However, with a long directory name even half the number of tests
could make this command exceed MAX_ARG_STRLEN without my patch.)
Tested regen-ulps on a system where it failed before this patch.
* math/Makefile (run-regen-ulps): Add $(objpfx) to test name here.
(regen-ulps): Use $(libm-tests) not $^ in shell loop.
|