aboutsummaryrefslogtreecommitdiff
path: root/math
AgeCommit message (Collapse)AuthorFilesLines
8 daysreplace tgammaf by the CORE-MATH implementationPaul Zimmermann1-3/+3
The CORE-MATH implementation is correctly rounded (for any rounding mode). This can be checked by exhaustive tests in a few minutes since there are less than 2^32 values to check against for example GNU MPFR. This patch also adds some bench values for tgammaf. Tested on x86_64 and x86 (cfarm26). With the initial GNU libc code it gave on an Intel(R) Core(TM) i7-8700: "tgammaf": { "": { "duration": 3.50188e+09, "iterations": 2e+07, "max": 602.891, "min": 65.1415, "mean": 175.094 } } With the new code: "tgammaf": { "": { "duration": 3.30825e+09, "iterations": 5e+07, "max": 211.592, "min": 32.0325, "mean": 66.1649 } } With the initial GNU libc code it gave on cfarm26 (i686): "tgammaf": { "": { "duration": 3.70505e+09, "iterations": 6e+06, "max": 2420.23, "min": 243.154, "mean": 617.509 } } With the new code: "tgammaf": { "": { "duration": 3.24497e+09, "iterations": 1.8e+07, "max": 1238.15, "min": 101.155, "mean": 180.276 } } Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Changes in v2: - include <math.h> (fix the linknamespace failures) - restored original benchtests/strcoll-inputs/filelist#en_US.UTF-8 file - restored original wrapper code (math/w_tgammaf_compat.c), except for the dealing with the sign - removed the tgammaf/float entries in all libm-test-ulps files - address other comments from Joseph Myers (https://sourceware.org/pipermail/libc-alpha/2024-July/158736.html) Changes in v3: - pass NULL argument for signgam from w_tgammaf_compat.c - use of math_narrow_eval - added more comments Changes in v4: - initialize local_signgam to 0 in math/w_tgamma_template.c - replace sysdeps/ieee754/dbl-64/gamma_productf.c by dummy file Changes in v5: - do not mention local_signgam any more in math/w_tgammaf_compat.c - initialize local_signgam to 1 instead of 0 in w_tgamma_template.c and added comment Changes in v6: - pass NULL as 2nd argument of __ieee754_gammaf_r in w_tgammaf_compat.c, and check for NULL in e_gammaf_r.c Changes in v7: - added Signed-off-by line for Alexei Sibidanov (author of the code) Changes in v8: - added Signed-off-by line for Paul Zimmermann (submitted of the patch) Changes in v9: - address comments from review by Adhemerval Zanella Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-09-19AArch64: Add vector logp1 alias for log1pJoe Ramsay1-1/+1
This enables vectorisation of C23 logp1, which is an alias for log1p. There are no new tests or ulp entries because the new symbols are simply aliases. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2024-09-04Do not set errno for overflowing NaN payload in strtod/nan (bug 32045)Joseph Myers2-0/+54
As reported in bug 32045, it's incorrect for strtod/nan functions to set errno based on overflowing payload (strtod should only set errno for overflow / underflow of its actual result, and potentially if nothing in the string can be parsed as a number at all; nan should be a pure function that never sets it). Save and restore errno around the internal strtoull call and add associated test coverage. Tested for x86_64.
2024-09-04Improve NaN payload testingJoseph Myers1-5/+54
There are two separate sets of tests of NaN payloads in glibc: * libm-test-{get,set}payload* verify that getpayload, setpayload, setpayloadsig and __builtin_nan functions are consistent in their payload handling. * test-nan-payload verifies that strtod-family functions and the not-built-in nan functions are consistent in their payload handling. Nothing, however, connects the two sets of functions (i.e., verifies that strtod / nan are consistent with getpayload / setpayload / __builtin_nan). Improve test-nan-payload to check actual payload value with getpayload rather than just verifying that the strtod and nan functions produce the same NaN. Also check that the NaNs produced aren't signaling and extend the tests to cover _FloatN / _FloatNx. Tested for x86_64.
2024-08-07added inputs giving large errors on x86_64 for new C23 functionsPaul Zimmermann4-1/+5432
These functions are exp10m1, exp2m1, log10p1, log2p1. Also regenerated ulps on x86_64. For each format, there are 4 values, one for each rounding mode. (For the intel96 format, there are 8 values, 4 for Intel hardware, and 4 for AMD hardware. However, regen-ulps was only run on Intel. It should be run in a separate patch on a AMD x86_64.) Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-07-26support: Add FAIL test failure helperMaciej W. Rozycki1-10/+3
Add a FAIL test failure helper analogous to FAIL_RET, that does not cause the current function to return, providing a standardized way to report a test failure with a message supplied while permitting the caller to continue executing, for further reporting, cleaning up, etc. Update existing test cases that provide a conflicting definition of FAIL by removing the local FAIL definition and then as follows: - tst-fortify-syslog: provide a meaningful message in addition to the file name already added by <support/check.h>; 'support_record_failure' is already called by 'support_print_failure_impl' invoked by the new FAIL test failure helper. - tst-ctype: no update to FAIL calls required, with the name of the file and the line number within of the failure site additionally included by the new FAIL test failure helper, and error counting plus count reporting upon test program termination also already provided by 'support_record_failure' and 'support_report_failure' respectively, called by 'support_print_failure_impl' and 'adjust_exit_status' also respectively. However in a number of places 'printf' is called and the error count adjusted by hand, so update these places to make use of FAIL instead. And last but not least adjust the final summary just to report completion, with any error count following as reported by the test driver. - test-tgmath2: no update to FAIL calls required, with the name of the file of the failure site additionally included by the new FAIL test failure helper. Also there is no need to track the return status by hand as any call to FAIL will eventually cause the test case to return an unsuccesful exit status regardless of the return status from the test function, via a call to 'adjust_exit_status' made by the test driver. Reviewed-by: DJ Delorie <dj@redhat.com>
2024-07-22This patch adds larger ulp errors for the log2p1 function.Paul Zimmermann2-0/+345
Changes in v2: - added larger error for long double on AMD reported by Adhemerval (https://sourceware.org/pipermail/libc-alpha/2024-June/157755.html) Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-06-17Implement C23 exp2m1, exp10m1Joseph Myers14-2/+9302
C23 adds various <math.h> function families originally defined in TS 18661-4. Add the exp2m1 and exp10m1 functions (exp2(x)-1 and exp10(x)-1, like expm1). As with other such functions, these use type-generic templates that could be replaced with faster and more accurate type-specific implementations in future. Test inputs are copied from those for expm1, plus some additions close to the overflow threshold (copied from exp2 and exp10) and also some near the underflow threshold. exp2m1 has the unusual property of having an input (M_MAX_EXP) where whether the function overflows (under IEEE semantics) depends on the rounding mode. Although these could reasonably be XFAILed in the testsuite (as we do in some cases for arguments very close to a function's overflow threshold when an error of a few ulps in the implementation can result in the implementation not agreeing with an ideal one on whether overflow takes place - the testsuite isn't smart enough to handle this automatically), since these functions aren't required to be correctly rounding, I made the implementation check for and handle this case specially. The Makefile ordering expected by lint-makefiles for the new functions is a bit peculiar, but I implemented it in this patch so that the test passes; I don't know why log2 also needed moving in one Makefile variable setting when it didn't in my previous patches, but the failure showed a different place was expected for that function as well. The powerpc64le IFUNC setup seems not to be as self-contained as one might hope; it shouldn't be necessary to add IFUNCs for new functions such as these simply to get them building, but without setting up IFUNCs for the new functions, there were undefined references to __GI___expm1f128 (that IFUNC machinery results in no such function being defined, but doesn't stop include/math.h from doing the redirection resulting in the exp2m1f128 and exp10m1f128 implementations expecting to call it). Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-06-17Implement C23 log10p1Joseph Myers11-1/+3249
C23 adds various <math.h> function families originally defined in TS 18661-4. Add the log10p1 functions (log10(1+x): like log1p, but for base-10 logarithms). This is directly analogous to the log2p1 implementation (except that whereas log2p1 has a smaller underflow range than log1p, log10p1 has a larger underflow range). The test inputs are copied from those for log1p and log2p1, plus a few more inputs in that wider underflow range. Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-06-17Implement C23 logp1Joseph Myers8-5/+39
C23 adds various <math.h> function families originally defined in TS 18661-4. Add the logp1 functions (aliases for log1p functions - the name is intended to be more consistent with the new log2p1 and log10p1, where clearly it would have been very confusing to name those functions log21p and log101p). As aliases rather than new functions, the content of this patch is somewhat different from those actually adding new functions. Tests are shared with log1p, so this patch *does* mechanically update all affected libm-test-ulps files to expect the same errors for both functions. The vector versions of log1p on aarch64 and x86_64 are *not* updated to have logp1 aliases (and thus there are no corresponding header, tests, abilist or ulps changes for vector functions either). It would be reasonable for such vector aliases and corresponding changes to other files to be made separately. For now, the log1p tests instead avoid testing logp1 in the vector case (a Makefile change is needed to avoid problems with grep, used in generating the .c files for vector function tests, matching more than one ALL_RM_TEST line in a file testing multiple functions with the same inputs, when it assumes that the .inc file only has a single such line). Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-05-22Don't provide scalb/significand _FloatN aliases [BZ #31760]H.J. Lu2-0/+24
scalb is a deprecated interface which was obsolescent in POSIX.1-2001, removed in POSIX.1-2008, never made to C standard. significant was originally from BSD and never made in any standard. Fix BZ #31760 by not providing _FloatN aliases for them. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-05-21math: Add support for auto static math testsAdhemerval Zanella10-5/+124
It basically copy the already in place rules for dynamic tests for auto-generated math functions for all support types. To avoid the need to duplicate .inc files, a .SECONDEXPANSION rules is adeed for the gen-libm-test.py generation. New tests are added on the new rules 'libm-test-funcs-auto-static', 'libm-test-funcs-noauto-static', and 'libm-test-funcs-narrow-static'; similar to the non-static counterparts. To avoid add extra build and disk requirement, the new math static tests are only enable with a new define 'build-math-static-tests'. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-20math: Add more details to the test driver output.Joe Simmons-Talbott3-49/+162
Add start and end indicators that identify the test being run in the verbose output. Better identify the tests for max errors in the summary output. Count each exception checked for each test. Remove double counting of tests for the check_<type> functions other than check_float_internal. Rename print_max_error and print_complex_max_error to check_max_error and check_complex_max_error respectively since they have side effects. Co-Authored-By: Carlos O'Donell <carlos@redhat.com> Reviewed-By: Joseph Myers <josmyers@redhat.com>
2024-05-20Implement C23 log2p1Joseph Myers11-1/+2891
C23 adds various <math.h> function families originally defined in TS 18661-4. Add the log2p1 functions (log2(1+x): like log1p, but for base-2 logarithms). This illustrates the intended structure of implementations of all these function families: define them initially with a type-generic template implementation. If someone wishes to add type-specific implementations, it is likely such implementations can be both faster and more accurate than the type-generic one and can then override it for types for which they are implemented (adding benchmarks would be desirable in such cases to demonstrate that a new implementation is indeed faster). The test inputs are copied from those for log1p. Note that these changes make gen-auto-libm-tests depend on MPFR 4.2 (or later). The bulk of the changes are fairly generic for any such new function. (sysdeps/powerpc/nofpu/Makefile only needs changing for those type-generic templates that use fabs.) Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-05-14math: Add GLIBC_TEST_LIBM_VERBOSE environment variable support.Joe Talbott1-2/+21
Allow the libm-test-driver based tests to have their verbosity set based on the GLIBC_TEST_LIBM_VERBOSE environment variable. This allows the entire testsuite to be run with a non-default verbosity. While here check the conversion for the verbose option as well. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-05-06Add crt1-2.0.o for glibc 2.0 compatibility testsH.J. Lu1-0/+3
Starting from glibc 2.1, crt1.o contains _IO_stdin_used which is checked by _IO_check_libio to provide binary compatibility for glibc 2.0. Add crt1-2.0.o for tests against glibc 2.0. Define tests-2.0 for glibc 2.0 compatibility tests. Add and update glibc 2.0 compatibility tests for stderr, matherr and pthread_kill. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-04-04math: x86 trunc traps when FE_INEXACT is enabled (BZ 31603)Adhemerval Zanella2-0/+69
The implementations of trunc functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-04math: x86 floor traps when FE_INEXACT is enabled (BZ 31601)Adhemerval Zanella2-0/+69
The implementations of floor functions using x87 floating point (i386 and 86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-04math: x86 ceill traps when FE_INEXACT is enabled (BZ 31600)Adhemerval Zanella2-0/+70
The implementations of ceil functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-04aarch64/fpu: Add vector variants of tanhJoe Ramsay2-26/+26
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-02math: Reformat Makefile.Adhemerval Zanella1-159/+685
Reflow all long lines adding comment terminators. Sort all reflowed text using scripts/sort-makefile-lines.py. No code generation changes observed in binary artifacts. No regressions on x86_64 and i686. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-02-23treewide: python-scripts: use `is None` for none-equalityKonstantin Kharlamov1-3/+3
Testing for `None`-ness with `==` operator is frowned upon and causes warnings in at least "LGTM" python linter. Fix that. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-02-01math: Remove bogus math implementationsAdhemerval Zanella5-164/+0
The exp10, exp10l, fma, fmaf, and fmal default implementation do not implement the appropriate semantics nor with an reasonable accuracy. They are also not used by any supported port.
2024-02-01Refer to C23 in place of C2X in glibcJoseph Myers6-49/+49
WG14 decided to use the name C23 as the informal name of the next revision of the C standard (notwithstanding the publication date in 2024). Update references to C2X in glibc to use the C23 name. This is intended to update everything *except* where it involves renaming files (the changes involving renaming tests are intended to be done separately). In the case of the _ISOC2X_SOURCE feature test macro - the only user-visible interface involved - support for that macro is kept for backwards compatibility, while adding _ISOC23_SOURCE. Tested for x86_64.
2024-01-12math: remove exp10 wrappersWilco Dijkstra2-5/+31
Remove the error handling wrapper from exp10. This is very similar to the changes done to exp and exp2, except that we also need to handle pow10 and pow10l. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-01-10math: Fix test-fenv.c feupdateenv testsAdhemerval Zanella1-0/+1
The feupdateenv tests added by 802aef27b2 do not restore the floating point mask, which might keep some floating point exception enabled and thus make the feupdateenv_single_test raise an unexpected signal. Checked on x86_64-linux-gnu and aarch64-linux-gnu (on Apple M1 trapping is supported). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-01-08Remove ia64-linux-gnuAdhemerval Zanella1-1/+1
Linux 6.7 removed ia64 from the official tree [1], following the general principle that a glibc port needs upstream support for the architecture in all the components it depends on (binutils, GCC, and the Linux kernel). Apart from the removal of sysdeps/ia64 and sysdeps/unix/sysv/linux/ia64, there are updates to various comments referencing ia64 for which removal of those references seemed appropriate. The configuration is removed from README and build-many-glibcs.py. The CONTRIBUTED-BY, elf/elf.h, manual/contrib.texi (the porting mention), *.po files, config.guess, and longlong.h are not changed. For Linux it allows cleanup some clone2 support on multiple files. The following bug can be closed as WONTFIX: BZ 22634 [2], BZ 14250 [3], BZ 21634 [4], BZ 10163 [5], BZ 16401 [6], and BZ 11585 [7]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=43ff221426d33db909f7159fdf620c3b052e2d1c [2] https://sourceware.org/bugzilla/show_bug.cgi?id=22634 [3] https://sourceware.org/bugzilla/show_bug.cgi?id=14250 [4] https://sourceware.org/bugzilla/show_bug.cgi?id=21634 [5] https://sourceware.org/bugzilla/show_bug.cgi?id=10163 [6] https://sourceware.org/bugzilla/show_bug.cgi?id=16401 [7] https://sourceware.org/bugzilla/show_bug.cgi?id=11585 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert390-390/+390
2023-12-19riscv: Fix feenvupdate with FE_DFL_ENV (BZ 31022)Adhemerval Zanella1-10/+121
libc_feupdateenv_riscv should check for FE_DFL_ENV, similar to libc_fesetenv_riscv. Also extend the test-fenv.c to test fenvupdate. Checked on riscv under qemu-system. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19x86: Do not raises floating-point exception traps on fesetexceptflag (BZ 30990)Bruno Haible1-1/+22
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). The flags can be set in the 387 unit or in the SSE unit. When we need to clear a flag, we need to do so in both units, due to the way fetestexcept is implemented. When we need to set a flag, it is sufficient to do it in the SSE unit, because that is guaranteed to not trap. However, on i386 CPUs that have only a 387 unit, set the flags in the 387, as long as this cannot trap. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19i686: Do not raise exception traps on fesetexcept (BZ 30989)Adhemerval Zanella1-3/+23
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). The flags can be set in the 387 unit or in the SSE unit. To set a flag, it is sufficient to do it in the SSE unit, because that is guaranteed to not trap. However, on i386 CPUs that have only a 387 unit, set the flags in the 387, as long as this cannot trap. Checked on i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19powerpc: Do not raise exception traps for fesetexcept/fesetexceptflag (BZ 30988)Adhemerval Zanella2-15/+25
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). This is a side-effect of how we implement the GNU extension feenableexcept, where feenableexcept/fesetenv/fesetmode/feupdateenv might issue prctl (PR_SET_FPEXC, PR_FP_EXC_PRECISE) depending of the argument. And on PR_FP_EXC_PRECISE, setting a floating-point exception flag triggers a trap. To make the both functions follow the C23, fesetexcept and fesetexceptflag now fail if the argument may trigger a trap. The math tests now check for an value different than 0, instead of bail out as unsupported for EXCEPTION_SET_FORCES_TRAP. Checked on powerpc64le-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-11-20aarch64: Add vector implementations of expm1 routinesJoe Ramsay2-108/+108
May discard sign of 0 - auto tests for -0 and -0x1p-10000 updated accordingly.
2023-11-10aarch64: Add vector implementations of log1p routinesJoe Ramsay2-26/+26
May discard sign of zero.
2023-10-23aarch64: Add vector implementations of tan routinesJoe Ramsay2-26/+26
This includes some utility headers for evaluating polynomials using various schemes.
2023-09-18math: Add a no-mathvec flag for sin (-0.0)Wilco Dijkstra4-28/+33
Add support for a no-mathvec flag to gen-auto-libm-tests.c. Update input test sin (-0.0) to be skipped in vector math libraries and regenerate testcases. Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2023-06-30Stop applying a GCC-specific workaround on clang [BZ #30550]Tulio Magno Quites Machado Filho1-1/+2
GCC was the only compiler affected by the issue with __builtin_isinf_sign and float128. Fix BZ #30550. Reported-by: Qiu Chaofan <qiucofan@cn.ibm.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-06-02Fix all the remaining misspellings -- BZ 25337Paul Pluzhnikov6-6/+6
2023-04-03math: Remove the error handling wrapper from fmod and fmodfAdhemerval Zanella Netto3-6/+17
The error handling is moved to sysdeps/ieee754 version with no SVID support. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). The ia64 is unchanged, since it still uses the arch specific __libm_error_region on its implementation. For both i686 and m68k, which provive arch specific implementation, wrappers are added so no new symbol are added (which would require to change the implementations). It shows an small improvement, the results for fmod: Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 12.5049 | 9.40992 x86_64 (Ryzen 9) | normal | 296.939 | 296.738 x86_64 (Ryzen 9) | close-exponents | 16.0244 | 13.119 aarch64 (N1) | subnormal | 6.81778 | 4.33313 aarch64 (N1) | normal | 155.620 | 152.915 aarch64 (N1) | close-exponents | 8.21306 | 5.76138 armhf (N1) | subnormal | 15.1083 | 14.5746 armhf (N1) | normal | 244.833 | 241.738 armhf (N1) | close-exponents | 21.8182 | 22.457 Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-04-03math: Improve fmodfAdhemerval Zanella Netto1-0/+4
This uses a new algorithm similar to already proposed earlier [1]. With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers), the simplest implementation is: mx * 2^ex == 2 * mx * 2^(ex - 1) while (ex > ey) { mx *= 2; --ex; mx %= my; } With mx/my being mantissa of double floating pointer, on each step the argument reduction can be improved 8 (which is sizeof of uint32_t minus MANTISSA_WIDTH plus the signal bit): while (ex > ey) { mx << 8; ex -= 8; mx %= my; } */ The implementation uses builtin clz and ctz, along with shifts to convert hx/hy back to doubles. Different than the original patch, this path assume modulo/divide operation is slow, so use multiplication with invert values. I see the following performance improvements using fmod benchtests (result only show the 'mean' result): Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 17.2549 | 12.0318 x86_64 (Ryzen 9) | normal | 85.4096 | 49.9641 x86_64 (Ryzen 9) | close-exponents | 19.1072 | 15.8224 aarch64 (N1) | subnormal | 10.2182 | 6.81778 aarch64 (N1) | normal | 60.0616 | 20.3667 aarch64 (N1) | close-exponents | 11.5256 | 8.39685 I also see similar improvements on arm-linux-gnueabihf when running on the N1 aarch64 chips, where it a lot of soft-fp implementation (for modulo, and multiplication): Architecture | Input | master | patch -----------------|-----------------|----------|-------- armhf (N1) | subnormal | 11.6662 | 10.8955 armhf (N1) | normal | 69.2759 | 34.1524 armhf (N1) | close-exponents | 13.6472 | 18.2131 Instead of using the math_private.h definitions, I used the math_config.h instead which is used on newer math implementations. Co-authored-by: kirill <kirill.okhotnikov@gmail.com> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-04-03math: Improve fmodAdhemerval Zanella Netto1-0/+14
This uses a new algorithm similar to already proposed earlier [1]. With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers), the simplest implementation is: mx * 2^ex == 2 * mx * 2^(ex - 1) while (ex > ey) { mx *= 2; --ex; mx %= my; } With mx/my being mantissa of double floating pointer, on each step the argument reduction can be improved 11 (which is sizeo of uint64_t minus MANTISSA_WIDTH plus the signal bit): while (ex > ey) { mx << 11; ex -= 11; mx %= my; } */ The implementation uses builtin clz and ctz, along with shifts to convert hx/hy back to doubles. Different than the original patch, this path assume modulo/divide operation is slow, so use multiplication with invert values. I see the following performance improvements using fmod benchtests (result only show the 'mean' result): Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 19.1584 | 12.5049 x86_64 (Ryzen 9) | normal | 1016.51 | 296.939 x86_64 (Ryzen 9) | close-exponents | 18.4428 | 16.0244 aarch64 (N1) | subnormal | 11.153 | 6.81778 aarch64 (N1) | normal | 528.649 | 155.62 aarch64 (N1) | close-exponents | 11.4517 | 8.21306 I also see similar improvements on arm-linux-gnueabihf when running on the N1 aarch64 chips, where it a lot of soft-fp implementation (for modulo, clz, ctz, and multiplication): Architecture | Input | master | patch -----------------|-----------------|----------|-------- armhf (N1) | subnormal | 15.908 | 15.1083 armhf (N1) | normal | 837.525 | 244.833 armhf (N1) | close-exponents | 16.2111 | 21.8182 Instead of using the math_private.h definitions, I used the math_config.h instead which is used on newer math implementations. Co-authored-by: kirill <kirill.okhotnikov@gmail.com> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-02-14update auto-libm-test-out-hypotPaul Zimmermann1-0/+25
This change was forgotten in commit cf7ffdd. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-02-14added pair of inputs for hypotf in binary32Paul Zimmermann1-0/+2
This pair yields an error of 1 ulp in binary32, whereas the current maximal known error for hypotf on x86_64 is zero: Checking hypot with glibc-2.37 hypot 0 -1 -0x1.003222p-20,-0x1.6a2d58p-32 [0.501] 0.500001 0.500000001392678 libm gives 0x1.003224p-20 mpfr gives 0x1.003222p-20 See https://sourceware.org/pipermail/libc-alpha/2023-February/145432.html and https://sourceware.org/pipermail/libc-alpha/2023-February/145442.html Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-01-06Update copyright dates with scripts/update-copyrightsJoseph Myers390-390/+390
2023-01-06C2x semantics for <tgmath.h>Joseph Myers2-162/+297
<tgmath.h> implements semantics for integer generic arguments that handle cases involving _FloatN / _FloatNx types as specified in TS 18661-3 plus some defect fixes. C2x has further changes to the semantics for <tgmath.h> macros with such types, which should also be considered defect fixes (although handled through the integration of TS 18661-3 in C2x rather than through an issue tracking process). Specifically, the rules were changed because of problems raised with using the macros with the evaluation format types such as float_t and _Float32_t: the older version of the rules didn't allow passing _FloatN / _FloatNx types to the narrowing macros returning float or double, or passing float / double / long double to the narrowing macros returning _FloatN / _FloatNx, which was a problem with the evaluation format types which could be either kind of type depending on the value of FLT_EVAL_METHOD. Thus the new rules allow cases of mixing types which were not allowed before, and, as part of the changes, the handling of integer arguments was also changed: if there is any _FloatNx generic argument, integer generic arguments are treated as _Float32x (not double), while the rule about treating integer arguments to narrowing macros returning _FloatN or _FloatNx as _Float64 not double was removed (no longer needed now double is a valid argument to such macros). I've implemented the changes in GCC's __builtin_tgmath, which thus requires updates to glibc's test expectations so that the tests continue to build with GCC 13 (the test is also updated to test the argument types that weren't allowed before but are now valid under C2x rules). Given those test changes, it's then also necessary to fix the implementations in <tgmath.h> to have appropriate semantics with older GCC so that the tests pass with GCC versions before GCC 13 as well. For some cases (non-narrowing macros with two or three generic arguments; narrowing macros returning _Float32x), the older version of __builtin_tgmath doesn't correspond sufficiently well to C2x semantics, so in those cases <tgmath.h> is adjusted to use the older macro implementation instead of __builtin_tgmath. The older macro implementation is itself adjusted to give the desired semantics, with GCC 7 and later. (It's not possible to get the right semantics in all cases for the narrowing macros with GCC 6 and before when the _FloatN / _FloatNx names are typedefs rather than distinct types.) Tested as follows: with the full glibc testsuite for x86_64, GCC 6, 7, 11, 13; with execution of the math/tests for aarch64, arm, powerpc and powerpc64le, GCC 6, 7, 12 and 13 (powerpc64le only with GCC 12 and 13); with build-many-glibcs.py with GCC 6, 7, 12 and 13.
2022-11-01Disable use of -fsignaling-nans if compiler does not support itAdhemerval Zanella9-10/+33
Reviewed-by: Fangrui Song <maskray@google.com>
2022-10-31Fix build with GCC 13 _FloatN, _FloatNx built-in functionsJoseph Myers1-7/+240
GCC 13 has added more _FloatN and _FloatNx versions of existing <math.h> and <complex.h> built-in functions, for use in libstdc++-v3. This breaks the glibc build because of how those functions are defined as aliases to functions with the same ABI but different types. Add appropriate -fno-builtin-* options for compiling relevant files, as already done for the case of long double functions aliasing double ones and based on the list of files used there. I fixed some mistakes in that list of double files that I noticed while implementing this fix, but there may well be more such (harmless) cases, in this list or the new one (files that don't actually exist or don't define the named functions as aliases so don't need the options). I did try to exclude cases where glibc doesn't define certain functions for _FloatN or _FloatNx types at all from the new uses of -fno-builtin-* options. As with the options for double files (see the commit message for commit 49348beafe9ba150c9bd48595b3f372299bddbb0, "Fix build with GCC 10 when long double = double."), it's deliberate that the options are used even if GCC currently doesn't have a built-in version of a given functions, so providing some level of future-proofing against more such built-in functions being added in future. Tested with build-many-glibcs.py for aarch64-linux-gnu powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu (compilers and glibcs builds) with GCC mainline.
2022-09-30Fix iseqsig for _FloatN and _FloatNx in C++ with GCC 13Joseph Myers2-1/+339
With GCC 13, _FloatN and _FloatNx types, when they exist, are distinct types like they are in C with GCC 7 and later, rather than typedefs for types such as float, double or long double. This breaks the templated iseqsig implementation for C++ in <math.h>, when used with types that were previously implemented as aliases. Add the necessary definitions for _Float32, _Float64, _Float128 (when the same format as long double), _Float32x and _Float64x in this case, so that iseqsig can be used with such types in C++ with GCC 13 as it could with previous GCC versions. Also add tests for calling iseqsig in C++ with arguments of such types (more minimal than existing tests, so that they can work with older GCC versions and without relying on any C++ library support for the types or on hardcoding details of their formats). The LDBL_MANT_DIG != 106 conditionals on some tests are because the type-generic comparison macros have undefined behavior when neither argument has a type whose set of values is a subset of those for the type of the other argument, which applies when one argument is IBM long double and the other is an IEEE format wider than binary64. Tested with build-many-glibcs.py glibcs build for aarch64-linux-gnu i686-linux-gnu mips-linux-gnu mips64-linux-gnu-n32 powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu.
2022-09-22Use '%z' instead of '%Z' on printf functionsAdhemerval Zanella Netto1-1/+1
The Z modifier is a nonstandard synonymn for z (that predates z itself) and compiler might issue an warning for in invalid conversion specifier. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-02-24math: Add more input to atanh accuracy testsSunil K Pandey2-0/+28
This patch adds following input to atanh accuracy test. 0x1.f80094p-8 Tested on x86-64 and i686 platforms. Other platforms may have to regenerate ulps file. Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>