aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-12-22math: Exclude tgmath3-macro-tests for ClangH.J. Lu1-0/+5
tgmath3-macro-tests won't compile with <float.h> and <tgmath.h> from Clang due to missing C23 support: https://github.com/llvm/llvm-project/issues/97335 Disable them for now when Clang is used for testing so that "make check" can finish. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-12-22Check if -mamx-tile works for testingH.J. Lu2-63/+32
Since -mamx-tile is used only for testing, use LIBC_TRY_TEST_CC_COMMAND, instead of LIBC_TRY_CC_AND_TEST_CC_COMMAND to check it and don't check __builtin_ia32_ldtilecfg for Clang. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sam James <sam@gentoo.org>
2024-12-22assert: Sort tests in MakefileH.J. Lu1-1/+1
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-12-20assert: ensure posix compliance, add tests for suchDJ Delorie3-2/+198
Fix assert.c so that even the fallback case conforms to POSIX, although not exactly the same as the default case so a test can tell the difference. Add a test that verifies that abort is called, and that the message printed to stderr has all the info that POSIX requires. Verify this even when malloc isn't usable. Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
2024-12-21cet: Drop '#pragma GCC target' in tst-cet-legacy-10a[-static].cAdhemerval Zanella2-2/+0
After commit 215447f5cbcf1a494cded57734f68d7f9c2b0dc0 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Dec 17 06:18:55 2024 +0800 cet: Pass -mshstk to compiler for tst-cet-legacy-10a[-static].c we can remove '#pragma GCC target' in tst-cet-legacy-10a[-static].c. Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
2024-12-20posix: fix system when a child cannot be created [BZ #32450]Aurelien Jarno2-3/+28
POSIX states that "if a child process cannot be created, or if the termination status for the command language interpreter cannot be obtained, system() shall return -1 and set errno to indicate the error." In the glibc implementation it could happen when posix_spawn fails, which happens when the underlying fork, vfork, or clone call fails. They could fail with EAGAIN and ENOMEM. Resolves: BZ #32450 Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-21Don't use glibc <tgmath.h> when testing with ClangH.J. Lu3-8/+27
Clang has its own <tgmath.h> and doesn't use <tgmath.h> from glibc. Pass "-I." to compiler only if $($(<F)-no-include-dot) are undefined. Define it to yes for tgmath tests when testing with Clang. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sam James <sam@gentoo.org>
2024-12-21stdio-common: Exclude bug28 when clang is usedH.J. Lu1-1/+10
Clang 19 takes a very long time, it ran more than 27 minutes on Intel Core i7-1195G7 before the process was killed, to compile bug28.c: https://github.com/llvm/llvm-project/issues/120462 Exclude it when Clang is used for testing. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sam James <sam@gentoo.org>
2024-12-21Fix elf: Introduce is_rtld_link_map [BZ #32488]H.J. Lu1-2/+2
Also use is_rtld_link_map in dl-cet.c. This fixes BZ #32488. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-12-20math: xfail some tanpi tests for ibm128-libgccAdhemerval Zanella3-364/+364
On powerpc math/test-ibm128-tanpi shows multiple failures: testing long double (without inline functions) Failure: tanpi_downward (0xfffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_downward (0xfffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_downward (0xfffffffffffffffdp-1) Result: is: 4.68843873182857939141363635204365e+28 0x1.2efbb6629d1d59b032520400df8p+95 should be: inf inf Failure: tanpi_downward (0x3fffffffffffffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_downward (0x3fffffffffffffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_downward (0x3fffffffffffffffffffffffffdp-1) Result: is: 1.41444453325831960404472183124793e+16 0x1.9202627cbf98e052d5fdbeee1f8p+53 should be: inf inf Failure: tanpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): Exception "Invalid operation" set Failure: tanpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): Exception "Overflow" set Failure: tanpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): errno set to 33, expected 0 (unchanged) Failure: Test: tanpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020) Result: is: qNaN should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 Failure: Test: tanpi_downward (0x3.fffffffffffffffcp+108) Result: is: 2.91356019227449116879287504834896e-15 0x1.a3e365fee24d4632f95a2235698p-49 should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 difference: 2.91356019227449116879287504834896e-15 0x1.a3e365fee24d4632f95a2235698p-49 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: Test: tanpi_downward (0x3.ffffffffffffffffffffffffffp+108) Result: is: 7.94911926685664643005642781870827e-16 0x1.ca3c4b83eb5688e1474146dc338p-51 should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 difference: 7.94911926685664643005642781870827e-16 0x1.ca3c4b83eb5688e1474146dc338p-51 ulp : 160891965142034222272327839154722485473479235229008379884749401713481320342777314570400076204240982703218835644458374555276642 max.ulp : 8.0000 Failure: tanpi_towardzero (0xfffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_towardzero (0xfffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_towardzero (0xfffffffffffffffdp-1) Result: is: 2.14718475310122677917055904836884e+28 0x1.1584624c14882fff76592b4ec10p+94 should be: inf inf Failure: tanpi_towardzero (-0xfffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_towardzero (-0xfffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_towardzero (-0xfffffffffffffffdp-1) Result: is: -2.14718475310122677917055904836884e+28 -0x1.1584624c14882fff76592b4ec10p+94 should be: -inf -inf Failure: tanpi_towardzero (0x3fffffffffffffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_towardzero (0x3fffffffffffffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_towardzero (0x3fffffffffffffffffffffffffdp-1) Result: is: 6.60739946234609289593176521179840e+15 0x1.7796511d79d6ce55bc8bf083fe0p+52 should be: inf inf Failure: tanpi_towardzero (-0x3fffffffffffffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_towardzero (-0x3fffffffffffffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_towardzero (-0x3fffffffffffffffffffffffffdp-1) Result: is: -6.60739946234609289593176521179840e+15 -0x1.7796511d79d6ce55bc8bf083fe0p+52 should be: -inf -inf Failure: Test: tanpi_towardzero (-0x3.fffffffffffffffcp+108) Result: is: -1.17953443892757434921819283936141e-14 -0x1.a8f8d97fb893518cbe5688935c0p-47 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 1.17953443892757434921819283936141e-14 0x1.a8f8d97fb893518cbe5688935c0p-47 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: Test: tanpi_towardzero (-0x3.ffffffffffffffffffffffffffp+108) Result: is: -1.85584803206881692897837494734542e-14 -0x1.4e51e25c1f5ab4470a3a0a42c24p-46 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 1.85584803206881692897837494734542e-14 0x1.4e51e25c1f5ab4470a3a0a42c24p-46 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: Test: tanpi_towardzero (0x3.fffffffffffffffcp+108) Result: is: 1.17953443892757434921819283936141e-14 0x1.a8f8d97fb893518cbe5688935c0p-47 should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 difference: 1.17953443892757434921819283936141e-14 0x1.a8f8d97fb893518cbe5688935c0p-47 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: Test: tanpi_towardzero (0x3.ffffffffffffffffffffffffffp+108) Result: is: 1.85584803206881692897837494734542e-14 0x1.4e51e25c1f5ab4470a3a0a42c24p-46 should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 difference: 1.85584803206881692897837494734542e-14 0x1.4e51e25c1f5ab4470a3a0a42c24p-46 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: tanpi_upward (-0xfffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_upward (-0xfffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_upward (-0xfffffffffffffffdp-1) Result: is: -2.14718475310122677917055904836884e+28 -0x1.1584624c14882fff76592b4ec10p+94 should be: -inf -inf Failure: tanpi_upward (-0x3fffffffffffffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_upward (-0x3fffffffffffffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_upward (-0x3fffffffffffffffffffffffffdp-1) Result: is: -6.60739946234609289593176521179829e+15 -0x1.7796511d79d6ce55bc8bf083fdbp+52 should be: -inf -inf Failure: Test: tanpi_upward (-0x3.fffffffffffffffcp+108) Result: is: -1.17953443892757434921819283936138e-14 -0x1.a8f8d97fb893518cbe5688935b0p-47 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 1.17953443892757434921819283936139e-14 0x1.a8f8d97fb893518cbe5688935b0p-47 ulp : inf max.ulp : 8.0000 Failure: Test: tanpi_upward (-0x3.ffffffffffffffffffffffffffp+108) Result: is: -1.85584803206881692897837494734542e-14 -0x1.4e51e25c1f5ab4470a3a0a42c24p-46 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 1.85584803206881692897837494734543e-14 0x1.4e51e25c1f5ab4470a3a0a42c24p-46 ulp : inf max.ulp : 8.0000 Failure: tanpi_upward (0xf.ffffffffffffbffffffffffffcp+1020): Exception "Invalid operation" set Failure: tanpi_upward (0xf.ffffffffffffbffffffffffffcp+1020): Exception "Overflow" set Failure: tanpi_upward (0xf.ffffffffffffbffffffffffffcp+1020): errno set to 33, expected 0 (unchanged) Failure: Test: tanpi_upward (0xf.ffffffffffffbffffffffffffcp+1020) Result: is: qNaN should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0
2024-12-20elf: Reorder audit events in dlcose to match _dl_fini (bug 32066)Florian Weimer2-16/+37
This was discovered after extending elf/tst-audit23 to cover dlclose of the dlmopen namespace. Auditors already experience the new order during process shutdown (_dl_fini), so no LAV_CURRENT bump or backwards compatibility code seems necessary. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-20elf: Call la_objclose for proxy link maps in _dl_fini (bug 32065)Florian Weimer2-3/+25
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-20elf: Signal la_objopen for the proxy link map in dlmopen (bug 31985)Florian Weimer2-29/+40
Previously, the ld.so link map was silently added to the namespace. This change produces an auditing event for it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-20elf: Add the endswith function to <endswith.h>Florian Weimer1-0/+8
And include <stdbool.h> for a definition of bool. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-20elf: Update DSO list, write audit log to elf/tst-audit23.outFlorian Weimer1-5/+22
After commit 1d5024f4f052c12e404d42d3b5bfe9c3e9fd27c4 ("support: Build with exceptions and asynchronous unwind tables [BZ #30587]"), libgcc_s is expected to show up in the DSO list on 32-bit Arm. Do not update max_objs because vdso is not tracked (and which is the reason why the test currently passes even with libgcc_s present). Also write the log output from the auditor to standard output, for easier test debugging. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-20elf: Move _dl_rtld_map, _dl_rtld_audit_state out of GLFlorian Weimer4-82/+82
This avoids immediate GLIBC_PRIVATE ABI issues if the size of struct link_map or struct auditstate changes. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-20elf: Introduce is_rtld_link_mapFlorian Weimer10-56/+28
Unconditionally define it to false for static builds. This avoids the awkward use of weak_extern for _dl_rtld_map in checks that cannot be possibly true on static builds. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-20Add F_CREATED_QUERY from Linux 6.12 to bits/fcntl-linux.hJoseph Myers1-0/+1
Linux 6.12 adds a new constant F_CREATED_QUERY. Add it to glibc's bits/fcntl-linux.h. Tested for x86_64.
2024-12-20Add HWCAP_LOONGARCH_LSPW from Linux 6.12 to bits/hwcap.hJoseph Myers1-0/+1
Add the new Linux 6.12 HWCAP_LOONGARCH_LSPW to the corresponding bits/hwcap.h. Tested with build-many-glibcs.py for loongarch64-linux-gnu-lp64d.
2024-12-20Add MSG_SOCK_DEVMEM from Linux 6.12 to bits/socket.hJoseph Myers1-0/+2
Linux 6.12 adds a constant MSG_SOCK_DEVMEM (recall that various constants such as this one are defined in the non-uapi linux/socket.h but still form part of the kernel/userspace interface, so that non-uapi header is one that needs checking each release for new such constants). Add it to glibc's bits/socket.h. Tested for x86_64.
2024-12-20i386: Regenerate ulpsFlorian Weimer2-0/+4
As seen on an Intel i9-9900K CPU, with glibc built with GCC 11.5, configured with and without --disable-multi-arch.
2024-12-20x86_64: Regenerate ulpsFlorian Weimer1-0/+2
As seen with an AMD 7950X CPU, on a glibc built with GCC 11.5.
2024-12-20aarch64: Regenerate ulpsFlorian Weimer1-0/+2
Results from running on Neoverse-V2, built with GCC 11.5.
2024-12-19elf: Remove code dependent on __rtld_lock_default_lock_recursive macroFlorian Weimer2-27/+0
Neither NPTL nor Hurd define this macro anymore. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-19Linux: Accept null arguments for utimensat pathnameFlorian Weimer3-5/+34
This matches kernel behavior. With this change, it is possible to use utimensat as a replacement for the futimens interface, similar to what glibc does internally. Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
2024-12-19x86_64: Remove unused padding from tcbhead_tFlorian Weimer1-12/+0
This padding is difficult to use for preserving the internal GLIBC_PRIVATE ABI. The comment is misleading. Current Address Sanitizer uses heuristics to determine struct pthread size. It does not depend on its precise layout. It merely scans for pointers allocated using malloc. Due to the removal of the padding, the assert for its start is no longer required. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-12-19Add further DSO dependency sorting testsJoseph Myers2-1/+243
The current DSO dependency sorting tests are for a limited number of specific cases, including some from particular bug reports. Add tests that systematically cover all possible DAGs for an executable and the shared libraries it depends on, directly or indirectly, up to four objects (an executable and three shared libraries). (For this kind of DAG - ones with a single source vertex from which all others are reachable, and an ordering on the edges from each vertex - there are 57 DAGs on four vertices, 3399 on five vertices and 1026944 on six vertices; see https://arxiv.org/pdf/2303.14710 for more details on this enumeration. I've tested that the 3399 cases with five vertices do all pass if enabled.) These tests are replicating the sorting logic from the dynamic linker (thereby, for example, asserting that it doesn't accidentally change); I'm not claiming that the logic in the dynamic linker is in some abstract sense optimal. Note that these tests do illustrate how in some cases the two sorting algorithms produce different results for a DAG (I think all the existing tests for such differences are ones involving cycles, and the motivation for the new algorithm was also to improve the handling of cycles): tst-dso-ordering-all4-44: a->[bc];{}->[cba] output(glibc.rtld.dynamic_sort=1): c>b>a>{}<a<b<c output(glibc.rtld.dynamic_sort=2): b>c>a>{}<a<c<b They also illustrate that sometimes the sorting algorithms do not follow the order in which dependencies are listed in DT_NEEDED even though there is a valid topological sort that does follow that, which might be counterintuitive considering that the DT_NEEDED ordering is followed in the simplest cases: tst-dso-ordering-all4-56: {}->[abc] output: c>b>a>{}<a<b<c shows such a simple case following DT_NEEDED order for destructor execution (the reverse of it for constructor execution), but tst-dso-ordering-all4-41: a->[cb];{}->[cba] output: c>b>a>{}<a<b<c shows that c and b are in the opposite order to what might be expected from the simplest case, though there is no dependency requiring such an opposite order to be used. (I'm not asserting that either of those things is a problem, simply observing them as less obvious properties of the sorting algorithms shown up by these tests.) Tested for x86_64.
2024-12-19Add NT_X86_XSAVE_LAYOUT and NT_ARM_POE from Linux 6.12 to elf.hJoseph Myers1-0/+2
Linux 6.12 adds new ELF note types NT_X86_XSAVE_LAYOUT and NT_ARM_POE. Add these to glibc's elf.h. Tested for x86_64.
2024-12-19Add SCHED_EXT from Linux 6.12 to bits/sched.hJoseph Myers2-1/+2
Linux 6.12 adds the SCHED_EXT constant. Add it to glibc's bits/sched.h and update the kernel version in tst-sched-consts.py. Tested for x86_64.
2024-12-19hppa: Fix strace detach-vfork testJohn David Anglin2-47/+64
This change implements vfork.S for direct support of the vfork syscall. clone.S is revised to correct child support for the vfork case. The main bug was creating a frame prior to the clone syscall. This was done to allow the rp and r4 registers to be saved and restored from the stack frame. r4 was used to save and restore the PIC register, r19, across the system call and the call to set errno. But in the vfork case, it is undefined behavior for the child to return from the function in which vfork was called. It is surprising that this usually worked. Syscalls on hppa save and restore rp and r19, so we don't need to create a frame prior to the clone syscall. We only need a frame when __syscall_error is called. We also don't need to save and restore r19 around the call to $$dyncall as r19 is not used in the code after $$dyncall. This considerably simplifies clone.S. Signed-off-by: John David Anglin <dave.anglin@bell.net>
2024-12-19Update kernel version to 6.12 in header constant testsJoseph Myers3-4/+4
There are no new constants covered by tst-mman-consts.py, tst-mount-consts.py or tst-pidfd-consts.py in Linux 6.12 that need any header changes, so update the kernel version in those tests. (tst-sched-consts.py will need updating separately along with adding SCHED_EXT.) Tested with build-many-glibcs.py.
2024-12-18added url of CORE-MATH projectPaul Zimmermann1-1/+2
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-12-18math: Use tanhf from CORE-MATHAdhemerval Zanella27-144/+84
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic tanhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 51.5273 41.0951 20.25% x86_64v2 47.7021 39.1526 17.92% x86_64v3 45.0373 34.2737 23.90% i686 133.9970 83.8596 37.42% aarch64 (Neoverse) 21.5439 14.7961 31.32% power10 13.3301 8.4406 36.68% reciprocal-throughput master patched improvement x86_64 24.9493 12.8547 48.48% x86_64v2 20.7051 12.7761 38.29% x86_64v3 19.2492 11.0851 42.41% i686 78.6498 29.8211 62.08% aarch64 (Neoverse) 11.6026 7.11487 38.68% power10 6.3328 2.8746 54.61% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Use sinhf from CORE-MATHAdhemerval Zanella26-139/+124
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic sinhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 52.6819 49.1489 6.71% x86_64v2 49.1162 42.9447 12.57% x86_64v3 46.9732 39.9157 15.02% i686 141.1470 129.6410 8.15% aarch64 (Neoverse) 20.8539 17.1288 17.86% power10 14.5258 9.1906 36.73% reciprocal-throughput master patched improvement x86_64 27.5553 23.9395 13.12% x86_64v2 21.6423 20.3219 6.10% x86_64v3 21.4842 16.0224 25.42% i686 87.9709 86.1626 2.06% aarch64 (Neoverse) 15.1919 12.2744 19.20% power10 7.2188 5.2611 27.12% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Use coshf from CORE-MATHAdhemerval Zanella27-146/+114
The CORE-MATH implementation is correctly rounded (for any rounding mode), although it should worse performance than current one. The current implementation performance comes mainly from the internal usage of the optimize expf implementation, and shows a maximum ULPs of 2 for FE_TONEAREST and 3 for other rounding modes. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 40.6995 49.0737 -20.58% x86_64v2 40.5841 44.3604 -9.30% x86_64v3 39.3879 39.7502 -0.92% i686 112.3380 129.8570 -15.59% aarch64 (Neoverse) 18.6914 17.0946 8.54% power10 11.1343 9.3245 16.25% reciprocal-throughput master patched improvement x86_64 18.6471 24.1077 -29.28% x86_64v2 17.7501 20.2946 -14.34% x86_64v3 17.8262 17.1877 3.58% i686 64.1454 86.5645 -34.95% aarch64 (Neoverse) 9.77226 12.2314 -25.16% power10 4.0200 5.3316 -32.63% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Use atanhf from CORE-MATHAdhemerval Zanella28-252/+161
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic atanhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 59.4930 45.8568 22.92% x86_64v2 59.5705 45.5804 23.48% x86_64v3 53.1838 37.7155 29.08% i686 169.354 133.5940 21.12% aarch64 (Neoverse) 26.0781 16.9829 34.88% power10 15.6591 10.7623 31.27% reciprocal-throughput master patched improvement x86_64 23.5903 18.5766 21.25% x86_64v2 22.6489 18.2683 19.34% x86_64v3 19.0401 13.9474 26.75% i686 97.6034 107.3260 -9.96% aarch64 (Neoverse) 15.3664 9.57846 37.67% power10 6.8877 4.6242 32.86% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Use atan2f from CORE-MATHAdhemerval Zanella29-296/+269
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic atan2f. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 68.1175 69.2014 -1.59% x86_64v2 66.9884 66.0081 1.46% x86_64v3 57.7034 61.6407 -6.82% i686 189.8690 152.7560 19.55% aarch64 (Neoverse) 32.6151 24.5382 24.76% power10 21.7282 17.1896 20.89% reciprocal-throughput master patched improvement x86_64 34.5202 31.6155 8.41% x86_64v2 32.6379 30.3372 7.05% x86_64v3 34.3677 23.6455 31.20% i686 157.7290 75.8308 51.92% aarch64 (Neoverse) 27.7788 16.2671 41.44% power10 15.5715 8.1588 47.60% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Use atanf from CORE-MATHAdhemerval Zanella27-225/+124
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic atanf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 56.8265 53.6842 5.53% x86_64v2 54.8177 53.6842 2.07% x86_64v3 46.2915 48.7034 -5.21% i686 158.3760 108.9560 31.20% aarch64 (Neoverse) 21.687 20.5893 5.06% power10 13.1903 13.5012 -2.36% reciprocal-throughput master patched improvement x86_64 16.6787 16.7601 -0.49% x86_64v2 16.6983 16.7601 -0.37% x86_64v3 16.2268 12.1391 25.19% i686 138.6840 36.0640 74.00% aarch64 (Neoverse) 11.8012 10.3565 12.24% power10 5.3212 4.2894 19.39% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Use asinhf from CORE-MATHAdhemerval Zanella28-270/+184
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic asinhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 64.5128 56.9717 11.69% x86_64v2 63.3065 57.2666 9.54% x86_64v3 62.8719 51.4170 18.22% i686 189.1630 137.635 27.24% aarch64 (Neoverse) 25.3551 20.5757 18.85% power10 17.9712 13.3302 25.82% reciprocal-throughput master patched improvement x86_64 20.0844 15.4731 22.96% x86_64v2 19.2919 15.4000 20.17% x86_64v3 18.7226 11.9009 36.44% i686 103.7670 80.2681 22.65% aarch64 (Neoverse) 12.5005 8.68969 30.49% power10 7.2220 5.03617 30.27% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>: Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Use asinf from CORE-MATHAdhemerval Zanella27-212/+122
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic asinf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 42.8237 35.2460 17.70% x86_64v2 43.3711 35.9406 17.13% x86_64v3 35.0335 30.5744 12.73% i686 213.8780 104.4710 51.15% aarch64 (Neoverse) 17.2937 13.6025 21.34% power10 12.0227 7.4241 38.25% reciprocal-throughput master patched improvement x86_64 13.6770 15.5231 -13.50% x86_64v2 13.8722 16.0446 -15.66% x86_64v3 13.6211 13.2753 2.54% i686 186.7670 45.4388 75.67% aarch64 (Neoverse) 9.96089 9.39285 5.70% power10 4.9862 3.7819 24.15% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Use acoshf from CORE-MATHAdhemerval Zanella26-225/+196
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic acoshf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 61.2471 58.7742 4.04% x86_64-v2 62.6519 59.0523 5.75% x86_64-v3 58.7408 50.1393 14.64% aarch64 24.8580 21.3317 14.19% power10 17.0469 13.1345 22.95% reciprocal-throughput master patched improvement x86_64 16.1618 15.1864 6.04% x86_64-v2 15.7729 14.7563 6.45% x86_64-v3 14.1669 11.9568 15.60% aarch64 10.911 9.5486 12.49% power10 6.38196 5.06734 20.60% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Use acosf from CORE-MATHAdhemerval Zanella27-174/+136
The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic acosf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 52.5098 36.6312 30.24% x86_64v2 53.0217 37.3091 29.63% x86_64v3 42.8501 32.3977 24.39% i686 207.3960 109.4000 47.25% aarch64 21.3694 13.7871 35.48% power10 14.5542 7.2891 49.92% reciprocal-throughput master patched improvement x86_64 14.1487 15.9508 -12.74% x86_64v2 14.3293 16.1899 -12.98% x86_64v3 13.6563 12.6161 7.62% i686 158.4060 45.7354 71.13% aarch64 12.5515 9.19233 26.76% power10 5.7868 3.3487 42.13% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Fix the expected carg (inf) resultsAdhemerval Zanella3-28/+261
The pi defined constants are not the expected value for carg on non-default rounding modes (similar to atan). Instead use autogenerated value.
2024-12-18math: Fix the expected atan2f (inf) resultsAdhemerval Zanella3-56/+2366
The pi defined constants are not the expected value for atan2 on non-default rounding modes. Instead use the autogenerated value. Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18math: Fix the expected atanf (inf) resultsAdhemerval Zanella3-2/+52
The M_PI_2 (lit_pi_2_d) constant is not the expected value for atanf on non-default rounding modes. Instead use the autogenerated value.
2024-12-18math: Add inf support on gen-auto-libm-tests.cAdhemerval Zanella1-4/+27
For some correctly rounded inputs where infinity might generate a number (like atanf), comparing to a pre-defined constant does not yield the expected result in all rounding modes. The most straightforward way to handle it would be to get the expected result from mpfr, where it handles all the rounding modes.
2024-12-18math: Fix spurious-divbyzero flag nameAdhemerval Zanella1-1/+1
Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18benchtests: Add tanhf benchmarkAdhemerval Zanella2-0/+2006
Random inputs in the range [-10,10]. Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18benchtests: Add sinhf benchmarkAdhemerval Zanella2-0/+2006
Random inputs in the range [-10,10]. Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18benchtests: Add coshf benchmarkAdhemerval Zanella2-0/+2006
Random inputs in the range [-10,10]. Reviewed-by: DJ Delorie <dj@redhat.com>