aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-04Cleanup __tls_get_addr on alpha/microblaze localplt.dataAdhemerval Zanella2-4/+0
They are not required. Checked with a make check for both ABIs.
2024-04-04arm: Remove ld.so __tls_get_addr plt usageAdhemerval Zanella2-3/+2
Use the hidden alias instead. Checked on arm-linux-gnueabihf.
2024-04-04aarch64: Remove ld.so __tls_get_addr plt usageAdhemerval Zanella2-3/+2
Use the hidden alias instead. Checked on aarch64-linux-gnu.
2024-04-04math: x86 trunc traps when FE_INEXACT is enabled (BZ 31603)Adhemerval Zanella6-93/+87
The implementations of trunc functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-04math: x86 floor traps when FE_INEXACT is enabled (BZ 31601)Adhemerval Zanella9-140/+144
The implementations of floor functions using x87 floating point (i386 and 86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-04math: x86 ceill traps when FE_INEXACT is enabled (BZ 31600)Adhemerval Zanella10-141/+181
The implementations of ceil functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-04aarch64/fpu: Add vector variants of erfcJoe Ramsay17-1/+4897
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04aarch64/fpu: Add vector variants of tanhJoe Ramsay16-27/+405
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04aarch64/fpu: Add vector variants of sinhJoe Ramsay16-0/+572
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04aarch64/fpu: Add vector variants of atanhJoe Ramsay14-0/+288
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04aarch64/fpu: Add vector variants of asinhJoe Ramsay14-0/+489
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04aarch64/fpu: Add vector variants of acoshJoe Ramsay19-0/+653
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04aarch64/fpu: Add vector variants of coshJoe Ramsay18-1/+648
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04aarch64/fpu: Add vector variants of erfJoe Ramsay19-1/+4531
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04misc: Add support for Linux uio.h RWF_NOAPPEND flagStafford Horne3-1/+9
In Linux 6.9 a new flag is added to allow for Per-io operations to disable append mode even if a file was opened with the flag O_APPEND. This is done with the new RWF_NOAPPEND flag. This caused two test failures as these tests expected the flag 0x00000020 to be unused. Adding the flag definition now fixes these tests on Linux 6.9 (v6.9-rc1). FAIL: misc/tst-preadvwritev2 FAIL: misc/tst-preadvwritev64v2 This patch adds the flag, adjusts the test and adds details to documentation. Link: https://lore.kernel.org/all/20200831153207.GO3265@brightrain.aerifal.cx/ Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-04-03manual: significand() uses FLT_RADIX, not 2Alejandro Colomar1-1/+1
It's implemented using scalb(), which uses FLT_RADIX, AFAIK. Link: <https://lore.kernel.org/linux-man/ZeYKUOKYS7G90SaV@debian/T/#mf21ab57e16b92eb6be6c7df79dc0eb43d4454056> Reported-by: Morten Welinder <mwelinder@gmail.com> Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Vincent Lefevre <vincent@vinc17.net> Cc: DJ Delorie <dj@redhat.com> Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alejandro Colomar <alx@kernel.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-04-03manual: Clarify return value of cbrt(3)Alejandro Colomar1-2/+6
Link: <https://lore.kernel.org/linux-man/ZeYKUOKYS7G90SaV@debian/T/#mff0ab388000c6afdb5e5162804d4a0073de481de> Reported-by: Morten Welinder <mwelinder@gmail.com> Cowritten-by: Morten Welinder <mwelinder@gmail.com> Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Vincent Lefevre <vincent@vinc17.net> Cc: DJ Delorie <dj@redhat.com> Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alejandro Colomar <alx@kernel.org> Reviewed-by: DJ Delorie <dj@redhat.com>
2024-04-03manual: floor(log2(fabs(x))) has rounding errorsAlejandro Colomar1-2/+5
Link: <https://inbox.sourceware.org/libc-alpha/20240305150131.GD3653@qaa.vinc17.org/T/#m3ceecda630012995339bcc5448fee451cf277a8b> Reported-by: Vincent Lefevre <vincent@vinc17.net> Suggested-by: Vincent Lefevre <vincent@vinc17.net> Reviewed-by: DJ Delorie <dj@redhat.com> Cc: Morten Welinder <mwelinder@gmail.com> Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
2024-04-03manual: logb(x) is floor(log2(fabs(x)))Alejandro Colomar1-1/+1
log2(3) doesn't accept negative input, but it seems logb(3) does accept it. Link: <https://lore.kernel.org/linux-man/ZeYKUOKYS7G90SaV@debian/T/#u> Reported-by: Morten Welinder <mwelinder@gmail.com> Reviewed-by: DJ Delorie <dj@redhat.com> Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Vincent Lefevre <vincent@vinc17.net> Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
2024-04-02powerpc: Add missing arch flags on rounding ifunc variantsAdhemerval Zanella1-0/+6
The ifunc variants now uses the powerpc implementation which in turn uses the compiler builtin. Without the proper -mcpu switch the builtin does not generate the expected optimization. Checked on powerpc-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
2024-04-02math: Reformat Makefile.Adhemerval Zanella1-159/+685
Reflow all long lines adding comment terminators. Sort all reflowed text using scripts/sort-makefile-lines.py. No code generation changes observed in binary artifacts. No regressions on x86_64 and i686. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-04-02Always define __USE_TIME_BITS64 when 64 bit time_t is usedAdhemerval Zanella75-182/+178
It was raised on libc-help [1] that some Linux kernel interfaces expect the libc to define __USE_TIME_BITS64 to indicate the time_t size for the kABI. Different than defined by the initial y2038 design document [2], the __USE_TIME_BITS64 is only defined for ABIs that support more than one time_t size (by defining the _TIME_BITS for each module). The 64 bit time_t redirects are now enabled using a different internal define (__USE_TIME64_REDIRECTS). There is no expected change in semantic or code generation. Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and arm-linux-gnueabi [1] https://sourceware.org/pipermail/libc-help/2024-January/006557.html [2] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign Reviewed-by: DJ Delorie <dj@redhat.com>
2024-04-01benchtests: Improve benchtests for strstrAdhemerval Zanella1-76/+273
Use same strategy as bench-strstr.c (93eebae5168e5cf2 and 80b2bfb53504) and use json_ctx for output to help standardize format across all benchtests. Reviewed-by: Arjun Shankar <arjun@redhat.com>
2024-03-27x86_64: Remove avx512 strstr implementationAdhemerval Zanella4-248/+4
As indicated in a recent thread, this it is a simple brute-force algorithm that checks the whole needle at a matching character pair (and does so 1 byte at a time after the first 64 bytes of a needle). Also it never skips ahead and thus can match at every haystack position after trying to match all of the needle, which generic implementation avoids. As indicated by Wilco, a 4x larger needle and 16x larger haystack gives a clear 65x slowdown both basic_strstr and __strstr_avx512: "ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512", "__strstr_sse2_unaligned", "__strstr_generic"], { "len_haystack": 65536, "len_needle": 1024, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2] }, { "len_haystack": 1048576, "len_needle": 4096, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9] } PS: I don't have an AVX512 capable machine to verify this issues, but skimming through the code it does seems to follow what Wilco has described. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-03-27signal: Avoid system signal disposition to interfere with testsAdhemerval Zanella2-0/+7
Both tst-sigset2 and tst-signal1 expectes that SIGINT disposition is set to SIG_DFL.
2024-03-25RISC-V: Fix the static-PIE non-relocated object checkPalmer Dabbelt1-1/+1
The value of l_scope is only valid post relocation, so this original check was triggering undefined behavior. Instead just directly check to see if the object has been relocated, at which point using l_scope is safe. Reported-by: Andreas Schwab <schwab@suse.de> Closes: BZ #31317 Fixes: e0590f41fe ("RISC-V: Enable static-pie.") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-23htl: Implement some support for TLS_DTV_AT_TPSergey Bugaev3-2/+25
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-19-bugaevc@gmail.com>
2024-03-23htl: Respect GL(dl_stack_flags) when allocating stacksSergey Bugaev2-2/+11
Previously, HTL would always allocate non-executable stacks. This has never been noticed, since GNU Mach on x86 ignores VM_PROT_EXECUTE and makes all pages implicitly executable. Since GNU Mach on AArch64 supports non-executable pages, HTL forgetting to pass VM_PROT_EXECUTE immediately breaks any code that (unfortunately, still) relies on executable stacks. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-7-bugaevc@gmail.com>
2024-03-23hurd: Use the RETURN_ADDRESS macroSergey Bugaev1-1/+1
This gives us PAC stripping on AArch64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-6-bugaevc@gmail.com>
2024-03-23hurd: Disable Prefer_MAP_32BIT_EXEC on non-x86_64 for nowSergey Bugaev2-2/+2
While we could support it on any architecture, the tunable is currently only defined on x86_64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-5-bugaevc@gmail.com>
2024-03-23Allow glibc to be compiled without EXEC_PAGESIZESergey Bugaev3-2/+8
We would like to avoid statically defining any specific page size on aarch64-gnu, and instead make sure that everything uses the dynamic page size, available via vm_page_size and GLRO(dl_pagesize). There are currently a few places in glibc that require EXEC_PAGESIZE to be defined. Per Roland's suggestion [0], drop the static GLRO(dl_pagesize) initializers (for now, only if EXEC_PAGESIZE is not defined), and don't require EXEC_PAGESIZE definition for libio to enable mmap usage. [0]: https://mail.gnu.org/archive/html/bug-hurd/2011-10/msg00035.html Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-4-bugaevc@gmail.com>
2024-03-23hurd: Stop relying on VM_MAX_ADDRESSSergey Bugaev1-4/+4
We'd like to avoid committing to a specific size of virtual address space (i.e. the value of VM_AARCH64_T0SZ) on AArch64. While the current version of GNU Mach still exports VM_MAX_ADDRESS for compatibility, we should try to avoid relying on it when we can. This piece of logic in _hurdsig_getenv () doesn't actually care about the size of user- accessible virtual address space, it just wants to preempt faults on any addresses starting from the value of the P pointer and above. So, use (unsigned long int) -1 instead of VM_MAX_ADDRESS. While at it, change the casts to (unsigned long int) and not just (long int), since the type of struct hurd_signal_preemptor.{first,last} is unsigned long int. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-3-bugaevc@gmail.com>
2024-03-23hurd: Move internal functions to internal headerSergey Bugaev2-87/+78
Move _hurd_self_sigstate (), _hurd_critical_section_lock (), and _hurd_critical_section_unlock () inline implementations (that were already guarded by #if defined _LIBC) to the internal version of the header. While at it, add <tls.h> to the includes, and use __LIBC_NO_TLS () unconditionally. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-2-bugaevc@gmail.com>
2024-03-23stdlib: Fix tst-makecontext2 log when swapcontext failsStafford Horne1-1/+1
The log incorrectly prints, setcontext failed. Update this to indicate that actually swapcontext failed. Signed-off-by: Stafford Horne <shorne@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-03-22or1k: Add prctl wrapper to unwrap variadic argsStafford Horne1-0/+42
On OpenRISC variadic functions and regular functions have different calling conventions so this wrapper is needed to translate. This wrapper is copied from x86_64/x32. I don't know the build system enough to find a cleaner way to share the code between x86_64/x32 and or1k (maybe Implies?), so I went with the straight copy. This fixes test failures: misc/tst-prctl nptl/tst-setgetname
2024-03-22or1k: Only define fpu rouding and exceptions with hard-floatStafford Horne1-0/+19
This test failure: math/test-fenv If rounding mode and exception macros are defined then the fenv tests run and always fail. This patch adds an ifdef using the __or1k_hard_float__ macro provided by gcc to avoid defining these fenv macros when they cnnot be used. This is similar to what is done in csky. Note, I will post the or1k hard-float support soon. So, I prefer to leave the hard-float bits here for now.
2024-03-22or1k: Update libm test ulpsStafford Horne1-0/+1
To fix test failures: FAIL: math/test-float-hypot FAIL: math/test-float32-hypot
2024-03-21AArch64: Check kernel version for SVE ifuncsWilco Dijkstra5-2/+53
Old Linux kernels disable SVE after every system call. Calling the SVE-optimized memcpy afterwards will then cause a trap to reenable SVE. As a result, applications with a high use of syscalls may run slower with the SVE memcpy. This is true for kernels between 4.15.0 and before 6.2.0, except for 5.14.0 which was patched. Avoid this by checking the kernel version and selecting the SVE ifunc on modern kernels. Parse the kernel version reported by uname() into a 24-bit kernel.major.minor value without calling any library functions. If uname() is not supported or if the version format is not recognized, assume the kernel is modern. Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-03-19powerpc: Placeholder and infrastructure/build support to add Power11 ↵Amrita H S15-5/+27
related changes. The following three changes have been added to provide initial Power11 support. 1. Add the directories to hold Power11 files. 2. Add support to select Power11 libraries based on AT_PLATFORM. 3. Let submachine=power11 be set automatically. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
2024-03-19powerpc: Add HWCAP3/HWCAP4 data to TCB for Power Architecture.Manjunath Matti12-19/+74
This patch adds a new feature for powerpc. In order to get faster access to the HWCAP3/HWCAP4 masks, similar to HWCAP/HWCAP2 (i.e. for implementing __builtin_cpu_supports() in GCC) without the overhead of reading them from the auxiliary vector, we now reserve space for them in the TCB. This is an ABI change for GLIBC 2.39. Suggested-by: Peter Bergner <bergner@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
2024-03-19elf: Enable TLS descriptor tests on aarch64Adhemerval Zanella5-33/+40
The aarch64 uses 'trad' for traditional tls and 'desc' for tls descriptors, but unlike other targets it defaults to 'desc'. The gnutls2 configure check does not set aarch64 as an ABI that uses TLS descriptors, which then disable somes stests. Also rename the internal machinery fron gnu2 to tls descriptors. Checked on aarch64-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-03-19arm: Update _dl_tlsdesc_dynamic to preserve caller-saved registers (BZ 31372)Adhemerval Zanella10-15/+250
ARM _dl_tlsdesc_dynamic slow path has two issues: * The ip/r12 is defined by AAPCS as a scratch register, and gcc is used to save the stack pointer before on some function calls. So it should also be saved/restored as well. It fixes the tst-gnu2-tls2. * None of the possible VFP registers are saved/restored. ARM has the additional complexity to have different VFP bank sizes (depending of VFP support by the chip). The tst-gnu2-tls2 test is extended to check for VFP registers, although only for hardfp builds. Different than setcontext, _dl_tlsdesc_dynamic does not have HWCAP_ARM_IWMMXT (I don't have a way to properly test it and it is almost a decade since newer hardware was released). With this patch there is no need to mark tst-gnu2-tls2 as XFAIL. Checked on arm-linux-gnueabihf. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-03-19Ignore undefined symbols for -mtls-dialect=gnu2Adhemerval Zanella2-2/+2
So it does not fail for arm config that defaults to -mtp=soft (which issues a call to __aeabi_read_tp). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-03-19Add tst-gnu2-tls2mod1 to test-internal-extrasAndreas Schwab1-0/+2
That allows sysdeps/x86_64/tst-gnu2-tls2mod1.S to use internal headers. Fixes: 717ebfa85c ("x86-64: Allocate state buffer space for RDI, RSI and RBX")
2024-03-18x86-64: Allocate state buffer space for RDI, RSI and RBXH.J. Lu3-11/+147
_dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning stack. After realigning stack, it saves RCX, RDX, R8, R9, R10 and R11. Define TLSDESC_CALL_REGISTER_SAVE_AREA to allocate space for RDI, RSI and RBX to avoid clobbering saved RDI, RSI and RBX values on stack by xsave to STATE_SAVE_OFFSET(%rsp). +==================+<- stack frame start aligned at 8 or 16 bytes | |<- RDI saved in the red zone | |<- RSI saved in the red zone | |<- RBX saved in the red zone | |<- paddings for stack realignment of 64 bytes |------------------|<- xsave buffer end aligned at 64 bytes | |<- | |<- | |<- |------------------|<- xsave buffer start at STATE_SAVE_OFFSET(%rsp) | |<- 8-byte padding for 64-byte alignment | |<- 8-byte padding for 64-byte alignment | |<- R11 | |<- R10 | |<- R9 | |<- R8 | |<- RDX | |<- RCX +==================+<- RSP aligned at 64 bytes Define TLSDESC_CALL_REGISTER_SAVE_AREA, the total register save area size for all integer registers by adding 24 to STATE_SAVE_OFFSET since RDI, RSI and RBX are saved onto stack without adjusting stack pointer first, using the red-zone. This fixes BZ #31501. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2024-03-18riscv: Update nofpu libm test ulpsDarius Rad1-0/+1
Fix two test failures. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-03-15Add STATX_MNT_ID_UNIQUE from Linux 6.8 to bits/statx-generic.hJoseph Myers1-0/+1
Linux 6.8 adds a new STATX_MNT_ID_UNIQUE constant. Add it to glibc's bits/statx-generic.h. Tested for x86_64.
2024-03-15linux: Use rseq area unconditionally in sched_getcpu (bug 31479)Florian Weimer1-8/+0
Originally, nptl/descr.h included <sys/rseq.h>, but we removed that in commit 2c6b4b272e6b4d07303af25709051c3e96288f2d ("nptl: Unconditionally use a 32-byte rseq area"). After that, it was not ensured that the RSEQ_SIG macro was defined during sched_getcpu.c compilation that provided a definition. This commit always checks the rseq area for CPU number information before using the other approaches. This adds an unnecessary (but well-predictable) branch on architectures which do not define RSEQ_SIG, but its cost is small compared to the system call. Most architectures that have vDSO acceleration for getcpu also have rseq support. Fixes: 2c6b4b272e6b4d07303af25709051c3e96288f2d Fixes: 1d350aa06091211863e41169729cee1bca39f72f Reviewed-by: Arjun Shankar <arjun@redhat.com>
2024-03-14aarch64: fix check for SVE support in assemblerSzabolcs Nagy2-4/+6
Due to GCC bug 110901 -mcpu can override -march setting when compiling asm code and thus a compiler targetting a specific cpu can fail the configure check even when binutils gas supports SVE. The workaround is that explicit .arch directive overrides both -mcpu and -march, and since that's what the actual SVE memcpy uses the configure check should use that too even if the GCC issue is fixed independently. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-03-13Update kernel version to 6.8 in header constant testsJoseph Myers3-4/+4
This patch updates the kernel version in the tests tst-mman-consts.py, tst-mount-consts.py and tst-pidfd-consts.py to 6.8. (There are no new constants covered by these tests in 6.8 that need any other header changes.) Tested with build-many-glibcs.py.