aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-04-04compare_strings.py : Add --gmean flagNisha Menon1-2/+18
To calculate geometric mean for string benchmark results. Signed-off-by: Nisha Poyarekar <nisha.s.menon@gmail.com>
2023-04-04x86/dl-cacheinfo: remove unsused parameter from handle_amdAndreas Schwab1-36/+30
Also replace an unreachable assert with __builtin_unreachable.
2023-04-03powerpc: Disable stack protector in early static initializationAdhemerval Zanella1-0/+3
Similar to fb95c316382679c0826cc8399760977cd95f15c9, also disable for string-ppc64.c (pulled on rltd as the default string implementation). Checked on powerpc64-linux-gnu.
2023-04-03nptl: Fix tst-cancel30 on sparc64Adhemerval Zanella1-3/+1
As indicated by sparc kernel-features.h, even though sparc64 defines __NR_pause, it is not supported (ENOSYS). Always use ppoll or the 64 bit time_t variant instead.
2023-04-03math: Remove the error handling wrapper from fmod and fmodfAdhemerval Zanella Netto38-13/+172
The error handling is moved to sysdeps/ieee754 version with no SVID support. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). The ia64 is unchanged, since it still uses the arch specific __libm_error_region on its implementation. For both i686 and m68k, which provive arch specific implementation, wrappers are added so no new symbol are added (which would require to change the implementations). It shows an small improvement, the results for fmod: Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 12.5049 | 9.40992 x86_64 (Ryzen 9) | normal | 296.939 | 296.738 x86_64 (Ryzen 9) | close-exponents | 16.0244 | 13.119 aarch64 (N1) | subnormal | 6.81778 | 4.33313 aarch64 (N1) | normal | 155.620 | 152.915 aarch64 (N1) | close-exponents | 8.21306 | 5.76138 armhf (N1) | subnormal | 15.1083 | 14.5746 armhf (N1) | normal | 244.833 | 241.738 armhf (N1) | close-exponents | 21.8182 | 22.457 Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-04-03math: Improve fmodfAdhemerval Zanella Netto3-93/+191
This uses a new algorithm similar to already proposed earlier [1]. With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers), the simplest implementation is: mx * 2^ex == 2 * mx * 2^(ex - 1) while (ex > ey) { mx *= 2; --ex; mx %= my; } With mx/my being mantissa of double floating pointer, on each step the argument reduction can be improved 8 (which is sizeof of uint32_t minus MANTISSA_WIDTH plus the signal bit): while (ex > ey) { mx << 8; ex -= 8; mx %= my; } */ The implementation uses builtin clz and ctz, along with shifts to convert hx/hy back to doubles. Different than the original patch, this path assume modulo/divide operation is slow, so use multiplication with invert values. I see the following performance improvements using fmod benchtests (result only show the 'mean' result): Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 17.2549 | 12.0318 x86_64 (Ryzen 9) | normal | 85.4096 | 49.9641 x86_64 (Ryzen 9) | close-exponents | 19.1072 | 15.8224 aarch64 (N1) | subnormal | 10.2182 | 6.81778 aarch64 (N1) | normal | 60.0616 | 20.3667 aarch64 (N1) | close-exponents | 11.5256 | 8.39685 I also see similar improvements on arm-linux-gnueabihf when running on the N1 aarch64 chips, where it a lot of soft-fp implementation (for modulo, and multiplication): Architecture | Input | master | patch -----------------|-----------------|----------|-------- armhf (N1) | subnormal | 11.6662 | 10.8955 armhf (N1) | normal | 69.2759 | 34.1524 armhf (N1) | close-exponents | 13.6472 | 18.2131 Instead of using the math_private.h definitions, I used the math_config.h instead which is used on newer math implementations. Co-authored-by: kirill <kirill.okhotnikov@gmail.com> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-04-03math: Improve fmodAdhemerval Zanella Netto3-96/+222
This uses a new algorithm similar to already proposed earlier [1]. With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers), the simplest implementation is: mx * 2^ex == 2 * mx * 2^(ex - 1) while (ex > ey) { mx *= 2; --ex; mx %= my; } With mx/my being mantissa of double floating pointer, on each step the argument reduction can be improved 11 (which is sizeo of uint64_t minus MANTISSA_WIDTH plus the signal bit): while (ex > ey) { mx << 11; ex -= 11; mx %= my; } */ The implementation uses builtin clz and ctz, along with shifts to convert hx/hy back to doubles. Different than the original patch, this path assume modulo/divide operation is slow, so use multiplication with invert values. I see the following performance improvements using fmod benchtests (result only show the 'mean' result): Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 19.1584 | 12.5049 x86_64 (Ryzen 9) | normal | 1016.51 | 296.939 x86_64 (Ryzen 9) | close-exponents | 18.4428 | 16.0244 aarch64 (N1) | subnormal | 11.153 | 6.81778 aarch64 (N1) | normal | 528.649 | 155.62 aarch64 (N1) | close-exponents | 11.4517 | 8.21306 I also see similar improvements on arm-linux-gnueabihf when running on the N1 aarch64 chips, where it a lot of soft-fp implementation (for modulo, clz, ctz, and multiplication): Architecture | Input | master | patch -----------------|-----------------|----------|-------- armhf (N1) | subnormal | 15.908 | 15.1083 armhf (N1) | normal | 837.525 | 244.833 armhf (N1) | close-exponents | 16.2111 | 21.8182 Instead of using the math_private.h definitions, I used the math_config.h instead which is used on newer math implementations. Co-authored-by: kirill <kirill.okhotnikov@gmail.com> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-04-03benchtests: Add fmodf benchmarkAdhemerval Zanella Netto2-0/+2183
1. Subnormals: 128 inputs. 2. Normal numbers with large exponent difference (|x/y| > 2^8): 1024 inputs between FLT_MIN and FLT_MAX; 3. Close exponents (ey >= -103 and |x/y| < 2^8): 1024 inputs with exponents between -10 and 10. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-04-03benchtests: Add fmod benchmarkAdhemerval Zanella Netto2-0/+2183
Add three different dataset, from random floating point numbers: 1. Subnormals: 128 inputs. 2. Normal numbers with large exponent difference (|x/y| > 2^52): 1024 inputs between DBL_MIN and DBL_MAX; 3. Close exponents (ey >= -907 and |x/y| < 2^52): 1024 inputs with exponents between -10 and 10. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-04-03x86: Set FSGSBASE to active if enabled by kernelH.J. Lu4-0/+58
Linux kernel uses AT_HWCAP2 to indicate if FSGSBASE instructions are enabled. If the HWCAP2_FSGSBASE bit in AT_HWCAP2 is set, FSGSBASE instructions can be used in user space. Define dl_check_hwcap2 to set the FSGSBASE feature to active on Linux when the HWCAP2_FSGSBASE bit is set. Add a test to verify that FSGSBASE is active on current kernels. NB: This test will fail if the kernel doesn't set the HWCAP2_FSGSBASE bit in AT_HWCAP2 while fsgsbase shows up in /proc/cpuinfo. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-04-03x86_64: Fix asm constraints in feraiseexcept (bug 30305)Florian Weimer1-2/+2
The divss instruction clobbers its first argument, and the constraints need to reflect that. Fortunately, with GCC 12, generated code does not actually change, so there is no externally visible bug. Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2023-04-03manual: Document __wur usage under _FORTIFY_SOURCESiddhesh Poyarekar1-0/+3
The __warn_unused_result__ attribute is only enabled when fortification is enabled. Mention that in the document. The rationale for this is essentially to mitigate against CWE-252: [1] https://cwe.mitre.org/data/definitions/252.html Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-04-03hurd: Microoptimize _hurd_self_sigstate ()Sergey Bugaev1-3/+5
When THREAD_GETMEM is defined with inline assembly, the compiler may not optimize away the two reads of _hurd_sigstate. Help it out a little bit by only reading it once. This also makes for a slightly cleaner code. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-32-bugaevc@gmail.com>
2023-04-03hurd: Add vm_param.h for x86_64Sergey Bugaev1-0/+24
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-30-bugaevc@gmail.com>
2023-04-03hurd: Implement _hurd_longjmp_thread_state for x86_64Sergey Bugaev1-0/+41
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-29-bugaevc@gmail.com>
2023-04-03htl: Implement thread_set_pcsptp for x86_64Sergey Bugaev1-0/+73
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-23-bugaevc@gmail.com>
2023-04-03x86_64: Add rtld-stpncpy & rtld-strncpySergey Bugaev2-0/+36
Just like the other existing rtld-str* files, this provides rtld with usable versions of stpncpy and strncpy. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-22-bugaevc@gmail.com>
2023-04-03htl: Add tcb-offsets.sym for x86_64Sergey Bugaev2-0/+28
The source code is the same as sysdeps/i386/htl/tcb-offsets.sym, but of course the produced tcb-offsets.h will be different. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-21-bugaevc@gmail.com>
2023-04-03hurd: Move a couple of signal-related files to x86Sergey Bugaev2-0/+0
These do not need any changes to be used on x86_64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-20-bugaevc@gmail.com>
2023-04-03hurd: Use uintptr_t for register values in trampoline.cSergey Bugaev1-7/+6
This is more correct, if only because these fields are defined as having the type unsigned int in the Mach headers, so casting them to a signed int and then back is suboptimal. Also, remove an extra reassignment of uesp -- this is another remnant of the ecx kludge. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-16-bugaevc@gmail.com>
2023-04-03hurd: Move rtld-strncpy-c.c out of mach/hurd/Sergey Bugaev1-0/+0
There's nothing Mach- or Hurd-specific about it; any port that ends up with rtld pulling in strncpy will need this. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-15-bugaevc@gmail.com>
2023-04-03hurd: More 64-bit integer casting fixesSergey Bugaev2-4/+4
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-13-bugaevc@gmail.com>
2023-04-03mach, hurd: Drop __libc_lock_self0Sergey Bugaev3-8/+3
This was used for the value of libc-lock's owner when TLS is not yet set up, so THREAD_SELF can not be used. Since the value need not be anything specific -- it just has to be non-NULL -- we can just use a plain constant, such as (void *) 1, for this. This avoids accessing the symbol through GOT, and exporting it from libc.so in the first place. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-12-bugaevc@gmail.com>
2023-04-03stdio-common: Fix building when !IS_IN (libc)Sergey Bugaev1-0/+2
In this case, _itoa_word () is already defined inline in the header (see sysdeps/generic/_itoa.h), and the second definition causes an error. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-11-bugaevc@gmail.com>
2023-04-03hurd: Fix _hurd_setup_sighandler () signatureSergey Bugaev1-5/+5
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-10-bugaevc@gmail.com>
2023-04-03hurd: Disable O_TRUNC and FS_RETRY_MAGICAL in rtldSergey Bugaev1-2/+5
hurd/lookup-retry.c is compiled into rtld, the dynamic linker/loader. To avoid pulling in file_set_size, file_utimens, tty/ctty stuff, more string/memory code (memmove, strncpy, strcpy), and more strtoul/itoa code, compile out support for O_TRUNC and FS_RETRY_MAGICAL when building hurd/lookup-retry.c for rtld. None of that functionality is useful to rtld during startup anyway. Keep support for FS_RETRY_MAGICAL("/"), since that does not pull in much, and is required for following absolute symlinks. The large number of extra code being pulled into rtld was noticed by reviewing librtld.map & elf/librtld.os.map in the build tree. It is worth noting that once libc.so is loaded, the real __open, __stat, etc. replace the minimal versions used initially by rtld -- this is especially important in the Hurd port, where the minimal rtld versions do not use the dtable and just pass real Mach port names as fds. Thus, once libc.so is loaded, rtld will gain access to the full __hurd_file_name_lookup_retry () version, complete with FS_RETRY_MAGICAL support, which is important in case the program decides to dlopen ("/proc/self/fd/...") or some such. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-9-bugaevc@gmail.com>
2023-04-03hurd: Fix file name in #errorSergey Bugaev1-1/+1
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-8-bugaevc@gmail.com>
2023-04-03hurd: Swap around two function callsSergey Bugaev1-4/+4
...to keep `sigexc' port initialization in one place, and match what the comments say. No functional change. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-7-bugaevc@gmail.com>
2023-04-03hurd: Remove __hurd_threadvar_stack_{offset,mask}Sergey Bugaev6-46/+2
Noone is or should be using __hurd_threadvar_stack_{offset,mask}, we have proper TLS now. These two remaining variables are never set to anything other than zero, so any code that would try to use them as described would just dereference a zero pointer and crash. So remove them entirely. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-6-bugaevc@gmail.com>
2023-04-03hurd: Make exception subcode a longSergey Bugaev3-4/+5
On EXC_BAD_ACCESS, exception subcode is used to pass the faulting memory address, so it needs to be (at least) pointer-sized. Thus, make it into a long. This matches the corresponding change in GNU Mach. Message-Id: <20230319151017.531737-5-bugaevc@gmail.com>
2023-03-31time: Fix strftime(3) API regarding nullabilityAlejandro Colomar1-1/+2
strftime(3) doesn't accept null pointers in any of the parameters. Cc: Paul Eggert <eggert@cs.ucla.edu> Signed-off-by: Alejandro Colomar <alx@kernel.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-03-30Update arm libm-tests-ulpsAdhemerval Zanella1-0/+1
For the next test from cf7ffdd8a5f6da55397e10b3860062944312824c.
2023-03-30getlogin_r: fix missing fallback if loginuid is unset (bug 30235)Andreas Schwab1-4/+1
When /proc/self/loginuid is not set, we should still fall back to using the traditional utmp lookup, instead of failing right away.
2023-03-29memalign: Support scanning for aligned chunks.DJ Delorie3-28/+390
This patch adds a chunk scanning algorithm to the _int_memalign code path that reduces heap fragmentation by reusing already aligned chunks instead of always looking for chunks of larger sizes and splitting them. The tcache macros are extended to allow removing a chunk from the middle of the list. The goal is to fix the pathological use cases where heaps grow continuously in workloads that are heavy users of memalign. Note that tst-memalign-2 checks for tcache operation, which malloc-check bypasses. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-03-29malloc: Use C11 atomics on memusageAdhemerval Zanella1-82/+111
Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
2023-03-29Remove --enable-tunables configure optionAdhemerval Zanella Netto48-513/+75
And make always supported. The configure option was added on glibc 2.25 and some features require it (such as hwcap mask, huge pages support, and lock elisition tuning). It also simplifies the build permutations. Changes from v1: * Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs more discussion. * Cleanup more code. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-03-29Remove --disable-experimental-malloc optionAdhemerval Zanella8-39/+4
It is the default since 2.26 and it has bitrotten over the years, By using it multiple malloc tests fails: FAIL: malloc/tst-memalign-2 FAIL: malloc/tst-memalign-2-malloc-hugetlb1 FAIL: malloc/tst-memalign-2-malloc-hugetlb2 FAIL: malloc/tst-memalign-2-mcheck FAIL: malloc/tst-mxfast-malloc-hugetlb1 FAIL: malloc/tst-mxfast-malloc-hugetlb2 FAIL: malloc/tst-tcfree2 FAIL: malloc/tst-tcfree2-malloc-hugetlb1 FAIL: malloc/tst-tcfree2-malloc-hugetlb2 Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
2023-03-28Allow building with --disable-nscd againFlavio Cruz1-0/+6
The change 88677348b4de breaks the build with undefiend references to the NSCD functions.
2023-03-28system: Add "--" after "-c" for sh (BZ #28519)Joe Simmons-Talbott4-2/+22
Prevent sh from interpreting a user string as shell options if it starts with '-' or '+'. Since the version of /bin/sh used for testing system() is different from the full-fledged system /bin/sh add support to it for handling "--" after "-c". Add a testcase to ensure the expected behavior. Signed-off-by: Joe Simmons-Talbott <josimmon@redhat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-03-28posix: Fix some crashes in wordexp [BZ #18096]Julian Squires2-7/+8
Without these fixes, the first three included tests segfault (on a NULL dereference); the fourth aborts on an assertion, which is itself unnecessary. Signed-off-by: Julian Squires <julian@cipht.net> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-03-28LoongArch: ldconfig: Add comments for using EF_LARCH_OBJABI_V1caiyinyu1-0/+6
We added Adhemerval Zanella's comment to explain the reason for using EF_LARCH_OBJABI_V1.
2023-03-27elf: Take into account ${sysconfdir} in elf/tst-ldconfig-p.shRomain Geissler2-6/+7
Take into account ${sysconfdir} in elf/tst-ldconfig-p.sh. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-03-27Fix tst-glibc-hwcaps-prepend-cache with custom configure prefix valueRomain Geissler1-3/+7
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-03-27Fix tst-ldconfig-ld_so_conf-update with custom configure prefix valueRomain Geissler1-5/+8
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-03-27support: introduce support_sysconfdir_prefixRomain Geissler3-1/+11
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-03-27Remove set-hooks.h from generic includesAdhemerval Zanella Netto1-0/+0
The hooks mechanism uses symbol sets for running lists of functions, which requires either extra linker directives to provide any hardening (such as RELRO) or additional code (such as pointer obfuscation via mangling with random value). Currently only hurd uses set-hooks.h so we remove it from the generic includes. The generic implementation uses direct function calls which provide hardening and good code generation, observability and debugging without the need for extra linking options or special code handling. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-03-27Remove --with-default-link configure optionAdhemerval Zanella Netto8-51/+7
Now that there is no need to use a special linker script to hardening internal data structures, remove the --with-default-link configure option and associated definitions. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-03-27libio: Remove the usage of __libc_IO_vtablesAdhemerval Zanella Netto22-513/+637
Instead of using a special ELF section along with a linker script directive to put the IO vtables within the RELRO section, the libio vtables are all moved to an array marked as data.relro (so linker will place in the RELRO segment without the need of extra directives). To avoid static linking namespace issues and including all vtable referenced objects, all required function pointers are set to weak alias. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-03-27libio: Do not autogenerate stdio_lim.hAdhemerval Zanella Netto6-69/+43
Instead define the required fields in system dependend files. The only system dependent definition is FILENAME_MAX, which should match POSIX PATH_MAX, and it is obtained from either kernel UAPI or mach headers. Currently set pre-defined value from current kernels. It avoids a circular dependendy when including stdio.h in gen-as-const-headers files. Checked on x86_64-linux-gnu and i686-linux-gnu Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-03-27Move libc_freeres_ptrs and libc_subfreeres to hidden/weak functionsAdhemerval Zanella Netto90-243/+584
They are both used by __libc_freeres to free all library malloc allocated resources to help tooling like mtrace or valgrind with memory leak tracking. The current scheme uses assembly markers and linker script entries to consolidate the free routine function pointers in the RELRO segment and to be freed buffers in BSS. This patch changes it to use specific free functions for libc_freeres_ptrs buffers and call the function pointer array directly with call_function_static_weak. It allows the removal of both the internal macros and the linker script sections. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>