Age | Commit message (Collapse) | Author | Files | Lines |
|
Remove generic tlsdesc code related to lazy tlsdesc processing since
lazy tlsdesc relocation is no longer supported. This includes removing
GL(dl_load_lock) from _dl_make_tlsdesc_dynamic which is only called at
load time when that lock is already held.
Added a documentation comment too.
|
|
Like in commit e75711ebfa976d5468ec292282566a18b07e4d67 for x86_64,
remove unused lazy tlsdesc relocation processing code:
_dl_tlsdesc_resolve_abs_plus_addend
_dl_tlsdesc_resolve_rel
_dl_tlsdesc_resolve_rela
_dl_tlsdesc_resolve_hold
|
|
_dl_tlsdesc_resolve_rela and _dl_tlsdesc_resolve_hold are only used for
lazy tlsdesc relocation processing which is no longer supported.
|
|
Lazy tlsdesc relocation is racy because the static tls optimization and
tlsdesc management operations are done without holding the dlopen lock.
This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67
for aarch64, but it fixes a different race: bug 27137.
On i386 the code is a bit more complicated than on x86_64 because both
rel and rela relocs are supported.
|
|
Lazy tlsdesc relocation is racy because the static tls optimization and
tlsdesc management operations are done without holding the dlopen lock.
This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67
for aarch64, but it fixes a different race: bug 27137.
|
|
For some reason only dlopen failure caused dtv gaps to be reused.
It is possible that the intent was to never reuse modids for a
different module, but after dlopen failure all gaps are reused
not just the ones caused by the unfinished dlopened.
So the code has to handle reused modids already which seems to
work, however the data races at thread creation and tls access
(see bug 19329 and bug 27111) may be more severe if slots are
reused so this is scheduled after those fixes. I think fixing
the races are not simpler if reuse is disallowed and reuse has
other benefits, so set GL(dl_tls_dtv_gaps) whenever entries are
removed from the middle of the slotinfo list. The value does
not have to be correct: incorrect true value causes the next
modid query to do a slotinfo walk, incorrect false will leave
gaps and new entries are added at the end.
Fixes bug 27135.
|
|
This is a follow up patch to the fix for bug 19329. This adds
relaxed MO atomics to accesses that are racy, but relaxed MO is
enough.
|
|
DTV setup at thread creation (_dl_allocate_tls_init) is changed
to take the dlopen lock, GL(dl_load_lock). Avoiding data races
here without locks would require design changes: the map that is
accessed for static TLS initialization here may be concurrently
freed by dlclose. That use after free may be solved by only
locking around static TLS setup or by ensuring dlclose does not
free modules with static TLS, however currently every link map
with TLS has to be accessed at least to see if it needs static
TLS. And even if that's solved, still a lot of atomics would be
needed to synchronize DTV related globals without a lock. So fix
both bug 19329 and bug 27111 with a lock that prevents DTV setup
running concurrently with dlopen or dlclose.
_dl_update_slotinfo at TLS access still does not use any locks
so CONCURRENCY NOTES are added to explain the synchronization.
The early exit from the slotinfo walk when max_modid is reached
is not strictly necessary, but does not hurt either.
An incorrect acquire load was removed from _dl_resize_dtv: it
did not synchronize with any release store or fence and
synchronization is now handled separately at thread creation
and TLS access time.
There are still a number of racy read accesses to globals that
will be changed to relaxed MO atomics in a followup patch. This
should not introduce regressions compared to existing behaviour
and avoid cluttering the main part of the fix.
Not all TLS access related data races got fixed here: there are
additional races at lazy tlsdesc relocations see bug 27137.
|
|
map is not valid to access here because it can be freed by a
concurrent dlclose, so don't check the modid.
The map == 0 and map != 0 code paths can be shared (avoiding
the dtv resize in case of map == 0 is just an optimization:
larger dtv than necessary would be fine too).
|
|
Since
commit a509eb117fac1d764b15eba64993f4bdb63d7f3c
Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
the generation counter update is not needed in the failure path.
|
|
The max modid is a valid index in the dtv, it should not be skipped.
The bug is observable if the last module has modid == 64 and its
generation is same or less than the max generation of the previous
modules. Then dtv[0].counter implies dtv[64] is initialized but
it isn't. Fixes bug 27136.
|
|
The test relies on reusing slotinfo entries after dlclose which
can result in non-monotonic increasing generation counters in
the slotinfo list. It can trigger bug 27136.
The test requires large number of modules with TLS so the
modules of tst-tls7 are used instead of duplicating them.
|
|
Test concurrent dlopen and pthread_create when the loaded
modules have TLS. This triggers dl-tls assertion failures
more reliably than the tst-stack4 test.
The dlopened module has 100 DT_NEEDED dependencies, the
number of concurrent threads created during dlopen depends
on filesystem speed and hardware. At most 3 threads exist
at a time to limit resource usage.
Doing the test in a fork loop can make it more reliable.
---
v4:
- rebased, updated copyright year.
- moved to tests-internal because of <atomic.h>
v3:
- use the new test support code.
- better Makefile usage so modules are cleaned properly.
v2:
- undef NDEBUG.
- join nop threads so at most 3 threads exist at a time.
- remove stack size setting (resource usage is no longer a concern).
- stop creating threads after dlopen observably finished.
- print the number of threads created for debugging.
2016-12-13 Szabolcs Nagy <szabolcs.nagy@arm.com>
* nptl/Makefile (tests): Add tst-tls7.
(modules-names): Add tst-tls7mod, tst-tls7mod-dep.
* nptl/tst-tls7.c: New file.
* nptl/tst-tls7mod-dep.c: New file.
* nptl/tst-tls7mod.c: New file.
|
|
dlopen updates libname_list by writing to lastp->next, but concurrent
reads in _dl_name_match_p were not synchronized when it was called
without holding GL(dl_load_lock), which can happen during lazy symbol
resolution.
This patch fixes the race between _dl_name_match_p reading lastp->next
and add_name_to_object writing to it. This could cause segfault on
targets with weak memory order when lastp->next->name is read, which
was observed on an arm system. Fixes bug 21349.
(Code is from Maninder Singh, comments and description is from Szabolcs
Nagy.)
Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
|
|
DL_UNMAP_IS_SPECIAL and DL_UNMAP were not defined. The definitions are
now copied from arm, since the same is needed on aarch64. The cleanup
of tlsdesc data is handled by the custom _dl_unmap.
Fixes bug 27403.
|
|
The kernel does not put the vDSO at special addresses, so writev can
write the name directly. Also remove the incorrect comment about not
setting l_name.
Andy Lutomirski confirmed in
<https://lore.kernel.org/linux-api/442A16C0-AE5A-4A44-B261-FE6F817EAF3C@amacapital.net/>
that this copy is not necessary.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Remove the extra space between "# endif" left over from
commit f380868f6dcfdeae8d449d556298d9c41012ed8d
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Thu Dec 24 15:43:34 2020 -0800
Remove _ISOMAC check from <cpu-features.h>
|
|
It was added by 1bfbaf7130 where it added a libc_hidden_proto for
__fstatfs but it didn't update the Hurd version as well.
Checked with a build for i686-gnu.
|
|
The check is moved to LFS fstatat implementation (since it is the
code that actually implements the syscall).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
The header is not used anywhere.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
Remove the internal_statvfs64.c and open code the implementation
on internal_statvfs.c. The alpha is now unrequired, the generic
implementation also handles it.
Also, remove unused includes on internal_statvfs.c, and remove
unused arguments on __internal_statvfs{64}.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
There is no need to handle ENOSYS on fstatfs64 call, required only
for alpha (where is already fallbacks to fstatfs).
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
There is no need to handle ENOSYS on fstatfs64 call, required only
for alpha (where is already fallbacks to fstatfs). The wordsize
internal_statvfs64.c is removed, since how the LFS support is
provided by fstatvfs64.c (used on 64-bit architectures as well).
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
The __NR_statfs64 syscall is supported on all architectures but
aarch64, mips64, riscv64, and x86_64. And newer ABIs also uses
the new statfs64 interface (where the struct size is used as
second argument).
So the default implementation now uses:
1. __NR_statfs64 for non-LFS call and handle overflow directly
There is no need to handle __NR_statfs since all architectures
that only support are LFS only.
2. __NR_statfs if defined or __NR_statfs64 otherwise for LFS
call.
Alpha is the only outlier, since it is a 64-bit architecture which
provides non-LFS interface and only provides __NR_statfs64 on
newer kernels (v5.1+).
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
The __NR_fstatfs64 syscall is supported on all architectures but
aarch64, mips64, riscv64, and x86_64. And newer ABIs also uses
the new fstatfs64 interface (where the struct size is used as
first argument).
So the default implementation now uses:
1. __NR_fstatfs64 for non-LFS call and handle overflow directly
There is no need to handle __NR_fstatfs since all architectures
that only support are LFS only.
2. __NR_fstatfs if defined or __NR_fstatfs64 otherwise for LFS
call.
Alpha is the only outlier, it is a 64-bit architecture which
provides non-LFS interface and only provides __NR_fstatfs64 on
newer kernels (5.1+).
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
Currently glibc has three different struct statfs{64} definitions:
1. Non-LFS support where non-LFS and LFS struct have different
size: alpha, arm, hppa, i686, m68k, microblaze, mips (all abis),
powerpc32, s390, sh4, and sparc.
2. Non-LFS support where non-LFS and LFS struct have the same
size: csky and nios2.
3. Only LFS support (where both struct have the same size): arc,
ia64, powerpc64 (including LE), riscv (both 32 and 64 bits),
s390x, sparc64, and x86 (including x32).
The STATFS_IS_STATFS64/__STATFS_MATCHES_STATFS64 does not tell apart
between 1. and 2. since for both the only difference is the struct
size (for 2. both non-LFS and LFS uses the same syscall, where for
1. the old non-LFS is used for [f]statfs).
This patch move the generic statfs.h for both csky and nios2, and
make the default definitions for newer ABIs to assume that only
LFS will be support (so there is no need to keep no-LFS and LFS
struct statfs with the same size, it will be implicit).
This patch does not change the code generation.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
The XSTAT_IS_XSTAT64 and STAT_IS_KERNEL_STAT flags are now set to 1 and
STATFS_IS_STATFS64 is set to __STATFS_MATCHES_STATFS64. This makes the
default ABI for newer ports to provide only LFS calls.
A copy of non-LFS support is provided to 32-bit ABIS with non-LFS
support (arm, csky, i386, m68k, nios2, s390, and sh). Is also allows
to remove the 64-bit ports, which already uses the default values.
This patch does not change the code generation.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
aarch64, arc, ia64, mips64, powerpc64, riscv32, riscv64, s390x, sparc64,
and x86_64 defines STATFS_IS_STATFS64 to 0, but all of them alias
statfs to statfs64 and the struct statfs has the same and layout of
struct statfs64.
The correct definition will be used on the [f]statfs[64] consolidation.
This patch does not change code generation since the symbols are
implemented using the auto-generation syscall for all the aforementioned
ABIs.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
The glibc.malloc.mmap_max tunable as well as al of the INT_32 tunables
don't have use for negative values, so pin the hardcoded limits in the
non-negative range of INT. There's no real benefit in any of those
use cases for the extended range of unsigned, so I have avoided added
a new type to keep things simple.
|
|
The tunable types are SIZE_T, so set the ranges to the correct maximum
value, i.e. SIZE_MAX.
|
|
The TUNABLE_SET interface took a primitive C type argument, which
resulted in inconsistent type conversions internally due to incorrect
dereferencing of types, especialy on 32-bit architectures. This
change simplifies the TUNABLE setting logic along with the interfaces.
Now all numeric tunable values are stored as signed numbers in
tunable_num_t, which is intmax_t. All calls to set tunables cast the
input value to its primitive type and then to tunable_num_t for
storage. This relies on gcc-specific (although I suspect other
compilers woul also do the same) unsigned to signed integer conversion
semantics, i.e. the bit pattern is conserved. The reverse conversion
is guaranteed by the standard.
|
|
Add __nonnull((2)) to the setrlimit()/getrlimit() function declaration
to avoid null pointer access.
-----
v2
According to the suggestions of the Adhemerval Zanella and Zack Weinberg:
use __nonnull() to check null pointers in the compilation phase.
do not add pointer check code to setrlimit()/getrlimit().
The validity of the "resource" parameter is checked in the syscall.
v1
https://public-inbox.org/libc-alpha/20201230114131.47589-1-nixiaoming@huawei.com/
-----
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
This patch updates json "bench-variant" attribute of "bench-memset.c"
to "default" so that the script "benchtests/scripts/plot_strings.py"
can generate a file "memset_time_default_linear.png".
Without this patch, the script "benchtests/scripts/plot_strings.py"
generates a file "memset_time__linear.png" which has inconsistent form
with "memcpy_time_default_linear.png" and
"memmove_time_default_linear.png".
|
|
It syncs with gnulib version 87ed1f9c4. No functional changes is
expected.
Checked on x86_64-linux-gnu.
|
|
It syncs with gnulib version 1731fef3d. On build_trtable prevent
inlining, so that it doesn't bloat the caller's stack and use auto
variables instead of alloca/malloc.
After these changes, build_trtable's total stack allocation is
only 20 KiB on a 64-bit machine, and this is less than glibc's 64
KiB cutoff so there's little point to using alloca to shrink it.
Checked on x86_64-linux-gnu.
|
|
It syncs with gnulib version b6207ab38. It replaces the regmatch_t
with a dynarray list.
Checked on x86_64-linux-gnu.
|
|
It syncs with gnulib version a8bac4d49. The main changes are:
- Remove the usage of anonymous union within DYNARRAY_STRUCT.
- Use DYNARRAY_FREE instead of DYNARRAY_NAME (free) so that
Gnulib does not change 'free' to 'rpl_free'.
- Use __nonnull instead of __attribute__ ((nonnull ())).
- Use __attribute_maybe_unused__ instead of
__attribute__ ((unused, nonnull (1))).
- Use of _Noreturn instead of _attribute__ ((noreturn)).
The only difference with gnulib is:
--- glibc
+++ gnulib
@@ -18,6 +18,7 @@
#include <dynarray.h>
#include <stdio.h>
+#include <stdlib.h>
void
__libc_dynarray_at_failure (size_t size, size_t index)
@@ -27,7 +28,6 @@
__snprintf (buf, sizeof (buf), "Fatal glibc error: "
"array index %zu not less than array length %zu\n",
index, size);
- __libc_fatal (buf);
#else
abort ();
#endif
It seems a wrong sync from gnulib (the code is used on loader and
thus it requires __libc_fatal instead of abort).
Checked on x86_64-linux-gnu.
|
|
It adds __glibc_has_builtin, __glibc_has_extension, and
__attribute_maybe_unused__ alongsize with some fixes.
The differences are:
--- glibc
+++ gnulib
@@ -259,7 +259,9 @@
# define __attribute_const__ /* Ignore */
#endif
-#if __GNUC_PREREQ (2,7) || __glibc_has_attribute (__unused__)
+#if defined __STDC_VERSION__ && 201710L < __STDC_VERSION__
+# define __attribute_maybe_unused__ [[__maybe_unused__]]
+#elif __GNUC_PREREQ (2,7) || __glibc_has_attribute (__unused__)
# define __attribute_maybe_unused__ __attribute__ ((__unused__))
#else
# define __attribute_maybe_unused__ /* Ignore */
@@ -485,7 +487,7 @@
/* The #ifndef lets Gnulib avoid including these on non-glibc
platforms, where the includes typically do not exist. */
-#ifdef __GLIBC__
+#ifndef __WORDSIZE
# include <bits/wordsize.h>
# include <bits/long-double.h>
#endif
The [[__attribute_maybe_unused__]] attribute removal __ is due Joseph
questioning gcc support with -std=c2x or -std=gnu2x [1].
The _WORDSIZE replacement by __GLIBC__ is because it does not play
well with internal cdefs.h that also uses
__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI.
Checked on x86_64-linux-gnu.
[1] https://sourceware.org/pipermail/libc-alpha/2021-January/121600.html
|
|
Similar to __sem_check_add_mapping fix, take in consideration the
trailling NULL.
Checked x86_64-linux-gnu.
|
|
Take in consideration the trailling NULL since sem_search uses
strcmp to compare entries.
Checked on x86_64-linux-gnu and powerpc-linux-gnu (where it triggered
a nptl/tst-sem7 regression).
|
|
Linux 5.10 adds PTRACE_PEEKMTETAGS and PTRACE_POKEMTETAGS for AArch64.
Adding those shows up that glibc is also missing PTRACE_SYSEMU and
PTRACE_SYSEMU_SINGLESTEP, for AArch64 (where they were added to Linux
in 5.3) and for PowerPC (where they were added in Linux 4.20); it
already has those two defines for x86. Add all those defines to
glibc's headers.
Tested with build-many-glibcs.py for aarch64-linux-gnu and
powerpc-linux-gnu.
|
|
This patch adds additional benchmarks and tests for string size of
4096 and several benchmarks for string size 256 with different
alignments.
|
|
No bug. Just seemed the performance could be improved a bit. Observed
and expected behavior are unchanged. Optimized body of main
loop. Updated page cross logic and optimized accordingly. Made a few
minor instruction selection modifications. No regressions in test
suite. Both test-strchrnul and test-strchr passed.
|
|
sem_open already returns EINVAL for input names larger than NAME_MAX,
so it can assume the largest name length with tfind.
Checked on x86_64-linux-gnu.
|
|
The internal semaphore list code is moved to a specific file,
sem_routine.c, and the internal usage is simplified to only two
functions (one to insert a new semaphore and one to remove it
from the internal list). There is no need to expose the
internal locking, neither how the semaphore mapping is implemented.
No functional or semantic change is expected, tested on
x86_64-linux-gnu.
|
|
Previously, glibc would pick an arbitrary tmpfs file system from
/proc/mounts if /dev/shm was not available. This could lead to
an unsuitable file system being picked for the backing storage for
shm_open, sem_open, and related functions.
This patch introduces a new function, __shm_get_name, which builds
the file name under the appropriate (now hard-coded) directory. It is
called from the various shm_* and sem_* function. Unlike the
SHM_GET_NAME macro it replaces, the callers handle the return values
and errno updates. shm-directory.c is moved directly into the posix
subdirectory because it can be implemented directly using POSIX
functionality. It resides in libc because it is needed by both
librt and nptl/htl.
In the sem_open implementation, tmpfname is initialized directly
from a string constant. This happens to remove one alloca call.
Checked on x86_64-linux-gnu.
|
|
|
|
This change adds new test to assess ppoll()'s timeout related
functionality (the struct pollfd does not provide valid fd to wait
for - just wait for timeout).
To be more specific - two use cases are checked:
- if ppoll() times out immediately when passed struct timespec has zero
values of tv_nsec and tv_sec.
- if ppoll() times out after timeout specified in passed argument
|
|
This change adds new test to assess functionality of timerfd_*
functions.
It creates new timer (operates on its file descriptor) and checks
if time before and after sleep is between expected values.
|
|
1. Add CPUID_INDEX_14_ECX_0 for CPUID leaf 0x14 to detect PTWRITE feature
in EBX of CPUID leaf 0x14 with ECX == 0.
2. Add PTWRITE detection to CPU feature tests.
3. Add 2 static CPU feature tests.
|