aboutsummaryrefslogtreecommitdiff
path: root/elf/dl-tls.c
AgeCommit message (Collapse)AuthorFilesLines
2025-01-16Consolidate TLS block allocation for static binaries with ld.soFlorian Weimer1-43/+7
Use the same code to compute the TLS block size and its alignment. The code in elf/dl-tls.c is linked in anyway for all binaries due to the reference to _dl_tls_static_surplus_init. It is not possible to call _dl_allocate_tls_storage directly because malloc is not available in the static case. (The dynamic linker uses the minimal malloc at this stage.) Therefore, split _dl_tls_block_size_with_pre and _dl_tls_block_align from _dl_allocate_tls_storage, and call those new functions from __libc_setup_tls. This fixes extra TLS allocation for the static case, and apparently some pre-existing bugs as well (the independent recomputation of TLS block sizes in init_static_tls looks rather suspect). Fixes commit 0e411c5d3098982d67cd2d7a233eaa6c977a1869 ("Add generic 'extra TLS'").
2025-01-16elf: Iterate over loaded object list in _dl_determine_tlsoffsetFlorian Weimer1-38/+35
The old code used the slotinfo array as a scratch area to pass the list of TLS-using objects to _dl_determine_tlsoffset. All array entries are subsequently overwritten by _dl_add_to_slotinfo, except the first one. The link maps are usually not at their right position for their module ID in the slotinfo array, so the initial use of the slotinfo array would be incorrect if not for scratch purposes only. In _dl_tls_initial_modid_limit_setup, the old code relied that some link map was written to the first slotinfo entry. After the change, this no longer happens because TLS module ID zero is unused. It's also necessary to move the call after the real initialization of the slotinfo array.
2025-01-10Add generic 'extra TLS'Michael Jeanson1-0/+72
Add the logic to append an 'extra TLS' block in the TLS block allocator with a generic stub implementation. The duplicated code in 'csu/libc-tls.c' and 'elf/dl-tls.c' is to handle both statically linked applications and the ELF dynamic loader. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-01-02elf: Use TLS_DTV_OFFSET in __tls_get_addrFlorian Weimer1-4/+16
This fixes commit 5e249192cac7354af02a7347a0d8c984e0c88ed3 ("elf: Remove the GET_ADDR_ARGS and related macros from the TLS code"): GET_ADDR_ARGS was indeed unused, but GET_ADDR_OFFSET was used on several targets, those that define TLS_DTV_OFFSET. Instead of reintroducing GET_ADDR_OFFSET, use TLS_DTV_OFFSET directly, now that it is defined on all targets. In the new tls_get_addr_adjust helper function, add a cast to uintptr_t to help the s390 case, where the offset can be positive or negative, depending on the addresses malloc returns. The cast avoids pointer wraparound/overflow. The outer uintptr_t cast is needed to suppress a warning on x86-64 x32 about mismatched integer/pointer sizes. Eventually this offset should be folded into the DTV addresses themselves, to eliminate the subtraction on the TLS fast path. This will require an adjustment to libthread_db because the debugger interface currently returns unadjusted pointers. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert1-1/+1
2024-12-27elf: Remove the GET_ADDR_ARGS and related macros from the TLS codeFlorian Weimer1-36/+19
This was used to manage an IA-64 ABI divergence is no longere needed after the IA-64 removal. (It should be possible to encode all the required information in one machine word, so the pointer indirection is really unnecessary. Technically, none of this is part of the ABI, so perhaps it's possible to do this retroactively. See bug 27404.) Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-08-05elf: Avoid re-initializing already allocated TLS in dlopen (bug 31717)Florian Weimer1-7/+37
The old code used l_init_called as an indicator for whether TLS initialization was complete. However, it is possible that TLS for an object is initialized, written to, and then dlopen for this object is called again, and l_init_called is not true at this point. Previously, this resulted in TLS being initialized twice, discarding any interim writes (technically introducing a use-after-free bug even). This commit introduces an explicit per-object flag, l_tls_in_slotinfo. It indicates whether _dl_add_to_slotinfo has been called for this object. This flag is used to avoid double-initialization of TLS. In update_tls_slotinfo, the first_static_tls micro-optimization is removed because preserving the initalization flag for subsequent use by the second loop for static TLS is a bit complicated, and another per-object flag does not seem to be worth it. Furthermore, the l_init_called flag is dropped from the second loop (for static TLS initialization) because l_need_tls_init on its own prevents double-initialization. The remaining l_init_called usage in resize_scopes and update_scopes is just an optimization due to the use of scope_has_map, so it is not changed in this commit. The isupper check ensures that libc.so.6 is TLS is not reverted. Such a revert happens if l_need_tls_init is not cleared in _dl_allocate_tls_init for the main_thread case, now that l_init_called is not checked anymore in update_tls_slotinfo in elf/dl-open.c. Reported-by: Jonathon Anderson <janderson@rice.edu> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-08-05elf: Clarify and invert second argument of _dl_allocate_tls_initFlorian Weimer1-4/+9
Also remove an outdated comment: _dl_allocate_tls_init is called as part of pthread_create. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-07-01elf: Support recursive use of dynamic TLS in interposed mallocFlorian Weimer1-9/+86
It turns out that quite a few applications use bundled mallocs that have been built to use global-dynamic TLS (instead of the recommended initial-exec TLS). The previous workaround from commit afe42e935b3ee97bac9a7064157587777259c60e ("elf: Avoid some free (NULL) calls in _dl_update_slotinfo") does not fix all encountered cases unfortunatelly. This change avoids the TLS generation update for recursive use of TLS from a malloc that was called during a TLS update. This is possible because an interposed malloc has a fixed module ID and TLS slot. (It cannot be unloaded.) If an initially-loaded module ID is encountered in __tls_get_addr and the dynamic linker is already in the middle of a TLS update, use the outdated DTV, thus avoiding another call into malloc. It's still necessary to update the DTV to the most recent generation, to get out of the slow path, which is why the check for recursion is needed. The bookkeeping is done using a global counter instead of per-thread flag because TLS access in the dynamic linker is tricky. All this will go away once the dynamic linker stops using malloc for TLS, likely as part of a change that pre-allocates all TLS during pthread_create/dlopen. Fixes commit d2123d68275acc0f061e73d5f86ca504e0d5a344 ("elf: Fix slow tls access after dlopen [BZ #19924]"). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-06-03elf: Avoid some free (NULL) calls in _dl_update_slotinfoFlorian Weimer1-1/+8
This has been confirmed to work around some interposed mallocs. Here is a discussion of the impact test ust/libc-wrapper/test_libc-wrapper in lttng-tools: New TLS usage in libgcc_s.so.1, compatibility impact <https://inbox.sourceware.org/libc-alpha/8734v1ieke.fsf@oldenburg.str.redhat.com/> Reportedly, this patch also papers over a similar issue when tcmalloc 2.9.1 is not compiled with -ftls-model=initial-exec. Of course the goal really should be to compile mallocs with the initial-exec TLS model, but this commit appears to be a useful interim workaround. Fixes commit d2123d68275acc0f061e73d5f86ca504e0d5a344 ("elf: Fix slow tls access after dlopen [BZ #19924]"). Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert1-1/+1
2023-11-28elf: Fix TLS modid reuse generation assignment (BZ 29039)Hector Martin1-0/+1
_dl_assign_tls_modid() assigns a slotinfo entry for a new module, but does *not* do anything to the generation counter. The first time this happens, the generation is zero and map_generation() returns the current generation to be used during relocation processing. However, if a slotinfo entry is later reused, it will already have a generation assigned. If this generation has fallen behind the current global max generation, then this causes an obsolete generation to be assigned during relocation processing, as map_generation() returns this generation if nonzero. _dl_add_to_slotinfo() eventually resets the generation, but by then it is too late. This causes DTV updates to be skipped, leading to NULL or broken TLS slot pointers and segfaults. Fix this by resetting the generation to zero in _dl_assign_tls_modid(), so it behaves the same as the first time a slot is assigned. _dl_add_to_slotinfo() will still assign the correct static generation later during module load, but relocation processing will no longer use an obsolete generation. Note that slotinfo entry (aka modid) reuse typically happens after a dlclose and only TLS access via dynamic tlsdesc is affected. Because tlsdesc is optimized to use the optional part of static TLS, dynamic tlsdesc can be avoided by increasing the glibc.rtld.optional_static_tls tunable to a large enough value, or by LD_PRELOAD-ing the affected modules. Fixes bug 29039. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-09-01elf: Fix slow tls access after dlopen [BZ #19924]Szabolcs Nagy1-55/+62
In short: __tls_get_addr checks the global generation counter and if the current dtv is older then _dl_update_slotinfo updates dtv up to the generation of the accessed module. So if the global generation is newer than generation of the module then __tls_get_addr keeps hitting the slow dtv update path. The dtv update path includes a number of checks to see if any update is needed and this already causes measurable tls access slow down after dlopen. It may be possible to detect up-to-date dtv faster. But if there are many modules loaded (> TLS_SLOTINFO_SURPLUS) then this requires at least walking the slotinfo list. This patch tries to update the dtv to the global generation instead, so after a dlopen the tls access slow path is only hit once. The modules with larger generation than the accessed one were not necessarily synchronized before, so additional synchronization is needed. This patch uses acquire/release synchronization when accessing the generation counter. Note: in the x86_64 version of dl-tls.c the generation is only loaded once, since relaxed mo is not faster than acquire mo load. I have not benchmarked this. Tested by Adhemerval Zanella on aarch64, powerpc, sparc, x86 who reported that it fixes the performance issue of bug 19924. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-05-29Fix misspellings in elf/ -- BZ 25337Paul Pluzhnikov1-1/+1
Applying this commit results in bit-identical libc.so.6. The elf/ld-linux-x86-64.so.2 does change, but only in .note.gnu.build-id Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-03-29Remove --enable-tunables configure optionAdhemerval Zanella Netto1-6/+0
And make always supported. The configure option was added on glibc 2.25 and some features require it (such as hwcap mask, huge pages support, and lock elisition tuning). It also simplifies the build permutations. Changes from v1: * Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs more discussion. * Cleanup more code. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-01-06Update copyright dates with scripts/update-copyrightsJoseph Myers1-1/+1
2022-02-01elf: Fix initial-exec TLS access on audit modules (BZ #28096)Adhemerval Zanella1-3/+14
For audit modules and dependencies with initial-exec TLS, we can not set the initial TLS image on default loader initialization because it would already be set by the audit setup. However, subsequent thread creation would need to follow the default behaviour. This patch fixes it by setting l_auditing link_map field not only for the audit modules, but also for all its dependencies. This is used on _dl_allocate_tls_init to avoid the static TLS initialization at load time. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
2022-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert1-1/+1
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2021-12-09Remove TLS_TCB_ALIGN and TLS_INIT_TCB_ALIGNFlorian Weimer1-2/+2
TLS_INIT_TCB_ALIGN is not actually used. TLS_TCB_ALIGN was likely introduced to support a configuration where the thread pointer has not the same alignment as THREAD_SELF. Only ia64 seems to use that, but for the stack/pointer guard, not for storing tcbhead_t. Some ports use TLS_TCB_OFFSET and TLS_PRE_TCB_SIZE to shift the thread pointer, potentially landing in a different residue class modulo the alignment, but the changes should not impact that. In general, given that TLS variables have their own alignment requirements, having different alignment for the (unshifted) thread pointer and struct pthread would potentially result in dynamic offsets, leading to more complexity. hppa had different values before: __alignof__ (tcbhead_t), which seems to be 4, and __alignof__ (struct pthread), which was 8 (old default) and is now 32. However, it defines THREAD_SELF as: /* Return the thread descriptor for the current thread. */ # define THREAD_SELF \ ({ struct pthread *__self; \ __self = __get_cr27(); \ __self - 1; \ }) So the thread pointer points after struct pthread (hence __self - 1), and they have to have the same alignment on hppa as well. Similarly, on ia64, the definitions were different. We have: # define TLS_PRE_TCB_SIZE \ (sizeof (struct pthread) \ + (PTHREAD_STRUCT_END_PADDING < 2 * sizeof (uintptr_t) \ ? ((2 * sizeof (uintptr_t) + __alignof__ (struct pthread) - 1) \ & ~(__alignof__ (struct pthread) - 1)) \ : 0)) # define THREAD_SELF \ ((struct pthread *) ((char *) __thread_self - TLS_PRE_TCB_SIZE)) And TLS_PRE_TCB_SIZE is a multiple of the struct pthread alignment (confirmed by the new _Static_assert in sysdeps/ia64/libc-tls.c). On m68k, we have a larger gap between tcbhead_t and struct pthread. But as far as I can tell, the port is fine with that. The definition of TCB_OFFSET is sufficient to handle the shifted TCB scenario. This fixes commit 23c77f60181eb549f11ec2f913b4270af29eee38 ("nptl: Increase default TCB alignment to 32"). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-10-04elf: Avoid deadlock between pthread_create and ctors [BZ #28357]Szabolcs Nagy1-8/+8
The fix for bug 19329 caused a regression such that pthread_create can deadlock when concurrent ctors from dlopen are waiting for it to finish. Use a new GL(dl_load_tls_lock) in pthread_create that is not taken around ctors in dlopen. The new lock is also used in __tls_get_addr instead of GL(dl_load_lock). The new lock is held in _dl_open_worker and _dl_close_worker around most of the logic before/after the init/fini routines. When init/fini routines are running then TLS is in a consistent, usable state. In _dl_open_worker the new lock requires catching and reraising dlopen failures that happen in the critical section. The new lock is reinitialized in a fork child, to keep the existing behaviour and it is kept recursive in case malloc interposition or TLS access from signal handlers can retake it. It is not obvious if this is necessary or helps, but avoids changing the preexisting behaviour. The new lock may be more appropriate for dl_iterate_phdr too than GL(dl_load_write_lock), since TLS state of an incompletely loaded module may be accessed. If the new lock can replace the old one, that can be a separate change. Fixes bug 28357. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-09-15elf: Replace most uses of THREAD_GSCOPE_IN_TCBSergey Bugaev1-3/+3
While originally this definition was indeed used to distinguish between the cases where the GSCOPE flag was stored in TCB or not, it has since become used as a general way to distinguish between HTL and NPTL. THREAD_GSCOPE_IN_TCB will be removed in the following commits, as HTL, which currently is the only port that does not put the flag into TCB, will get ported to put the GSCOPE flag into the TCB as well. To prepare for that change, migrate all code that wants to distinguish between HTL and NPTL to use PTHREAD_IN_LIBC instead, which is a better choice since the distinction mostly has to do with whether libc has access to the list of thread structures and therefore can initialize thread-local storage. The parts of code that actually depend on whether the GSCOPE flag is in TCB are left unchanged. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20210907133325.255690-2-bugaevc@gmail.com> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2021-07-14elf: Fix DTV gap reuse logic (BZ #27135)Adhemerval Zanella1-8/+9
This is updated version of the 572bd547d57a (reverted by 40ebfd016ad2) that fixes the _dl_next_tls_modid issues. This issue with 572bd547d57a patch is the DTV entry will be only update on dl_open_worker() with the update_tls_slotinfo() call after all dependencies are being processed by _dl_map_object_deps(). However _dl_map_object_deps() itself might call _dl_next_tls_modid(), and since the _dl_tls_dtv_slotinfo_list::map is not yet set the entry will be wrongly reused. This patch fixes by renaming the _dl_next_tls_modid() function to _dl_assign_tls_modid() and by passing the link_map so it can set the slotinfo value so a subsequente _dl_next_tls_modid() call will see the entry as allocated. The intermediary value is cleared up on remove_slotinfo() for the case a library fails to load with RTLD_NOW. This patch fixes BZ #27135. Checked on x86_64-linux-gnu. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-06-25elf: Disable most of TLS modid gaps processing [BZ #27135]Florian Weimer1-1/+4
Revert "elf: Fix DTV gap reuse logic [BZ #27135]" This reverts commit 572bd547d57a39b6cf0ea072545dc4048921f4c3. It turns out that the _dl_next_tls_modid in _dl_map_object_from_fd keeps returning the same modid over and over again if there is a gap and more than TLS-using module is loaded in one dlopen call. This corrupts TLS data structures. The bug is still present after a revert, but empirically it is much more difficult to trigger (because it involves a dlopen failure).
2021-05-21nptl: Eliminate the __static_tls_size, __static_tls_align_m1 variablesFlorian Weimer1-2/+3
Use the __nptl_tls_static_size_for_stack inline function instead, and the GLRO (dl_tls_static_align) value directly. The computation of GLRO (dl_tls_static_align) in _dl_determine_tlsoffset ensures that the alignment is at least TLS_TCB_ALIGN, which at least STACK_ALIGN (see allocate_stack). Therefore, the additional rounding-up step is removed. ALso move the initialization of the default stack size from __pthread_initialize_minimal_internal to __pthread_early_init. This introduces an extra system call during single-threaded startup, but this simplifies the initialization sequence. No locking is needed around the writes to __default_pthread_attr because the process is single-threaded at this point. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-05-17elf: Move static TLS size and alignment into _rtld_global_roFlorian Weimer1-11/+11
This helps to clarify that the caching of these fields in libpthread (in __static_tls_size, __static_tls_align_m1) is unnecessary. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-05-11elf: Fix DTV gap reuse logic [BZ #27135]Szabolcs Nagy1-4/+1
For some reason only dlopen failure caused dtv gaps to be reused. It is possible that the intent was to never reuse modids for a different module, but after dlopen failure all gaps are reused not just the ones caused by the unfinished dlopened. So the code has to handle reused modids already which seems to work, however the data races at thread creation and tls access (see bug 19329 and bug 27111) may be more severe if slots are reused so this is scheduled after those fixes. I think fixing the races are not simpler if reuse is disallowed and reuse has other benefits, so set GL(dl_tls_dtv_gaps) whenever entries are removed from the middle of the slotinfo list. The value does not have to be correct: incorrect true value causes the next modid query to do a slotinfo walk, incorrect false will leave gaps and new entries are added at the end. Fixes bug 27135. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-05-11elf: Use relaxed atomics for racy accesses [BZ #19329]Szabolcs Nagy1-8/+23
This is a follow up patch to the fix for bug 19329. This adds relaxed MO atomics to accesses that were previously data races but are now race conditions, and where relaxed MO is sufficient. The race conditions all follow the pattern that the write is behind the dlopen lock, but a read can happen concurrently (e.g. during tls access) without holding the lock. For slotinfo entries the read value only matters if it reads from a synchronized write in dlopen or dlclose, otherwise the related dtv entry is not valid to access so it is fine to leave it in an inconsistent state. The same applies for GL(dl_tls_max_dtv_idx) and GL(dl_tls_generation), but there the algorithm relies on the fact that the read of the last synchronized write is an increasing value. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-05-11elf: Fix data races in pthread_create and TLS access [BZ #19329]Szabolcs Nagy1-16/+47
DTV setup at thread creation (_dl_allocate_tls_init) is changed to take the dlopen lock, GL(dl_load_lock). Avoiding data races here without locks would require design changes: the map that is accessed for static TLS initialization here may be concurrently freed by dlclose. That use after free may be solved by only locking around static TLS setup or by ensuring dlclose does not free modules with static TLS, however currently every link map with TLS has to be accessed at least to see if it needs static TLS. And even if that's solved, still a lot of atomics would be needed to synchronize DTV related globals without a lock. So fix both bug 19329 and bug 27111 with a lock that prevents DTV setup running concurrently with dlopen or dlclose. _dl_update_slotinfo at TLS access still does not use any locks so CONCURRENCY NOTES are added to explain the synchronization. The early exit from the slotinfo walk when max_modid is reached is not strictly necessary, but does not hurt either. An incorrect acquire load was removed from _dl_resize_dtv: it did not synchronize with any release store or fence and synchronization is now handled separately at thread creation and TLS access time. There are still a number of racy read accesses to globals that will be changed to relaxed MO atomics in a followup patch. This should not introduce regressions compared to existing behaviour and avoid cluttering the main part of the fix. Not all TLS access related data races got fixed here: there are additional races at lazy tlsdesc relocations see bug 27137. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-05-05elf, nptl: Initialize static TLS directly in ld.soFlorian Weimer1-0/+39
The stack list is available in ld.so since commit 1daccf403b1bd86370eb94edca794dc106d02039 ("nptl: Move stack list variables into _rtld_global"), so it's possible to walk the stack list directly in ld.so and perform the initialization there. This eliminates an unprotected function pointer from _rtld_global and reduces the libpthread initialization code.
2021-04-15elf: Refactor _dl_update_slotinfo to avoid use after freeSzabolcs Nagy1-16/+5
map is not valid to access here because it can be freed by a concurrent dlclose: during tls access (via __tls_get_addr) _dl_update_slotinfo is called without holding dlopen locks. So don't check the modid of map. The map == 0 and map != 0 code paths can be shared (avoiding the dtv resize in case of map == 0 is just an optimization: larger dtv than necessary would be fine too). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-15elf: Fix comments and logic in _dl_add_to_slotinfoSzabolcs Nagy1-10/+1
Since commit a509eb117fac1d764b15eba64993f4bdb63d7f3c Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112] the generation counter update is not needed in the failure path. That commit ensures allocation in _dl_add_to_slotinfo happens before the demarcation point in dlopen (it is called twice, first time is for allocation only where dlopen can still be reverted on failure, then second time actual dtv updates are done which then cannot fail). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-04-15elf: Fix a DTV setup issue [BZ #27136]Szabolcs Nagy1-1/+1
The max modid is a valid index in the dtv, it should not be skipped. The bug is observable if the last module has modid == 64 and its generation is same or less than the max generation of the previous modules. Then dtv[0].counter implies dtv[64] is initialized but it isn't. Fixes bug 27136. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-01-02Update copyright dates with scripts/update-copyrightsPaul Eggert1-1/+1
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
2020-07-20elf: Change TLS static surplus default back to 1664Florian Weimer1-7/+30
Make the computation in elf/dl-tls.c more transparent, and add an explicit test for the historic value. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-07-16Linux: Remove rseq supportFlorian Weimer1-7/+1
The kernel ABI is not finalized, and there are now various proposals to change the size of struct rseq, which would make the glibc ABI dependent on the version of the kernels used for building glibc. This is of course not acceptable. This reverts commit 48699da1c468543ade14777819bd1b4d652709de ("elf: Support at least 32-byte alignment in static dlopen"), commit 8f4632deb3545b2949cec5454afc3cb21a0024ea ("Linux: rseq registration tests"), commit 6e29cb3f61ff5432c78a1c84b0d9b123a350ab36 ("Linux: Use rseq in sched_getcpu if available"), and commit 0c76fc3c2b346dc5401dc055d97d4279632b0fb3 ("Linux: Perform rseq registration at C startup and thread creation"), resolving the conflicts introduced by the ARC port and the TLS static surplus changes. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-07-08rtld: Avoid using up static TLS surplus for optimizations [BZ #25051]Szabolcs Nagy1-4/+5
On some targets static TLS surplus area can be used opportunistically for dynamically loaded modules such that the TLS access then becomes faster (TLSDESC and powerpc TLS optimization). However we don't want all surplus TLS to be used for this optimization because dynamically loaded modules with initial-exec model TLS can only use surplus TLS. The new contract for surplus static TLS use is: - libc.so can have up to 192 bytes of IE TLS, - other system libraries together can have up to 144 bytes of IE TLS. - Some "optional" static TLS is available for opportunistic use. The optional TLS is now tunable: rtld.optional_static_tls, so users can directly affect the allocated static TLS size. (Note that module unloading with dlclose does not reclaim static TLS. After the optional TLS runs out, TLS access is no longer optimized to use static TLS.) The default setting of rtld.optional_static_tls is 512 so the surplus TLS is 3*192 + 4*144 + 512 = 1664 by default, the same as before. Fixes BZ #25051. Tested on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-07-08rtld: Account static TLS surplus for audit modulesSzabolcs Nagy1-2/+13
The new static TLS surplus size computation is surplus_tls = 192 * (nns-1) + 144 * nns + 512 where nns is controlled via the rtld.nns tunable. This commit accounts audit modules too so nns = rtld.nns + audit modules. rtld.nns should only include the namespaces required by the application, namespaces for audit modules are accounted on top of that so audit modules don't use up the static TLS that is reserved for the application. This allows loading many audit modules without tuning rtld.nns or using up static TLS, and it fixes FAIL: elf/tst-auditmany Note that DL_NNS is currently a hard upper limit for nns, and if rtld.nns + audit modules go over the limit that's a fatal error. By default rtld.nns is 4 which allows 12 audit modules. Counting the audit modules is based on existing audit string parsing code, we cannot use GLRO(dl_naudit) before the modules are actually loaded.
2020-07-08rtld: Add rtld.nns tunable for the number of supported namespacesSzabolcs Nagy1-5/+50
TLS_STATIC_SURPLUS is 1664 bytes currently which is not enough to support DL_NNS (== 16) number of dynamic link namespaces, if we assume 192 bytes of TLS are reserved for libc use and 144 bytes are reserved for other system libraries that use IE TLS. A new tunable is introduced to control the number of supported namespaces and to adjust the surplus static TLS size as follows: surplus_tls = 192 * (rtld.nns-1) + 144 * rtld.nns + 512 The default is rtld.nns == 4 and then the surplus TLS size is the same as before, so the behaviour is unchanged by default. If an application creates more namespaces than the rtld.nns setting allows, then it is not guaranteed to work, but the limit is not checked. So existing usage will continue to work, but in the future if an application creates more than 4 dynamic link namespaces then the tunable will need to be set. In this patch DL_NNS is a fixed value and provides a maximum to the rtld.nns setting. Static linking used fixed 2048 bytes surplus TLS, this is changed so the same contract is used as for dynamic linking. With static linking DL_NNS == 1 so rtld.nns tunable is forced to 1, so by default the surplus TLS is reduced to 144 + 512 = 656 bytes. This change is not expected to cause problems. Tested on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-07-06Linux: Perform rseq registration at C startup and thread creationMathieu Desnoyers1-1/+7
Register rseq TLS for each thread (including main), and unregister for each thread (excluding main). "rseq" stands for Restartable Sequences. See the rseq(2) man page proposed here: https://lkml.org/lkml/2018/9/19/647 Those are based on glibc master branch commit 3ee1e0ec5c. The rseq system call was merged into Linux 4.18. The TLS_STATIC_SURPLUS define is increased to leave additional room for dlopen'd initial-exec TLS, which keeps elf/tst-auditmany working. The increase (76 bytes) is larger than 32 bytes because it has not been increased in quite a while. The cost in terms of additional TLS storage is quite significant, but it will also obscure some initial-exec-related dlopen failures.
2020-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers1-1/+1
2019-11-27Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]Florian Weimer1-3/+6
This change splits the scope and TLS slotinfo updates in dlopen into two parts: one to resize the data structures, and one to actually apply the update. The call to add_to_global_resize in dl_open_worker is moved before the demarcation point at which no further memory allocations are allowed. _dl_add_to_slotinfo is adjusted to make the list update optional. There is some optimization possibility here because we could grow the slotinfo list of arrays in a single call, one the largest TLS modid is known. This commit does not fix the fatal meory allocation failure in _dl_update_slotinfo. Ideally, this error during dlopen should be recoverable. The update order of scopes and TLS data structures is retained, although it appears to be more correct to fully initialize TLS first, and then expose symbols in the newly loaded objects via the scope update. Tested on x86_64-linux-gnu. Change-Id: I240c58387dabda3ca1bcab48b02115175fa83d6c
2019-09-07Prefer https to http for gnu.org and fsf.org URLsPaul Eggert1-1/+1
Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers1-1/+1
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
2018-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers1-1/+1
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
2017-08-31elf: Remove internal_function attributeFlorian Weimer1-6/+0
2017-08-13ld.so: Remove internal_function attribute from various functionsFlorian Weimer1-3/+0
These functions are invoked from other DSOs and should therefore use the standard calling convention.
2017-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers1-1/+1
2016-12-21Remove unused function _dl_tls_setupFlorian Weimer1-33/+1
Commit 7a5e3d9d633c828d84a9535f26b202a6179978e7 (elf: Assume TLS is initialized in _dl_map_object_from_fd) removed the last call of _dl_tls_setup, but did not remove the function itself.
2016-09-21[PR19826] fix non-LE TLS in static programsAlexandre Oliva1-0/+4
An earlier fix for TLS dropped early initialization of DTV entries for modules using static TLS, leaving it for __tls_get_addr to set them up. That worked on platforms that require the GD access model to be relaxed to LE in the main executable, but it caused a regression on platforms that allow GD in the main executable, particularly in statically-linked programs: they use a custom __tls_get_addr that does not update the DTV, which fails when the DTV early initialization is not performed. In static programs, __libc_setup_tls performs the DTV initialization for the main thread, but the DTV of other threads is set up in _dl_allocate_tls_init, so that's the fix that matters. Restoring the initialization in the remaining functions modified by this patch was just for uniformity. It's not clear that it is ever needed: even on platforms that allow GD in the main executable, the dynamically-linked version of __tls_get_addr would set up the DTV entries, even for static TLS modules, while updating the DTV counter. for ChangeLog [BZ #19826] * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early initialization of static TLS entries. * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise. * nptl/allocatestack.c (init_one_static_tls): Likewise.
2016-08-03elf: Do not use memalign for TCB/TLS blocks allocation [BZ #17730]Florian Weimer1-36/+53
Instead, call malloc and explicitly align the pointer. There is no external location to store the original (unaligned) pointer, and this commit increases the allocation size to store the pointer at a fixed location relative to the TCB pointer. The manual alignment means that some space goes unused which was previously made available for subsequent allocations. However, in the TLS_DTV_AT_TP case, the manual alignment code avoids aligning the pre-TCB to the TLS block alignment. (Even while using memalign, the allocation had some unused padding in front.) This concludes the removal of memalign calls from the TLS code, and the new tst-tls3-malloc test verifies that only core malloc routines are used.