riscv-gnu-toolchain/glibc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2025-04-15	malloc: Use tailcalls in __libc_free	Wilco Dijkstra	1	-7/+24
	Use tailcalls to avoid the overhead of a frame on the free fastpath. Move tcache initialization to _int_free_chunk(). Add malloc_printerr_tail() which can be tailcalled without forcing a frame like no-return functions. Change tcache_double_free_verify() to retry via __libc_free() after clearing the key. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-04-15	malloc: Inline tcache_free	Wilco Dijkstra	1	-30/+14
	Inline tcache_free since it's only used by __libc_free. Add __glibc_likely for the tcache checks. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-04-15	malloc: Improve free checks	Wilco Dijkstra	1	-9/+6
	The checks on size can be merged and use __builtin_add_overflow. Since tcache only handles small sizes (and rejects sizes < MINSIZE), delay this check until after tcache. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-04-15	malloc: Inline _int_free_check	Wilco Dijkstra	1	-19/+12
	Inline _int_free_check since it is only used by __libc_free. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-04-14	malloc: Inline _int_free	Wilco Dijkstra	2	-28/+12
	Inline _int_free since it is a small function and only really used by __libc_free. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-04-14	malloc: Move mmap code out of __libc_free hotpath	Wilco Dijkstra	1	-30/+40
	Currently __libc_free checks for a freed mmap chunk in the fast path. Also errno is always saved and restored to preserve it. Since mmap chunks are larger than the largest tcache chunk, it is safe to delay this and handle tcache, smallbin and medium bin blocks first. Move saving of errno to cases that actually need it. Remove a safety check that fails on mmap chunks and a check that mmap chunks cannot be added to tcache. Performance of bench-malloc-thread improves by 9.2% for 1 thread and 6.9% for 32 threads on Neoverse V2. Reviewed-by: DJ Delorie <dj@redhat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-03-28	malloc: Improve performance of __libc_malloc	Wilco Dijkstra	1	-13/+20
	Improve performance of __libc_malloc by splitting it into 2 parts: first handle the tcache fastpath, then do the rest in a separate tailcalled function. This results in significant performance gains since __libc_malloc doesn't need to setup a frame and we delay tcache initialization and setting of errno until later. On Neoverse V2, bench-malloc-simple improves by 6.7% overall (up to 8.5% for ST case) and bench-malloc-thread improves by 20.3% for 1 thread and 14.4% for 32 threads. Reviewed-by: DJ Delorie <dj@redhat.com>
2025-03-26	malloc: Use __always_inline for simple functions	Wilco Dijkstra	2	-11/+11
	Use __always_inline for small helper functions that are critical for performance. This ensures inlining always happens when expected. Performance of bench-malloc-simple improves by 0.6% on average on Neoverse V2. Reviewed-by: DJ Delorie <dj@redhat.com>
2025-03-25	malloc: Use _int_free_chunk for remainders	Wilco Dijkstra	1	-9/+6
	When splitting a chunk, release the tail part by calling int_free_chunk. This avoids inserting random blocks into tcache that were never requested by the user. Fragmentation will be worse if they are never used again. Note if the tail is fairly small, we could avoid splitting it at all. Also remove an oddly placed initialization of tcache in _libc_realloc. Reviewed-by: DJ Delorie <dj@redhat.com>
2025-03-21	malloc: missing initialization of tcache in _mid_memalign	Cupertino Miranda	1	-0/+2
	_mid_memalign includes tcache code but does not attempt to initialize tcaches. Reviewed-by: DJ Delorie <dj@redhat.com>
2025-03-18	malloc: Improve csize2tidx	Wilco Dijkstra	1	-1/+1
	Remove the alignment rounding up from csize2tidx - this makes no sense since the input should be a chunk size. Removing it enables further optimizations, for example chunksize_nomask can be safely used and invalid sizes < MINSIZE are not mapped to a valid tidx. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-03-18	malloc: Improve arena_for_chunk()	Wilco Dijkstra	1	-6/+7
	Change heap_max_size() to improve performance of arena_for_chunk(). Instead of a complex calculation, using a simple mask operation to get the arena base pointer. HEAP_MAX_SIZE should be larger than the huge page size, otherwise heaps will use not huge pages. On AArch64 this removes 6 instructions from arena_for_chunk(), and bench-malloc-thread improves by 1.1% - 1.8%. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-03-03	malloc: Add integrity check to largebin nextsizes	Ben Kallus	1	-0/+3
	If attacker overwrites the bk_nextsize link in the first chunk of a largebin that later has a smaller chunk inserted into it, malloc will write a heap pointer into an attacker-controlled address [0]. This patch adds an integrity check to mitigate this attack. [0]: https://github.com/shellphish/how2heap/blob/master/glibc_2.39/large_bin_attack.c Signed-off-by: Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu> Reviewed-by: DJ Delorie <dj@redhat.com>
2025-02-13	malloc: Add size check when moving fastbin->tcache	Ben Kallus	1	-0/+3
	By overwriting a forward link in a fastbin chunk that is subsequently moved into the tcache, it's possible to get malloc to return an arbitrary address [0]. When a chunk is fetched from a fastbin, its size is checked against the expected chunk size for that fastbin (see malloc.c:3991). This patch adds a similar check for chunks being moved from a fastbin to tcache, which renders obsolete the exploitation technique described above. Now updated to use __glibc_unlikely instead of __builtin_expect, as requested. [0]: https://github.com/shellphish/how2heap/blob/master/glibc_2.39/fastbin_reverse_into_tcache.c Signed-off-by: Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-02-02	elf: Merge __dl_libc_freemem into __rtld_libc_freeres	Florian Weimer	1	-2/+0
	The functions serve very similar purposes. The advantage of __rtld_libc_freeres is that it is located within ld.so, so it is more natural to poke at link map internals there. This slightly regresses cleanup capabilities for statically linked binaries. If that becomes a problem, we should start calling __rtld_libc_freeres from __libc_freeres (perhaps after renaming it).
2025-01-25	malloc: cleanup casts in tst-calloc	Sam James	1	-2/+2
	Followup to c3d1dac96bdd10250aa37bb367d5ef8334a093a1. As pointed out by Maciej W. Rozycki, the casts are obviously useless now. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2025-01-10	malloc: obscure calloc use in tst-calloc	Sam James	1	-4/+8
	Similar to a9944a52c967ce76a5894c30d0274b824df43c7a and f9493a15ea9cfb63a815c00c23142369ec09d8ce, we need to hide calloc use from the compiler to accommodate GCC's r15-6566-g804e9d55d9e54c change. First, include tst-malloc-aux.h, but then use `volatile` variables for size. The test passes without the tst-malloc-aux.h change but IMO we want it there for consistency and to avoid future problems (possibly silent). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2025-01-01	Update copyright dates not handled by scripts/update-copyrights	Paul Eggert	1	-1/+1
	I've updated copyright dates in glibc for 2025. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files.
2025-01-01	Update copyright dates with scripts/update-copyrights	Paul Eggert	97	-97/+97

2024-12-23	include/sys/cdefs.h: Add __attribute_optimization_barrier__	Adhemerval Zanella	3	-3/+3
	Add __attribute_optimization_barrier__ to disable inlining and cloning on a function. For Clang, expand it to __attribute__ ((optnone)) Otherwise, expand it to __attribute__ ((noinline, clone)) Co-Authored-By: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sam James <sam@gentoo.org>
2024-12-22	malloc: Only enable -Waggressive-loop-optimizations suppression for gcc	Adhemerval Zanella	1	-2/+2
	Reviewed-by: Sam James <sam@gentoo.org>
2024-12-17	Hide all malloc functions from compiler [BZ #32366]	H.J. Lu	6	-5/+30
	Since -1 isn't a power of two, compiler may reject it, hide memalign from Clang 19 which issues an error: tst-memalign.c:86:31: error: requested alignment is not a power of 2 [-Werror,-Wnon-power-of-two-alignment] 86 \| p = memalign (-1, pagesize); \| ^~ tst-memalign.c:86:31: error: requested alignment must be 4294967296 bytes or smaller; maximum alignment assumed [-Werror,-Wbuiltin-assume-aligned-alignment] 86 \| p = memalign (-1, pagesize); \| ^~ Update tst-malloc-aux.h to hide all malloc functions and include it in all malloc tests to prevent compiler from optimizing out any malloc functions. Tested with Clang 19.1.5 and GCC 15 20241206 for BZ #32366. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sam James <sam@gentoo.org>
2024-12-11	malloc: Add tcache path for calloc	Wangyang Guo	2	-38/+132
	This commit add tcache support in calloc() which can largely improve the performance of small size allocation, especially in multi-thread scenario. tcache_available() and tcache_try_malloc() are split out as a helper function for better reusing the code. Also fix tst-safe-linking failure after enabling tcache. In previous, calloc() is used as a way to by-pass tcache in memory allocation and trigger safe-linking check in fastbins path. With tcache enabled, it needs extra workarounds to bypass tcache. Result of bench-calloc-thread benchmark Test Platform: Xeon-8380 Ratio: New / Original time_per_iteration (Lower is Better) Threads# \| Ratio -----------\|------ 1 thread \| 0.656 4 threads \| 0.470 Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-12-10	malloc: add indirection for malloc(-like) functions in tests [BZ #32366]	Sam James	7	-0/+52
	GCC 15 introduces allocation dead code removal (DCE) for PR117370 in r15-5255-g7828dc070510f8. This breaks various glibc tests which want to assert various properties of the allocator without doing anything obviously useful with the allocated memory. Alexander Monakov rightly pointed out that we can and should do better than passing -fno-malloc-dce to paper over the problem. Not least because GCC 14 already does such DCE where there's no testing of malloc's return value against NULL, and LLVM has such optimisations too. Handle this by providing malloc (and friends) wrappers with a volatile function pointer to obscure that we're calling malloc (et. al) from the compiler. Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
2024-12-04	malloc: Optimize small memory clearing for calloc	H.J. Lu	2	-35/+2
	Add calloc-clear-memory.h to clear memory size up to 36 bytes (72 bytes on 64-bit targets) for calloc. Use repeated stores with 1 branch, instead of up to 3 branches. On x86-64, it is faster than memset since calling memset needs 1 indirect branch, 1 broadcast, and up to 4 branches. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2024-11-29	malloc: send freed small chunks to smallbin	k4lizen	1	-19/+34
	Large chunks get added to the unsorted bin since sorting them takes time, for small chunks the benefit of adding them to the unsorted bin is non-existant, actually hurting performance. Splitting and malloc_consolidate still add small chunks to unsorted, but we can hint the compiler that that is a relatively rare occurance. Benchmarking shows this to be consistently good. Authored-by: k4lizen <k4lizen@proton.me> Signed-off-by: Aleksa Siriški <sir@tmina.org>
2024-11-27	malloc: Avoid func call for tcache quick path in free()	Wangyang Guo	1	-1/+1
	Tcache is an important optimzation to accelerate memory free(), things within this code path should be kept as simple as possible. This commit try to remove the function call when free() invokes tcache code path by inlining _int_free(). Result of bench-malloc-thread benchmark Test Platform: Xeon-8380 Ratio: New / Original time_per_iteration (Lower is Better) Threads# \| Ratio -----------\|------ 1 thread \| 0.879 4 threads \| 0.874 The performance data shows it can improve bench-malloc-thread benchmark by ~12% in both single thread and multi-thread scenario. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-11-25	Silence most -Wzero-as-null-pointer-constant diagnostics	Alejandro Colomar	5	-46/+46
	Replace 0 by NULL and {0} by {}. Omit a few cases that aren't so trivial to fix. Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117059> Link: <https://software.codidact.com/posts/292718/292759#answer-292759> Signed-off-by: Alejandro Colomar <alx@kernel.org>
2024-11-25	malloc: Split _int_free() into 3 sub functions	Wangyang Guo	1	-49/+86
	Split _int_free() into 3 smaller functions for flexible combination: * _int_free_check -- sanity check for free * tcache_free -- free memory to tcache (quick path) * _int_free_chunk -- free memory chunk (slow path)
2024-11-12	linux: Add support for getrandom vDSO	Adhemerval Zanella	1	-2/+2
	Linux 6.11 has getrandom() in vDSO. It operates on a thread-local opaque state allocated with mmap using flags specified by the vDSO. Multiple states are allocated at once, as many as fit into a page, and these are held in an array of available states to be doled out to each thread upon first use, and recycled when a thread terminates. As these states run low, more are allocated. To make this procedure async-signal-safe, a simple guard is used in the LSB of the opaque state address, falling back to the syscall if there's reentrancy contention. Also, _Fork() is handled by blocking signals on opaque state allocation (so _Fork() always sees a consistent state even if it interrupts a getrandom() call) and by iterating over the thread stack cache on reclaim_stack. Each opaque state will be in the free states list (grnd_alloc.states) or allocated to a running thread. The cancellation is handled by always using GRND_NONBLOCK flags while calling the vDSO, and falling back to the cancellable syscall if the kernel returns EAGAIN (would block). Since getrandom is not defined by POSIX and cancellation is supported as an extension, the cancellation is handled as 'may occur' instead of 'shall occur' [1], meaning that if vDSO does not block (the expected behavior) getrandom will not act as a cancellation entrypoint. It avoids a pthread_testcancel call on the fast path (different than 'shall occur' functions, like sem_wait()). It is currently enabled for x86_64, which is available in Linux 6.11, and aarch64, powerpc32, powerpc64, loongarch64, and s390x, which are available in Linux 6.12. Link: https://pubs.opengroup.org/onlinepubs/9799919799/nframe.html [1] Co-developed-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> # x86_64 Tested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> # x86_64, aarch64 Tested-by: Xi Ruoyao <xry111@xry111.site> # x86_64, aarch64, loongarch64 Tested-by: Stefan Liebler <stli@linux.ibm.com> # s390x
2024-08-20	malloc: Link threading tests with $(shared-thread-library)	Samuel Thibault	1	-0/+6
	Fixes build failures on Hurd.
2024-07-27	malloc: Link threading tests with $(shared-thread-library)	Florian Weimer	1	-0/+2
	Fixes build failures on Hurd.
2024-07-22	malloc: add multi-threaded tests for aligned_alloc/calloc/malloc	Miguel Martín	3	-0/+172
	Improve aligned_alloc/calloc/malloc test coverage by adding multi-threaded tests with random memory allocations and with/without cross-thread memory deallocations. Perform a number of memory allocation calls with random sizes limited to 0xffff. Use the existing DSO ('malloc/tst-aligned_alloc-lib.c') to randomize allocator selection. The multi-threaded allocation/deallocation is staged as described below: - Stage 1: Half of the threads will be allocating memory and the other half will be waiting for them to finish the allocation. - Stage 2: Half of the threads will be allocating memory and the other half will be deallocating memory. - Stage 3: Half of the threads will be deallocating memory and the second half waiting on them to finish. Add 'malloc/tst-aligned-alloc-random-thread.c' where each thread will deallocate only the memory that was previously allocated by itself. Add 'malloc/tst-aligned-alloc-random-thread-cross.c' where each thread will deallocate memory that was previously allocated by another thread. The intention is to be able to utilize existing malloc testing to ensure that similar allocation APIs are also exposed to the same rigors. Reviewed-by: Arjun Shankar <arjun@redhat.com>
2024-07-22	malloc: avoid global locks in tst-aligned_alloc-lib.c	Miguel Martín	1	-19/+20
	Make sure the DSO used by aligned_alloc/calloc/malloc tests does not get a global lock on multithreaded tests. Reviewed-by: Arjun Shankar <arjun@redhat.com>
2024-07-19	Fix usage of _STACK_GROWS_DOWN and _STACK_GROWS_UP defines [BZ 31989]	John David Anglin	1	-1/+1
	Signed-off-by: John David Anglin <dave.anglin@bell.net> Reviewed-By: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-06-24	mtrace: make shell commands robust against meta characters	Andreas Schwab	1	-2/+2
	Use the list form of the open function to avoid interpreting meta characters in the arguments.
2024-06-20	malloc: Replace shell/Perl gate in mtrace	Florian Weimer	1	-4/+17
	The previous version expanded $0 and $@ twice. The new version defines a q no-op shell command. The Perl syntax error is masked by the eval Perl function. The q { … } construct is executed by the shell without errors because the q shell function was defined, but treated as a non-expanding quoted string by Perl, effectively hiding its context from the Perl interpreter. As before the script is read by require instead of executed directly, to avoid infinite recursion because the #! line contains /bin/sh. Introduce the “fatal” function to produce diagnostics that are not suppressed by “do”. Use “do” instead of “require” because it has fewer requirements on the executed script than “require”. Prefix relative paths with './' because “do” (and “require“ before) searches for the script in @INC if the path is relative and does not start with './'. Use $_ to make the trampoline shorter. Add an Emacs mode marker to indentify the script as a Perl script.
2024-06-20	malloc: Always install mtrace (bug 31892)	Florian Weimer	2	-7/+4
	Generation of the Perl script does not depend on Perl, so we can always install it even if $(PERL) is not set during the build. Change the malloc/mtrace.pl text substition not to rely on $(PERL). Instead use PATH at run time to find the Perl interpreter. The Perl interpreter cannot execute directly a script that starts with “#! /bin/sh”: it always executes it with /bin/sh. There is no perl command line switch to disable this behavior. Instead, use the Perl require function to execute the script. The additional shift calls remove the “.” shell arguments. Perl interprets the “.” as a string concatenation operator, making the expression syntactically valid. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-04	malloc: New test to check malloc alternate path using memory obstruction	sayan paul	2	-0/+73
	The test aims to ensure that malloc uses the alternate path to allocate memory when sbrk() or brk() fails.To achieve this, the test first creates an obstruction at current program break, tests that obstruction with a failing sbrk(), then checks if malloc is still returning a valid ptr thus inferring that malloc() used mmap() instead of brk() or sbrk() to allocate the memory. Reviewed-by: Arjun Shankar <arjun@redhat.com> Reviewed-by: Zack Weinberg <zack@owlfolio.org>
2024-05-14	malloc: Improve aligned_alloc and calloc test coverage.	Joe Simmons-Talbott	5	-0/+151
	Add a DSO (malloc/tst-aligned_alloc-lib.so) that can be used during testing to interpose malloc with a call that randomly uses either aligned_alloc, __libc_malloc, or __libc_calloc in the place of malloc. Use LD_PRELOAD with the DSO to mirror malloc/tst-malloc.c testing as an example in malloc/tst-malloc-random.c. Add malloc/tst-aligned-alloc-random.c as another example that does a number of malloc calls with randomly sized, but limited to 0xffff, requests. The intention is to be able to utilize existing malloc testing to ensure that similar allocation APIs are also exposed to the same rigors. Reviewed-by: DJ Delorie <dj@redhat.com>
2024-05-10	malloc/Makefile: Split and sort tests	H.J. Lu	1	-64/+102
	Put each test on a separate line and sort tests. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-01-12	Make __getrandom_nocancel set errno and add a _nostatus version	Xi Ruoyao	1	-1/+3
	The __getrandom_nocancel function returns errors as negative values instead of errno. This is inconsistent with other _nocancel functions and it breaks "TEMP_FAILURE_RETRY (__getrandom_nocancel (p, n, 0))" in __arc4random_buf. Use INLINE_SYSCALL_CALL instead of INTERNAL_SYSCALL_CALL to fix this issue. But __getrandom_nocancel has been avoiding from touching errno for a reason, see BZ 29624. So add a __getrandom_nocancel_nostatus function and use it in tcache_key_initialize. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-01-01	Update copyright dates not handled by scripts/update-copyrights	Paul Eggert	3	-3/+3
	I've updated copyright dates in glibc for 2024. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files.
2024-01-01	Update copyright dates with scripts/update-copyrights	Paul Eggert	90	-90/+90

2023-11-29	malloc: Improve MAP_HUGETLB with glibc.malloc.hugetlb=2	Adhemerval Zanella	1	-3/+10
	Even for explicit large page support, allocation might use mmap without the hugepage bit set if the requested size is smaller than mmap_threshold. For this case where mmap is issued, MAP_HUGETLB is set iff the allocation size is larger than the used large page. To force such allocations to use large pages, also tune the mmap_threhold (if it is not explicit set by a tunable). This forces allocation to follow the sbrk path, which will fall back to mmap (which will try large pages before galling back to default mmap). Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
2023-11-22	malloc: Use __get_nprocs on arena_get2 (BZ 30945)	Adhemerval Zanella	1	-1/+1
	This restore the 2.33 semantic for arena_get2. It was changed by 11a02b035b46 to avoid arena_get2 call malloc (back when __get_nproc was refactored to use an scratch_buffer - 903bc7dcc2acafc). The __get_nproc was refactored over then and now it also avoid to call malloc. The 11a02b035b46 did not take in consideration any performance implication, which should have been discussed properly. The __get_nprocs_sched is still used as a fallback mechanism if procfs and sysfs is not acessible. Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
2023-11-07	malloc: Decorate malloc maps	Adhemerval Zanella	2	-0/+9
	Add anonymous mmap annotations on loader malloc, malloc when it allocates memory with mmap, and on malloc arena. The /proc/self/maps will now print: [anon: glibc: malloc arena] [anon: glibc: malloc] [anon: glibc: loader malloc] On arena allocation, glibc annotates only the read/write mapping. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
2023-10-23	malloc: Fix tst-tcfree3 build csky-linux-gnuabiv2 with fortify source	Adhemerval Zanella	2	-3/+2
	With gcc 13.1 with --enable-fortify-source=2, tst-tcfree3 fails to build on csky-linux-gnuabiv2 with: ../string/bits/string_fortified.h: In function ‘do_test’: ../string/bits/string_fortified.h:26:8: error: inlining failed in call to ‘always_inline’ ‘memcpy’: target specific option mismatch 26 \| __NTH (memcpy (void __restrict __dest, const void __restrict __src, \| ^~~~~~ ../misc/sys/cdefs.h:81:62: note: in definition of macro ‘__NTH’ 81 \| # define __NTH(fct) __attribute__ ((__nothrow__ __LEAF)) fct \| ^~~ tst-tcfree3.c:45:3: note: called from here 45 \| memcpy (c, a, 32); \| ^~~~~~~~~~~~~~~~~ Instead of relying on -O0 to avoid malloc/free to be optimized away, disable the builtin. Reviewed-by: DJ Delorie <dj@redhat.com>
2023-08-15	malloc: Remove bin scanning from memalign (bug 30723)	Florian Weimer	2	-166/+10
	On the test workload (mpv --cache=yes with VP9 video decoding), the bin scanning has a very poor success rate (less than 2%). The tcache scanning has about 50% success rate, so keep that. Update comments in malloc/tst-memalign-2 to indicate the purpose of the tests. Even with the scanning removed, the additional merging opportunities since commit 542b1105852568c3ebc712225ae78b ("malloc: Enable merging of remainders in memalign (bug 30723)") are sufficient to pass the existing large bins test. Remove leftover variables from _int_free from refactoring in the same commit. Reviewed-by: DJ Delorie <dj@redhat.com>
2023-08-11	malloc: Enable merging of remainders in memalign (bug 30723)	Florian Weimer	1	-76/+121
	Previously, calling _int_free from _int_memalign could put remainders into the tcache or into fastbins, where they are invisible to the low-level allocator. This results in missed merge opportunities because once these freed chunks become available to the low-level allocator, further memalign allocations (even of the same size are) likely obstructing merges. Furthermore, during forwards merging in _int_memalign, do not completely give up when the remainder is too small to serve as a chunk on its own. We can still give it back if it can be merged with the following unused chunk. This makes it more likely that memalign calls in a loop achieve a compact memory layout, independently of initial heap layout. Drop some useless (unsigned long) casts along the way, and tweak the style to more closely match GNU on changed lines. Reviewed-by: DJ Delorie <dj@redhat.com>