aboutsummaryrefslogtreecommitdiff
path: root/malloc/malloc.c
AgeCommit message (Collapse)AuthorFilesLines
2017-10-24Add single-threaded path to _int_mallocWilco Dijkstra1-25/+38
This patch adds single-threaded fast paths to _int_malloc. * malloc/malloc.c (_int_malloc): Add SINGLE_THREAD_P path.
2017-10-24Add single-threaded path to malloc/realloc/calloc/memallocWilco Dijkstra1-9/+41
This patch adds a single-threaded fast path to malloc, realloc, calloc and memalloc. When we're single-threaded, we can bypass arena_get (which always locks the arena it returns) and just use the main arena. Also avoid retrying a different arena since there is just the main arena. * malloc/malloc.c (__libc_malloc): Add SINGLE_THREAD_P path. (__libc_realloc): Likewise. (_mid_memalign): Likewise. (__libc_calloc): Likewise.
2017-10-20Fix build issue with SINGLE_THREAD_PWilco Dijkstra1-0/+3
Add sysdep-cancel.h include. * malloc/malloc.c (sysdep-cancel.h): Add include.
2017-10-20Add single-threaded path to _int_freeWilco Dijkstra1-14/+29
This patch adds single-threaded fast paths to _int_free. Bypass the explicit locking for larger allocations. * malloc/malloc.c (_int_free): Add SINGLE_THREAD_P fast paths.
2017-10-19Fix deadlock in _int_free consistency checkWilco Dijkstra1-9/+12
This patch fixes a deadlock in the fastbin consistency check. If we fail the fast check due to concurrent modifications to the next chunk or system_mem, we should not lock if we already have the arena lock. Simplify the check to make it obviously correct. * malloc/malloc.c (_int_free): Fix deadlock bug in consistency check.
2017-10-18Fix build failure on tilepro due to unsupported atomicsWilco Dijkstra1-1/+2
* malloc/malloc.c (malloc_state): Use int for have_fastchunks since not all targets support atomics on bool.
2017-10-17Improve malloc initialization sequenceWilco Dijkstra1-109/+88
The current malloc initialization is quite convoluted. Instead of sometimes calling malloc_consolidate from ptmalloc_init, call malloc_init_state early so that the main_arena is always initialized. The special initialization can now be removed from malloc_consolidate. This also fixes BZ #22159. Check all calls to malloc_consolidate and remove calls that are redundant initialization after ptmalloc_init, like in int_mallinfo and __libc_mallopt (but keep the latter as consolidation is required for set_max_fast). Update comments to improve clarity. Remove impossible initialization check from _int_malloc, fix assert in do_check_malloc_state to ensure arena->top != 0. Fix the obvious bugs in do_check_free_chunk and do_check_remalloced_chunk to enable single threaded malloc debugging (do_check_malloc_state is not thread safe!). [BZ #22159] * malloc/arena.c (ptmalloc_init): Call malloc_init_state. * malloc/malloc.c (do_check_free_chunk): Fix build bug. (do_check_remalloced_chunk): Fix build bug. (do_check_malloc_state): Add assert that checks arena->top. (malloc_consolidate): Remove initialization. (int_mallinfo): Remove call to malloc_consolidate. (__libc_mallopt): Clarify why malloc_consolidate is needed.
2017-10-17Use relaxed atomics for malloc have_fastchunksWilco Dijkstra1-32/+20
Currently free typically uses 2 atomic operations per call. The have_fastchunks flag indicates whether there are recently freed blocks in the fastbins. This is purely an optimization to avoid calling malloc_consolidate too often and avoiding the overhead of walking all fast bins even if all are empty during a sequence of allocations. However using catomic_or to update the flag is completely unnecessary since it can be changed into a simple boolean and accessed using relaxed atomics. There is no change in multi-threaded behaviour given the flag is already approximate (it may be set when there are no blocks in any fast bins, or it may be clear when there are free blocks that could be consolidated). Performance of malloc/free improves by 27% on a simple benchmark on AArch64 (both single and multithreaded). The number of load/store exclusive instructions is reduced by 33%. Bench-malloc-thread speeds up by ~3% in all cases. * malloc/malloc.c (FASTCHUNKS_BIT): Remove. (have_fastchunks): Remove. (clear_fastchunks): Remove. (set_fastchunks): Remove. (malloc_state): Add have_fastchunks. (malloc_init_state): Use have_fastchunks. (do_check_malloc_state): Remove incorrect invariant checks. (_int_malloc): Use have_fastchunks. (_int_free): Likewise. (malloc_consolidate): Likewise.
2017-10-17Inline tcache functionsWilco Dijkstra1-2/+2
The functions tcache_get and tcache_put show up in profiles as they are a critical part of the tcache code. Inline them to give tcache a 16% performance gain. Since this improves multi-threaded cases as well, it helps offset any potential performance loss due to adding single-threaded fast paths. * malloc/malloc.c (tcache_put): Inline. (tcache_get): Inline.
2017-10-06malloc: Fix tcache leak after thread destruction [BZ #22111]Carlos O'Donell1-3/+5
The malloc tcache added in 2.26 will leak all of the elements remaining in the cache and the cache structure itself when a thread exits. The defect is that we do not set tcache_shutting_down early enough, and the thread simply recreates the tcache and places the elements back onto a new tcache which is subsequently lost as the thread exits (unfreed memory). The fix is relatively simple, move the setting of tcache_shutting_down earlier in tcache_thread_freeres. We add a test case which uses mallinfo and some heuristics to look for unaccounted for memory usage between the start and end of a thread start/join loop. It is very reliable at detecting that there is a leak given the number of iterations. Without the fix the test will consume 122MiB of leaked memory.
2017-08-31malloc: Remove the internal_function attributeFlorian Weimer1-13/+4
2017-08-31malloc: Resolve compilation failure in NDEBUG modeFlorian Weimer1-18/+7
In _int_free, the locked variable is not used if NDEBUG is defined.
2017-08-31malloc: Change top_check return type to voidFlorian Weimer1-1/+1
After commit ec2c1fcefb200c6cb7e09553f3c6af8815013d83, (malloc: Abort on heap corruption, without a backtrace), the function always returns 0.
2017-08-30malloc: Remove corrupt arena flagFlorian Weimer1-13/+0
This is no longer needed because we now abort immediately once heap corruption is detected.
2017-08-30malloc: Remove check_action variable [BZ #21754]Florian Weimer1-122/+30
Clean up calls to malloc_printerr and trim its argument list. This also removes a few bits of work done before calling malloc_printerr (such as unlocking operations). The tunable/environment variable still enables the lightweight additional malloc checking, but mallopt (M_CHECK_ACTION) no longer has any effect.
2017-08-30malloc: Abort on heap corruption, without a backtrace [BZ #21754]Florian Weimer1-19/+4
The stack trace printing caused deadlocks and has been itself been targeted by code execution exploits.
2017-08-10malloc: Avoid optimizer warning with GCC 7 and -O3Florian Weimer1-4/+16
2017-07-11Avoid backtrace from __stack_chk_fail [BZ #12189]H.J. Lu1-2/+4
__stack_chk_fail is called on corrupted stack. Stack backtrace is very unreliable against corrupted stack. __libc_message is changed to accept enum __libc_message_action and call BEFORE_ABORT only if action includes do_backtrace. __fortify_fail_abort is added to avoid backtrace from __stack_chk_fail. [BZ #12189] * debug/Makefile (CFLAGS-tst-ssp-1.c): New. (tests): Add tst-ssp-1 if -fstack-protector works. * debug/fortify_fail.c: Include <stdbool.h>. (_fortify_fail_abort): New function. (__fortify_fail): Call _fortify_fail_abort. (__fortify_fail_abort): Add a hidden definition. * debug/stack_chk_fail.c: Include <stdbool.h>. (__stack_chk_fail): Call __fortify_fail_abort, instead of __fortify_fail. * debug/tst-ssp-1.c: New file. * include/stdio.h (__libc_message_action): New enum. (__libc_message): Replace int with enum __libc_message_action. (__fortify_fail_abort): New hidden prototype. * malloc/malloc.c (malloc_printerr): Update __libc_message calls. * sysdeps/posix/libc_fatal.c (__libc_message): Replace int with enum __libc_message_action. Call BEFORE_ABORT only if action includes do_backtrace. (__libc_fatal): Update __libc_message call.
2017-07-06Add per-thread cache to mallocDJ Delorie1-9/+341
* config.make.in: Enable experimental malloc option. * configure.ac: Likewise. * configure: Regenerate. * manual/install.texi: Document it. * INSTALL: Regenerate. * malloc/Makefile: Likewise. * malloc/malloc.c: Add per-thread cache (tcache). (tcache_put): New. (tcache_get): New. (tcache_thread_freeres): New. (tcache_init): New. (__libc_malloc): Use cached chunks if available. (__libc_free): Initialize tcache if needed. (__libc_realloc): Likewise. (__libc_calloc): Likewise. (_int_malloc): Prefill tcache when appropriate. (_int_free): Likewise. (do_set_tcache_max): New. (do_set_tcache_count): New. (do_set_tcache_unsorted_limit): New. * manual/probes.texi: Document new probes. * malloc/arena.c: Add new tcache tunables. * elf/dl-tunables.list: Likewise. * manual/tunables.texi: Document them. * NEWS: Mention the per-thread cache.
2017-05-03Tweak realloc/MREMAP comment to be more accurate.DJ Delorie1-3/+3
MMap'd memory isn't shrunk without MREMAP, but IIRC this is intentional for performance reasons. Regardless, this patch tweaks the existing comment to be more accurate wrt the existing code. [BZ #21411] * malloc/malloc.c: Tweak realloc/MREMAP comment to be more accurate.
2017-04-18malloc: Turn cfree into a compatibility symbolFlorian Weimer1-2/+3
2017-04-01Call the right helper function when setting mallopt M_ARENA_MAX (BZ #21338)Wladimir J. van der Laan1-1/+1
Fixes a typo introduced in commit be7991c0705e35b4d70a419d117addcd6c627319. This caused mallopt(M_ARENA_MAX) as well as the environment variable MALLOC_ARENA_MAX to not work as intended because it set the wrong internal parameter. [BZ #21338] * malloc/malloc.c: Call do_set_arena_max for M_ARENA_MAX instead of incorrect do_set_arena_test
2017-03-17Further harden glibc malloc metadata against 1-byte overflows.DJ Delorie1-0/+2
Additional check for chunk_size == next->prev->chunk_size in unlink() 2017-03-17 Chris Evans <scarybeasts@gmail.com> * malloc/malloc.c (unlink): Add consistency check between size and next->prev->size, to further harden against 1-byte overflows.
2017-03-01Narrowing the visibility of libc-internal.h even further.Zack Weinberg1-1/+1
posix/wordexp-test.c used libc-internal.h for PTR_ALIGN_DOWN; similar to what was done with libc-diag.h, I have split the definitions of cast_to_integer, ALIGN_UP, ALIGN_DOWN, PTR_ALIGN_UP, and PTR_ALIGN_DOWN to a new header, libc-pointer-arith.h. It then occurred to me that the remaining declarations in libc-internal.h are mostly to do with early initialization, and probably most of the files including it, even in the core code, don't need it anymore. Indeed, only 19 files actually need what remains of libc-internal.h. 23 others need libc-diag.h instead, and 12 need libc-pointer-arith.h instead. No file needs more than one of them, and 16 don't need any of them! So, with this patch, libc-internal.h stops including libc-diag.h as well as losing the pointer arithmetic macros, and all including files are adjusted. * include/libc-pointer-arith.h: New file. Define cast_to_integer, ALIGN_UP, ALIGN_DOWN, PTR_ALIGN_UP, and PTR_ALIGN_DOWN here. * include/libc-internal.h: Definitions of above macros moved from here. Don't include libc-diag.h anymore either. * posix/wordexp-test.c: Include stdint.h and libc-pointer-arith.h. Don't include libc-internal.h. * debug/pcprofile.c, elf/dl-tunables.c, elf/soinit.c, io/openat.c * io/openat64.c, misc/ptrace.c, nptl/pthread_clock_gettime.c * nptl/pthread_clock_settime.c, nptl/pthread_cond_common.c * string/strcoll_l.c, sysdeps/nacl/brk.c * sysdeps/unix/clock_settime.c * sysdeps/unix/sysv/linux/i386/get_clockfreq.c * sysdeps/unix/sysv/linux/ia64/get_clockfreq.c * sysdeps/unix/sysv/linux/powerpc/get_clockfreq.c * sysdeps/unix/sysv/linux/sparc/sparc64/get_clockfreq.c: Don't include libc-internal.h. * elf/get-dynamic-info.h, iconv/loop.c * iconvdata/iso-2022-cn-ext.c, locale/weight.h, locale/weightwc.h * misc/reboot.c, nis/nis_table.c, nptl_db/thread_dbP.h * nscd/connections.c, resolv/res_send.c, soft-fp/fmadf4.c * soft-fp/fmasf4.c, soft-fp/fmatf4.c, stdio-common/vfscanf.c * sysdeps/ieee754/dbl-64/e_lgamma_r.c * sysdeps/ieee754/dbl-64/k_rem_pio2.c * sysdeps/ieee754/flt-32/e_lgammaf_r.c * sysdeps/ieee754/flt-32/k_rem_pio2f.c * sysdeps/ieee754/ldbl-128/k_tanl.c * sysdeps/ieee754/ldbl-128ibm/k_tanl.c * sysdeps/ieee754/ldbl-96/e_lgammal_r.c * sysdeps/ieee754/ldbl-96/k_tanl.c, sysdeps/nptl/futex-internal.h: Include libc-diag.h instead of libc-internal.h. * elf/dl-load.c, elf/dl-reloc.c, locale/programs/locarchive.c * nptl/nptl-init.c, string/strcspn.c, string/strspn.c * malloc/malloc.c, sysdeps/i386/nptl/tls.h * sysdeps/nacl/dl-map-segments.h, sysdeps/x86_64/atomic-machine.h * sysdeps/unix/sysv/linux/spawni.c * sysdeps/x86_64/nptl/tls.h: Include libc-pointer-arith.h instead of libc-internal.h. * elf/get-dynamic-info.h, sysdeps/nacl/dl-map-segments.h * sysdeps/x86_64/atomic-machine.h: Add multiple include guard.
2017-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers1-1/+1
2016-10-28malloc: Update comments about chunk layoutFlorian Weimer1-10/+30
2016-10-28sysmalloc: Initialize previous size field of mmaped chunksFlorian Weimer1-0/+1
With different encodings of the header, the previous zero initialization may be insufficient and produce an invalid encoding.
2016-10-28malloc: Use accessors for chunk metadata accessFlorian Weimer1-63/+84
This change allows us to change the encoding of these struct members in a centralized fashion.
2016-10-27Static inline functions for mallopt helpersSiddhesh Poyarekar1-34/+93
Make mallopt helper functions for each mallopt parameter so that it can be called consistently in other areas, like setting tunables. * malloc/malloc.c (do_set_mallopt_check): New function. (do_set_mmap_threshold): Likewise. (do_set_mmaps_max): Likewise. (do_set_top_pad): Likewise. (do_set_perturb_byte): Likewise. (do_set_trim_threshold): Likewise. (do_set_arena_max): Likewise. (do_set_arena_test): Likewise. (__libc_mallopt): Use them.
2016-10-26malloc: Remove malloc_get_state, malloc_set_state [BZ #19473]Florian Weimer1-2/+0
After the removal of __malloc_initialize_hook, newly compiled Emacs binaries are no longer able to use these interfaces. malloc_get_state is only used during the Emacs build process, so we provide a stub implementation only. Existing Emacs binaries will not call this stub function, but still reference the symbol. The rewritten tst-mallocstate test constructs a dumped heap which should approximates what existing Emacs binaries pass to glibc malloc.
2016-10-26Remove redundant definitions of M_ARENA_* macrosSiddhesh Poyarekar1-5/+0
The M_ARENA_MAX and M_ARENA_TEST macros are defined in malloc.c as well as malloc.h, and the former is unnecessary. This patch removes the duplicate. Tested on x86_64 to verify that the generated code remains unchanged barring changed line numbers to __malloc_assert. * malloc/malloc.c (M_ARENA_TEST, M_ARENA_MAX): Remove.
2016-10-26Document the M_ARENA_* mallopt parametersSiddhesh Poyarekar1-1/+0
The M_ARENA_* mallopt parameters are in wide use in production to control the number of arenas that a long lived process creates and hence there is no point in stating that this interface is non-public. Document this interface and remove the obsolete comment. * manual/memory.texi (M_ARENA_TEST): Add documentation. (M_ARENA_MAX): Likewise. * malloc/malloc.c: Remove obsolete comment.
2016-09-21malloc: Manual part of conversion to __libc_lockFlorian Weimer1-1/+1
This removes the old mutex_t-related definitions from malloc-machine.h, too.
2016-09-06malloc: Automated part of conversion to __libc_lockFlorian Weimer1-20/+20
2016-08-03elf: dl-minimal malloc needs to respect fundamental alignmentFlorian Weimer1-63/+0
The dynamic linker currently uses __libc_memalign for TLS-related allocations. The goal is to switch to malloc instead. If the minimal malloc follows the ABI fundamental alignment, we can assume that malloc provides this alignment, and thus skip explicit alignment in a few cases as an optimization. It was requested on libc-alpha that MALLOC_ALIGNMENT should be used, although this results in wasted space if MALLOC_ALIGNMENT is larger than the fundamental alignment. (The dynamic linker cannot assume that the non-minimal malloc will provide an alignment of MALLOC_ALIGNMENT; the ABI provides _Alignof (max_align_t) only.)
2016-06-20Revert __malloc_initialize_hook symbol poisoningFlorian Weimer1-3/+3
It turns out the Emacs-internal malloc implementation uses __malloc_* symbols. If glibc poisons them in <stdc-pre.h>, Emacs will no longer compile.
2016-06-11malloc_usable_size: Use correct size for dumped fake mapped chunksFlorian Weimer1-1/+6
The adjustment for the size computation in commit 1e8a8875d69e36d2890b223ffe8853a8ff0c9512 is needed in malloc_usable_size, too.
2016-06-10malloc: Remove __malloc_initialize_hook from the API [BZ #19564]Florian Weimer1-1/+15
__malloc_initialize_hook is interposed by application code, so the usual approach to define a compatibility symbol does not work. This commit adds a new mechanism based on #pragma GCC poison in <stdc-predef.h>.
2016-06-08malloc: Correct size computation in realloc for dumped fake mmapped chunksFlorian Weimer1-4/+8
For regular mmapped chunks there are two size fields (hence a reduction by 2 * SIZE_SZ bytes), but for fake chunks, we only have one size field, so we need to subtract SIZE_SZ bytes. This was initially reported as Emacs bug 23726.
2016-05-24malloc: Correct malloc alignment on 32-bit architectures [BZ #6527]Florian Weimer1-14/+2
After the heap rewriting added in commit 4cf6c72fd2a482e7499c29162349810029632c3f (malloc: Rewrite dumped heap for compatibility in __malloc_set_state), we can change malloc alignment for new allocations because the alignment of old allocations no longer matters. We need to increase the malloc state version number, so that binaries containing dumped heaps of the new layout will not try to run on previous versions of glibc, resulting in obscure crashes. This commit addresses a failure of tst-malloc-thread-fail on the affected architectures (32-bit ppc and mips) because the test checks pointer alignment.
2016-05-13malloc: Rewrite dumped heap for compatibility in __malloc_set_stateFlorian Weimer1-9/+46
This will allow us to change many aspects of the malloc implementation while preserving compatibility with existing Emacs binaries. As a result, existing Emacs binaries will have a larger RSS, and Emacs needs a few more milliseconds to start. This overhead is specific to Emacs (and will go away once Emacs switches to its internal malloc). The new checks to make free and realloc compatible with the dumped heap are confined to the mmap paths, which are already quite slow due to the munmap overhead. This commit weakens some security checks, but only for heap pointers in the dumped main arena. By default, this area is empty, so those checks are as effective as before.
2016-04-14malloc: Remove malloc hooks from fork handlerFlorian Weimer1-2/+0
The fork handler now runs so late that there is no risk anymore that other fork handlers in the same thread use malloc, so it is no longer necessary to install malloc hooks which made a subset of malloc functionality available to the thread that called fork.
2016-04-14malloc: Run fork handler as late as possible [BZ #19431]Florian Weimer1-0/+1
Previously, a thread M invoking fork would acquire locks in this order: (M1) malloc arena locks (in the registered fork handler) (M2) libio list lock A thread F invoking flush (NULL) would acquire locks in this order: (F1) libio list lock (F2) individual _IO_FILE locks A thread G running getdelim would use this order: (G1) _IO_FILE lock (G2) malloc arena lock After executing (M1), (F1), (G1), none of the threads can make progress. This commit changes the fork lock order to: (M'1) libio list lock (M'2) malloc arena locks It explicitly encodes the lock order in the implementations of fork, and does not rely on the registration order, thus avoiding the deadlock.
2016-03-11Fix type of parameter passed by malloc_consolidateTulio Magno Quites Machado Filho1-1/+1
atomic_exchange_acq() expected a pointer, but was receiving an integer.
2016-02-19malloc: Remove NO_THREADSFlorian Weimer1-2/+0
No functional change. It was not possible to build without threading support before.
2016-02-19malloc: Remove max_total_mem member form struct malloc_parFlorian Weimer1-6/+2
Also note that sumblks in struct mallinfo is always 0. No functional change.
2016-02-19malloc: Remove arena_mem variableFlorian Weimer1-2/+0
The computed value is never used. The accesses were data races.
2016-01-04Update copyright dates with scripts/update-copyrights.Joseph Myers1-1/+1
2015-12-21malloc: Fix list_lock/arena lock deadlock [BZ #19182]Florian Weimer1-3/+3
* malloc/arena.c (list_lock): Document lock ordering requirements. (free_list_lock): New lock. (ptmalloc_lock_all): Comment on free_list_lock. (ptmalloc_unlock_all2): Reinitialize free_list_lock. (detach_arena): Update comment. free_list_lock is now needed. (_int_new_arena): Use free_list_lock around detach_arena call. Acquire arena lock after list_lock. Add comment, including FIXME about incorrect synchronization. (get_free_list): Switch to free_list_lock. (reused_arena): Acquire free_list_lock around detach_arena call and attached threads counter update. Add two FIXMEs about incorrect synchronization. (arena_thread_freeres): Switch to free_list_lock. * malloc/malloc.c (struct malloc_state): Update comments to mention free_list_lock.
2015-11-24Replace MUTEX_INITIALIZER with _LIBC_LOCK_INITIALIZER in generic codeFlorian Weimer1-1/+1
* sysdeps/mach/hurd/libc-lock.h (_LIBC_LOCK_INITIALIZER): Define. (__libc_lock_define_initialized): Use it. * sysdeps/nptl/libc-lockP.h (_LIBC_LOCK_INITIALIZER): Define. * malloc/arena.c (list_lock): Use _LIBC_LOCK_INITIALIZER. * malloc/malloc.c (main_arena): Likewise. * sysdeps/generic/malloc-machine.h (MUTEX_INITIALIZER): Remove. * sysdeps/nptl/malloc-machine.h (MUTEX_INITIALIZER): Remove.