Age | Commit message (Collapse) | Author | Files | Lines |
|
GCC 7 changed the definition of max_align_t on i386:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9b5c49ef97e63cc63f1ffa13baf771368105ebe2
As a result, glibc malloc no longer returns memory blocks which are as
aligned as max_align_t requires.
This causes malloc/tst-malloc-thread-fail to fail with an error like this
one:
error: allocation function 0, size 144 not aligned to 16
This patch moves the MALLOC_ALIGNMENT definition to <malloc-alignment.h>
and increases the malloc alignment to 16 for i386.
[BZ #21120]
* malloc/malloc.c (MALLOC_ALIGNMENT): Moved to ...
* sysdeps/generic/malloc-alignment.h: Here. New file.
* sysdeps/i386/malloc-alignment.h: Likewise.
* sysdeps/generic/malloc-machine.h: Include <malloc-alignment.h>.
(cherry picked from commit 4e61a6be446026c327aa70cef221c9082bf0085d)
|
|
Compiling resolv/tst-resolv-qtypes.c with GCC 7 results in:
tst-resolv-qtypes.c:53:14: error: duplicate ‘const’ declaration specifier [-Werror=duplicate-decl-specifier]
static const const char *domain = "www.example.com";
This patch removes the duplicate const and makes domain a const pointer
to const char literal.
ChangeLog:
* resolv/tst-resolv-qtypes.c (domain):
Change type to const pointer to const char.
(cherry picked from commit d4f94368a96541db2b38b6535402a941f5aff975)
|
|
GCC 7 has a -Wstringop-overflow= warning that includes warning for
strncat with a size specified that is larger than the size of the
buffer (which is dubious usage, but valid at runtime if in fact there
isn't an overflow with the particular buffer contents present).
string/tester.c tests such cases; this patch arranges for this warning
to be ignored around relevant strncat calls.
Tested compilation for aarch64 (GCC mainline) with
build-many-glibcs.py; did execution testing for x86_64 (GCC 5).
* string/tester.c (test_strncat): Disable -Wstringop-overflow=
around tests of strncat with large sizes.
(cherry picked from commit 3ecd616cc1782210d09c9678ec1a48899f19145b)
|
|
GCC 7 has a -Walloc-size-larger-than= warning for allocations of half
the address space or more. This causes errors building glibc tests
that deliberately test failure of very large allocations. This patch
arranges for this warning to be ignored around the problematic
function calls.
Tested compilation for aarch64 (GCC mainline) with
build-many-glibcs.py; did execution testing for x86_64 (GCC 5).
* malloc/tst-malloc.c: Include <libc-internal.h>.
(do_test): Disable -Walloc-size-larger-than= around tests of
malloc with negative sizes.
* malloc/tst-mcheck.c: Include <libc-internal.h>.
(do_test): Disable -Walloc-size-larger-than= around tests of
malloc and realloc with negative sizes.
* malloc/tst-realloc.c: Include <libc-internal.h>.
(do_test): Disable -Walloc-size-larger-than= around tests of
realloc with negative sizes.
(cherry picked from commit 3d7229c2507be1daf0c3e15e1f134076fa8b9025)
|
|
This patch fixes the glibc testsuite build for GCC 7
-Wformat-truncation, newly moved out of -Wformat-length and with some
further warnings that didn't previously appear. Two tests that
previously disabled -Wformat-length are changed to disable
-Wformat-truncation instead; two others are made to disable that
option as well.
Tested (compilation only) with build-many-glibcs.py for aarch64 with
GCC mainline.
* stdio-common/tst-printf.c [__GNUC_PREREQ (7, 0)]: Ignore
-Wformat-truncation instead of -Wformat-length.
* time/tst-strptime2.c (mkbuf) [__GNUC_PREREQ (7, 0)]: Likewise.
* stdio-common/tstdiomisc.c (F): Ignore -Wformat-truncation for
GCC 7.
* wcsmbs/tst-wcstof.c: Include <libc-internal.h>.
(do_test): Ignore -Wformat-truncation for GCC 7.
(cherry picked from commit 3c9378265a8633e2c85a393b54a16abcf64fe616)
|
|
* stdio-common/tst-printf.c: Ignore -Wformat-length warning.
(cherry picked from commit 9032070deaa03431921315f973c548c2c403fecc)
|
|
* time/tst-strptime2.c: Ignore -Wformat-length warning.
(cherry picked from commit 26d7185d6f0a79188fdf02c5eec6e52bb29112f8)
|
|
glibc build with current mainline GCC fails because
nis/nss_nisplus/nisplus-alias.c contains code
if (name != NULL)
{
*errnop = EINVAL;
return NSS_STATUS_UNAVAIL;
}
char buf[strlen (name) + 9 + tablename_len];
producing an error about strlen being called on a pointer that is
always NULL (and a subsequent use of that pointer with a %s format in
snprintf).
As Andreas noted, the bogus conditional comes from a 1997 change:
- if (name == NULL || strlen(name) > 8)
- return NSS_STATUS_NOTFOUND;
- else
+ if (name != NULL || strlen(name) <= 8)
So the intention is clearly to return an error for NULL name.
This patch duly inverts the sense of the conditional. It fixes the
build with GCC mainline, and passes usual glibc testsuite testing for
x86_64. However, I have not tried any actual substantive nisplus
testing, do not have an environment for such testing, and do not know
whether it is possible that strlen (name) or tablename_len might be
large so that the VLA for buf is actually a security issue. However,
if it is a security issue, there are plenty of other similar instances
in the nisplus code (that haven't been hidden by a bogus comparison
with NULL) - and nis_table.c:__create_ib_request uses strdupa on the
string passed to nis_list, so a local fix in the caller wouldn't
suffice anyway (see bug 20987). (Calls to strdupa and other such
macros that use alloca must be considered equally questionable
regarding stack overflow issues as direct calls to alloca and VLA
declarations.)
[BZ #20978]
* nis/nss_nisplus/nisplus-alias.c (_nss_nisplus_getaliasbyname_r):
Compare name == NULL, not name != NULL.
(cherry picked from commit f88759ea9bd3c8d8fef28f123ba9767cb0e421a3)
|
|
Building with GCC 7 produces an error building rpcgen:
rpc_parse.c: In function 'get_prog_declaration':
rpc_parse.c:543:25: error: may write a terminating nul past the end of the destination [-Werror=format-length=]
sprintf (name, "%s%d", ARGNAME, num); /* default name of argument */
~~~~^
rpc_parse.c:543:5: note: format output between 5 and 14 bytes into a destination of size 10
sprintf (name, "%s%d", ARGNAME, num); /* default name of argument */
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
That buffer overrun is for the case where the .x file declares a
program with a million arguments. The strcpy two lines above can
generate a buffer overrun much more simply for a long argument name.
The limit on length of line read by rpcgen (MAXLINESIZE == 1024)
provides a bound on the buffer size needed, so this patch just changes
the buffer size to MAXLINESIZE to avoid both possible buffer
overruns. A testcase is added that rpcgen does not crash with a
500-character argument name, where it previously crashed.
It would not at all surprise me if there are many other ways of
crashing rpcgen with either valid or invalid input; fuzz testing would
likely find various such bugs, though I don't think they are that
important to fix (rpcgen is not that likely to be used with untrusted
.x files as input). (As well as fuzz-findable bugs there are probably
also issues when various int variables get overflowed on very large
input.) The test infrastructure for rpcgen-not-crashing tests would
need extending if tests are to be added for cases where rpcgen should
produce an error, as opposed to cases where it should succeed.
Tested for x86_64 and x86.
[BZ #20790]
* sunrpc/rpc_parse.c (get_prog_declaration): Increase buffer size
to MAXLINESIZE.
* sunrpc/bug20790.x: New file.
* sunrpc/Makefile [$(run-built-tests) = yes] (rpcgen-tests): New
variable.
[$(run-built-tests) = yes] (tests-special): Add $(rpcgen-tests).
[$(run-built-tests) = yes] ($(rpcgen-tests)): New rule.
(cherry picked from commit 5874510faaf3cbd0bb112aaacab9f225002beed1)
|
|
(cherry picked from commit c2528fef3b05bcffb1ac27c6c09cc3ff24b7f03f)
|
|
<bits/std_abs.h> from GCC 7 will include /usr/include/stdlib.h from
"#include_next" (instead of stdlib/stdlib.h in the glibc source
directory), and this turns up as a make dependency. Also make a copy
of <bits/std_abs.h> to prevent it from including /usr/include/stdlib.h.
[BZ #21573]
* Makerules [$(c++-bits-std_abs-h) != ""] (before-compile): Add
$(common-objpfx)bits/std_abs.h.
[$(c++-bits-std_abs-h) != ""] ($(common-objpfx)bits/std_abs.h):
New target.
* config.make.in (c++-bits-std_abs-h): New.
* configure.ac (find_cxx_header): Use "\,$1," with sed.
(CXX_BITS_STD_ABS_H): New.
(AC_SUBST(CXX_BITS_STD_ABS_H)): Likewise.
* configure: Regenerated.
(cherry picked from commit a65ea28d1833d3502c5070472e43bda04410e6b5)
|
|
This reduces the build time somewhat and is particularly noticeable
during rebuilds with few code changes.
(cherry picked from commit fc3e1337be1c6935ab58bd13520f97a535cf70cc)
|
|
The .symver directive on common symbol just creates a new common symbol,
not an alias and the newer assembler with the bug fix for
https://sourceware.org/bugzilla/show_bug.cgi?id=21661
will issue an error. Before the fix, we got
$ readelf -sW libc.so | grep "loc[12s]"
5109: 00000000003a0608 8 OBJECT LOCAL DEFAULT 36 loc1
5188: 00000000003a0610 8 OBJECT LOCAL DEFAULT 36 loc2
5455: 00000000003a0618 8 OBJECT LOCAL DEFAULT 36 locs
6575: 00000000003a05f0 8 OBJECT GLOBAL DEFAULT 36 locs@GLIBC_2.2.5
7156: 00000000003a05f8 8 OBJECT GLOBAL DEFAULT 36 loc1@GLIBC_2.2.5
7312: 00000000003a0600 8 OBJECT GLOBAL DEFAULT 36 loc2@GLIBC_2.2.5
in libc.so. The versioned loc1, loc2 and locs have the wrong addresses.
After the fix, we got
$ readelf -sW libc.so | grep "loc[12s]"
6570: 000000000039e3b8 8 OBJECT GLOBAL DEFAULT 34 locs@GLIBC_2.2.5
7151: 000000000039e3c8 8 OBJECT GLOBAL DEFAULT 34 loc1@GLIBC_2.2.5
7307: 000000000039e3c0 8 OBJECT GLOBAL DEFAULT 34 loc2@GLIBC_2.2.5
[BZ #21666]
* misc/regexp.c (loc1): Add __attribute__ ((nocommon));
(loc2): Likewise.
(locs): Likewise.
(cherry picked from commit 388b4f1a02f3a801965028bbfcd48d905638b797)
|
|
Since commit d957c4d3fa48d685ff2726c605c988127ef99395 (i386: Compile
rtld-*.os with -mno-sse -mno-mmx -mfpmath=387), vector intrinsics can
no longer be used in ld.so, even if the compiled code never makes it
into the final ld.so link. This commit adds the missing IS_IN (libc)
guard to the SSE 4.2 strcspn implementation, so that it can be used from
ld.so in the future.
(cherry picked from commit 69052a3a95da37169a08f9e59b2cc1808312753c)
|
|
The LD_HWCAP_MASK environment variable may alter the selection of
function variants for some architectures. For AT_SECURE process it
means that if an outdated routine has a bug that would otherwise not
affect newer platforms by default, LD_HWCAP_MASK will allow that bug
to be exploited.
To be on the safe side, ignore and disable LD_HWCAP_MASK for setuid
binaries.
[BZ #21209]
* elf/rtld.c (process_envvars): Ignore LD_HWCAP_MASK for
AT_SECURE processes.
* sysdeps/generic/unsecvars.h: Add LD_HWCAP_MASK.
(cherry picked from commit 1c1243b6fc33c029488add276e56570a07803bfd)
|
|
Also only process the last LD_AUDIT entry.
(cherry picked from commit 81b82fb966ffbd94353f793ad17116c6088dedd9)
|
|
(cherry picked from commit 6d0ba622891bed9d8394eef1935add53003b12e8)
|
|
LD_LIBRARY_PATH can only be used to reorder system search paths, which
is not useful functionality.
This makes an exploitable unbounded alloca in _dl_init_paths unreachable
for AT_SECURE=1 programs.
(cherry picked from commit f6110a8fee2ca36f8e2d2abecf3cba9fa7b8ea7d)
|
|
(cherry picked from commit 64ae9fe45662c8994b0e56ab469b01967408a154)
|
|
[BZ #19922]
* locales/iso14651_t1_common: Add collation rules for U+07DA to U+07DF.
[BZ #19919]
* locales/iso14651_t1_common: Correct collation of U+0D36 and U+0D37.
|
|
(cherry picked from commit 1d2bc2eae969543b89850e35e532f3144122d80a)
|
|
(cherry picked from commit fdc543919a3d8578631a492e1227c2cd8f5ecec7)
|
|
This patch remove the PID cache and usage in current GLIBC code. Current
usage is mainly used a performance optimization to avoid the syscall,
however it adds some issues:
- The exposed clone syscall will try to set pid/tid to make the new
thread somewhat compatible with current GLIBC assumptions. This cause
a set of issue with new workloads and usecases (such as BZ#17214 and
[1]) as well for new internal usage of clone to optimize other algorithms
(such as clone plus CLONE_VM for posix_spawn, BZ#19957).
- The caching complexity also added some bugs in the past [2] [3] and
requires more effort of each port to handle such requirements (for
both clone and vfork implementation).
- Caching performance gain in mainly on getpid and some specific
code paths. The getpid performance leverage is questionable [4],
either by the idea of getpid being a hotspot as for the getpid
implementation itself (if it is indeed a justifiable hotspot a
vDSO symbol could let to a much more simpler solution).
Other usage is mainly for non usual code paths, such as pthread
cancellation signal and handling.
For thread creation (on stack allocation) the code simplification in fact
adds some performance gain due the no need of transverse the stack cache
and invalidate each element pid.
Other thread usages will require a direct getpid syscall, such as
cancellation/setxid signal, thread cancellation, thread fail path (at
create_thread), and thread signal (pthread_kill and pthread_sigqueue).
However these are hardly usual hotspots and I think adding a syscall is
justifiable.
It also simplifies both the clone and vfork arch-specific implementation.
And by review each fork implementation there are some discrepancies that
this patch also solves:
- microblaze clone/vfork does not set/reset the pid/tid field
- hppa uses the default vfork implementation that fallback to fork.
Since vfork is deprecated I do not think we should bother with it.
The patch also removes the TID caching in clone. My understanding for
such semantic is try provide some pthread usage after a user program
issue clone directly (as done by thread creation with CLONE_PARENT_SETTID
and pthread tid member). However, as stated before in multiple discussions
threads, GLIBC provides clone syscalls without further supporting all this
semantics.
I ran a full make check on x86_64, x32, i686, armhf, aarch64, and powerpc64le.
For sparc32, sparc64, and mips I ran the basic fork and vfork tests from
posix/ folder (on a qemu system). So it would require further testing
on alpha, hppa, ia64, m68k, nios2, s390, sh, and tile (I excluded microblaze
because it is already implementing the patch semantic regarding clone/vfork).
[1] https://codereview.chromium.org/800183004/
[2] https://sourceware.org/ml/libc-alpha/2006-07/msg00123.html
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=15368
[4] http://yarchive.net/comp/linux/getpid_caching.html
* sysdeps/nptl/fork.c (__libc_fork): Remove pid cache setting.
* nptl/allocatestack.c (allocate_stack): Likewise.
(__reclaim_stacks): Likewise.
(setxid_signal_thread): Obtain pid through syscall.
* nptl/nptl-init.c (sigcancel_handler): Likewise.
(sighandle_setxid): Likewise.
* nptl/pthread_cancel.c (pthread_cancel): Likewise.
* sysdeps/unix/sysv/linux/pthread_kill.c (__pthread_kill): Likewise.
* sysdeps/unix/sysv/linux/pthread_sigqueue.c (pthread_sigqueue):
Likewise.
* sysdeps/unix/sysv/linux/createthread.c (create_thread): Likewise.
* sysdeps/unix/sysv/linux/getpid.c: Remove file.
* nptl/descr.h (struct pthread): Change comment about pid value.
* nptl/pthread_getattr_np.c (pthread_getattr_np): Remove thread
pid assert.
* sysdeps/unix/sysv/linux/pthread-pids.h (__pthread_initialize_pids):
Do not set pid value.
* nptl_db/td_ta_thr_iter.c (iterate_thread_list): Remove thread
pid cache check.
* nptl_db/td_thr_validate.c (td_thr_validate): Likewise.
* sysdeps/aarch64/nptl/tcb-offsets.sym: Remove pid offset.
* sysdeps/alpha/nptl/tcb-offsets.sym: Likewise.
* sysdeps/arm/nptl/tcb-offsets.sym: Likewise.
* sysdeps/hppa/nptl/tcb-offsets.sym: Likewise.
* sysdeps/i386/nptl/tcb-offsets.sym: Likewise.
* sysdeps/ia64/nptl/tcb-offsets.sym: Likewise.
* sysdeps/m68k/nptl/tcb-offsets.sym: Likewise.
* sysdeps/microblaze/nptl/tcb-offsets.sym: Likewise.
* sysdeps/mips/nptl/tcb-offsets.sym: Likewise.
* sysdeps/nios2/nptl/tcb-offsets.sym: Likewise.
* sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise.
* sysdeps/s390/nptl/tcb-offsets.sym: Likewise.
* sysdeps/sh/nptl/tcb-offsets.sym: Likewise.
* sysdeps/sparc/nptl/tcb-offsets.sym: Likewise.
* sysdeps/tile/nptl/tcb-offsets.sym: Likewise.
* sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise.
* sysdeps/unix/sysv/linux/aarch64/clone.S: Remove pid and tid caching.
* sysdeps/unix/sysv/linux/alpha/clone.S: Likewise.
* sysdeps/unix/sysv/linux/arm/clone.S: Likewise.
* sysdeps/unix/sysv/linux/hppa/clone.S: Likewise.
* sysdeps/unix/sysv/linux/i386/clone.S: Likewise.
* sysdeps/unix/sysv/linux/ia64/clone2.S: Likewise.
* sysdeps/unix/sysv/linux/mips/clone.S: Likewise.
* sysdeps/unix/sysv/linux/nios2/clone.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/clone.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/sh/clone.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/clone.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/tile/clone.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/aarch64/vfork.S: Remove pid set and reset.
* sysdeps/unix/sysv/linux/alpha/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/arm/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/i386/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/ia64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/m68k/clone.S: Likewise.
* sysdeps/unix/sysv/linux/m68k/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/mips/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/nios2/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sh/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/tile/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/tst-clone2.c (f): Remove direct pthread
struct access.
(clone_test): Remove function.
(do_test): Rewrite to take in consideration pid is not cached anymore.
(cherry picked from commit c579f48edba88380635ab98cb612030e3ed8691e)
|
|
This patch adds two new macros for internal and inline syscall to use
within GLIBC: INTERNAL_SYSCALL_CALL and INLINE_SYSCALL_CALL. They are
similar to the old INTERNAL_SYSCALL and INLINE_SYSCALL with the difference
the new macros accept a variable argument call and do not require to pass
the expected argument size.
The advantage is it is possible to use variable argument macros like
SYSCALL_LL{64} without the need to also handle the argument size. So
for an ABI where SYSCALL_LL might split the argument in high and low
parts, instead of:
INTERNAL_SYSCALL_DECL (err);
#if ...
INTERNAL_SYSCALL (syscall, err, 2, SYSCALL_LL (len));
#else
INTERNAL_SYSCALL (syscall, err, 1, SYSCALL_LL (len));
#endif
It will be just:
INTERNAL_SYSCALL_CALL (syscall, err, SYSCALL_LL (len));
The INLINE_SYSCALL_CALL follows the same semanthic regarding the argument
and is similar to INLINE_SYSCALL regarding setting errno.
Checked with a build for x86_64, i386, aach64, armhf, powerpc64le, powerpc32,
and mips32. No code generation changed.
* sysdeps/unix/sysdep.h (__INTERNAL_SYSCALL0): New macro.
(__INTERNAL_SYSCALL1): Likewise.
(__INTERNAL_SYSCALL2): Likewise.
(__INTERNAL_SYSCALL3): Likewise.
(__INTERNAL_SYSCALL4): Likewise.
(__INTERNAL_SYSCALL5): Likewise.
(__INTERNAL_SYSCALL6): Likewise.
(__INTERNAL_SYSCALL7): Likewise.
(__INTERNAL_SYSCALL_NARGS_X): Likewise.
(__INTERNAL_SYSCALL_NARGS): Likewise.
(__INTERNAL_SYSCALL_DISP): Likewise.
(INTERNAL_SYSCALL_CALL): Likewise.
(__SYSCALL0): Rename to __INLINE_SYSCALL0.
(__SYSCALL1): Rename to __INLINE_SYSCALL1.
(__SYSCALL2): Rename to __INLINE_SYSCALL2.
(__SYSCALL3): Rename to __INLINE_SYSCALL3.
(__SYSCALL4): Rename to __INLINE_SYSCALL4.
(__SYSCALL5): Rename to __INLINE_SYSCALL5.
(__SYSCALL6): Rename to __INLINE_SYSCALL6.
(__SYSCALL7): Rename to __INLINE_SYSCALL7.
(__SYSCALL_NARGS_X): Rename to __INLINE_SYSCALL_NARGS_X.
(__SYSCALL_NARGS): Rename to __INLINE_SYSCALL_NARGS.
(__SYSCALL_DISP): Rename to __INLINE_SYSCALL_DISP.
(__SYSCALL_CALL): Rename to INLINE_SYSCALL_CALL.
(SYSCALL_CANCEL): Replace __SYSCALL_CALL with INLINE_SYSCALL_CALL.
(cherry picked from commit e33a23fbe8c2dba04fe05678c584d3efcb6c9951)
|
|
This commit updates the support/ subdirectory to
commit 2714c5f3c95f90977167c1d21326d907fb76b419
on the master branch and modifies Makeconfig,
Rules, and extra-lib.mk accordingly.
|
|
On Skylake server, AVX512 load/store instructions in memcpy/memset may
lead to lower CPU turbo frequency in certain situations. Use of AVX2
in memcpy/memset has been observed to have improved overall performance
in many workloads due to the higher frequency.
Since AVX512ER is unique to Xeon Phi, this patch sets Prefer_No_AVX512
if AVX512ER isn't available so that AVX2 versions of memcpy/memset are
used on Skylake server.
[BZ #21396]
* sysdeps/x86/cpu-features.c (init_cpu_features): Set
Prefer_No_AVX512 if AVX512ER isn't available.
* sysdeps/x86/cpu-features.h (bit_arch_Prefer_No_AVX512): New.
(index_arch_Prefer_No_AVX512): Likewise.
* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Don't use
AVX512 version if Prefer_No_AVX512 is set.
* sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk):
Likewise.
* sysdeps/x86_64/multiarch/memmove.S (__libc_memmove): Likewise.
* sysdeps/x86_64/multiarch/memmove_chk.S (__memmove_chk):
Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk):
Likewise.
* sysdeps/x86_64/multiarch/memset.S (memset): Likewise.
* sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk):
Likewise.
(cherry picked from commit 4cb334c4d6249686653137ec273d081371b3672d)
|
|
AVX512ER won't be implemented in any Xeon processors and will be in
all Xeon Phi processors. Don't check CPU model number when setting
Prefer_No_VZEROUPPER for Xeon Phi. Instead, set Prefer_No_VZEROUPPER
if AVX512ER is available. It works with current and future Xeon Phi
and non-Xeon Phi processors.
* sysdeps/x86/cpu-features.c (init_cpu_features): Set
Prefer_No_VZEROUPPER if AVX512ER is available.
* sysdeps/x86/cpu-features.h
(bit_cpu_AVX512PF): New.
(bit_cpu_AVX512ER): Likewise.
(bit_cpu_AVX512CD): Likewise.
(bit_cpu_AVX512BW): Likewise.
(bit_cpu_AVX512VL): Likewise.
(index_cpu_AVX512PF): Likewise.
(index_cpu_AVX512ER): Likewise.
(index_cpu_AVX512CD): Likewise.
(index_cpu_AVX512BW): Likewise.
(index_cpu_AVX512VL): Likewise.
(reg_AVX512PF): Likewise.
(reg_AVX512ER): Likewise.
(reg_AVX512CD): Likewise.
(reg_AVX512BW): Likewise.
(reg_AVX512VL): Likewise.
(cherry picked from commit 1c53cb49de6d82d9469ccbd5aa0c55924502bd8b)
|
|
As noted in bug 20126, MIPS n64 uses an incorrect implementation of
readahead intended for 32-bit systems. This patch adds a
syscalls.list entry to fix this. An updated version of the
consolidation patch
<https://sourceware.org/ml/libc-alpha/2016-09/msg00527.html> could
remove this syscalls.list entry again.
Tested with compilation (only) for mips64; the nature of the syscall
doesn't allow for a glibc test to detect this issue.
[BZ #21026]
* sysdeps/unix/sysv/linux/mips/mips64/n64/syscalls.list
(readahead): New syscall entry.
(cherry picked from commit 30733525c6867c160261db1afade6326000f9f75)
|
|
On Skylake server, _dl_runtime_resolve_avx512_opt is used to preserve
the first 8 vector registers. The code layout is
if only %xmm0 - %xmm7 registers are used
preserve %xmm0 - %xmm7 registers
if only %ymm0 - %ymm7 registers are used
preserve %ymm0 - %ymm7 registers
preserve %zmm0 - %zmm7 registers
Branch predication always executes the fallthrough code path to preserve
%zmm0 - %zmm7 registers speculatively, even though only %xmm0 - %xmm7
registers are used. This leads to lower CPU frequency on Skylake
server. This patch changes the fallthrough code path to preserve
%xmm0 - %xmm7 registers instead:
if whole %zmm0 - %zmm7 registers are used
preserve %zmm0 - %zmm7 registers
if only %ymm0 - %ymm7 registers are used
preserve %ymm0 - %ymm7 registers
preserve %xmm0 - %xmm7 registers
Tested on Skylake server.
[BZ #21258]
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve_opt):
Define only if _dl_runtime_resolve is defined to
_dl_runtime_resolve_sse_vex.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_opt):
Fallthrough to _dl_runtime_resolve_sse_vex.
(cherry picked from commit c15f8eb50cea7ad1a4ccece6e0982bf426d52c00)
|
|
When glibc is built with -fstack-check, trying to use posix_spawn can
lead to segfaults due to gcc internally probing stack memory too far.
The new spawn API will allocate a minimum of 1 page, but the stack
checking logic might probe a couple of pages. When it tries to walk
them, everything falls apart.
The gcc internal docs [1] state the default interval checking is one
page. Which means we need two pages (the current one, and the next
probed). No target currently defines it larger.
Further, it mentions that the default minimum stack size needed to
recover from an overflow is 4/8KiB for sjlj or 8/12KiB for others.
But some Linux targets (like mips and ppc) go up to 16KiB (and some
non-Linux targets go up to 24KiB).
Let's create each child with a minimum of 32KiB slack space to support
them all, and give us future breathing room.
No test is added as existing ones crash. Even a simple call is
enough to trigger the problem:
char *argv[] = { "/bin/ls", NULL };
posix_spawn(NULL, "/bin/ls", NULL, NULL, argv, NULL);
[1] https://gcc.gnu.org/onlinedocs/gcc-6.3.0/gccint/Stack-Checking.html
(cherry picked from commit 21f042c804835d1f7a4a8e06f2c93ca35a182042)
|
|
In a 32-bit environment with _FILE_OFFSET_BITS=64, the __REDIRECT macro
combined with __THROW generates an invalid C++ declaration.
(cherry picked from commit ce39613205dc47ceaeea76710d49e7a483b503ab)
|
|
The ia64-specific clone2 call expects the base of the stack mapping and
the stack size as sep arguments, not an initial stack value as on other
stack-grows-down architectures. Reuse the stack-grows-up macro so we
pass in the right stack base.
Reported-by: Matt Turner <mattst88@gentoo.org>
(cherry picked from commit ddc3fb333469c2997798742dc0509dc1e3201d91)
|
|
The binutils package was recently changed to fix -z relro support on hppa.
See ld/21000 for details:
https://sourceware.org/bugzilla/show_bug.cgi?id=21000
This exposed a problem with the _dl_start_user function in the RTLD_START
define. We need to set __libc_stack_end before it is made read only. For
this, we need to define DL_STACK_END. The offset of 0x160 gives the same
stack end as the code in _dl_start_user.
A build log with the attached patch is here:
https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=hppa&ver=2.24-9&stamp=1487639205&raw=0
(cherry picked from commit 5d20a49aaccef5ef7adac93d5ca159f6b7ba0105)
|
|
Since memset-vec-unaligned-erms.S has VDUP_TO_VEC0_AND_SET_RETURN at
function entry, memset optimized for AVX2 and AVX512 will always use
ymm/zmm register. VZEROUPPER should be placed before ret in
L(stosb):
movq %rdx, %rcx
movzbl %sil, %eax
movq %rdi, %rdx
rep stosb
movq %rdx, %rax
ret
since it can be reached from
L(stosb_more_2x_vec):
cmpq $REP_STOSB_THRESHOLD, %rdx
ja L(stosb)
[BZ #21081]
* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(L(stosb)): Add VZEROUPPER before ret.
(cherry picked from commit 02b78ff749f0c88771713368dbb2a09b1979814f)
|
|
There is no need to use PLT nor GOT in static archives to branch to a
function, regardless whether static archives is compiled with PIC or
not. When static archives are used to create dynamic executable,
PLT/GOT may be used. The resulting executable still works correctly.
[BZ #20750]
* sysdeps/x86_64/sysdep.h (JUMPTARGET): Check SHARED instead
of PIC.
(cherry picked from commit c9070e6305c08365c5f8b604bdca39c699d70cfa)
|
|
Also rename T_UNSPEC because an upcoming public header file
update will use that name.
(cherry picked from commit fc82b0a2dfe7dbd35671c10510a8da1043d746a5)
|
|
Drop the GLIBC_TUNABLES environment variable from the environment of
setxid processes to avoid passing it on to non-setxid children. This
prevents potentially insecure tunables in the GLIBC_TUNABLES envvar
from crossing over into a child that may use a libc that has tunables
support.
* sysdeps/generic/unsecvars.h: Add GLIBC_TUNABLES.
|
|
The update of *adapt_count after the release of the lock causes a race
condition when thread A unlocks, thread B continues and destroys the
mutex, and thread A writes to *adapt_count.
(cherry picked from commit e9a96ea1aca4ebaa7c86e8b83b766f118d689d0f)
(with changes from commit eb1321f291515dae75c83a40c39e775fdd38e97a)
|
|
Both regexes end with a "*." which means the previous match can be
omitted, and then the . allows them to match any input at all.
This means tools like coreutils' `rm -i` will always delete things
when prompted because the yesexpr regex matches all inputs (even
the negative ones).
(cherry picked from commit a035eb6928bc63fb798dcc1421529f933122d74f)
|
|
There is at least one use case where during exit a library destructor
might call dlclose() on a valid handle and have it fail with an
assertion. We must allow this case, it is a valid handle, and dlclose()
should not fail with an assert. In the future we might be able to return
an error that the dlclose() could not be completed because the opened
library has already been unloaded and destructors have run as part of
exit processing.
For more details see:
https://www.sourceware.org/ml/libc-alpha/2016-12/msg00859.html
(cherry picked from commit 57707b7fcc38855869321f8c7827bfe21d729f37)
|
|
The alpha specific version of trunc and truncf always add and subtract
0x1.0p23 or 0x1.0p52 even for big values. This causes this kind of
errors in the testsuite:
Failure: Test: trunc_towardzero (0x1p107)
Result:
is: 1.6225927682921334e+32 0x1.fffffffffffffp+106
should be: 1.6225927682921336e+32 0x1.0000000000000p+107
difference: 1.8014398509481984e+16 0x1.0000000000000p+54
ulp : 0.5000
max.ulp : 0.0000
Change this by returning the input value when its absolute value is
greater than 0x1.0p23 or 0x1.0p52. NaN have to go through the add and
subtract operations to get possibly silenced.
Finally remove the code to handle inexact exception, trunc should never
generate such an exception.
Changelog:
* sysdeps/alpha/fpu/s_trunc.c (__trunc): Return the input value
when its absolute value is greater than 0x1.0p52.
[_IEEE_FP_INEXACT] Remove.
* sysdeps/alpha/fpu/s_truncf.c (__truncf): Return the input value
when its absolute value is greater than 0x1.0p23.
[_IEEE_FP_INEXACT] Remove.
(cherry picked from commit b74d259fe793499134eb743222cd8dd7c74a31ce)
|
|
The alpha version of rint wrongly return sNaN for sNaN input. Fix that
by checking for NaN and by returning the input value added with itself
in that case.
Changelog:
* sysdeps/alpha/fpu/s_rint.c (__rint): Add argument with itself
when it is a NaN.
* sysdeps/alpha/fpu/s_rintf.c (__rintf): Likewise.
(cherry picked from commit cb7f9d63b921ea1a1cbb4ab377a8484fd5da9a2b)
|
|
The alpha version of floor wrongly return sNaN for sNaN input. Fix that
by checking for NaN and by returning the input value added with itself
in that case.
Finally remove the code to handle inexact exception, floor should never
generate such an exception.
Changelog:
* sysdeps/alpha/fpu/s_floor.c (__floor): Add argument with itself
when it is a NaN.
[_IEEE_FP_INEXACT] Remove.
* sysdeps/alpha/fpu/s_floorf.c (__floorf): Likewise.
(cherry picked from commit 65cc568cf57156e5230db9a061645e54ff028a41)
|
|
The alpha version of ceil wrongly return sNaN for sNaN input. Fix that
by checking for NaN and by returning the input value added with itself
in that case.
Finally remove the code to handle inexact exception, ceil should never
generate such an exception.
Changelog:
* sysdeps/alpha/fpu/s_ceil.c (__ceil): Add argument with itself
when it is a NaN.
[_IEEE_FP_INEXACT] Remove.
* sysdeps/alpha/fpu/s_ceilf.c (__ceilf): Likewise.
(cherry picked from commit 062e53c195b4a87754632c7d51254867247698b4)
|
|
There is transition penalty when SSE instructions are mixed with 256-bit
AVX or 512-bit AVX512 load instructions. Since _dl_runtime_resolve_avx
and _dl_runtime_profile_avx512 save/restore 256-bit YMM/512-bit ZMM
registers, there is transition penalty when SSE instructions are used
with lazy binding on AVX and AVX512 processors.
To avoid SSE transition penalty, if only the lower 128 bits of the first
8 vector registers are non-zero, we can preserve %xmm0 - %xmm7 registers
with the zero upper bits.
For AVX and AVX512 processors which support XGETBV with ECX == 1, we can
use XGETBV with ECX == 1 to check if the upper 128 bits of YMM registers
or the upper 256 bits of ZMM registers are zero. We can restore only the
non-zero portion of vector registers with AVX/AVX512 load instructions
which will zero-extend upper bits of vector registers.
This patch adds _dl_runtime_resolve_sse_vex which saves and restores
XMM registers with 128-bit AVX store/load instructions. It is used to
preserve YMM/ZMM registers when only the lower 128 bits are non-zero.
_dl_runtime_resolve_avx_opt and _dl_runtime_resolve_avx512_opt are added
and used on AVX/AVX512 processors supporting XGETBV with ECX == 1 so
that we store and load only the non-zero portion of vector registers.
This avoids SSE transition penalty caused by _dl_runtime_resolve_avx and
_dl_runtime_profile_avx512 when only the lower 128 bits of vector
registers are used.
_dl_runtime_resolve_avx_slow is added and used for AVX processors which
don't support XGETBV with ECX == 1. Since there is no SSE transition
penalty on AVX512 processors which don't support XGETBV with ECX == 1,
_dl_runtime_resolve_avx512_slow isn't provided.
[BZ #20495]
[BZ #20508]
* sysdeps/x86/cpu-features.c (init_cpu_features): For Intel
processors, set Use_dl_runtime_resolve_slow and set
Use_dl_runtime_resolve_opt if XGETBV suports ECX == 1.
* sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
New.
(bit_arch_Use_dl_runtime_resolve_slow): Likewise.
(index_arch_Use_dl_runtime_resolve_opt): Likewise.
(index_arch_Use_dl_runtime_resolve_slow): Likewise.
* sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup): Use
_dl_runtime_resolve_avx512_opt and _dl_runtime_resolve_avx_opt
if Use_dl_runtime_resolve_opt is set. Use
_dl_runtime_resolve_slow if Use_dl_runtime_resolve_slow is set.
* sysdeps/x86_64/dl-trampoline.S: Include <cpu-features.h>.
(_dl_runtime_resolve_opt): New. Defined for AVX and AVX512.
(_dl_runtime_resolve): Add one for _dl_runtime_resolve_sse_vex.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx_slow):
New.
(_dl_runtime_resolve_opt): Likewise.
(_dl_runtime_profile): Define only if _dl_runtime_profile is
defined.
(cherry picked from commit fb0f7a6755c1bfaec38f490fbfcaa39a66ee3604)
|
|
When glibc is compiled with gcc 6.2 that has been configured with
to default to PIC/PIE, the static version of __memcpy_chk is not built,
as the test is done on PIC instead of SHARED. Fix the test to check for
SHARED, like it is done for similar functions like memmove_chk.
Changelog:
* sysdeps/x86_64/memcpy_chk.S (__memcpy_chk): Check for SHARED
instead of PIC.
(cherry picked from commit 380ec16d62f459d5a28cfc25b7b20990c45e1cc9)
|
|
Avoid a build error with microMIPS compilation and recent versions of
GAS which complain if a branch targets a label which is marked as data
rather than microMIPS code:
../sysdeps/mips/mips32/crti.S: Assembler messages:
../sysdeps/mips/mips32/crti.S:72: Error: branch to a symbol in another ISA mode
make[2]: *** [.../csu/crti.o] Error 1
as commit 9d862524f6ae ("MIPS: Verify the ISA mode and alignment of
branch and jump targets") closed a hole in branch processing, making
relocation calculation respect the ISA mode of the symbol referred.
This allowed diagnosing the situation where an attempt is made to pass
control from code assembled for one ISA mode to code assembled for a
different ISA mode and either relaxing the branch to a cross-mode jump
or if that is not possible, then reporting this as an error rather than
letting such code build and then fail unpredictably at the run time.
This however requires the correct annotation of branch targets as code,
because the ISA mode is not relevant for data symbols and is therefore
not recorded for them. The `.insn' pseudo-op is used for this purpose
and has been supported by GAS since:
Wed Feb 12 14:36:29 1997 Ian Lance Taylor <ian@cygnus.com>
* config/tc-mips.c (mips_pseudo_table): Add "insn".
(s_insn): New static function.
* doc/c-mips.texi: Document .insn.
so there has been no reason to avoid it where required. More recently
this pseudo-op has been documented, by the microMIPS architecture
specification[1][2], as required for the correct interpretation of any
code label which is not followed by an actual instruction in an assembly
source.
Use it in our crti.S files then, to mark that the trailing label there
with no instructions following is indeed not a code bug and the branch
is legitimate.
References:
[1] "MIPS Architecture for Programmers, Volume II-B: The microMIPS32
Instruction Set", MIPS Technologies, Inc., Document Number: MD00582,
Revision 5.04, January 15, 2014, Section 7.1 "Assembly-Level
Compatibility", p. 533
[2] "MIPS Architecture for Programmers, Volume II-B: The microMIPS64
Instruction Set", MIPS Technologies, Inc., Document Number: MD00594,
Revision 5.04, January 15, 2014, Section 8.1 "Assembly-Level
Compatibility", p. 623
2016-11-23 Matthew Fortune <Matthew.Fortune@imgtec.com>
Maciej W. Rozycki <macro@imgtec.com>
* sysdeps/mips/mips32/crti.S (_init): Add `.insn' pseudo-op at
`.Lno_weak_fn' label.
* sysdeps/mips/mips64/n32/crti.S (_init): Likewise.
* sysdeps/mips/mips64/n64/crti.S (_init): Likewise.
(cherry picked from commit cfaf1949ff1f8336b54c43796d0e2531bc8a40a2)
|
|
This patch fixes an invalid write out or stack allocated buffer in
2 places at execvpe implementation:
1. On 'maybe_script_execute' function where it allocates the new
argument list and it does not account that a minimum of argc
plus 3 elements (default shell path, script name, arguments,
and ending null pointer) should be considered. The straightforward
fix is just to take account of the correct list size on argument
copy.
2. On '__execvpe' where the executable file name lenght may not
account for ending '\0' and thus subsequent path creation may
write past array bounds because it requires to add the terminating
null. The fix is to change how to calculate the executable name
size to add the final '\0' and adjust the rest of the code
accordingly.
As described in GCC bug report 78433 [1], these issues were masked off by
GCC because it allocated several bytes more than necessary so that many
off-by-one bugs went unnoticed.
Checked on x86_64 with a latest GCC (7.0.0 20161121) with -O3 on CFLAGS.
[BZ #20847]
* posix/execvpe.c (maybe_script_execute): Remove write past allocated
array bounds.
(__execvpe): Likewise.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78433
|
|
When glibc is compiled with gcc 6.2 that has been configured with
--enable-default-pie and --enable-default-ssp, the configure script
fails to detect that the compiler has ssp turned on by default when
being built for i686-linux-gnu.
This is because gcc is emitting __stack_chk_fail_local but the
script is only looking for __stack_chk_fail. Support both.
Example output:
checking whether x86_64-pc-linux-gnu-gcc -m32 -Wl,-O1 -Wl,--as-needed
implicitly enables -fstack-protector... no
(cherry picked from commit c7409aded44634411a19b0b7178b7faa237835e6)
|
|
Having found that with my script to build many glibc variants I could
reproduce the linknamespace test failures in parallel builds (that
various people had previously reported but I hadn't seen myself), I
investigated those failures further. This patch adds a missing
dependency to those tests.
Tested for x86_64, including the configuration where I saw those
failures and where I don't see them with this patch.
* conform/Makefile ($(linknamespace-header-tests)): Also depend on
$(linknamespace-symlists-tests).
(cherry picked from commit 6c50bb532bb1f47084762e16fbf52af9b5a752d8)
|