Age | Commit message (Collapse) | Author | Files | Lines |
|
It should say "string", not "blob".
|
|
|
|
Clarify the byte order of the the struct fields and document
sin6_flowinfo and sin6_scope_id.
|
|
Clock_gettime, settime and getres implementations are unncessarily
complex due to using defines and C file inclusion. Simplify the
code by replacing the redundant defines and removing the inclusion,
making it much easier to understand. No functional changes.
* sysdeps/posix/clock_getres.c (__clock_getres): Cleanup.
* sysdeps/unix/clock_gettime.c (__clock_gettime): Cleanup.
* sysdeps/unix/clock_settime.c (__clock_settime): Cleanup.
* sysdeps/unix/sysv/linux/clock_getres.c (__clock_getres): Cleanup.
* sysdeps/unix/sysv/linux/clock_gettime.c (__clock_gettime): Cleanup.
* sysdeps/unix/sysv/linux/clock_settime.c (__clock_settime): Cleanup.
|
|
This version uses general register based memory instruction to load
data, because vector register based is slightly slower in emag.
Character-matching is performed on 16-byte (both size and alignment)
memory block in parallel each iteration.
* sysdeps/aarch64/memchr.S (__memchr): Rename to MEMCHR.
[!MEMCHR](MEMCHR): Set to __memchr.
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines):
Add memchr_generic and memchr_nosimd.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add memchr ifuncs.
* sysdeps/aarch64/multiarch/memchr.c: New file.
* sysdeps/aarch64/multiarch/memchr_generic.S: Likewise.
* sysdeps/aarch64/multiarch/memchr_nosimd.S: Likewise.
|
|
This version uses general register based memory store instead of
vector register based, for the former is faster than the latter
in emag.
The fact that DC ZVA size in emag is 64-byte, is used by IFUNC
dispatch to select this memset, so that cost of runtime-check on
DC ZVA size can be saved.
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines):
Add memset_emag.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add __memset_emag to memset ifunc.
* sysdeps/aarch64/multiarch/memset.c (libc_ifunc):
Add IS_EMAG check for ifunc dispatch.
* sysdeps/aarch64/multiarch/memset_base64.S: New file.
* sysdeps/aarch64/multiarch/memset_emag.S: New file.
|
|
Emag is a 64-bit CPU core released by AmpereComputing.
Add its name to cpu list, and corresponding macro as utilities for
later IFUNC dispatch.
* manual/tunables.texi (Tunable glibc.cpu.name): Add emag.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list):
Add emag.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_EMAG):
New macro.
|
|
From time to time I get fails in tst-spawn like:
tst-spawn.c:111: numeric comparison failure
left: 0 (0x0); from: xlseek (fd2, 0, SEEK_CUR)
right: 28 (0x1c); from: strlen (fd2string)
error: 1 test failures
tst-spawn.c:252: numeric comparison failure
left: 1 (0x1); from: WEXITSTATUS (status)
right: 0 (0x0); from: 0
error: 1 test failures
It turned out, that a child process is testing it's open file descriptors
with e.g. a sequence of testing the current position, setting the position
to zero and reading a specific amount of bytes.
Unfortunately starting with commit 2a69f853c03034c2e383e0f9c35b5402ce8b5473
the test is spawning a second child process which is sharing some of the
file descriptors. If the test sequence as mentioned above is running in parallel
it leads to test failures.
As the second call of posix_spawn shall test a NULL pid argument,
this patch is just moving the waitpid of the first child
before the posix_spawn of the second child.
ChangeLog:
* posix/tst-spawn do_test(): Move waitpid before posix_spawn.
|
|
|
|
For a full analysis of both the pthread_rwlock_tryrdlock() stall
and the pthread_rwlock_trywrlock() stall see:
https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14
In the pthread_rwlock_trydlock() function we fail to inspect for
PTHREAD_RWLOCK_FUTEX_USED in __wrphase_futex and wake the waiting
readers.
In the pthread_rwlock_trywrlock() function we write 1 to
__wrphase_futex and loose the setting of the PTHREAD_RWLOCK_FUTEX_USED
bit, again failing to wake waiting readers during unlock.
The fix in the case of pthread_rwlock_trydlock() is to check for
PTHREAD_RWLOCK_FUTEX_USED and wake the readers.
The fix in the case of pthread_rwlock_trywrlock() is to only write
1 to __wrphase_futex if we installed the write phase, since all other
readers would be spinning waiting for this step.
We add two new tests, one exercises the stall for
pthread_rwlock_trywrlock() which is easy to exercise, and one exercises
the stall for pthread_rwlock_trydlock() which is harder to exercise.
The pthread_rwlock_trywrlock() test fails consistently without the fix,
and passes after. The pthread_rwlock_tryrdlock() test fails roughly
5-10% of the time without the fix, and passes all the time after.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Torvald Riegel <triegel@redhat.com>
Signed-off-by: Rik Prohaska <prohaska7@gmail.com>
Co-authored-by: Torvald Riegel <triegel@redhat.com>
Co-authored-by: Rik Prohaska <prohaska7@gmail.com>
|
|
* scripts/build-many-glibcs.py (Context.checkout): Default MPFR
version to 4.0.2.
|
|
GLIBC explicitly allows one to assign a new FILE pointer to stdout and
other standard streams. printf and wprintf were honouring assignment to
stdout and using the new value, but puts, putchar, and wide char variants
did not.
The stdout part is fixed here. The stdin part will be fixed in a followup.
|
|
Problem found by AddressSanitizer, reported by Hongxu Chen in:
https://debbugs.gnu.org/34140
* posix/regexec.c (proceed_next_node):
Do not read past end of input buffer.
|
|
If /etc/aliases ends with a continuation line (a line that starts
with whitespace) which does not have a trailing newline character,
the file parser would crash due to a null pointer dereference.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
|
|
* version.h (RELEASE): Set to "stable".
(VERSION): Set to "2.29".
* include/features.h (__GLIBC_MINOR__): Set to 2.29.
|
|
* NEWS: Add the list of bugs fixed in 2.29.
* manual/contrib.texi: Update contributors list with some more
names.
* manual/install.texi: Update latest versions of packages
tested.
* INSTALL: Regenerated.
|
|
Update translations from translationproject.org for 2.28.9000.
|
|
It triggers an invalid build issue on GCC8+ and does not covers all
corner cases.
|
|
Previous commit fixed
[BZ #24110]
|
|
This patch fixes 36 math related test failures.
|
|
There was missing restore of $f3 before the return from the function
via the $y_is_neg path. This caused the math/big testcase from Go-1.11
testsuite (that includes lots of corner cases that exercise remqu) FAIL.
[BZ #24130]
* sysdeps/alpha/remqu.S (__remqu): Add missing restore
of $f3 register on $y_is_neg path.
|
|
* hurd/hurdsig.c (_hurd_thread_sigstate): Set SS_DISABLE in
sigaltstack.ss_flags.
|
|
The full representation of the alternative calendar year (%EY)
typically includes an internal use of "%Ey". As a GNU extension,
apply any flags on "%EY" (e.g. "%_EY", "%-EY") to the internal "%Ey",
allowing users of "%EY" to control how the year is padded.
Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com>
Reviewed-by: Zack Weinberg <zackw@panix.com>
ChangeLog:
[BZ #24096]
* manual/time.texi (strftime): Document "%EC" and "%EY".
* time/Makefile (tests): Add tst-strftime2.
(LOCALES): Add ja_JP.UTF-8, lo_LA.UTF-8, and th_TH.UTF-8.
* time/strftime_l.c (__strftime_internal): Add argument yr_spec to
override padding for "%Ey".
If an optional flag ('_' or '-') is specified to "%EY", interpret the
"%Ey" in the subformat as if decorated with that flag.
* time/tst-strftime2.c: New file.
|
|
In Japanese locales, strftime's alternative year format (%Ey) produces
a year numbered within a time period called an _era_. A new era
typically begins when a new emperor is enthroned. The result of "%Ey"
is therefore usually a one- or two-digit number.
Many programs that display Japanese era dates assume that the era year
is two digits wide. To improve how these programs display dates
during the first nine years of a new era, change "%Ey" to pad one-
digit numbers on the left with a zero. This change applies to all
locales. It is expected to be harmless for other locales that use the
alternative year format (e.g. lo_LA and th_TH, in which "%Ey" produces
the year of the Buddhist calendar) as those calendars' year numbers
are already more than two digits wide, and this is not expected to
change.
This change needs to be in place before 2019-05-01 CE, as a new era is
scheduled to begin on that date.
Reviewed-by: Zack Weinberg <zackw@panix.com>
Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com>
ChangeLog:
[BZ #23758]
* manual/time.texi (strftime): Document "%Ey".
* time/strftime_l.c (__strftime_internal): Set the default width
padding with zero of "%Ey" to 2.
|
|
Hurd does not support MAP_NORESERVE and MAP_STACK.
Checked on i686-gnu build.
* support/xsigstack.c (MAP_NORESERVE, MAP_STACK): Define if they
are not defined.
|
|
The error handling patch for invalid audit modules version access
invalid memory:
elf/rtld.c:
1454 unsigned int (*laversion) (unsigned int);
1455 unsigned int lav;
1456 if (err_str == NULL
1457 && (laversion = largs.result) != NULL
1458 && (lav = laversion (LAV_CURRENT)) > 0
1459 && lav <= LAV_CURRENT)
1460 {
[...]
1526 else
1527 {
1528 /* We cannot use the DSO, it does not have the
1529 appropriate interfaces or it expects something
1530 more recent. */
1531 #ifndef NDEBUG
1532 Lmid_t ns = dlmargs.map->l_ns;
1533 #endif
1534 _dl_close (dlmargs.map);
1535
1536 /* Make sure the namespace has been cleared entirely. */
1537 assert (GL(dl_ns)[ns]._ns_loaded == NULL);
1538 assert (GL(dl_ns)[ns]._ns_nloaded == 0);
1539
1540 GL(dl_tls_max_dtv_idx) = tls_idx;
1541 goto not_loaded;
1542 }
1431 const char *err_str = NULL;
1432 bool malloced;
1433 (void) _dl_catch_error (&objname, &err_str, &malloced, dlmopen_doit,
1434 &dlmargs);
1435 if (__glibc_unlikely (err_str != NULL))
1436 {
1437 not_loaded:
1438 _dl_error_printf ("\
1439 ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n",
1440 name, err_str);
1441 if (malloced)
1442 free ((char *) err_str);
1443 }
On failure the err_str will be NULL and _dl_debug_vdprintf does not handle
it properly:
elf/dl-misc.c:
200 case 's':
201 /* Get the string argument. */
202 iov[niov].iov_base = va_arg (arg, char *);
203 iov[niov].iov_len = strlen (iov[niov].iov_base);
204 if (prec != -1)
205 iov[niov].iov_len = MIN ((size_t) prec, iov[niov].iov_len);
206 ++niov;
207 break;
This patch fixes the issues and improves the error message.
Checked on x86_64-linux-gnu and i686-linux-gnu
[BZ #24122]
* elf/Makefile (tests): Add tst-audit13.
(modules-names): Add tst-audit13mod1.
(tst-audit13.out, LDFLAGS-tst-audit13mod1.so, tst-audit13-ENV): New
rule.
* elf/rtld.c (dl_main): Handle invalid audit module version.
* elf/tst-audit13.c: New file.
* elf/tst-audit13mod1.c: Likewise.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
* hurd/lookup-at.c (__file_name_lookup_at): When at_flags contains
AT_EMPTY_PATH, call __dir_lookup and __hurd_file_name_lookup_retry
directly instead of __hurd_file_name_lookup.
|
|
* sysdeps/mach/hurd/faccessat.c (__faccessat_common): Check for errors
returned by __hurd_at_flags.
|
|
* scripts/build-many-glibcs.py (Context.checkout): Default
binutils version to 2.32 branch.
|
|
The IPv4 address parser in the getaddrinfo function is changed so that
it does not ignore trailing whitespace and all characters after it.
For backwards compatibility, the getaddrinfo function still recognizes
legacy name syntax, such as 192.000.002.010 interpreted as 192.0.2.8
(octal).
This commit does not change the behavior of inet_addr and inet_aton.
gethostbyname already had additional sanity checks (but is switched
over to the new __inet_aton_exact function for completeness as well).
To avoid sending the problematic query names over DNS, commit
6ca53a2453598804a2559a548a08424fca96434a ("resolv: Do not send queries
for non-host-names in nss_dns [BZ #24112]") is needed.
|
|
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes strnlen/wcsnlen for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strlen-avx2.S: Use RSI_LP for length.
Clear the upper 32 bits of RSI register.
* sysdeps/x86_64/strlen.S: Use RSI_LP for length.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strnlen
and tst-size_t-wcsnlen.
* sysdeps/x86_64/x32/tst-size_t-strnlen.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wcsnlen.c: Likewise.
|
|
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes strncpy for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strcpy-avx2.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Likewise.
* sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncpy.
* sysdeps/x86_64/x32/tst-size_t-strncpy.c: New file.
|
|
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes the strncmp family for x32. Tested on x86-64 and x32.
On x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strcmp-avx2.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/strcmp-sse42.S: Likewise.
* sysdeps/x86_64/strcmp.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncasecmp,
tst-size_t-strncmp and tst-size_t-wcsncmp.
* sysdeps/x86_64/x32/tst-size_t-strncasecmp.c: New file.
* sysdeps/x86_64/x32/tst-size_t-strncmp.c: Likewise.
* sysdeps/x86_64/x32/tst-size_t-wcsncmp.c: Likewise.
|
|
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memset/wmemset for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Use
RDX_LP for length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-wmemset.
* sysdeps/x86_64/x32/tst-size_t-memset.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wmemset.c: Likewise.
|
|
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memrchr for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/memrchr.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/memrchr-avx2.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memrchr.
* sysdeps/x86_64/x32/tst-size_t-memrchr.c: New file.
|
|
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memcpy for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Use RDX_LP for
length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise.
* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
Likewise.
* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcpy.
tst-size_t-wmemchr.
* sysdeps/x86_64/x32/tst-size_t-memcpy.c: New file.
|
|
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memcmp/wmemcmp for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S: Use RDX_LP for
length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memcmp-sse4.S: Likewise.
* sysdeps/x86_64/multiarch/memcmp-ssse3.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp and
tst-size_t-wmemcmp.
* sysdeps/x86_64/x32/tst-size_t-memcmp.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wmemcmp.c: Likewise.
|
|
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memchr/wmemchr for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ# 24097]
CVE-2019-6488
* sysdeps/x86_64/memchr.S: Use RDX_LP for length. Clear the
upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memchr-avx2.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memchr and
tst-size_t-wmemchr.
* sysdeps/x86_64/x32/test-size_t.h: New file.
* sysdeps/x86_64/x32/tst-size_t-memchr.c: Likewise.
* sysdeps/x86_64/x32/tst-size_t-wmemchr.c: Likewise.
|
|
Before this commit, nss_dns would send a query which did not contain a
host name as the query name (such as invalid\032name.example.com) and
then reject the answer in getanswer_r and gaih_getanswer_slice, using
a check based on res_hnok. With this commit, no query is sent, and a
host-not-found error is returned to NSS without network interaction.
|
|
|
|
Commit 6923f6db1e688dedcf3a6556da76e0bf24a41872 ("malloc: Use current
(C11-style) atomics for fastbin access") caused a substantial
performance regression on POWER and Aarch64, and the old atomics,
while hard to prove correct, seem to work in practice.
|
|
Since MINSIGSTKSZ may not have sufficent stack space to allow lazy
binding, build tests for minimal signal handler with -Wl,-z,now to
disable lazy binding.
* signal/Makefile (LDFLAGS-tst-minsigstksz-1): New. Set to
-Wl,-z,now.
(LDFLAGS-tst-minsigstksz-2): Likewise.
(LDFLAGS-tst-minsigstksz-3): Likewise.
(LDFLAGS-tst-minsigstksz-3a): Likewise.
(LDFLAGS-tst-minsigstksz-4): Likewise.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
ChangeLog:
* manual/time.texi (strftime): Fix the wording to "alternative" rather
than "alternate".
|
|
A single underscore was omitted in
sysdeps/powerpc/powerpc64/multiarch/strncmp.c, resulting in use of
power8 version of strncmp instead of power9 version, with significant
performance degradation.
* sysdeps/powerpc/powerpc64/multiarch/strncmp.c: Fix #ifdef.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
|
|
There is general agreement that the very short list of things that ISO
C says you can do in an async signal handler should all work when the
handler is running on an alternate signal stack with only MINSIGSTKSZ
space. This patch adds tests to make sure those things do work.
To facilitate this, there is a new set of test support routines for
setting up alternate signal stacks; see support/xsignal.h for the API.
* support/xsignal.h (xalloc_sigstack, xfree_sigstack)
(xget_sigstack_location): New test support functions.
* support/xsigstack.c: New file, implementing them.
* support/tst-xsigstack.c: New test for them.
* support/Makefile: Update.
* signal/tst-minsigstksz-1.c
* signal/tst-minsigstksz-2.c
* signal/tst-minsigstksz-3.c
* signal/tst-minsigstksz-3a.c
* signal/tst-minsigstksz-4.c: New tests.
* signal/Makefile: Run them.
|
|
|
|
Ignore 112 errors in math/test-ldouble-fma and math/test-ildouble-fma
when IBM 128-bit long double used.
These errors are caused by spurious overflows from libgcc.
* math/libm-test-fma.inc (fma_test_data): Set
XFAIL_ROUNDING_IBM128_LIBGCC to more tests.
Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
|
|
An error "impossible register constraint in 'asm'" was raised on POWER
5 and due to __vector __int128_t being used as operands without passing the
option -msvx to gcc.
This patch replaces "__vector __int128_t" with "__vector unsigned int"
which requires only -maltivec, available since POWER ISA 2.03, and which
is already passed to the compiler.
* sysdeps/powerpc/powerpc64/tst-ucontext-ppc64-vscr.c:
(do_test): Changed __vector __int128_t to __vector unsigned int.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
|
|
Optimize x86-64 strcat/strncat, strcpy/strncpy and stpcpy/stpncpy with AVX2.
It uses vector comparison as much as possible. In general, the larger the
source string, the greater performance gain observed, reaching speedups of
1.6x compared to SSE2 unaligned routines. Select AVX2 strcat/strncat,
strcpy/strncpy and stpcpy/stpncpy on AVX2 machines where vzeroupper is
preferred and AVX unaligned load is fast.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
strcat-avx2, strncat-avx2, strcpy-avx2, strncpy-avx2,
stpcpy-avx2 and stpncpy-avx2.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c:
(__libc_ifunc_impl_list): Add tests for __strcat_avx2,
__strncat_avx2, __strcpy_avx2, __strncpy_avx2, __stpcpy_avx2
and __stpncpy_avx2.
* sysdeps/x86_64/multiarch/{ifunc-unaligned-ssse3.h =>
ifunc-strcpy.h}: rename header for a more generic name.
* sysdeps/x86_64/multiarch/ifunc-strcpy.h:
(IFUNC_SELECTOR): Return OPTIMIZE (avx2) on AVX 2 machines if
AVX unaligned load is fast and vzeroupper is preferred.
* sysdeps/x86_64/multiarch/stpcpy-avx2.S: New file
* sysdeps/x86_64/multiarch/stpncpy-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strcat-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strcpy-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strncat-avx2.S: Likewise
* sysdeps/x86_64/multiarch/strncpy-avx2.S: Likewise
|