Age | Commit message (Collapse) | Author | Files | Lines |
|
When cleaning up conformtest expectations for POSIX for locale.h in
<https://sourceware.org/ml/libc-alpha/2012-11/msg00382.html>, I missed
that locale.h had contents defined in POSIX.2:1993 as well as
POSIX.1:1995/6. Thus, LC_MESSAGES *should* in fact be required for
POSIX, because POSIX.2 says so; this patch adds that expectation
back. Tested for x86_64.
* conform/data/locale.h-data [POSIX] (LC_MESSAGES): Require.
|
|
Concluding the series of patches to clean up conformtest expectations
for "POSIX" (POSIX.1:1995/6, union with POSIX.2:1993), this patch
cleans up expectations for unistd.h. Tested x86_64; the new XFAIL is
for missing _POSIX2_C_VERSION.
* conform/data/unistd.h-data (_POSIX_VERSION): Require.
(_POSIX2_C_VERSION): Require if [POSIX || XPG3 || XPG4 || UNIX98].
Do not mention otherwise.
[POSIX] (_XOPEN_VERSION): Do not expect.
[POSIX] (_XOPEN_XCU_VERSION): Likewise.
[POSIX] (_POSIX2_C_BIND): Likewise.
[POSIX] (_POSIX2_VERSION): Likewise.
[POSIX] (_XOPEN_XPG2): Likewise.
[POSIX] (_XOPEN_XPG3): Likewise.
[POSIX] (_XOPEN_XPG4): Likewise.
[POSIX] (_XOPEN_UNIX): Likewise.
[POSIX] (_POSIX_ADVISORY_INFO): Likewise.
[POSIX] (_POSIX_BARRIERS): Likewise.
[POSIX] (_POSIX_CLOCK_SELECTION): Likewise.
[POSIX] (_POSIX_CPUTIME): Likewise.
[POSIX] (_POSIX_MONOTONIC_CLOCK): Likewise.
[POSIX] (_POSIX_READER_WRITER_LOCKS): Likewise.
[POSIX] (_POSIX_SHELL): Likewise.
[POSIX] (_POSIX_SPAWN): Likewise.
[POSIX] (_POSIX_SPIN_LOCKS): Likewise.
[POSIX] (_POSIX_SPORADIC_SERVER): Likewise.
[POSIX] (_POSIX_THREAD_CPUTIME): Likewise.
[POSIX] (_POSIX_TYPED_MEMORY_OBJECTS): Likewise.
[POSIX] (_POSIX_THREAD_SPORADIC_SERVER): Likewise.
[POSIX] (_XBS5_ILP32_OFF32): Likewise.
[POSIX] (_XBS5_ILP32_OFBIG): Likewise.
[POSIX] (_XBS5_LP64_OFF64): Likewise.
[POSIX] (_XBS5_LPBIG_OFFBIG): Likewise.
[POSIX] (_POSIX_TIMEOUTS): Likewise.
[POSIX] (_POSIX2_PBS): Likewise.
[POSIX] (_POSIX2_PBS_ACCOUNTING): Likewise.
[POSIX] (_POSIX2_PBS_CHECKPOINT): Likewise.
[POSIX] (_POSIX2_PBS_LOCATE): Likewise.
[POSIX] (_POSIX2_PBS_MESSAGE): Likewise.
[POSIX] (_POSIX2_PBS_TRACK): Likewise.
[POSIX] (_POSIX_TIMESTAMP_RESOLUTION): Likewise.
[POSIX] (_CS_XBS5_ILP32_OFF32_CFLAGS): Likewise.
[POSIX] (_CS_XBS5_ILP32_OFF32_LDFLAGS): Likewise.
[POSIX] (_CS_XBS5_ILP32_OFF32_LIBS): Likewise.
[POSIX] (_CS_XBS5_ILP32_OFF32_LINTFLAGS): Likewise.
[POSIX] (_CS_XBS5_ILP32_OFFBIG_CFLAGS): Likewise.
[POSIX] (_CS_XBS5_ILP32_OFFBIG_LDFLAGS): Likewise.
[POSIX] (_CS_XBS5_ILP32_OFFBIG_LIBS): Likewise.
[POSIX] (_CS_XBS5_ILP32_OFFBIG_LINTFLAGS): Likewise.
[POSIX] (_CS_XBS5_LP64_OFF64_CFLAGS): Likewise.
[POSIX] (_CS_XBS5_LP64_OFF64_LDFLAGS): Likewise.
[POSIX] (_CS_XBS5_LP64_OFF64_LIBS): Likewise.
[POSIX] (_CS_XBS5_LP64_OFF64_LINTFLAGS): Likewise.
[POSIX] (_CS_XBS5_LPBIG_OFFBIG_CFLAGS): Likewise.
[POSIX] (_CS_XBS5_LPBIG_OFFBIG_LDFLAGS): Likewise.
[POSIX] (_CS_XBS5_LPBIG_OFFBIG_LIBS): Likewise.
[POSIX] (_CS_XBS5_LPBIG_OFFBIG_LINTFLAGS): Likewise.
[POSIX] (_SC_2_C_BIND): Likewise.
[POSIX] (_SC_2_C_VERSION): Likewise.
[POSIX] (_SC_2_PBS): Likewise.
[POSIX] (_SC_2_PBS_ACCOUNTING): Likewise.
[POSIX] (_SC_2_PBS_CHECKPOINT): Likewise.
[POSIX] (_SC_2_PBS_LOCATE): Likewise.
[POSIX] (_SC_2_PBS_MESSAGE): Likewise.
[POSIX] (_SC_2_PBS_TRACK): Likewise.
[POSIX] (_SC_ATEXIT_MAX): Likewise.
[POSIX] (_SC_BARRIERS): Likewise.
[POSIX] (_SC_BASE): Likewise.
[POSIX] (_SC_CLOCK_SELECTION): Likewise.
[POSIX] (_SC_DEVICE_IO): Likewise.
[POSIX] (_SC_DEVICE_SPECIFIC): Likewise.
[POSIX] (_SC_DEVICE_SPECIFIC_R): Likewise.
[POSIX] (_SC_FD_MGMT): Likewise.
[POSIX] (_SC_FIFO): Likewise.
[POSIX] (_SC_FILE_ATTRIBUTES): Likewise.
[POSIX] (_SC_FILE_LOCKING): Likewise.
[POSIX] (_SC_FILE_SYSTEM): Likewise.
[POSIX] (_SC_IOV_MAX): Likewise.
[POSIX] (_SC_MONOTONIC_CLOCK): Likewise.
[POSIX] (_SC_NETWORKING): Likewise.
[POSIX] (_SC_PAGE_SIZE): Likewise.
[POSIX] (_SC_PASS_MAX): Likewise.
[POSIX] (_SC_PIPE): Likewise.
[POSIX] (_SC_READER_WRITER_LOCKS): Likewise.
[POSIX] (_SC_REGEXP): Likewise.
[POSIX] (_SC_SHELL): Likewise.
[POSIX] (_SC_SIGNALS): Likewise.
[POSIX] (_SC_SINGLE_PROCESS): Likewise.
[POSIX] (_SC_SPIN_LOCKS): Likewise.
[POSIX] (_SC_TYPED_MEMORY_OBJECTS): Likewise.
[POSIX] (_SC_USER_GROUPS): Likewise.
[POSIX] (_SC_USER_GROUPS_R): Likewise.
[POSIX] (_SC_STREAMS): Likewise.
[POSIX] (_SC_XBS5_ILP32_OFF32): Likewise.
[POSIX] (_SC_XBS5_ILP32_OFFBIG): Likewise.
[POSIX] (_SC_XBS5_LP64_OFF64): Likewise.
[POSIX] (_SC_XBS5_LPBIG_OFFBIG): Likewise.
[POSIX] (_SC_THREAD_ROBUST_PRIO_INHERIT): Likewise.
[POSIX] (_SC_THREAD_ROBUST_PRIO_PROTECT): Likewise.
[POSIX] (_PC_FILESIZEBITS): Likewise.
[POSIX] (_PC_REC_INCR_XFER_SIZE): Likewise.
[POSIX] (_PC_REC_MAX_XFER_SIZE): Likewise.
[POSIX] (_PC_REC_MIN_XFER_SIZE): Likewise.
[POSIX] (_PC_REC_XFER_ALIGN): Likewise.
[POSIX] (uid_t): Likewise.
[POSIX] (gid_t): Likewise.
[POSIX] (off_t): Likewise.
[POSIX] (pid_t): Likewise.
[POSIX] (cuserid): Allow.
(_SC_2_CHAR_TERM): Require constant.
(_POSIX_ASYNCHRONOUS_IO): Remove duplicate optional-constant.
* conform/Makefile (test-xfail-POSIX/unistd.h/conform): New
variable.
|
|
|
|
|
|
This patch removes the specialized i386 assembly implementations for
fallocate{64}, pselect, and sync_file_range now that i386 have
support for 6 argument syscalls.
|
|
|
|
|
|
ldbl-96 remquol wrongly handles the case where the first argument is
finite and the second infinite, because the check for the second
argument being a NaN fails to disregard the explicit high mantissa bit
and so wrongly interprets an infinity as being a NaN. This patch
fixes this by masking off that bit, and improves test coverage for
both remainder and remquo (various cases were missing tests, or, as in
the case of the bug, were tested only for one of the two functions).
Tested for x86_64 and x86.
[BZ #18244]
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Ignore explicit
high mantissa bit when testing whether P is a NaN.
* math/libm-test.inc (remainder_test_data): Add more tests.
(remquo_test_data): Likewise.
|
|
The i386 implementation of atanhl, for small arguments, does a
calculation that involves computing twice the square of the argument,
resulting in spurious underflows for some arguments. This patch fixes
this by just returning the argument when its exponent is below -32,
with underflow being forced as needed for subnormal arguments.
Tested for x86 and x86_64.
[BZ #18049]
* sysdeps/i386/fpu/e_atanhl.S (__ieee754_atanhl): For exponents
below -32, return the argument, with underflow if subnormal.
* math/auto-libm-test-in: Add more tests of atanh.
* math/auto-libm-test-out: Regenerated.
|
|
|
|
|
|
in order to avoid strict alias warnings.
(iruserok_af): Ditto for ra.
|
|
[BZ #17581] The checking chain of unused chunks was terminated by a hash of
the block pointer, which was sometimes confused with the chunk length byte.
We now avoid using a length byte equal to the magic byte.
|
|
* soft-fp/op-common.h (_FP_FROM_INT): Don't write to R.
|
|
|
|
When the malloc subsystem detects some kind of memory corruption,
depending on the configuration it prints the error, a backtrace, a
memory map and then aborts the process. In this process, the
backtrace() call may result in a call to malloc, resulting in
various kinds of problematic behavior.
In one case, the malloc it calls may detect a corruption and call
backtrace again, and a stack overflow may result due to the infinite
recursion. In another case, the malloc it calls may deadlock on an
arena lock with the malloc (or free, realloc, etc.) that detected the
corruption. In yet another case, if the program is linked with
pthreads, backtrace may do a pthread_once initialization, which
deadlocks on itself.
In all these cases, the program exit is not as intended. This is
avoidable by marking the arena that malloc detected a corruption on,
as unusable. The following patch does that. Features of this patch
are as follows:
- A flag is added to the mstate struct of the arena to indicate if the
arena is corrupt.
- The flag is checked whenever malloc functions try to get a lock on
an arena. If the arena is unusable, a NULL is returned, causing the
malloc to use mmap or try the next arena.
- malloc_printerr sets the corrupt flag on the arena when it detects a
corruption
- free does not concern itself with the flag at all. It is not
important since the backtrace workflow does not need free. A free
in a parallel thread may cause another corruption, but that's not
new
- The flag check and set are not atomic and may race. This is fine
since we don't care about contention during the flag check. We want
to make sure that the malloc call in the backtrace does not trip on
itself and all that action happens in the same thread and not across
threads.
I verified that the test case does not show any regressions due to
this patch. I also ran the malloc benchmarks and found an
insignificant difference in timings (< 2%).
* malloc/Makefile (tests): New test case tst-malloc-backtrace.
* malloc/arena.c (arena_lock): Check if arena is corrupt.
(reused_arena): Find a non-corrupt arena.
(heap_trim): Pass arena to unlink.
* malloc/hooks.c (malloc_check_get_size): Pass arena to
malloc_printerr.
(top_check): Likewise.
(free_check): Likewise.
(realloc_check): Likewise.
* malloc/malloc.c (malloc_printerr): Add arena argument.
(unlink): Likewise.
(munmap_chunk): Adjust.
(ARENA_CORRUPTION_BIT): New macro.
(arena_is_corrupt): Likewise.
(set_arena_corrupt): Likewise.
(sysmalloc): Use mmap if there are no usable arenas.
(_int_malloc): Likewise.
(__libc_malloc): Don't fail if arena_get returns NULL.
(_mid_memalign): Likewise.
(__libc_calloc): Likewise.
(__libc_realloc): Adjust for additional argument to
malloc_printerr.
(_int_free): Likewise.
(malloc_consolidate): Likewise.
(_int_realloc): Likewise.
(_int_memalign): Don't touch corrupt arenas.
* malloc/tst-malloc-backtrace.c: New test case.
|
|
The conditional that evaluates if there are any FAILed test cases
currently always fails, since we ensure it fails if we find any
unexpected results in tests.sum and it would obviously fail if it does
not find failed results in tests.sum. This patch fixes this by simply
inverting the result of the egrep, i.e. succeed if egrep fails (to
find failed results) and fail if it succeeds.
Tested with 'make subdirs=localedata check' and 'make subdirs=locale
check' where all tests succeed and with 'make subdirs=elf check' where
a couple of tests fail for me.
* Makefile (summarize-tests): Fix return value on success.
|
|
I was told that Ma Shimao submitted a patch to add envz_remove to the
libc manual, but the patch could not be accepted since he does not
have a copyright assignment in place. I have been woefully behind on
libc-alpha recently and have not seen the patch or the discussion
thread. I have also not read the man page for envz_remove, so
Alexandre Oliva asked me if I could write this independently and post
a patch. The patch below is the result of the same - I have written
it based on the implementation in string/envz.c and Alex told me via
email that the function is AS, AC and MT-safe like envz_strip.
I assume Alex and Carlos cannot review this since they have been
tainted by the original patch (I haven't even tried to look for a link
to it since I don't want to be tainted) so someone else will have to
review this. If there are no reviewers till the end of the week, I
will commit this since I believe there is a chance that there are no
other reviewers who haven't read that thread.
* manual/string.texi (Envz Functions): Add envz_remove.
|
|
|
|
Ignore generated *.pyc files, particularly in the benchtests
directory.
|
|
While trying to get nptl/tst-initializers1.c to include the test skeleton, I
came across a couple of speed bumps. Firstly: after making the appropriate
changes to the test, running `make check' led to this error:
> In file included from ../malloc/malloc.h:24:0,
..
> from tst-initializers1.c:60:
> ../include/stdio.h:111:1: error: unknown type name `wint_t'
> extern wint_t __getwc_unlocked (FILE *__fp);
So, `wint_t' is used before being defined. Question: Why did test-skeleton.c
not cause this error in any of the other tests that include it?
Anyway, I noticed include/stdio.h includes stddef.h, which in turn defines
`wint_t', but only if `__need_wint_t' is defined. So I put in a
`#define __need_wint_t' before the include to get rid of the error. Is that
the correct fix?
A subsequent `make && make check' led to this second error:
> from tst-initializers1-c89.c:1:
> ../test-skeleton.c: In function `main':
> ../test-skeleton.c:356:11: error: `for' loop initial declarations are only
> allowed in C99 mode
> for (struct temp_name_list *n = temp_name_list;
Although there seem to be several other C89 no-noes in test-skeleton.c, I
needed only to fix this specific one for gcc-4.8.3 to stop complaining.
|
|
This patch remove the non-portable array usage on tst-setcontext3.sh
script.
|
|
If any locale fails to compile then the installation
of locales via `make localedata/install-locales`
also fails.
|
|
Both bo_CN and bo_IN were not compiling. The following fix
gets them into a usable state again giving a clean build
result for `make localedata/install-locales`.
|
|
Similar to various other bugs in this area, some atanh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (No change in this regard is needed
for the i386 implementation; special handling to force underflows in
these cases will only be needed there when the spurious underflows,
bug 18049, get fixed.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16352]
* sysdeps/i386/fpu/e_atanh.S (dbl_min): New object.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/i386/fpu/e_atanhf.S (flt_min): New object.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/e_atanh.c: Include <float.h>.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/flt-32/e_atanhf.c: Include <float.h>.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from atanh.
* math/auto-libm-test-out: Regenerated.
|
|
The flt-32 implementation of tanf produces spurious underflow
exceptions for some small arguments, through computing values on the
order of x^5. This patch fixes this by adjusting the threshold for
returning x (or, as applicable, +/- 1/x) to 2**-13 (the next term in
the power series being x^3/3).
Tested for x86_64 and x86.
[BZ #18221]
* sysdeps/ieee754/flt-32/k_tanf.c (__kernel_tanf): Use 2**-13 not
2**-28 as threshold for returning x or +/- 1/x.
* math/auto-libm-test-in: Add more tests of tan.
* math/auto-libm-test-out: Regenerated.
|
|
The flt-32 implementation of lgammaf produces spurious underflow
exceptions for some large arguments, because of calculations involving
x^-2 multiplied by small constants. This patch fixes this by
adjusting the threshold for a simpler computation to 2**26 (the error
in the simpler computation is on the order of 0.5 * log (x), for a
result on the order of x * log (x)).
Tested for x86_64 and x86.
[BZ #18220]
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Use
2**26 not 2**58 as threshold for returning x * (log (x) - 1).
* math/auto-libm-test-in: Add another test of lgamma.
* math/auto-libm-test-out: Regenerated.
|
|
|
|
which is more efficient on all targets.
|
|
The flt-32 implementation of erfcf produces spurious underflow
exceptions for some arguments close to 0, because of calculations
squaring the argument and then multiplying by small constants. This
patch fixes this by adjusting the threshold for arguments for which
the result is so close to 1 that 1 - x will give the right result from
2**-56 to 2**-26. (If 1 - x * 2/sqrt(pi) were used, the errors would be
on the order of x^3 and a much larger threshold could be used.)
Tested for x86_64 and x86.
[BZ #18217]
* sysdeps/ieee754/flt-32/s_erff.c (__erfcf): Use 2**-26 not 2**-56
as threshold for returning 1 - x.
* math/auto-libm-test-in: Add more tests of erfc.
* math/auto-libm-test-out: Regenerated.
|
|
The sysdeps/ieee754/flt-32 version of atanf produces spurious
underflow exceptions for some large arguments, because of computations
that compute x^-4. This patch fixes this by adjusting the threshold
for large arguments (for which +/- pi/2 can just be returned, the
correct result being roughly +/- pi/2 - 1/x) from 2^34 to 2^25.
Tested for x86_64 and x86.
[BZ #18196]
* sysdeps/ieee754/flt-32/s_atanf.c (__atanf): Use 2^25 not 2^34 as
threshold for large arguments.
* math/auto-libm-test-in: Add another test of atan.
* math/auto-libm-test-out: Regenerated.
|
|
Similar to various other bugs in this area, some log1p implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (The ldbl-128ibm implementation
doesn't currently need any change as it already generates this
exception, albeit through code that would generate spurious exceptions
in other cases; special code for this issue will only be needed there
when fixing the spurious exceptions.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16339]
* sysdeps/i386/fpu/s_log1p.S (dbl_min): New object.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/i386/fpu/s_log1pf.S (flt_min): New object.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/dbl-64/s_log1p.c: Include <float.h>.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/flt-32/s_log1pf.c: Include <float.h>.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_log1pl.c: Include <float.h>.
(__log1pl): Force underflow exception for results with small
absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from log1p.
* math/auto-libm-test-out: Regenerated.
|
|
This patch changes the way make-syscall-sh script uses echo to follow
POSIX spec.
|
|
Installation of libm.so as linker script only in case of libmvec.so build.
2015-05-14 Andrew Senkevich <andrew.n.senkevich@gmail.com>
* Makeconfig (rpath-dirs, all-subdirs): Added mathvec folder.
(libmvec): New variable.
* configure.ac: Added option for mathvec build.
* configure: Regenerated.
* mathvec/Depend: New file.
* mathvec/Makefile: New file.
* shlib-versions: Added libmvec.
* math/Makefile: Added rule for libm.so installation.
|
|
declarations for math functions in math.h. Added new headers math-vector.h
(only generic version for now) and libm-simd-decl-stubs.h with empty
definitions required for proper unfolding of new macros __MATHCALL_VEC which
will be used for declaration of vector math functions.
2015-05-14 Andrew Senkevich <andrew.senkevich@intel.com>
* bits/math-vector.h: New file.
* bits/libm-simd-decl-stubs.h: New header.
* math/Makefile (headers): Added new header libm-simd-decl-stubs.h.
* math/math.h (__MATHCALL_VEC): New macro.
|
|
of method for separation which exactly testing function needed to run with
help of generated during make check header with series of conditional
definitions.
2015-05-14 Andrew Senkevich <andrew.senkevich@intel.com>
* math/gen-libm-have-vector-test.sh: Script generates series of macros
for conditions in testing functions.
* math/Makefile: Added call of libm-have-vector-test.sh.
* math/libm-test.inc (HAVE_VECTOR): New macros.
|
|
and addition of macros used for runtime architecture check.
2015-05-14 Andrew Senkevich <andrew.senkevich@intel.com>
* math/libm-test.inc: START refactored.
* math/test-double.c (TEST_MATHVEC): Add define.
* math/test-float.c: Likewise.
* math/test-idouble.c: Likewise.
* math/test-ifloat.c: Likewise.
* math/test-ildoubl.c: Likewise.
* math/test-ldouble.c: Likewise.
* sysdeps/generic/math-tests-arch.h (INIT_ARCH_EXT, CHECK_ARCH_EXT):
New helper macros for runtime architecture check.
|
|
of vector math functions infrastructure and several x86_64 implementations.
This patch is preparatory change in libm-test.c - splitting of macros which
form name of tested functions for ability to use separate name for tested
functions and for functions used in test suite infrastructure.
2015-05-14 Andrew Senkevich <andrew.senkevich@intel.com>
* math/test-double.c (FUNC_TEST): New macro.
* math/test-float.c: Likewise.
* math/test-idouble.c: Likewise.
* math/test-ifloat.c: Likewise.
* math/test-ildoubl.c: Likewise.
* math/test-ldouble.c: Likewise.
* math/libm-test.inc: Use FUNC_TEST for name of tested functions.
|
|
|
|
|
|
|
|
|
|
|
|
[BZ #18409]
* sysdeps/unix/make-syscalls.sh: Remove a trailing `\'.
|
|
|
|
|
|
|
|
This patch prepares for the strcoll benchmark by moving the makefile
code for generating the locale files into a standalone snippet that
can be used elsewhere.
|
|
To make a strtok faster and improve performance in general we need to do one
additional change.
A comment:
/* It doesn't make sense to send libc-internal strcspn calls through a PLT.
The speedup we get from using SSE4.2 instruction is likely eaten away
by the indirect call in the PLT. */
Does not make sense at all because nobody bothered to check it. Gap
between these implementations is quite big, when haystack is empty a
sse2 is around 40 cycles slower because it needs to populate a lookup
table and difference only increases with size. That is much bigger than
plt slowdown which is few cycles.
Even benchtest show a gap which also may be reverse by branch
misprediction but my internal benchmark shown.
simple_strspn stupid_strspn __strspn_sse42 __strspn_sse2
Length 0, alignment 0, acc len 6: 18.6562 35.2344 17.0469 61.6719
Length 6, alignment 0, acc len 6: 59.5469 72.5781 16.4219 73.625
This patch also handles strpbrk which is implemented by including a
x86_64/multiarch/strcspn.S file.
* sysdeps/x86_64/multiarch/strspn.S: Remove plt indirection.
* sysdeps/x86_64/multiarch/strcspn.S: Likewise.
|
|
|