aboutsummaryrefslogtreecommitdiff
path: root/sysdeps
AgeCommit message (Collapse)AuthorFilesLines
2015-01-30Add memcpy-rte-ssse3.chjl/memcpy/dpdk/masterH.J. Lu3-1/+10
2015-01-30Add memcpy-rte-avx.cH.J. Lu4-4/+16
Don't inline rte_memcpy.
2015-01-30Import rte_memcpy.hH.J. Lu1-0/+635
rte_memcpy.h is a memcpy implementation from DPDK: http://dpdk.org/ optimized for Sandy Bridge and Haswell. See http://dpdk.org/ml/archives/dev/2014-November/008158.html The original code is at https://gist.github.com/lukego/efc82a15bde5ec83cb1b
2015-01-30Use AVX unaligned memcpy only if AVX2 is availableH.J. Lu8-8/+17
memcpy with unaligned 256-bit AVX register loads/stores are slow on older processorsl like Sandy Bridge. This patch adds bit_AVX_Fast_Unaligned_Load and sets it only when AVX2 is available. [BZ #17801] * sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features): Set the bit_AVX_Fast_Unaligned_Load bit for AVX2. * sysdeps/x86_64/multiarch/init-arch.h (bit_AVX_Fast_Unaligned_Load): New. (index_AVX_Fast_Unaligned_Load): Likewise. (HAS_AVX_FAST_UNALIGNED_LOAD): Likewise. * sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check the bit_AVX_Fast_Unaligned_Load bit instead of the bit_AVX_Usable bit. * sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Likewise. * sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Likewise. * sysdeps/x86_64/multiarch/memmove.c (__libc_memmove): Replace HAS_AVX with HAS_AVX_FAST_UNALIGNED_LOAD. * sysdeps/x86_64/multiarch/memmove_chk.c (__memmove_chk): Likewise.
2015-01-29Include <signal.h> in sysdeps/nptl/allocrtsig.cAndreas Schwab1-0/+1
Architectures which don't use hp-timing-common.h don't include <signal.h> via <sys/param.h>.
2015-01-28tilegx32: set __HAVE_64B_ATOMICS to 0Chris Metcalf1-1/+9
This is because of alignment issues in the sem_t support. tilegx32 does in fact support 64-bit atomics and we will need to revisit this after the 2.21 freeze.
2015-01-28Disable 64-bit atomics for MIPS n32.Joseph Myers1-1/+1
This patch disables use of 64-bit atomics for MIPS n32 to fix the problems with unaligned semaphores. Before 64-bit atomics are used for anything for which such alignment issues do not arise, and before the addition of any new ILP32 ports with 64-bit semaphores for which the ABI can be set to have the greater alignment (AARCH64?), a better approach will need to be established that allows architectures to declare their 64-bit atomics availability accurately, without doing so causing inappropriate use of such atomics on unaligned semaphores. Tested for MIPS n32 that this fixes the nptl/tst-sem3 failure. * sysdeps/mips/bits/atomic.h [_MIPS_SIM == _ABIN32] (__HAVE_64B_ATOMICS): Define to 0.
2015-01-28powerpc: Fix fesetexceptflag [BZ#17885]Adhemerval Zanella1-1/+1
This patch fixes a bug introduced by 18f2945ae9216cfc, where it optimizes the FPSCR set by just issuing a mtfs instruction if new flag is different from older one. The issue is a typo, where the new flag should the the new value, instead of the old one. It fixes BZ#17885.
2015-01-28powerpc: Fix fsqrt build in libm [BZ#16576]Adhemerval Zanella5-102/+24
Some powerpc64 processors (e5500 core for instance) does not provide the fsqrt instruction, however current check to use in math_private.h is __WORDSIZE and _ARCH_PWR4 (ISA 2.02). This is patch change it to use the compiler flag _ARCH_PPCSQ (which is the same condition GCC uses to decide whether to generate fsqrt instruction). It fixes BZ#16576.
2015-01-25ia64: avoid set-but-not-used warningAndreas Schwab1-0/+3
2015-01-25m68k/coldfire: avoid warning about volatile register variablesAndreas Schwab1-10/+9
2015-01-25m68k: fix missing definition of __feraiseexceptAndreas Schwab1-0/+1
2015-01-25m68k: force inlining bswap functionsAndreas Schwab1-4/+4
2015-01-24powerpc: Fix powerpc64 build failure with binutils 2.22Adhemerval Zanella1-1/+4
GLIBC memset optimization for POWER8 uses the '.machine power8' directive, which is only supported officially on binutils 2.24+. This causes a build failure on older binutils. Since the requirement of .machine power8 is to correctly assembly the 'mtvsrd' instruction and it is already handled by the MTVSRD_V1_R4 macro, there is no really needed of using it. The patch replaces the power8 with power7 for .machine directive. It fixes BZ#17869.
2015-01-24powerpc: Fix ifuncmain6pie failure with GCC 4.9Adhemerval Zanella1-1/+3
This patch fix the elf/ifuncmain6pie failure when building with GCC 4.9+. For some reason, the compiler removes the branch taken code at resolve_ifunc (sysdeps/powerpc/powerpc64/dl-machine.h) as dead-code and thus the testcase fails because the ifunc resolves branches to an invalid memory location. It fixes by explicit adding a dependency of value based on odp variable to avoid compiler optimization. It fixes BZ#17868.
2015-01-23Also treat model numbers 0x5a/0x5d as SilvermontH.J. Lu1-0/+2
2015-01-23Treat model numbers 0x4a/0x4d as SilvermontH.J. Lu1-0/+2
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features): Treat model numbers 0x4a/0x4d as Intel Silvermont architecture.
2015-01-23Use uint64_t and (uint64_t) 1 for 64-bit intH.J. Lu1-1/+1
This patch replaces unsigned long int and 1UL with uint64_t and (uint64_t) 1 to support ILP32 targets like x32. [BZ #17870] * nptl/sem_post.c (__new_sem_post): Replace unsigned long int with uint64_t. * nptl/sem_waitcommon.c (__sem_wait_cleanup): Replace 1UL with (uint64_t) 1. (__new_sem_wait_slow): Replace unsigned long int with uint64_t. Replace 1UL with (uint64_t) 1. * sysdeps/nptl/internaltypes.h (new_sem): Replace unsigned long int with uint64_t.
2015-01-21BZ #16418: Fix powerpc get_clockfreq racinessAdhemerval Zanella1-69/+59
This patch fix powerpc __get_clockfreq racy and cancel-safe issues by dropping internal static cache and by using nocancel file operations. The vDSO failure check is also removed, since kernel code does not return an error (it cleans cr0.so bit on function return) and the static code (to read value /proc) now uses non-cancellable calls.
2015-01-21Update copyright year to 2015 for new files.Carlos O'Donell1-1/+1
2015-01-21Fix recursive dlopen.Carlos O'Donell1-2/+2
The ability to recursively call dlopen is useful for malloc implementations that wish to load other dynamic modules that implement reentrant/AS-safe functions to use in their own implementation. Given that a user malloc implementation may be called by an ongoing dlopen to allocate memory the user malloc implementation interrupts dlopen and if it calls dlopen again that's a reentrant call. This patch fixes the issues with the ld.so.cache mapping and the _r_debug assertion which prevent this from working as expected. See: https://sourceware.org/ml/libc-alpha/2014-12/msg00446.html
2015-01-21Fix semaphore destruction (bug 12674).Carlos O'Donell19-1794/+23
This commit fixes semaphore destruction by either using 64b atomic operations (where available), or by using two separate fields when only 32b atomic operations are available. In the latter case, we keep a conservative estimate of whether there are any waiting threads in one bit of the field that counts the number of available tokens, thus allowing sem_post to atomically both add a token and determine whether it needs to call futex_wake. See: https://sourceware.org/ml/libc-alpha/2014-12/msg00155.html
2015-01-17Commit nios2 port to master.Chung-Lin Tang96-0/+7575
2015-01-16S390: Get rid of linknamespace failures for utmp functions.Stefan Liebler7-14/+42
2015-01-16S390: Get rid of linknamespace failures for string functions.Stefan Liebler14-79/+79
2015-01-14Fix powerpc-nofpu fesetenv namespace (bug 17748).Joseph Myers1-1/+1
When fixing namespace issues for <fenv.h> functions I missed one call to fesetenv for powerpc-nofpu. This patch changes this to a call to __fesetenv. Tested for powerpc-nofpu; it fixes the previously observed math.h linknamespace test failures. [BZ #17748] * sysdeps/powerpc/nofpu/feholdexcpt.c (__feholdexcept): Call __fesetenv instead of fesetenv.
2015-01-14[s390] Define a __tls_get_addr macro to avoid declaring it againSiddhesh Poyarekar1-0/+7
commit 050f7298e1ecc39887c329037575ccd972071255 added an extern declaration for __tls_get_addr that conflicts with the one in s390 dl-tls.h, based on whether __tls_get_addr is defined as a macro. The rationale seems to be based on the assumption that __tls_get_addr is exported for every architecture and hence an internal non-plt alias is needed. This is not true for s390 though, since it exports __tls_get_offset and not __tls_get_addr. This results in tst-audit9 being stuck in an infinite loop. This patch fixes this by defining a __tls_get_addr macro to itself so as to not use the conflicting declaration.
2015-01-13powerpc: Fix POWER7/PPC64 performance regression on LEAdhemerval Zanella1-588/+282
This patch fixes a performance regression on the POWER7/PPC64 memcmp porting for Little Endian. The LE code uses 'ldbrx' instruction to read the memory on byte reversed form, however ISA 2.06 just provide the indexed form which uses a register value as additional index, instead of a fixed value enconded in the instruction. And the port strategy for LE uses r0 index value and update the address value on each compare loop interation. For large compare size values, it adds 8 more instructions plus some more depending of trailing size. This patch fixes it by adding pre-calculate indexes to remove the address update on loops and tailing sizes. For large sizes it shows a considerable gain, with double performance pairing with BE.
2015-01-13powerpc: Optimized strncmp for POWER8/PPC64Adhemerval Zanella5-5/+374
This patch adds an optimized POWER8 strncmp. The implementation focus on speeding up unaligned cases follwing the ideas of power8 strcmp. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases (where sources alignment are different).
2015-01-13powerpc: Optimize POWER7 strcmp trailing checksRajalakshmi Srinivasaraghavan1-114/+83
This patch optimized the POWER7 trailing check by avoiding using byte read operations and instead use the doubleword already readed with bitwise operations.
2015-01-13powerpc: Optimized strcmp for POWER8/PPC64Adhemerval Zanella5-3/+306
This patch adds an optimized POWER8 strcmp using unaligned accesses. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases
2015-01-13powerpc: Optimized st{r,p}ncpy for POWER8/PPC64Adhemerval Zanella8-6/+542
This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses. It shows 10%-80% improvement over the optimized POWER7 one that uses only aligned accesses, specially on unaligned inputs. The algorithm first read and check 16 bytes (if inputs do not cross a 4K page size). The it realign source to 16-bytes and issue a 16 bytes read and compare loop to speedup null byte checks for large strings. Also, different from POWER7 optimization, the null pad is done inline in the implementation using possible unaligned accesses, instead of realying on a memset call. Special case is added for page cross reads.
2015-01-13powerpc: Optimized strncat for POWER7/PPC64Adhemerval Zanella3-270/+31
With 3eb38795dbbbd816 (Simplify strncat) the generic algorithms uses strlen, strnlen, and memcpy. This is faster than POWER7 current implementation, especially for unaligned strings (where POWER7 code uses byte-byte operations). This patch removes the assembly implementation and uses a multiarch specialization based on default algorithm calling optimized POWER7 symbols.
2015-01-13powerpc: Optimized strcat for POWER8/PPC64Adhemerval Zanella4-4/+40
With new optimized strcpy for POWER8, this patch adds an optimized strcat which uses it along with default implementation at strings/.
2015-01-13powerpc: Optimized st{r,p}cpy for POWER8/PPC64Adhemerval Zanella7-3/+377
This patch adds an optimized POWER8 strcpy using unaligned accesses. For strings up to 16 bytes the implementation first calculate the string size, like strlen, and issues a memcpy. For larger strings, source is first aligned to 16 bytes and then tested over a loop that reads 16 bytes am combine the cmpb results for speedup. Special case is added for page cross reads. It shows 30%-60% improvement over the optimized POWER7 one that uses only aligned accesses.
2015-01-13Fix wake-up in sysdeps/nptl/fork.c.Torvald Riegel1-1/+1
2015-01-12Fix ldbl-96 scalblnl underflowing results (bug 17803).Joseph Myers1-4/+4
The ldbl-96 implementation of scalblnl (used for x86_64 and ia64) uses a condition k <= -63 to determine when a standard underflowing result tiny*__copysignl(tiny,x) should be returned. However, that condition corresponds to values with exponent -16446 or less, and in the case of -16446, the correct result for round-to-nearest depends on whether the value is exactly 0x1p-16446 (half the least subnormal) or more than that. This patch fixes the bug by changing the condition to k <= -64 and accordingly adjusting the exponent by 64 not 63 when converting to a normal value. Tested for x86_64. [BZ #17803] * sysdeps/ieee754/ldbl-96/s_scalblnl.c (twom63): Rename to twom64. Adjust value to 0x1p-64L. (__scalblnl): Only return standard underflowing result for K <= -64 not K <= -63; adjust exponent for underflowing result by 64 not 63. * math/libm-test.inc (scalbn_test_data): Add more tests. (scalbln_test_data): Likewise.
2015-01-12Fix ldbl-96 scalblnl for subnormal arguments (bug 17834).Joseph Myers1-2/+2
The ldbl-96 implementation of scalblnl (used for x86_64 and ia64) is incorrect for subnormal arguments (this is a separate bug from bug 17803, which is about underflowing results). There are two problems with the adjustments of subnormal arguments: the "two63" variable multiplied by is actually 0x1p52L not 0x1p63L, so is insufficient to make values normal, and then GET_LDOUBLE_EXP(es,x), used to extract the new exponent, extracts it into a variable that isn't used, while the value taken to by the new exponent is wrongly taken from the high part of the mantissa before the adjustment (hx). This patch fixes both those problems and adds appropriate tests. Tested for x86_64. [BZ #17834] * sysdeps/ieee754/ldbl-96/s_scalblnl.c (two63): Change value to 0x1p63L. (__scalblnl): Get new exponent of adjusted subnormal value from ES not HX. * math/libm-test.inc (scalbn_test_data): Add more tests. (scalbln_test_data): Likewise.
2015-01-12Add x86 32 bit vDSO time function supportAdhemerval Zanella16-118/+313
Linux 3.15 adds support for clock_gettime, gettimeofday, and time vDSO (commit id 37c975545ec63320789962bf307f000f08fabd48). This patch adds GLIBC supports to use such symbol when they are avaiable. Along with x86 vDSO support, this patch cleanup x86_64 code by moving all common code to x86 common folder. Only init-first.c is different between implementations.
2015-01-12powerpc: Fix Copyright dates and CL entryAdhemerval Zanella14-14/+14
This patch fixes the copyright dates from files created by commit 8d2c0a5, 4b45943, and 56cf276.
2015-01-12powerpc: abort transaction in syscallsAdhemerval Zanella7-2/+74
Linux kernel powerpc documentation states issuing a syscall inside a transaction is not recommended and may lead to undefined behavior. It also states syscalls does not abort transactoin neither they run in transactional state. To avoid side-effects being visible outside transactions, GLIBC with lock elision enabled will issue a transaction abort instruction just before all syscalls if hardware supports hardware transactions.
2015-01-12powerpc: Add adaptive elision to rwlocksAdhemerval Zanella3-3/+119
This patch adds support for lock elision using ISA 2.07 hardware transactional memory for rwlocks. The logic is similar to the one presented in pthread_mutex lock elision.
2015-01-12powerpc: Add the lock elision using HTMAdhemerval Zanella15-5/+672
This patch adds support for lock elision using ISA 2.07 hardware transactional memory instructions for pthread_mutex primitives. Similar to s390 version, the for elision logic defined in 'force-elision.h' is only enabled if ENABLE_LOCK_ELISION is defined. Also, the lock elision code should be able to be built even with a compiler that does not provide HTM support with builtins. However I have noted the performance is sub-optimal due scheduling pressures.
2015-01-09Fix shm-directory.h #include.Roland McGrath1-1/+1
2015-01-09MicroBlaze: Fix BZ17791 - Remove fixed page size macros and othersMatthew Fortune1-8/+0
Microblaze apparently has a variable page size (see thread below) and should not hard-code any page-size related macros. Also remove macros that are only used for BFD's trad-core support which is not relavant for microblaze also according to the thread starting here: https://sourceware.org/ml/libc-ports/2013-11/msg00028.html This patch is neither built nor tested but mirrors a MIPS patch that fixes the same issue. Thanks, Matthew * sysdepsysdeps/unix/sysv/linux/microblaze/sys/user.h (PAGE_SHIFT, PAGE_SIZE, PAGE_MASK, NBPG, UPAGES): Remove. (HOST_TEXT_START_ADDR, HOST_STACK_END_ADDR): Remove. Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>
2015-01-09MicroBlaze: Remove custom lowlevellock.h.Torvald Riegel1-303/+0
2015-01-06 Torvald Riegel <triegel@redhat.com> * sysdeps/unix/sysv/linux/microblaze/lowlevellock.h: Delete file. Signed-off-by: Torvald Riegel <triegel@redhat.com> Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>
2015-01-09MicroBlaze: Remove custom pthread_once implementation on microblaze.Torvald Riegel1-89/+0
2015-01-06 Torvald Riegel <triegel@redhat.com> * sysdeps/unix/sysv/linux/microblaze/pthread_once.c: Delete file. Signed-off-by: Torvald Riegel <triegel@redhat.com> Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>
2015-01-09MicroBlaze: Avoid pointer to integer conversion warningDavid Holsgrove1-2/+2
2015-01-06 David Holsgrove <david.holsgrove@xilinx.com> * sysdeps/microblaze/jmpbuf-unwind.h (_jmpbuf_sp): Declare SP as void pointer and cast to uintptr_t. Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>
2015-01-09MicroBlaze: Fix volatile-register-var warning in READ_THREAD_POINTERDavid Holsgrove1-8/+3
Resolves warning: 'optimization may eliminate reads and/or writes to register variables' 2015-01-06 David Holsgrove <david.holsgrove@xilinx.com> * sysdeps/microblaze/nptl/tls.h: Remove inline __microblaze_get_thread_area and update READ_THREAD_POINTER. Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>
2015-01-09MicroBlaze: Fix integer-pointer conversion warningDavid Holsgrove1-1/+1
2015-01-06 David Holsgrove <david.holsgrove@xilinx.com> * sysdeps/microblaze/nptl/tls.h (TLS_INIT_TP): Use NULL instead of 0. Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>