aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-02-19PPC64: Attach SIMD attribute to cosf, sin, sinf function declarations.tuliom/libmvecBert Tenjy1-0/+6
These changes were mistakenly left out of the patches that added SIMD versions of these functions to libmvec. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19PPC64: Add libmvec SIMD double-precision power function [BZ #24210]Shawn Landden10-9/+688
Based off the ./sysdeps/ieee754/dbl-64/pow.c implementation, and provides identical results. Unlike other libmvec functions, this sets the underflow and overflow bits. The caller can check these flags, and possibly re-run the calculations with scalar pow to figure out what is causing the overflow or underflow. I may have not normalized the data for benchmarking this properly, but operating only on integers between 0-2^32 and floats between 0.5 and 1 I get the following: Running 20 times over 32MiB vector: mean 535.824919 (sd 0.246088) scalar: mean 286.384220 (sd 0.027630) Which is a very impressive speed boost. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19PPC64: Add libmvec SIMD single-precision power function [BZ #24210]Shawn Landden8-2/+357
Based off the ./sysdeps/ieee754/flt-32/powf.c implementation, and thus provides identical results. Unlike other libmvec functions, this sets the underflow and overflow bits. The caller can check these flags, and possibly re-run the calculations with scalar powf to figure out what is causing the overflow or underflow. I may have not normalized the data for benchmarking this properly, but operating only on floats between 0.5 and 1 I get the following: Running 20 times over 32MiB vector: mean 307.659767 (sd 0.203217) scalar: mean 221.837088 (sd 0.032256) And with random data there is a decrease in performance: vector: mean 265.366371 (sd 0.000626) scalar: mean 279.598078 (sd 0.025592) Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19powerpc64: Add support for vec_cmpne for older compilersTulio Magno Quites Machado Filho2-0/+22
vec_cmpne was added to GCC 7, requiring an alternative implementation when building glibc with GCC 6.
2020-02-19PPC64: Add libmvec SIMD double-precision natural exponent function [BZ #24209]Shawn Landden12-12/+471
Passes all tests. Unlike other libmvec functions, this sets the underflow and overflow bits. The caller can check these flags, and possibly re-run the calculations with scalar expf to figure out what is causing the overflow or underflow. The special-case path is not vectorized, and performs much worse than the scalar code. Normalized data: 1 to 2^32 converted to double Running 20 times over 32MiB vector: mean 563.807107 MiB/s (sd 0.390922) scalar: mean 226.527824 MiB/s (sd 0.077406) Random data: vector: mean 80.175986 MiB/s (sd 1.110948) scalar: mean 244.738130 MiB/s (sd 0.029561) Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19PPC64: Add libmvec SIMD single-precision natural exponent function [BZ #24209]Shawn Landden10-11/+321
Passes all tests. Based off the ./sysdeps/ieee754/dbl-64/e_exp.c implementation, and thus provides identical results. Unlike other libmvec functions, this sets the underflow and overflow bits. The caller can check these flags, and possibly re-run the calculations with scalar expf to figure out what is causing the overflow or underflow. Suprisingly the special-case path performs as well as the normal path. (both of which are vectorized) Running 20 times over 32MiB vector: mean 432.263032 MiB/s (sd 0.486733) scalar: mean 178.646197 MiB/s (sd 0.050013) Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19powerpc64: Fix libmvec's logf4 build on GCC < 8Tulio Magno Quites Machado Filho1-0/+11
The built-in vec_float was added to GCC 8.0, requiring an alternative implementation when using older GCC versions.
2020-02-19PPC64: Add libmvec SIMD single-precision logarithm function [BZ #24208]Bert Tenjy10-3/+351
Implements single-precision vector logarithm function. The algorithm is an adaptation of the one in sysdeps/ieee754/flt-32/e_logf.c, modified for PPC64 VSX hardware. The version of e_logf.c referenced here is from commit #bf27d3973d. The patch has been tested on both Little-Endian and Big-Endian. It passes all the tests for single-precision logarithm run by make check with max ULP of 1. Integration into the make check infrastructure is adapted from similar x86_64 changes in commit #774488f88a. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19PPC64: Add libmvec SIMD double-precision logarithm function [BZ #24208]Bert Tenjy9-3/+812
Implements double-precision vector logarithm function. The algorithm is an adaptation of the one in sysdeps/ieee754/dbl-64, modified to exploit PPC64 VSX hardware. The version of ieee754/dbl-64 is commit #f41b0a43e4. The patch has been tested on both Little-Endian and Big-Endian. It passes all the tests for double-precision logarithm run by make check. Integration into the make check infrastructure closely follows corres- ponding changes done for x86_64 in commit #6af25acc7b. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19powerpc64: Fix mathvec build and tests on POWER < 8Tulio Magno Quites Machado Filho2-4/+4
vec_d_cos2_vsx.c, vec_d_sin2_vsx.c and vec_d_sincos2_vsx.c use vec_sl(), which is only available on POWER8 processors.
2020-02-19PPC64: Add libmvec SIMD single-precision sincosf function [BZ #24207]Bert Tenjy9-3/+249
Implements single-precision vector sincosf function. The polynomial approxima- ting algorithm is adapted for PPC64 from x86_64 [commit #a6336cc446]. The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. Testing uses the framework created for libmvec on x86_64 which runs tests on issuing 'make check'. Tests of the new vector sincosf function all pass. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19PPC64: Add libmvec SIMD double-precision sincos function [BZ #24207]Bert Tenjy9-2/+216
Implements double-precision vector sincos function. The polynomial approxima- ting algorithm is adapted for PPC64 from x86_64 [commit #c9a8c526ac]. The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. Testing uses the framework created for libmvec on x86_64 which runs tests on issuing 'make check'. Tests of the new vector sincos function all pass. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19PPC64: Add libmvec SIMD single-precision sine function [BZ #24206]Bert Tenjy8-15/+139
Implements single-precision vector sine function. The polynomial sine-approximating algorithm is adapted for PPC64 from x86_64 [commit #2a8c2c7b33]. The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. Testing uses the framework created for libmvec on x86_64 which runs tests on issuing 'make check'. Tests of the new vector single-precision sine function all pass. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19PPC64: Add libmvec SIMD double-precision sine function [BZ #24206]Bert Tenjy8-19/+140
Implements double-precision vector sine function. The polynomial sine-approximating algorithm is adapted for PPC64 from x86_64 [commit #4b9c2b707b]. The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. Testing uses the framework created for libmvec on x86_64 which runs tests on issuing 'make check'. Tests of the new vector sine function all pass. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19PPC64: Add libmvec SIMD single-precision cosine function [BZ #24205]Bert Tenjy8-5/+226
Implements single-precision cosine using VSX vector capability. The polynomial cosine-approximating algorithm is adapted for PPC64 from x86_64 [commit #04f496d602]. The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. It is tested using the framework created for libmvec on x86_64 which runs tests on issuing 'make check'. Tests of the new vector cosine function all pass. Details on the ABI are found at this link: <https://sourceware.org/glibc/wiki/ libmvec?action=AttachFile&do=view&target=VectorABI.txt> But for adjusting the width of operands, details described for the double-precision cosine implemented earlier apply here. See git commit #7956c29f07 for that information. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19PPC64: Add libmvec SIMD double-precision cosine function [BZ #24205]Bert Tenjy12-0/+295
This is the 1st of 12 patches that will implement libmvec for PPC64 using VSX hardware capabilities. Implements double-precision cosine using VSX vector capability. Algorithm for cosine is from x86_64 [commit #2193311288] adapted to PPC64. Name-mangling exactly duplicates SSE ISA of the x86_64 ABI. The details are at <https://sourceware.org/glibc/wiki/ libmvec?action=AttachFile&do=view&target=VectorABI.txt> The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. It is tested using the framework created for libmvec on x86_64 which runs tests on issuing 'make check'. Tests of the new vector cosine function all pass. Library libmvec is built by default. To disable building it, pass flag --disable-mathvec to the configure script. A runtime check prevents vector tests running on systems lacking VSX hardware. Glibc built with this patch was installed using the procedure outlined at <https://sourceware.org/glibc/wiki/Testing/Builds>. Compiling against the new library created a test executable which computes cosines using the vector version of the function. The results are at most 2-ulps away from the scalar cosine. That is expected and indicated in the comments describing the algorithm - as obtained from x86_64 commit #2193311288. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-19Fix tst-pkey expectations on pkey_get [BZ #23202]Lucas A. M. Magalhaes1-4/+6
From the GNU C Library manual, the pkey_set can receive a combination of PKEY_DISABLE_WRITE and PKEY_DISABLE_ACCESS. However PKEY_DISABLE_ACCESS is more restrictive than PKEY_DISABLE_WRITE and includes its behavior. The test expects that after setting (PKEY_DISABLE_WRITE|PKEY_DISABLE_ACCESS) pkey_get should return the same. This may not be true as PKEY_DISABLE_ACCESS will succeed in describing the state of the key in this case. The pkey behavior during signal handling is different between x86 and POWER. This change make the test compatible with both architectures. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-18y2038: linux: Provide __gettimeofday64 implementationLukasz Majewski4-3/+49
In the glibc the gettimeofday can use vDSO (on power and x86 the USE_IFUNC_GETTIMEOFDAY is defined), gettimeofday syscall or 'default' ___gettimeofday() from ./time/gettime.c (as a fallback). In this patch the last function (___gettimeofday) has been refactored and moved to ./sysdeps/unix/sysv/linux/gettimeofday.c to be Linux specific. The new __gettimeofday64 explicit 64 bit function for getting 64 bit time from the kernel (by internally calling __clock_gettime64) has been introduced. Moreover, a 32 bit version - __gettimeofday has been refactored to internally use __gettimeofday64. The __gettimeofday is now supposed to be used on systems still supporting 32 bit time (__TIMESIZE != 64) - hence the necessary check for time_t potential overflow and conversion of struct __timeval64 to 32 bit struct timespec. The iFUNC vDSO direct call optimization has been removed from both i686 and powerpc32 (USE_IFUNC_GETTIMEOFDAY is not defined for those architectures anymore). The Linux kernel does not provide a y2038 safe implementation of gettimeofday neither it plans to provide it in the future, clock_gettime64 should be used instead. Keeping support for this optimization would require to handle another build permutation (!__ASSUME_TIME64_SYSCALLS && USE_IFUNC_GETTIMEOFDAY) which adds more complexity and has limited use (since the idea is to eventually have a y2038 safe glibc build). Build tests: ./src/scripts/build-many-glibcs.py glibcs Run-time tests: - Run specific tests on ARM/x86 32bit systems (qemu): https://github.com/lmajewski/meta-y2038 and run tests: https://github.com/lmajewski/y2038-tests/commits/master Above tests were performed with Y2038 redirection applied as well as without to test proper usage of both __gettimeofday64 and __gettimeofday. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> [Including some commit message improvement]
2020-02-18Linux: Work around kernel bugs in chmod on /proc/self/fd paths [BZ #14578]Florian Weimer2-63/+45
It appears that the ability to change symbolic link modes through such paths is unintended. On several file systems, the operation fails with EOPNOTSUPP, even though the symbolic link permissions are updated. The expected behavior is a failure to update the permissions, without file system changes. Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
2020-02-18Introduce <elf-initfini.h> and ELF_INITFINI for all architecturesFlorian Weimer26-105/+330
This supersedes the init_array sysdeps directory. It allows us to check for ELF_INITFINI in both C and assembler code, and skip DT_INIT and DT_FINI processing completely on newer architectures. A new header file is needed because <dl-machine.h> is incompatible with assembler code. <sysdep.h> is compatible with assembler code, but it cannot be included in all assembler files because on some architectures, it redefines register names, and some assembler files conflict with that. <elf-initfini.h> is replicated for legacy architectures which need DT_INIT/DT_FINI support. New architectures follow the generic default and disable it.
2020-02-18mips: Fix bracktrace result for signal framesAdhemerval Zanella3-0/+102
MIPS fallback code handle a frame where its FDE can not be obtained (for instance a signal frame) by reading the kernel allocated signal frame and adding '2' to the value of 'sc_pc' [1]. The added value is used to recognize an end of an EH region on mips16 [2]. The fix adjust the obtained signal frame value and remove the libgcc added value by checking if the previous frame is a signal frame one. Checked with backtrace and tst-sigcontext-get_pc tests on mips-linux-gnu and mips64-linux-gnu. [1] libgcc/config/mips/linux-unwind.h from gcc code. [2] gcc/config/mips/mips.h from gcc code. */
2020-02-18Move implementation of <file_change_detection.h> into a C fileFlorian Weimer7-125/+174
file_change_detection_for_stat partially initialize struct file_change_detection in some cases, when the size member alone determines the outcome of all comparisons. This results in maybe-uninitialized compiler warnings in case of sufficiently aggressive inlining. Once the implementation is moved into a separate C file, this kind of inlining is no longer possible, so the compiler warnings are gone.
2020-02-18<fd_to_filename.h>: Add type safety and port to HurdFlorian Weimer9-33/+206
The new type struct fd_to_filename makes the allocation of the backing storage explicit. Hurd uses /dev/fd, not /proc/self/fd. Co-Authored-By: Paul Eggert <eggert@cs.ucla.edu>
2020-02-17Prepare redirections for IEEE long double on powerpc64leGabriel F. T. Gomes14-43/+159
All functions that have a format string, which can consume a long double argument, must have one version for each long double format supported on a platform. On powerpc64le, these functions currently have two versions (i.e.: long double with the same format as double, and long double with IBM Extended Precision format). Support for a third long double format option (i.e. long double with IEEE long double format) is being prepared and all the aforementioned functions now have a third version (not yet exported on the master branch, but the code is in). For these functions to get selected (during build time), references to them in user programs (or dependent libraries) must get redirected to the aforementioned new versions of the functions. This patch installs the header magic required to perform such redirections. Notice, however, that since the redirections only happen when __LONG_DOUBLE_USES_FLOAT128 is set to 1, and no platform (including powerpc64le) currently does it, no redirections actually happen. Redirections and the exporting of the new functions will happen at the same time (when powerpc64le adds ldbl-128ibm-compat to their Implies. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> Reviewed-by: Paul E. Murphy <murphyp@linux.vnet.ibm.com>
2020-02-17conform/conformtest.py: Extend tokenizer to cover character constantsFlorian Weimer1-6/+5
Such constants are used in __USE_EXTERN_INLINES blocks.
2020-02-17stdlib: Reduce namespace pollution in <inttypes.h>Florian Weimer1-24/+24
The namespace pollution results in conform test failures if the tests are run __USE_EXTERN_INLINES defined (e.g., when configuring with CC="gcc -O3" CXX="g++ -O3").
2020-02-17x86: Avoid single-argument _Static_assert in <tls.h>Florian Weimer3-24/+36
Older GCC versions do not support this extension. Fixes commit f1bdee61797 ("x86 tls: Use _Static_assert for TLS access size assertion").
2020-02-17x86 tls: Use _Static_assert for TLS access size assertionSamuel Thibault3-78/+60
2020-02-17htl: Link internal htl tests against libpthreadSamuel Thibault1-1/+1
2020-02-16pthread: Fix building tst-robust8 with nptlSamuel Thibault2-3/+5
NPTL's pthreadP.h needs internal definitions
2020-02-16pthread: Move robust mutex tests from nptl to sysdeps/pthreadSamuel Thibault15-8/+21
tst-robust8.c prints some mutex internals for nptl debugging, this needed to be made conditioned by getting built with nptl.
2020-02-16htl: Remove stub warning for pthread_mutexattr_setpsharedSamuel Thibault1-1/+0
It actually is implemented.
2020-02-16htl: Add missing functions and defines for robust mutexesSamuel Thibault3-0/+12
2020-02-15htl: Only check pthread_self coherency when DEBUG is setSamuel Thibault1-0/+4
htl has been widely tested for a long time now with this coherency checked successfully.
2020-02-15hurd: Add THREAD_GET/SETMEM/_NCSamuel Thibault2-5/+114
Store them in the TCB, and use them for accessing _hurd_sigstate.
2020-02-15hurd tls: update comment about fields at the end of tcbheadSamuel Thibault1-2/+2
2020-02-15ld.so: Do not export free/calloc/malloc/realloc functions [BZ #25486]Florian Weimer60-266/+202
Exporting functions and relying on symbol interposition from libc.so makes the choice of implementation dependent on DT_NEEDED order, which is not what some compiler drivers expect. This commit replaces one magic mechanism (symbol interposition) with another one (preprocessor-/compiler-based redirection). This makes the hand-over from the minimal malloc to the full malloc more explicit. Removing the ABI symbols is backwards-compatible because libc.so is always in scope, and the dynamic loader will find the malloc-related symbols there since commit f0b2132b35248c1f4a80f62a2c38cddcc802aa8c ("ld.so: Support moving versioned symbols between sonames [BZ #24741]"). Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-02-15Remove weak declaration of free from <inline-hashtab.h>Florian Weimer1-8/+3
elf/dl-minimal.c provides a definition of free, so the function pointer is always non-null, even before the final relocation of the loader. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-02-15elf: Extract _dl_sym_post, _dl_sym_find_caller_map from elf/dl-sym.cFlorian Weimer2-82/+110
The definitions are moved into a new file, elf/dl-sym-post.h, so that this code can be used by the dynamic loader as well. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-02-15elf: Introduce the rtld-stubbed-symbols makefile variableFlorian Weimer1-9/+13
This generalizes a mechanism used for stack-protector support, so that it can be applied to other symbols if required. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-02-15arm: fix use of INTERNAL_SYSCALL_CALLAndreas Schwab1-1/+1
Remove extra argument from INTERNAL_SYSCALL_CALL macro call. Fixes commit bc2eb9321e ("linux: Remove INTERNAL_SYSCALL_DECL").
2020-02-14linux: Remove INTERNAL_SYSCALL_DECLAdhemerval Zanella107-546/+425
With all Linux ABIs using the expected Linux kABI to indicate syscalls errors, the INTERNAL_SYSCALL_DECL is an empty declaration on all ports. This patch removes the 'err' argument on INTERNAL_SYSCALL* macro and remove the INTERNAL_SYSCALL_DECL usage. Checked with a build against all affected ABIs.
2020-02-14nptl: Remove ununsed pthread-errnos.h ruleAdhemerval Zanella2-15/+1
2020-02-14linux: Consolidate INLINE_SYSCALLAdhemerval Zanella33-486/+60
With all Linux ABIs using the expected Linux kABI to indicate syscalls errors, there is no need to replicate the INLINE_SYSCALL. The generic Linux sysdep.h includes errno.h even for !__ASSEMBLER__, which is ok now and it allows cleanup some archaic code that assume otherwise. Checked with a build against all affected ABIs.
2020-02-14s390: Consolidate Linux syscall definitionAdhemerval Zanella3-192/+103
The {INTERNAL,INLINE}_SYSCALL are defined only on s390 sysdep.h. Checked on s390x-linux-gnu and s390-linux-gnu.
2020-02-14riscv: Avoid clobbering register parameters in syscallAdhemerval Zanella1-28/+56
The riscv INTERNAL_SYSCALL macro might clobber the register parameter if the argument itself might clobber any register (a function call for instance). This patch fixes it by using temporary variables for the expressions between the register assignments (as indicated by GCC documentation, 6.47.5.2 Specifying Registers for Local Variables). It is similar to the fix done for MIPS (bug 25523). Checked with riscv64-linux-gnu-rv64imafdc-lp64d build.
2020-02-14microblaze: Avoid clobbering register parameters in syscallAdhemerval Zanella1-35/+56
The microblaze INTERNAL_SYSCALL macro might clobber the register parameter if the argument itself might clobber any register (a function call for instance). This patch fixes it by using temporary variables for the expressions between the register assignments (as indicated by GCC documentation, 6.47.5.2 Specifying Registers for Local Variables). It is similar to the fix done for MIPS (bug 25523). Checked with microblaze-linux-gnu and microblazeel-linux-gnu build.
2020-02-14nios2: Use Linux kABI for syscall returnAdhemerval Zanella1-5/+5
It changes the nios INTERNAL_SYSCALL_RAW macro to return a negative value instead of the 'r2' register value on the 'err' macro argument. The macro INTERNAL_SYSCALL_DECL is no longer required, and the INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs. Checked with a build against nios2-linux-gnu.
2020-02-14mips: Use Linux kABI for syscall returnAdhemerval Zanella3-53/+23
It changes the mips INTERNAL_SYSCALL* and internal_syscall* macros to return a negative value instead of the 'a3' register value on then 'err' macro argument. The macro INTERNAL_SYSCALL_DECL is no longer required, and the INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs. The redefinition of INTERNAL_VSYSCALL_CALL is also no longer required. Checked on mips64-linux-gnu, mips64n32-linux-gnu, and mips-linux-gnu.
2020-02-14mips64: Consolidate Linux sysdep.hAdhemerval Zanella4-455/+70
The mips64 Linux syscall macros only differs argument type and the requirement of sign-extending values on n32. The headers are consolidate by parameterizing the arguments with a new type, __syscall_arg_t, and by defining the ARGIFY for n64. Also, the generic unix mips64 sysdep is essentially the same, only the load instruction need to be adjusted depending of the ABI. Checked on mips64-linux-gnu and mips64n32-linux-gnu.