Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
The symbols is not present in current POSIX specification and compiler
already generates memmove call.
|
|
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
|
|
This patch was initially based on the __memmove_power7 with some ideas
from strncpy implementation for Power 9.
Improvements from __memmove_power7:
1. Use lxvl/stxvl for alignment code.
The code for Power 7 uses branches when the input is not naturally
aligned to the width of a vector. The new implementation uses
lxvl/stxvl instead which reduces pressure on GPRs. It also allows
the removal of branch instructions, implicitly removing branch stalls
and mispredictions.
2. Use of lxv/stxv and lxvl/stxvl pair is safe to use on Cache Inhibited
memory.
On Power 10 vector load and stores are safe to use on CI memory for
addresses unaligned to 16B. This code takes advantage of this to
do unaligned loads.
The unaligned loads don't have a significant performance impact by
themselves. However doing so decreases register pressure on GPRs
and interdependence stalls on load/store pairs. This also improved
readability as there are now less code paths for different alignments.
Finally this reduces the overall code size.
3. Improved performance.
This version runs on average about 30% better than memmove_power7
for lengths larger than 8KB. For input lengths shorter than 8KB
the improvement is smaller, it has on average about 17% better
performance.
This version has a degradation of about 50% for input lengths
in the 0 to 31 bytes range when dest is unaligned.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
|
|
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
|
|
|
|
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
|
|
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
When the .c/.S file neither uses nor modifies macros defined in
sysdep.h there is no point to #include it. The same goes for
math_ldbl_opt.h except that it includes shlib-compat.h, and if
compat_symbol is redefined we need to include shlib-compat.h first.
* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-power8.S: Don't
include sysdep.h.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-a2.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-cell.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/mempcpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memrchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memrchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/rawmemchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcasestr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcspn-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strlen-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strlen-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strlen-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncase-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strnlen-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strnlen-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strspn-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-ppc64.S: Don't
include sysdep.h and math_ldbl_opt.h.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-power5+.S: Don't
include sysdep.h and math_ldbl_opt.h. Include shlib-compat.h.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power6x.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power6x.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc-ppc64.S: Likewise.
|
|
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power4.S: Define the
implementation-specific function name and remove unneeded
macros definition.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcmp.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power7/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise.
|
|
|
|
|
|
|
|
This patch adds an optimized memmove optimization for POWER7/powerpc64.
Basically the idea is to use the memcpy for POWER7 on non-overlapped
memory regions and a optimized backward memcpy for memory regions
that overlap (similar to the idea of string/memmove.c).
The backward memcpy algorithm used is similar the one use for memcpy for
POWER7, with adjustments done for alignment. The difference is memory
is always aligned to 16 bytes before using VSX/altivec instructions.
|