diff options
author | H.J. Lu <hjl.tools@gmail.com> | 2015-01-30 06:50:20 -0800 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2015-01-30 15:37:58 -0800 |
commit | 5f3d0b78e011d2a72f9e88b0e9ef5bc081d18f97 (patch) | |
tree | 8eabf127206283d2421bc40b6bc44e123e346598 /NEWS | |
parent | b658fdd82b4524cf6a39881d092caa23f63d93ac (diff) | |
download | glibc-5f3d0b78e011d2a72f9e88b0e9ef5bc081d18f97.zip glibc-5f3d0b78e011d2a72f9e88b0e9ef5bc081d18f97.tar.gz glibc-5f3d0b78e011d2a72f9e88b0e9ef5bc081d18f97.tar.bz2 |
Use AVX unaligned memcpy only if AVX2 is available
memcpy with unaligned 256-bit AVX register loads/stores are slow on older
processorsl like Sandy Bridge. This patch adds bit_AVX_Fast_Unaligned_Load
and sets it only when AVX2 is available.
[BZ #17801]
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
Set the bit_AVX_Fast_Unaligned_Load bit for AVX2.
* sysdeps/x86_64/multiarch/init-arch.h (bit_AVX_Fast_Unaligned_Load):
New.
(index_AVX_Fast_Unaligned_Load): Likewise.
(HAS_AVX_FAST_UNALIGNED_LOAD): Likewise.
* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check the
bit_AVX_Fast_Unaligned_Load bit instead of the bit_AVX_Usable bit.
* sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/memmove.c (__libc_memmove): Replace
HAS_AVX with HAS_AVX_FAST_UNALIGNED_LOAD.
* sysdeps/x86_64/multiarch/memmove_chk.c (__memmove_chk): Likewise.
Diffstat (limited to 'NEWS')
-rw-r--r-- | NEWS | 4 |
1 files changed, 2 insertions, 2 deletions
@@ -17,8 +17,8 @@ Version 2.21 17601, 17608, 17616, 17625, 17630, 17633, 17634, 17635, 17647, 17653, 17657, 17658, 17664, 17665, 17668, 17682, 17702, 17717, 17719, 17722, 17723, 17724, 17725, 17732, 17733, 17744, 17745, 17746, 17747, 17748, - 17775, 17777, 17780, 17781, 17782, 17791, 17793, 17796, 17797, 17803, - 17806, 17834, 17844, 17848, 17868, 17869, 17870, 17885, 17892. + 17775, 17777, 17780, 17781, 17782, 17791, 17793, 17796, 17797, 17801, + 17803, 17806, 17834, 17844, 17848, 17868, 17869, 17870, 17885, 17892. * A new semaphore algorithm has been implemented in generic C code for all machines. Previous custom assembly implementations of semaphore were |