aboutsummaryrefslogtreecommitdiff
path: root/sysdeps/x86_64/Makefile
diff options
context:
space:
mode:
authorAdhemerval Zanella Netto <adhemerval.zanella@linaro.org>2022-07-21 10:05:03 -0300
committerAdhemerval Zanella <adhemerval.zanella@linaro.org>2022-07-22 11:58:27 -0300
commite169aff0e9aacdcf466357247f1759f2c84b7fe4 (patch)
treedf2413bece130a87d8b1825aef6dd4e6eea14213 /sysdeps/x86_64/Makefile
parent4c128c7823e5a19058589cfac42aa96de3e15430 (diff)
downloadglibc-e169aff0e9aacdcf466357247f1759f2c84b7fe4.zip
glibc-e169aff0e9aacdcf466357247f1759f2c84b7fe4.tar.gz
glibc-e169aff0e9aacdcf466357247f1759f2c84b7fe4.tar.bz2
x86: Add SSE2 optimized chacha20
It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-amd64-ssse3.S. It replaces the ROTATE_SHUF_2 (which uses pshufb) by ROTATE2 and thus making the original implementation SSE2. As for generic implementation, the last step that XOR with the input is omited. The final state register clearing is also omitted. On a Ryzen 9 5900X it shows the following improvements (using formatted bench-arc4random data): GENERIC MB/s ----------------------------------------------- arc4random [single-thread] 443.11 arc4random_buf(16) [single-thread] 552.27 arc4random_buf(32) [single-thread] 626.86 arc4random_buf(48) [single-thread] 649.81 arc4random_buf(64) [single-thread] 663.95 arc4random_buf(80) [single-thread] 674.78 arc4random_buf(96) [single-thread] 675.17 arc4random_buf(112) [single-thread] 680.69 arc4random_buf(128) [single-thread] 683.20 ----------------------------------------------- SSE MB/s ----------------------------------------------- arc4random [single-thread] 704.25 arc4random_buf(16) [single-thread] 1018.17 arc4random_buf(32) [single-thread] 1315.27 arc4random_buf(48) [single-thread] 1449.36 arc4random_buf(64) [single-thread] 1511.16 arc4random_buf(80) [single-thread] 1539.48 arc4random_buf(96) [single-thread] 1571.06 arc4random_buf(112) [single-thread] 1596.16 arc4random_buf(128) [single-thread] 1613.48 ----------------------------------------------- Checked on x86_64-linux-gnu.
Diffstat (limited to 'sysdeps/x86_64/Makefile')
-rw-r--r--sysdeps/x86_64/Makefile6
1 files changed, 6 insertions, 0 deletions
diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile
index c19bef2..3acd975 100644
--- a/sysdeps/x86_64/Makefile
+++ b/sysdeps/x86_64/Makefile
@@ -5,6 +5,12 @@ ifeq ($(subdir),csu)
gen-as-const-headers += link-defines.sym
endif
+ifeq ($(subdir),stdlib)
+sysdep_routines += \
+ chacha20-amd64-sse2 \
+ # sysdep_routines
+endif
+
ifeq ($(subdir),gmon)
sysdep_routines += _mcount
# We cannot compile _mcount.S with -pg because that would create