diff options
author | Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> | 2022-07-21 10:05:03 -0300 |
---|---|---|
committer | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2022-07-22 11:58:27 -0300 |
commit | e169aff0e9aacdcf466357247f1759f2c84b7fe4 (patch) | |
tree | df2413bece130a87d8b1825aef6dd4e6eea14213 /sysdeps/x86_64/Makefile | |
parent | 4c128c7823e5a19058589cfac42aa96de3e15430 (diff) | |
download | glibc-e169aff0e9aacdcf466357247f1759f2c84b7fe4.zip glibc-e169aff0e9aacdcf466357247f1759f2c84b7fe4.tar.gz glibc-e169aff0e9aacdcf466357247f1759f2c84b7fe4.tar.bz2 |
x86: Add SSE2 optimized chacha20
It adds vectorized ChaCha20 implementation based on libgcrypt
cipher/chacha20-amd64-ssse3.S. It replaces the ROTATE_SHUF_2 (which
uses pshufb) by ROTATE2 and thus making the original implementation
SSE2.
As for generic implementation, the last step that XOR with the
input is omited. The final state register clearing is also
omitted.
On a Ryzen 9 5900X it shows the following improvements (using
formatted bench-arc4random data):
GENERIC MB/s
-----------------------------------------------
arc4random [single-thread] 443.11
arc4random_buf(16) [single-thread] 552.27
arc4random_buf(32) [single-thread] 626.86
arc4random_buf(48) [single-thread] 649.81
arc4random_buf(64) [single-thread] 663.95
arc4random_buf(80) [single-thread] 674.78
arc4random_buf(96) [single-thread] 675.17
arc4random_buf(112) [single-thread] 680.69
arc4random_buf(128) [single-thread] 683.20
-----------------------------------------------
SSE MB/s
-----------------------------------------------
arc4random [single-thread] 704.25
arc4random_buf(16) [single-thread] 1018.17
arc4random_buf(32) [single-thread] 1315.27
arc4random_buf(48) [single-thread] 1449.36
arc4random_buf(64) [single-thread] 1511.16
arc4random_buf(80) [single-thread] 1539.48
arc4random_buf(96) [single-thread] 1571.06
arc4random_buf(112) [single-thread] 1596.16
arc4random_buf(128) [single-thread] 1613.48
-----------------------------------------------
Checked on x86_64-linux-gnu.
Diffstat (limited to 'sysdeps/x86_64/Makefile')
-rw-r--r-- | sysdeps/x86_64/Makefile | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile index c19bef2..3acd975 100644 --- a/sysdeps/x86_64/Makefile +++ b/sysdeps/x86_64/Makefile @@ -5,6 +5,12 @@ ifeq ($(subdir),csu) gen-as-const-headers += link-defines.sym endif +ifeq ($(subdir),stdlib) +sysdep_routines += \ + chacha20-amd64-sse2 \ + # sysdep_routines +endif + ifeq ($(subdir),gmon) sysdep_routines += _mcount # We cannot compile _mcount.S with -pg because that would create |