From e169aff0e9aacdcf466357247f1759f2c84b7fe4 Mon Sep 17 00:00:00 2001 From: Adhemerval Zanella Netto Date: Thu, 21 Jul 2022 10:05:03 -0300 Subject: x86: Add SSE2 optimized chacha20 It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-amd64-ssse3.S. It replaces the ROTATE_SHUF_2 (which uses pshufb) by ROTATE2 and thus making the original implementation SSE2. As for generic implementation, the last step that XOR with the input is omited. The final state register clearing is also omitted. On a Ryzen 9 5900X it shows the following improvements (using formatted bench-arc4random data): GENERIC MB/s ----------------------------------------------- arc4random [single-thread] 443.11 arc4random_buf(16) [single-thread] 552.27 arc4random_buf(32) [single-thread] 626.86 arc4random_buf(48) [single-thread] 649.81 arc4random_buf(64) [single-thread] 663.95 arc4random_buf(80) [single-thread] 674.78 arc4random_buf(96) [single-thread] 675.17 arc4random_buf(112) [single-thread] 680.69 arc4random_buf(128) [single-thread] 683.20 ----------------------------------------------- SSE MB/s ----------------------------------------------- arc4random [single-thread] 704.25 arc4random_buf(16) [single-thread] 1018.17 arc4random_buf(32) [single-thread] 1315.27 arc4random_buf(48) [single-thread] 1449.36 arc4random_buf(64) [single-thread] 1511.16 arc4random_buf(80) [single-thread] 1539.48 arc4random_buf(96) [single-thread] 1571.06 arc4random_buf(112) [single-thread] 1596.16 arc4random_buf(128) [single-thread] 1613.48 ----------------------------------------------- Checked on x86_64-linux-gnu. --- LICENSES | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'LICENSES') diff --git a/LICENSES b/LICENSES index b1fbfc6..f0117ef 100644 --- a/LICENSES +++ b/LICENSES @@ -390,8 +390,8 @@ Copyright 2001 by Stephen L. Moshier License along with this library; if not, see . */ -sysdeps/aarch64/chacha20-aarch64.S imports code from libgcrypt, with -the following notices: +sysdeps/aarch64/chacha20-aarch64.S and sysdeps/x86_64/chacha20-amd64-sse2.S +imports code from libgcrypt, with the following notices: Copyright (C) 2017-2019 Jussi Kivilinna -- cgit v1.1