diff options
author | Brian Smith <brian@briansmith.org> | 2016-03-01 21:17:37 -1000 |
---|---|---|
committer | Adam Langley <alangley@gmail.com> | 2017-01-16 16:54:13 +0000 |
commit | a26d4c3f439543e45df5d0f381f0a815504255a7 (patch) | |
tree | d8f0241ae5f4bc3d920f773efbe8027847a98823 | |
parent | abb32cc00dd4086f7b2213a5d3ecd223be937831 (diff) | |
download | boringssl-a26d4c3f439543e45df5d0f381f0a815504255a7.zip boringssl-a26d4c3f439543e45df5d0f381f0a815504255a7.tar.gz boringssl-a26d4c3f439543e45df5d0f381f0a815504255a7.tar.bz2 |
Enable stitched x86-64 AES-NI AES-GCM implementation.
Measured on a SkyLake processor:
Before:
Did 11373750 AES-128-GCM (16 bytes) seal operations in 1016000us (11194635.8 ops/sec): 179.1 MB/s
Did 2253000 AES-128-GCM (1350 bytes) seal operations in 1016000us (2217519.7 ops/sec): 2993.7 MB/s
Did 453750 AES-128-GCM (8192 bytes) seal operations in 1015000us (447044.3 ops/sec): 3662.2 MB/s
Did 10753500 AES-256-GCM (16 bytes) seal operations in 1016000us (10584153.5 ops/sec): 169.3 MB/s
Did 1898750 AES-256-GCM (1350 bytes) seal operations in 1015000us (1870689.7 ops/sec): 2525.4 MB/s
Did 374000 AES-256-GCM (8192 bytes) seal operations in 1016000us (368110.2 ops/sec): 3015.6 MB/s
After:
Did 11074000 AES-128-GCM (16 bytes) seal operations in 1015000us (10910344.8 ops/sec): 174.6 MB/s
Did 3178250 AES-128-GCM (1350 bytes) seal operations in 1016000us (3128198.8 ops/sec): 4223.1 MB/s
Did 734500 AES-128-GCM (8192 bytes) seal operations in 1016000us (722933.1 ops/sec): 5922.3 MB/s
Did 10394750 AES-256-GCM (16 bytes) seal operations in 1015000us (10241133.0 ops/sec): 163.9 MB/s
Did 2502250 AES-256-GCM (1350 bytes) seal operations in 1016000us (2462844.5 ops/sec): 3324.8 MB/s
Did 544500 AES-256-GCM (8192 bytes) seal operations in 1015000us (536453.2 ops/sec): 4394.6 MB/s
Change-Id: If058935796441ed4e577b9a72d3aa43422edba58
Reviewed-on: https://boringssl-review.googlesource.com/7273
Reviewed-by: Adam Langley <alangley@gmail.com>
-rw-r--r-- | BUILDING.md | 2 | ||||
-rw-r--r-- | crypto/modes/asm/aesni-gcm-x86_64.pl | 11 | ||||
-rw-r--r-- | crypto/modes/asm/ghash-x86_64.pl | 6 |
3 files changed, 13 insertions, 6 deletions
diff --git a/BUILDING.md b/BUILDING.md index 8e226c0..c87e8b5 100644 --- a/BUILDING.md +++ b/BUILDING.md @@ -33,7 +33,7 @@ executable may be configured explicitly by setting `GO_EXECUTABLE`. * To build the x86 and x86\_64 assembly, your assembler must support AVX2 - instructions. If using GNU binutils, you must have 2.22 or later. + instructions and MOVBE. If using GNU binutils, you must have 2.22 or later. ## Building diff --git a/crypto/modes/asm/aesni-gcm-x86_64.pl b/crypto/modes/asm/aesni-gcm-x86_64.pl index f777a6e..e329741 100644 --- a/crypto/modes/asm/aesni-gcm-x86_64.pl +++ b/crypto/modes/asm/aesni-gcm-x86_64.pl @@ -41,17 +41,24 @@ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; ( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or die "can't locate x86_64-xlate.pl"; -# This must be kept in sync with |$avx| in ghash-x86_64.pl; otherwise tags will +# |$avx| in ghash-x86_64.pl must be set to at least 1; otherwise tags will # be computed incorrectly. # # In upstream, this is controlled by shelling out to the compiler to check # versions, but BoringSSL is intended to be used with pre-generated perlasm # output, so this isn't useful anyway. -$avx = 0; +# +# The upstream code uses the condition |$avx>1| even though no AVX2 +# instructions are used, because it assumes MOVBE is supported by the assembler +# if and only if AVX2 is also supported by the assembler; see +# https://marc.info/?l=openssl-dev&m=146567589526984&w=2. +$avx = 2; open OUT,"| \"$^X\" \"$xlate\" $flavour \"$output\""; *STDOUT=*OUT; +# See the comment above regarding why the condition is ($avx>1) when there are +# no AVX2 instructions being used. if ($avx>1) {{{ ($inp,$out,$len,$key,$ivp,$Xip)=("%rdi","%rsi","%rdx","%rcx","%r8","%r9"); diff --git a/crypto/modes/asm/ghash-x86_64.pl b/crypto/modes/asm/ghash-x86_64.pl index df8546c..d7471e2 100644 --- a/crypto/modes/asm/ghash-x86_64.pl +++ b/crypto/modes/asm/ghash-x86_64.pl @@ -90,13 +90,13 @@ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; ( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or die "can't locate x86_64-xlate.pl"; -# This must be kept in sync with |$avx| in aesni-gcm-x86_64.pl; otherwise tags -# will be computed incorrectly. +# See the notes about |$avx| in aesni-gcm-x86_64.pl; otherwise tags will be +# computed incorrectly. # # In upstream, this is controlled by shelling out to the compiler to check # versions, but BoringSSL is intended to be used with pre-generated perlasm # output, so this isn't useful anyway. -$avx = 0; +$avx = 1; open OUT,"| \"$^X\" \"$xlate\" $flavour \"$output\""; *STDOUT=*OUT; |