aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorBrian Smith <brian@briansmith.org>2016-03-01 21:17:37 -1000
committerAdam Langley <alangley@gmail.com>2017-01-16 16:54:13 +0000
commita26d4c3f439543e45df5d0f381f0a815504255a7 (patch)
treed8f0241ae5f4bc3d920f773efbe8027847a98823
parentabb32cc00dd4086f7b2213a5d3ecd223be937831 (diff)
downloadboringssl-a26d4c3f439543e45df5d0f381f0a815504255a7.zip
boringssl-a26d4c3f439543e45df5d0f381f0a815504255a7.tar.gz
boringssl-a26d4c3f439543e45df5d0f381f0a815504255a7.tar.bz2
Enable stitched x86-64 AES-NI AES-GCM implementation.
Measured on a SkyLake processor: Before: Did 11373750 AES-128-GCM (16 bytes) seal operations in 1016000us (11194635.8 ops/sec): 179.1 MB/s Did 2253000 AES-128-GCM (1350 bytes) seal operations in 1016000us (2217519.7 ops/sec): 2993.7 MB/s Did 453750 AES-128-GCM (8192 bytes) seal operations in 1015000us (447044.3 ops/sec): 3662.2 MB/s Did 10753500 AES-256-GCM (16 bytes) seal operations in 1016000us (10584153.5 ops/sec): 169.3 MB/s Did 1898750 AES-256-GCM (1350 bytes) seal operations in 1015000us (1870689.7 ops/sec): 2525.4 MB/s Did 374000 AES-256-GCM (8192 bytes) seal operations in 1016000us (368110.2 ops/sec): 3015.6 MB/s After: Did 11074000 AES-128-GCM (16 bytes) seal operations in 1015000us (10910344.8 ops/sec): 174.6 MB/s Did 3178250 AES-128-GCM (1350 bytes) seal operations in 1016000us (3128198.8 ops/sec): 4223.1 MB/s Did 734500 AES-128-GCM (8192 bytes) seal operations in 1016000us (722933.1 ops/sec): 5922.3 MB/s Did 10394750 AES-256-GCM (16 bytes) seal operations in 1015000us (10241133.0 ops/sec): 163.9 MB/s Did 2502250 AES-256-GCM (1350 bytes) seal operations in 1016000us (2462844.5 ops/sec): 3324.8 MB/s Did 544500 AES-256-GCM (8192 bytes) seal operations in 1015000us (536453.2 ops/sec): 4394.6 MB/s Change-Id: If058935796441ed4e577b9a72d3aa43422edba58 Reviewed-on: https://boringssl-review.googlesource.com/7273 Reviewed-by: Adam Langley <alangley@gmail.com>
-rw-r--r--BUILDING.md2
-rw-r--r--crypto/modes/asm/aesni-gcm-x86_64.pl11
-rw-r--r--crypto/modes/asm/ghash-x86_64.pl6
3 files changed, 13 insertions, 6 deletions
diff --git a/BUILDING.md b/BUILDING.md
index 8e226c0..c87e8b5 100644
--- a/BUILDING.md
+++ b/BUILDING.md
@@ -33,7 +33,7 @@
executable may be configured explicitly by setting `GO_EXECUTABLE`.
* To build the x86 and x86\_64 assembly, your assembler must support AVX2
- instructions. If using GNU binutils, you must have 2.22 or later.
+ instructions and MOVBE. If using GNU binutils, you must have 2.22 or later.
## Building
diff --git a/crypto/modes/asm/aesni-gcm-x86_64.pl b/crypto/modes/asm/aesni-gcm-x86_64.pl
index f777a6e..e329741 100644
--- a/crypto/modes/asm/aesni-gcm-x86_64.pl
+++ b/crypto/modes/asm/aesni-gcm-x86_64.pl
@@ -41,17 +41,24 @@ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or
die "can't locate x86_64-xlate.pl";
-# This must be kept in sync with |$avx| in ghash-x86_64.pl; otherwise tags will
+# |$avx| in ghash-x86_64.pl must be set to at least 1; otherwise tags will
# be computed incorrectly.
#
# In upstream, this is controlled by shelling out to the compiler to check
# versions, but BoringSSL is intended to be used with pre-generated perlasm
# output, so this isn't useful anyway.
-$avx = 0;
+#
+# The upstream code uses the condition |$avx>1| even though no AVX2
+# instructions are used, because it assumes MOVBE is supported by the assembler
+# if and only if AVX2 is also supported by the assembler; see
+# https://marc.info/?l=openssl-dev&m=146567589526984&w=2.
+$avx = 2;
open OUT,"| \"$^X\" \"$xlate\" $flavour \"$output\"";
*STDOUT=*OUT;
+# See the comment above regarding why the condition is ($avx>1) when there are
+# no AVX2 instructions being used.
if ($avx>1) {{{
($inp,$out,$len,$key,$ivp,$Xip)=("%rdi","%rsi","%rdx","%rcx","%r8","%r9");
diff --git a/crypto/modes/asm/ghash-x86_64.pl b/crypto/modes/asm/ghash-x86_64.pl
index df8546c..d7471e2 100644
--- a/crypto/modes/asm/ghash-x86_64.pl
+++ b/crypto/modes/asm/ghash-x86_64.pl
@@ -90,13 +90,13 @@ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or
die "can't locate x86_64-xlate.pl";
-# This must be kept in sync with |$avx| in aesni-gcm-x86_64.pl; otherwise tags
-# will be computed incorrectly.
+# See the notes about |$avx| in aesni-gcm-x86_64.pl; otherwise tags will be
+# computed incorrectly.
#
# In upstream, this is controlled by shelling out to the compiler to check
# versions, but BoringSSL is intended to be used with pre-generated perlasm
# output, so this isn't useful anyway.
-$avx = 0;
+$avx = 1;
open OUT,"| \"$^X\" \"$xlate\" $flavour \"$output\"";
*STDOUT=*OUT;