[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[SCM] GNU gnutls branch, master, updated. gnutls_2_99_2-27-g5f84e48
From: |
Nikos Mavrogiannopoulos |
Subject: |
[SCM] GNU gnutls branch, master, updated. gnutls_2_99_2-27-g5f84e48 |
Date: |
Sun, 29 May 2011 22:17:46 +0000 |
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU gnutls".
http://git.savannah.gnu.org/cgit/gnutls.git/commit/?id=5f84e48a3f8ae92181f6031bf211989f6c54add2
The branch, master has been updated
via 5f84e48a3f8ae92181f6031bf211989f6c54add2 (commit)
via a6219a5918cff2431cc83cb06a2929a7853a2bed (commit)
via b983b8638fba58ad76f98423a23566442af72dc9 (commit)
via 8dc2a74cbdad286b6a97d55b2a47929f07e44aa7 (commit)
via 23df2cf3d4e719b51d6be784b0249b68139d1668 (commit)
via b50b4b052bb9cd455615c2ed784bc419cae6719c (commit)
from 14f27b4e2488f82eeaf05b78073daedb0712a76f (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
commit 5f84e48a3f8ae92181f6031bf211989f6c54add2
Author: Nikos Mavrogiannopoulos <address@hidden>
Date: Sun May 29 23:34:55 2011 +0200
Added new AES code by Andy.
commit a6219a5918cff2431cc83cb06a2929a7853a2bed
Author: Nikos Mavrogiannopoulos <address@hidden>
Date: Sun May 29 12:39:46 2011 +0200
Added missing file.
commit b983b8638fba58ad76f98423a23566442af72dc9
Author: Nikos Mavrogiannopoulos <address@hidden>
Date: Sun May 29 12:39:44 2011 +0200
more files to ignore
commit 8dc2a74cbdad286b6a97d55b2a47929f07e44aa7
Author: Nikos Mavrogiannopoulos <address@hidden>
Date: Sun May 29 12:35:57 2011 +0200
Added FSF copyright to public domain files.
commit 23df2cf3d4e719b51d6be784b0249b68139d1668
Author: Nikos Mavrogiannopoulos <address@hidden>
Date: Sun May 29 12:01:16 2011 +0200
Use cpuid.h if it exists, to use the x86 CPUID instruction.
commit b50b4b052bb9cd455615c2ed784bc419cae6719c
Author: Nikos Mavrogiannopoulos <address@hidden>
Date: Sun May 29 01:40:16 2011 +0200
Added Dash.
-----------------------------------------------------------------------
Summary of changes:
.gitignore | 5 +
THANKS | 1 +
configure.ac | 1 +
doc/credentials/x509/ca-key.pem | 145 ++
lib/accelerated/intel/asm/appro-aes-x86-64.s | 2416 ++++++++++++++++++++++----
lib/accelerated/intel/asm/appro-aes-x86.s | 2359 ++++++++++++++++++++------
lib/accelerated/x86.h | 9 +
lib/nettle/Makefile.am | 3 +-
lib/nettle/ecc_free.c | 30 +-
lib/nettle/ecc_make_key.c | 30 +-
lib/nettle/ecc_map.c | 30 +-
lib/nettle/ecc_mulmod.c | 30 +-
lib/nettle/ecc_points.c | 30 +-
lib/nettle/ecc_projective_add_point.c | 30 +-
lib/nettle/ecc_projective_dbl_point_3.c | 30 +-
lib/nettle/ecc_shared_secret.c | 30 +-
lib/nettle/ecc_sign_hash.c | 30 +-
lib/nettle/ecc_test.c | 142 --
lib/nettle/ecc_verify_hash.c | 30 +-
19 files changed, 4331 insertions(+), 1050 deletions(-)
create mode 100644 doc/credentials/x509/ca-key.pem
delete mode 100644 lib/nettle/ecc_test.c
diff --git a/.gitignore b/.gitignore
index e00238b..68a55b9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -455,3 +455,8 @@ tests/suite/x509paths/X509tests
tests/x509cert
src/benchmark-cipher
src/benchmark-tls
+doc/gnutls-guile.html
+doc/version-guile.texi
+build-aux/compile
+doc/stamp-1
+lib/algorithms/libgnutls_alg.la
diff --git a/THANKS b/THANKS
index ef6cb28..c01a0ed 100644
--- a/THANKS
+++ b/THANKS
@@ -113,6 +113,7 @@ Michael Rommel <rommel [at]
layer-7.net>
Mark Brand <mabrand [at] mabrand.nl>
Vitaly Kruglikov <vitaly.kruglikov [at] palm.com>
Kalle Olavi Niemitalo <kon [at] iki.fi>
+Dash Shendy <admin [at] dash.za.net>
----------------------------------------------------------------------
Copying and distribution of this file, with or without modification,
diff --git a/configure.ac b/configure.ac
index 95eb972..00f4a7e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -81,6 +81,7 @@ case $host_cpu in
i?86 | x86_64 | amd64)
dnl GCC_FLAG_ADD([-maes -mpclmul],[X86])
dnl if test "x$X86" = "xyes";then
+ AC_CHECK_HEADERS(cpuid.h)
if test "$host_cpu" = "x86_64" -o "$host_cpu" = "amd64";then
hw_accel="x86-64"
else
diff --git a/doc/credentials/x509/ca-key.pem b/doc/credentials/x509/ca-key.pem
new file mode 100644
index 0000000..4efbe5a
--- /dev/null
+++ b/doc/credentials/x509/ca-key.pem
@@ -0,0 +1,145 @@
+Public Key Info:
+ Public Key Algorithm: RSA
+ Key Security Level: Normal
+
+modulus:
+ 00:9c:e4:42:b1:7d:6e:9e:5f:ff:7f:2d:9d:d7:4e:
+ 78:5d:db:88:83:fd:c2:a9:50:5a:4f:71:dc:6b:ae:
+ 52:12:80:f0:87:42:a2:3e:d4:28:3a:06:4b:74:a6:
+ 36:72:86:c6:b3:fa:23:62:d3:a3:72:cd:0a:9e:53:
+ d8:76:6b:63:12:1e:96:12:1b:89:53:de:6f:e1:34:
+ 1d:0b:83:8b:32:21:39:e9:e2:06:ab:6e:76:85:90:
+ 1b:1e:84:cb:f3:84:35:e0:3c:50:58:6b:b3:40:af:
+ 37:d2:29:a5:ed:f6:f0:d9:67:08:71:14:3c:bc:51:
+ ac:f1:2c:df:5f:0e:b7:f8:c2:3a:16:ae:a2:30:04:
+ 08:a8:fd:3c:5b:31:a6:45:1c:cb:e7:0b:c2:88:f8:
+ 42:56:4a:cf:9b:06:d7:a0:00:6e:6f:a0:00:b1:8c:
+ 16:3c:90:7d:d4:cf:7f:97:1e:60:14:7e:64:f7:f8:
+ 8f:7e:2d:ec:d8:a8:37:17:c3:0e:72:9a:6a:15:88:
+ f1:0d:29:ec:7e:2c:fa:78:c8:75:f9:b6:15:20:0a:
+ 37:eb:bb:c6:55:81:e2:81:73:04:64:2d:85:7b:39:
+ 70:20:76:99:ce:91:28:16:56:37:6b:b2:c5:27:4d:
+ 32:ae:34:3d:d7:4a:fc:50:4f:82:10:c4:d8:cc:4e:
+ 34:0f:4a:25:08:ca:3b:14:0f:51:0a:37:8e:dd:b5:
+ 08:a1:86:88:75:54:d4:19:61:06:1d:64:9e:a3:11:
+ 9e:8b:d1:a4:9b:ab:be:01:28:fc:7f:e8:b4:8f:17:
+ 43:da:a5:ec:7b:
+public exponent:
+ 01:00:01:
+private exponent:
+ 6a:cd:04:0d:99:0a:65:6b:8a:1c:c4:2b:cf:b6:8e:
+ 3f:ae:43:47:3e:c6:75:c5:ca:44:8c:88:f5:10:8c:
+ b4:25:ec:16:d7:a8:64:c6:bd:bf:8a:2b:71:73:f8:
+ 5a:8c:1e:d5:c3:b0:b5:04:c7:1e:4e:30:2d:49:7c:
+ 70:58:77:ef:8c:bc:b2:04:e6:be:1e:0c:e1:2c:3d:
+ 9d:69:e5:a6:b1:71:a0:22:0a:52:46:f7:0d:c2:e4:
+ 83:28:f9:41:83:3d:bd:b0:b1:2d:0f:db:cd:6b:b9:
+ bf:2a:34:d7:42:24:00:8a:9f:f7:82:44:3a:1a:0b:
+ 75:7e:0b:6c:c5:33:3d:76:d2:5e:40:71:0d:e8:a1:
+ 10:90:9a:b6:a5:9c:bf:2d:74:2c:8b:17:d9:6f:ce:
+ 90:b8:79:79:dd:14:4a:bc:87:96:24:81:5a:14:6b:
+ cf:16:b2:94:5e:b7:7b:cc:cc:4a:a9:8e:e3:a9:c3:
+ 70:51:1f:03:f6:f0:92:1f:1e:39:9a:58:05:e0:9c:
+ 0c:4e:06:4a:6a:31:23:e6:21:bf:0a:ec:8f:31:a0:
+ c9:24:e2:cd:ff:fa:25:fa:1c:bf:4f:22:c6:e5:0f:
+ 52:8d:95:ab:1f:58:30:20:f1:2b:ea:df:c4:af:b5:
+ 7e:10:c5:4f:16:72:3f:f5:2e:88:3c:51:23:37:20:
+ 7c:55:d4:bb:d7:23:6a:b0:14:81:a4:c1:6b:06:3b:
+ 28:17:e9:80:dd:1a:e5:d6:bb:0d:30:cb:6a:34:9b:
+ 23:ae:49:49:42:24:b8:7f:72:f6:e9:4a:c9:75:2b:
+ 7f:ac:40:b1:
+prime1:
+ 00:d0:9c:a7:0f:3a:c4:ec:84:3d:92:22:39:ef:3e:
+ 81:27:8a:5e:bf:01:7d:69:78:e8:ec:af:62:cf:c0:
+ ec:1d:f0:38:f4:f9:e5:ab:bc:aa:a2:5c:78:fa:23:
+ 0d:03:9c:7b:29:3c:6f:26:91:c9:a4:31:41:72:63:
+ 76:65:02:0d:f1:56:0f:b0:70:ef:be:6e:97:bb:f6:
+ ed:57:b6:02:16:eb:83:f6:c9:f6:ce:51:d2:91:b6:
+ a1:85:83:b9:da:da:29:b1:eb:23:6a:dd:3d:cc:1f:
+ 40:e2:f2:68:db:be:7f:2a:4f:2b:5b:ed:ad:ff:c8:
+ ef:16:9c:15:68:71:24:8c:44:bb:58:17:0d:f2:fa:
+ b7:ca:e6:f1:b3:5e:45:fc:3a:56:82:44:95:d5:15:
+ 90:c9:d3:
+prime2:
+ 00:c0:87:ef:09:79:4e:4a:ea:23:86:c7:10:3e:59:
+ 90:8e:f0:32:ff:8a:9d:8f:5c:dc:2c:5a:99:6a:46:
+ 04:dd:c2:0d:41:f0:3c:71:78:95:fc:10:da:90:9d:
+ 1a:f8:f5:27:eb:26:2b:44:c2:b1:64:27:2c:3f:f4:
+ 03:98:e9:b7:34:70:69:69:7b:bc:c9:85:b8:8b:e3:
+ 45:a0:44:90:b9:3f:bf:76:b8:a1:29:a6:05:63:cb:
+ 03:a2:8a:06:31:ce:b4:15:89:7f:ee:e5:ce:89:da:
+ 8c:e6:0f:38:43:1e:cc:dc:58:f3:73:19:1d:82:9c:
+ 0e:fa:f2:a8:ad:ab:91:09:06:fc:a6:10:cd:82:be:
+ 4a:fb:3c:b2:92:0b:24:cf:6d:02:2e:0d:4a:52:aa:
+ 34:c1:b9:
+coefficient:
+ 00:86:2e:30:76:ad:fd:d3:00:ab:06:e6:bf:aa:db:
+ 1f:49:8a:23:7c:b4:be:b3:fa:ff:5a:7a:d7:09:2c:
+ ad:ed:d2:0c:7d:a8:bc:e3:a4:a3:8d:10:0e:47:a3:
+ ad:5d:66:3b:58:35:55:95:53:3d:1f:5e:0a:db:10:
+ 32:b6:0a:8f:e0:0c:4b:8c:e6:94:ef:5e:ba:cb:b3:
+ d0:b2:88:a3:d6:ff:16:0e:60:59:fe:0b:43:03:6f:
+ ea:57:54:9b:cd:1c:2a:e6:57:3f:f2:d4:81:dd:07:
+ f3:dc:39:53:1c:09:f9:bf:0f:f6:5c:8e:2f:e0:aa:
+ f7:b8:58:4b:21:3f:5d:2f:08:24:e4:3a:3b:52:6f:
+ 28:3c:ee:29:f5:03:be:8b:93:9a:f1:ac:ce:12:ac:
+ fe:7f:32:
+exp1:
+ 00:a7:07:16:77:8a:2d:8b:d5:e1:da:74:8f:00:70:
+ 82:46:9f:72:76:ea:81:78:86:77:b0:b2:48:a2:61:
+ 2c:6c:58:1f:b2:7d:b7:97:86:ca:f4:8e:a7:ca:57:
+ 70:1f:19:16:3f:91:04:c9:d3:e6:a8:11:4b:fe:83:
+ 86:93:1f:4e:fc:91:54:a4:87:f8:5c:f7:fd:83:61:
+ 14:ed:aa:6c:07:df:f0:5c:13:9f:09:d8:d7:89:15:
+ ba:43:c5:91:74:9a:42:d2:12:9b:db:ff:62:70:62:
+ 01:b8:f4:30:62:e9:26:b6:40:87:4d:e6:82:ef:8e:
+ f9:67:97:f7:48:15:77:16:dc:1d:48:4d:c5:3c:6b:
+ e3:e6:90:7c:ab:89:ea:ed:25:e4:88:0e:d4:0c:b5:
+ 64:a5:43:
+exp2:
+ 7a:14:b7:c9:b6:15:a3:03:1c:4b:d5:e5:c2:e3:5f:
+ fa:82:ec:93:84:fd:ab:6e:22:5e:2d:84:a2:12:8b:
+ fb:61:94:ae:7e:fa:94:a8:f5:d1:c3:8e:13:ac:ca:
+ f1:99:e2:1a:05:35:e2:7f:e1:a3:b4:03:26:fa:3f:
+ 5d:b2:b4:ec:97:6a:ff:eb:ea:25:8e:99:1a:7a:9e:
+ 27:a5:d2:6e:e4:b1:2f:42:9b:4e:a1:6b:41:7f:f5:
+ 6a:17:43:1e:4a:07:7e:b0:95:62:92:6d:88:94:00:
+ 4b:d0:d2:c8:1c:bb:a1:ec:f5:51:c2:57:27:fe:74:
+ b1:43:35:1a:0a:74:08:d9:59:52:a3:cc:ec:5e:65:
+ 85:31:53:b9:af:3f:44:17:c7:0e:14:77:50:3b:85:
+ 00:61:
+
+Public Key ID: 4D:56:B7:6A:00:58:F1:67:92:F4:A6:75:55:1B:8E:53:01:03:EF:CF
+
+-----BEGIN RSA PRIVATE KEY-----
+MIIFfAIBAAKCATEAnORCsX1unl//fy2d1054XduIg/3CqVBaT3Hca65SEoDwh0Ki
+PtQoOgZLdKY2cobGs/ojYtOjcs0KnlPYdmtjEh6WEhuJU95v4TQdC4OLMiE56eIG
+q252hZAbHoTL84Q14DxQWGuzQK830iml7fbw2WcIcRQ8vFGs8SzfXw63+MI6Fq6i
+MAQIqP08WzGmRRzL5wvCiPhCVkrPmwbXoABub6AAsYwWPJB91M9/lx5gFH5k9/iP
+fi3s2Kg3F8MOcppqFYjxDSnsfiz6eMh1+bYVIAo367vGVYHigXMEZC2FezlwIHaZ
+zpEoFlY3a7LFJ00yrjQ910r8UE+CEMTYzE40D0olCMo7FA9RCjeO3bUIoYaIdVTU
+GWEGHWSeoxGei9Gkm6u+ASj8f+i0jxdD2qXsewIDAQABAoIBMGrNBA2ZCmVrihzE
+K8+2jj+uQ0c+xnXFykSMiPUQjLQl7BbXqGTGvb+KK3Fz+FqMHtXDsLUExx5OMC1J
+fHBYd++MvLIE5r4eDOEsPZ1p5aaxcaAiClJG9w3C5IMo+UGDPb2wsS0P281rub8q
+NNdCJACKn/eCRDoaC3V+C2zFMz120l5AcQ3ooRCQmralnL8tdCyLF9lvzpC4eXnd
+FEq8h5YkgVoUa88WspRet3vMzEqpjuOpw3BRHwP28JIfHjmaWAXgnAxOBkpqMSPm
+Ib8K7I8xoMkk4s3/+iX6HL9PIsblD1KNlasfWDAg8Svq38SvtX4QxU8Wcj/1Log8
+USM3IHxV1LvXI2qwFIGkwWsGOygX6YDdGuXWuw0wy2o0myOuSUlCJLh/cvbpSsl1
+K3+sQLECgZkA0JynDzrE7IQ9kiI57z6BJ4pevwF9aXjo7K9iz8DsHfA49Pnlq7yq
+olx4+iMNA5x7KTxvJpHJpDFBcmN2ZQIN8VYPsHDvvm6Xu/btV7YCFuuD9sn2zlHS
+kbahhYO52topsesjat09zB9A4vJo275/Kk8rW+2t/8jvFpwVaHEkjES7WBcN8vq3
+yubxs15F/DpWgkSV1RWQydMCgZkAwIfvCXlOSuojhscQPlmQjvAy/4qdj1zcLFqZ
+akYE3cINQfA8cXiV/BDakJ0a+PUn6yYrRMKxZCcsP/QDmOm3NHBpaXu8yYW4i+NF
+oESQuT+/drihKaYFY8sDoooGMc60FYl/7uXOidqM5g84Qx7M3FjzcxkdgpwO+vKo
+rauRCQb8phDNgr5K+zyykgskz20CLg1KUqo0wbkCgZkApwcWd4oti9Xh2nSPAHCC
+Rp9yduqBeIZ3sLJIomEsbFgfsn23l4bK9I6nyldwHxkWP5EEydPmqBFL/oOGkx9O
+/JFUpIf4XPf9g2EU7apsB9/wXBOfCdjXiRW6Q8WRdJpC0hKb2/9icGIBuPQwYukm
+tkCHTeaC7475Z5f3SBV3FtwdSE3FPGvj5pB8q4nq7SXkiA7UDLVkpUMCgZh6FLfJ
+thWjAxxL1eXC41/6guyThP2rbiJeLYSiEov7YZSufvqUqPXRw44TrMrxmeIaBTXi
+f+GjtAMm+j9dsrTsl2r/6+oljpkaep4npdJu5LEvQptOoWtBf/VqF0MeSgd+sJVi
+km2IlABL0NLIHLuh7PVRwlcn/nSxQzUaCnQI2VlSo8zsXmWFMVO5rz9EF8cOFHdQ
+O4UAYQKBmQCGLjB2rf3TAKsG5r+q2x9JiiN8tL6z+v9aetcJLK3t0gx9qLzjpKON
+EA5Ho61dZjtYNVWVUz0fXgrbEDK2Co/gDEuM5pTvXrrLs9CyiKPW/xYOYFn+C0MD
+b+pXVJvNHCrmVz/y1IHdB/PcOVMcCfm/D/Zcji/gqve4WEshP10vCCTkOjtSbyg8
+7in1A76Lk5rxrM4SrP5/Mg==
+-----END RSA PRIVATE KEY-----
diff --git a/lib/accelerated/intel/asm/appro-aes-x86-64.s
b/lib/accelerated/intel/asm/appro-aes-x86-64.s
index 96b7b6e..e6db040 100644
--- a/lib/accelerated/intel/asm/appro-aes-x86-64.s
+++ b/lib/accelerated/intel/asm/appro-aes-x86-64.s
@@ -5,18 +5,19 @@
# modification, are permitted provided that the following conditions
# are met:
#
-# * Redistributions of source code must retain copyright notices,
-# this list of conditions and the following disclaimer.
+# * Redistributions of source code must retain copyright
+# * notices,
+# this list of conditions and the following disclaimer.
#
-# * Redistributions in binary form must reproduce the above
-# copyright notice, this list of conditions and the following
-# disclaimer in the documentation and/or other materials
-# provided with the distribution.
+# * Redistributions in binary form must reproduce the above
+# copyright notice, this list of conditions and the following
+# disclaimer in the documentation and/or other materials
+# provided with the distribution.
#
-# * Neither the name of the Andy Polyakov nor the names of its
-# copyright holder and contributors may be used to endorse or
-# promote products derived from this software without specific
-# prior written permission.
+# * Neither the name of the Andy Polyakov nor the names of its
+# copyright holder and contributors may be used to endorse or
+# promote products derived from this software without specific
+# prior written permission.
#
# ALTERNATIVELY, provided that this notice is retained in full, this
# product may be distributed under the terms of the GNU General Public
@@ -40,20 +41,20 @@
.type aesni_encrypt,@function
.align 16
aesni_encrypt:
- movups (%rdi),%xmm0
+ movups (%rdi),%xmm2
movl 240(%rdx),%eax
- movaps (%rdx),%xmm4
- movaps 16(%rdx),%xmm5
+ movaps (%rdx),%xmm0
+ movaps 16(%rdx),%xmm1
leaq 32(%rdx),%rdx
- pxor %xmm4,%xmm0
+ xorps %xmm0,%xmm2
.Loop_enc1_1:
-.byte 102,15,56,220,197
+.byte 102,15,56,220,209
decl %eax
- movaps (%rdx),%xmm5
+ movaps (%rdx),%xmm1
leaq 16(%rdx),%rdx
jnz .Loop_enc1_1
-.byte 102,15,56,221,197
- movups %xmm0,(%rsi)
+.byte 102,15,56,221,209
+ movups %xmm2,(%rsi)
.byte 0xf3,0xc3
.size aesni_encrypt,.-aesni_encrypt
@@ -61,318 +62,1941 @@ aesni_encrypt:
.type aesni_decrypt,@function
.align 16
aesni_decrypt:
- movups (%rdi),%xmm0
+ movups (%rdi),%xmm2
movl 240(%rdx),%eax
- movaps (%rdx),%xmm4
- movaps 16(%rdx),%xmm5
+ movaps (%rdx),%xmm0
+ movaps 16(%rdx),%xmm1
leaq 32(%rdx),%rdx
- pxor %xmm4,%xmm0
+ xorps %xmm0,%xmm2
.Loop_dec1_2:
-.byte 102,15,56,222,197
+.byte 102,15,56,222,209
decl %eax
- movaps (%rdx),%xmm5
+ movaps (%rdx),%xmm1
leaq 16(%rdx),%rdx
jnz .Loop_dec1_2
-.byte 102,15,56,223,197
- movups %xmm0,(%rsi)
+.byte 102,15,56,223,209
+ movups %xmm2,(%rsi)
.byte 0xf3,0xc3
.size aesni_decrypt, .-aesni_decrypt
.type _aesni_encrypt3,@function
.align 16
_aesni_encrypt3:
- movaps (%rcx),%xmm4
+ movaps (%rcx),%xmm0
shrl $1,%eax
- movaps 16(%rcx),%xmm5
+ movaps 16(%rcx),%xmm1
leaq 32(%rcx),%rcx
- pxor %xmm4,%xmm0
- pxor %xmm4,%xmm1
- pxor %xmm4,%xmm2
+ xorps %xmm0,%xmm2
+ xorps %xmm0,%xmm3
+ xorps %xmm0,%xmm4
+ movaps (%rcx),%xmm0
.Lenc_loop3:
-.byte 102,15,56,220,197
- movaps (%rcx),%xmm4
-.byte 102,15,56,220,205
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
decl %eax
-.byte 102,15,56,220,213
-.byte 102,15,56,220,196
- movaps 16(%rcx),%xmm5
-.byte 102,15,56,220,204
+.byte 102,15,56,220,225
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,220,208
+.byte 102,15,56,220,216
leaq 32(%rcx),%rcx
-.byte 102,15,56,220,212
+.byte 102,15,56,220,224
+ movaps (%rcx),%xmm0
jnz .Lenc_loop3
-.byte 102,15,56,220,197
- movaps (%rcx),%xmm4
-.byte 102,15,56,220,205
-.byte 102,15,56,220,213
-.byte 102,15,56,221,196
-.byte 102,15,56,221,204
-.byte 102,15,56,221,212
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,220,225
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+.byte 102,15,56,221,224
.byte 0xf3,0xc3
.size _aesni_encrypt3,.-_aesni_encrypt3
.type _aesni_decrypt3,@function
.align 16
_aesni_decrypt3:
- movaps (%rcx),%xmm4
+ movaps (%rcx),%xmm0
shrl $1,%eax
- movaps 16(%rcx),%xmm5
+ movaps 16(%rcx),%xmm1
leaq 32(%rcx),%rcx
- pxor %xmm4,%xmm0
- pxor %xmm4,%xmm1
- pxor %xmm4,%xmm2
+ xorps %xmm0,%xmm2
+ xorps %xmm0,%xmm3
+ xorps %xmm0,%xmm4
+ movaps (%rcx),%xmm0
.Ldec_loop3:
-.byte 102,15,56,222,197
- movaps (%rcx),%xmm4
-.byte 102,15,56,222,205
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
decl %eax
-.byte 102,15,56,222,213
-.byte 102,15,56,222,196
- movaps 16(%rcx),%xmm5
-.byte 102,15,56,222,204
+.byte 102,15,56,222,225
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,222,208
+.byte 102,15,56,222,216
leaq 32(%rcx),%rcx
-.byte 102,15,56,222,212
+.byte 102,15,56,222,224
+ movaps (%rcx),%xmm0
jnz .Ldec_loop3
-.byte 102,15,56,222,197
- movaps (%rcx),%xmm4
-.byte 102,15,56,222,205
-.byte 102,15,56,222,213
-.byte 102,15,56,223,196
-.byte 102,15,56,223,204
-.byte 102,15,56,223,212
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+.byte 102,15,56,222,225
+.byte 102,15,56,223,208
+.byte 102,15,56,223,216
+.byte 102,15,56,223,224
.byte 0xf3,0xc3
.size _aesni_decrypt3,.-_aesni_decrypt3
.type _aesni_encrypt4,@function
.align 16
_aesni_encrypt4:
- movaps (%rcx),%xmm4
+ movaps (%rcx),%xmm0
shrl $1,%eax
- movaps 16(%rcx),%xmm5
+ movaps 16(%rcx),%xmm1
leaq 32(%rcx),%rcx
- pxor %xmm4,%xmm0
- pxor %xmm4,%xmm1
- pxor %xmm4,%xmm2
- pxor %xmm4,%xmm3
+ xorps %xmm0,%xmm2
+ xorps %xmm0,%xmm3
+ xorps %xmm0,%xmm4
+ xorps %xmm0,%xmm5
+ movaps (%rcx),%xmm0
.Lenc_loop4:
-.byte 102,15,56,220,197
- movaps (%rcx),%xmm4
-.byte 102,15,56,220,205
- decl %eax
-.byte 102,15,56,220,213
-.byte 102,15,56,220,221
-.byte 102,15,56,220,196
- movaps 16(%rcx),%xmm5
-.byte 102,15,56,220,204
- leaq 32(%rcx),%rcx
-.byte 102,15,56,220,212
-.byte 102,15,56,220,220
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+ decl %eax
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,220,208
+.byte 102,15,56,220,216
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,220,224
+.byte 102,15,56,220,232
+ movaps (%rcx),%xmm0
jnz .Lenc_loop4
-.byte 102,15,56,220,197
- movaps (%rcx),%xmm4
-.byte 102,15,56,220,205
-.byte 102,15,56,220,213
-.byte 102,15,56,220,221
-.byte 102,15,56,221,196
-.byte 102,15,56,221,204
-.byte 102,15,56,221,212
-.byte 102,15,56,221,220
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+.byte 102,15,56,221,224
+.byte 102,15,56,221,232
.byte 0xf3,0xc3
.size _aesni_encrypt4,.-_aesni_encrypt4
.type _aesni_decrypt4,@function
.align 16
_aesni_decrypt4:
- movaps (%rcx),%xmm4
+ movaps (%rcx),%xmm0
shrl $1,%eax
- movaps 16(%rcx),%xmm5
+ movaps 16(%rcx),%xmm1
leaq 32(%rcx),%rcx
- pxor %xmm4,%xmm0
- pxor %xmm4,%xmm1
- pxor %xmm4,%xmm2
- pxor %xmm4,%xmm3
+ xorps %xmm0,%xmm2
+ xorps %xmm0,%xmm3
+ xorps %xmm0,%xmm4
+ xorps %xmm0,%xmm5
+ movaps (%rcx),%xmm0
.Ldec_loop4:
-.byte 102,15,56,222,197
- movaps (%rcx),%xmm4
-.byte 102,15,56,222,205
- decl %eax
-.byte 102,15,56,222,213
-.byte 102,15,56,222,221
-.byte 102,15,56,222,196
- movaps 16(%rcx),%xmm5
-.byte 102,15,56,222,204
- leaq 32(%rcx),%rcx
-.byte 102,15,56,222,212
-.byte 102,15,56,222,220
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+ decl %eax
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,222,208
+.byte 102,15,56,222,216
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,222,224
+.byte 102,15,56,222,232
+ movaps (%rcx),%xmm0
jnz .Ldec_loop4
-.byte 102,15,56,222,197
- movaps (%rcx),%xmm4
-.byte 102,15,56,222,205
-.byte 102,15,56,222,213
-.byte 102,15,56,222,221
-.byte 102,15,56,223,196
-.byte 102,15,56,223,204
-.byte 102,15,56,223,212
-.byte 102,15,56,223,220
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+.byte 102,15,56,223,208
+.byte 102,15,56,223,216
+.byte 102,15,56,223,224
+.byte 102,15,56,223,232
.byte 0xf3,0xc3
.size _aesni_decrypt4,.-_aesni_decrypt4
+.type _aesni_encrypt6,@function
+.align 16
+_aesni_encrypt6:
+ movaps (%rcx),%xmm0
+ shrl $1,%eax
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+.byte 102,15,56,220,209
+ pxor %xmm0,%xmm4
+.byte 102,15,56,220,217
+ pxor %xmm0,%xmm5
+.byte 102,15,56,220,225
+ pxor %xmm0,%xmm6
+.byte 102,15,56,220,233
+ pxor %xmm0,%xmm7
+ decl %eax
+.byte 102,15,56,220,241
+ movaps (%rcx),%xmm0
+.byte 102,15,56,220,249
+ jmp .Lenc_loop6_enter
+.align 16
+.Lenc_loop6:
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+ decl %eax
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+.Lenc_loop6_enter:
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,220,208
+.byte 102,15,56,220,216
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,220,224
+.byte 102,15,56,220,232
+.byte 102,15,56,220,240
+.byte 102,15,56,220,248
+ movaps (%rcx),%xmm0
+ jnz .Lenc_loop6
+
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+.byte 102,15,56,221,224
+.byte 102,15,56,221,232
+.byte 102,15,56,221,240
+.byte 102,15,56,221,248
+ .byte 0xf3,0xc3
+.size _aesni_encrypt6,.-_aesni_encrypt6
+.type _aesni_decrypt6,@function
+.align 16
+_aesni_decrypt6:
+ movaps (%rcx),%xmm0
+ shrl $1,%eax
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+.byte 102,15,56,222,209
+ pxor %xmm0,%xmm4
+.byte 102,15,56,222,217
+ pxor %xmm0,%xmm5
+.byte 102,15,56,222,225
+ pxor %xmm0,%xmm6
+.byte 102,15,56,222,233
+ pxor %xmm0,%xmm7
+ decl %eax
+.byte 102,15,56,222,241
+ movaps (%rcx),%xmm0
+.byte 102,15,56,222,249
+ jmp .Ldec_loop6_enter
+.align 16
+.Ldec_loop6:
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+ decl %eax
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+.byte 102,15,56,222,241
+.byte 102,15,56,222,249
+.Ldec_loop6_enter:
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,222,208
+.byte 102,15,56,222,216
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,222,224
+.byte 102,15,56,222,232
+.byte 102,15,56,222,240
+.byte 102,15,56,222,248
+ movaps (%rcx),%xmm0
+ jnz .Ldec_loop6
+
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+.byte 102,15,56,222,241
+.byte 102,15,56,222,249
+.byte 102,15,56,223,208
+.byte 102,15,56,223,216
+.byte 102,15,56,223,224
+.byte 102,15,56,223,232
+.byte 102,15,56,223,240
+.byte 102,15,56,223,248
+ .byte 0xf3,0xc3
+.size _aesni_decrypt6,.-_aesni_decrypt6
+.type _aesni_encrypt8,@function
+.align 16
+_aesni_encrypt8:
+ movaps (%rcx),%xmm0
+ shrl $1,%eax
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+ xorps %xmm0,%xmm3
+.byte 102,15,56,220,209
+ pxor %xmm0,%xmm4
+.byte 102,15,56,220,217
+ pxor %xmm0,%xmm5
+.byte 102,15,56,220,225
+ pxor %xmm0,%xmm6
+.byte 102,15,56,220,233
+ pxor %xmm0,%xmm7
+ decl %eax
+.byte 102,15,56,220,241
+ pxor %xmm0,%xmm8
+.byte 102,15,56,220,249
+ pxor %xmm0,%xmm9
+ movaps (%rcx),%xmm0
+.byte 102,68,15,56,220,193
+.byte 102,68,15,56,220,201
+ movaps 16(%rcx),%xmm1
+ jmp .Lenc_loop8_enter
+.align 16
+.Lenc_loop8:
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+ decl %eax
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+.byte 102,68,15,56,220,193
+.byte 102,68,15,56,220,201
+ movaps 16(%rcx),%xmm1
+.Lenc_loop8_enter:
+.byte 102,15,56,220,208
+.byte 102,15,56,220,216
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,220,224
+.byte 102,15,56,220,232
+.byte 102,15,56,220,240
+.byte 102,15,56,220,248
+.byte 102,68,15,56,220,192
+.byte 102,68,15,56,220,200
+ movaps (%rcx),%xmm0
+ jnz .Lenc_loop8
+
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+.byte 102,68,15,56,220,193
+.byte 102,68,15,56,220,201
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+.byte 102,15,56,221,224
+.byte 102,15,56,221,232
+.byte 102,15,56,221,240
+.byte 102,15,56,221,248
+.byte 102,68,15,56,221,192
+.byte 102,68,15,56,221,200
+ .byte 0xf3,0xc3
+.size _aesni_encrypt8,.-_aesni_encrypt8
+.type _aesni_decrypt8,@function
+.align 16
+_aesni_decrypt8:
+ movaps (%rcx),%xmm0
+ shrl $1,%eax
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+ xorps %xmm0,%xmm3
+.byte 102,15,56,222,209
+ pxor %xmm0,%xmm4
+.byte 102,15,56,222,217
+ pxor %xmm0,%xmm5
+.byte 102,15,56,222,225
+ pxor %xmm0,%xmm6
+.byte 102,15,56,222,233
+ pxor %xmm0,%xmm7
+ decl %eax
+.byte 102,15,56,222,241
+ pxor %xmm0,%xmm8
+.byte 102,15,56,222,249
+ pxor %xmm0,%xmm9
+ movaps (%rcx),%xmm0
+.byte 102,68,15,56,222,193
+.byte 102,68,15,56,222,201
+ movaps 16(%rcx),%xmm1
+ jmp .Ldec_loop8_enter
+.align 16
+.Ldec_loop8:
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+ decl %eax
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+.byte 102,15,56,222,241
+.byte 102,15,56,222,249
+.byte 102,68,15,56,222,193
+.byte 102,68,15,56,222,201
+ movaps 16(%rcx),%xmm1
+.Ldec_loop8_enter:
+.byte 102,15,56,222,208
+.byte 102,15,56,222,216
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,222,224
+.byte 102,15,56,222,232
+.byte 102,15,56,222,240
+.byte 102,15,56,222,248
+.byte 102,68,15,56,222,192
+.byte 102,68,15,56,222,200
+ movaps (%rcx),%xmm0
+ jnz .Ldec_loop8
+
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+.byte 102,15,56,222,241
+.byte 102,15,56,222,249
+.byte 102,68,15,56,222,193
+.byte 102,68,15,56,222,201
+.byte 102,15,56,223,208
+.byte 102,15,56,223,216
+.byte 102,15,56,223,224
+.byte 102,15,56,223,232
+.byte 102,15,56,223,240
+.byte 102,15,56,223,248
+.byte 102,68,15,56,223,192
+.byte 102,68,15,56,223,200
+ .byte 0xf3,0xc3
+.size _aesni_decrypt8,.-_aesni_decrypt8
.globl aesni_ecb_encrypt
.type aesni_ecb_encrypt,@function
.align 16
aesni_ecb_encrypt:
- cmpq $16,%rdx
- jb .Lecb_ret
+ andq $-16,%rdx
+ jz .Lecb_ret
movl 240(%rcx),%eax
- andq $-16,%rdx
+ movaps (%rcx),%xmm0
movq %rcx,%r11
- testl %r8d,%r8d
movl %eax,%r10d
+ testl %r8d,%r8d
jz .Lecb_decrypt
- subq $64,%rdx
- jbe .Lecb_enc_tail
- jmp .Lecb_enc_loop3
+ cmpq $128,%rdx
+ jb .Lecb_enc_tail
+
+ movdqu (%rdi),%xmm2
+ movdqu 16(%rdi),%xmm3
+ movdqu 32(%rdi),%xmm4
+ movdqu 48(%rdi),%xmm5
+ movdqu 64(%rdi),%xmm6
+ movdqu 80(%rdi),%xmm7
+ movdqu 96(%rdi),%xmm8
+ movdqu 112(%rdi),%xmm9
+ leaq 128(%rdi),%rdi
+ subq $128,%rdx
+ jmp .Lecb_enc_loop8_enter
.align 16
-.Lecb_enc_loop3:
- movups (%rdi),%xmm0
- movups 16(%rdi),%xmm1
- movups 32(%rdi),%xmm2
- call _aesni_encrypt3
- subq $48,%rdx
- leaq 48(%rdi),%rdi
- leaq 48(%rsi),%rsi
- movups %xmm0,-48(%rsi)
- movl %r10d,%eax
- movups %xmm1,-32(%rsi)
+.Lecb_enc_loop8:
+ movups %xmm2,(%rsi)
movq %r11,%rcx
- movups %xmm2,-16(%rsi)
- ja .Lecb_enc_loop3
+ movdqu (%rdi),%xmm2
+ movl %r10d,%eax
+ movups %xmm3,16(%rsi)
+ movdqu 16(%rdi),%xmm3
+ movups %xmm4,32(%rsi)
+ movdqu 32(%rdi),%xmm4
+ movups %xmm5,48(%rsi)
+ movdqu 48(%rdi),%xmm5
+ movups %xmm6,64(%rsi)
+ movdqu 64(%rdi),%xmm6
+ movups %xmm7,80(%rsi)
+ movdqu 80(%rdi),%xmm7
+ movups %xmm8,96(%rsi)
+ movdqu 96(%rdi),%xmm8
+ movups %xmm9,112(%rsi)
+ leaq 128(%rsi),%rsi
+ movdqu 112(%rdi),%xmm9
+ leaq 128(%rdi),%rdi
+.Lecb_enc_loop8_enter:
-.Lecb_enc_tail:
- addq $64,%rdx
+ call _aesni_encrypt8
+
+ subq $128,%rdx
+ jnc .Lecb_enc_loop8
+
+ movups %xmm2,(%rsi)
+ movq %r11,%rcx
+ movups %xmm3,16(%rsi)
+ movl %r10d,%eax
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ movups %xmm7,80(%rsi)
+ movups %xmm8,96(%rsi)
+ movups %xmm9,112(%rsi)
+ leaq 128(%rsi),%rsi
+ addq $128,%rdx
jz .Lecb_ret
- cmpq $16,%rdx
- movups (%rdi),%xmm0
- je .Lecb_enc_one
+.Lecb_enc_tail:
+ movups (%rdi),%xmm2
cmpq $32,%rdx
- movups 16(%rdi),%xmm1
+ jb .Lecb_enc_one
+ movups 16(%rdi),%xmm3
je .Lecb_enc_two
- cmpq $48,%rdx
- movups 32(%rdi),%xmm2
- je .Lecb_enc_three
- movups 48(%rdi),%xmm3
- call _aesni_encrypt4
- movups %xmm0,(%rsi)
- movups %xmm1,16(%rsi)
- movups %xmm2,32(%rsi)
- movups %xmm3,48(%rsi)
+ movups 32(%rdi),%xmm4
+ cmpq $64,%rdx
+ jb .Lecb_enc_three
+ movups 48(%rdi),%xmm5
+ je .Lecb_enc_four
+ movups 64(%rdi),%xmm6
+ cmpq $96,%rdx
+ jb .Lecb_enc_five
+ movups 80(%rdi),%xmm7
+ je .Lecb_enc_six
+ movdqu 96(%rdi),%xmm8
+ call _aesni_encrypt8
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ movups %xmm7,80(%rsi)
+ movups %xmm8,96(%rsi)
jmp .Lecb_ret
.align 16
.Lecb_enc_one:
- movaps (%rcx),%xmm4
- movaps 16(%rcx),%xmm5
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
leaq 32(%rcx),%rcx
- pxor %xmm4,%xmm0
+ xorps %xmm0,%xmm2
.Loop_enc1_3:
-.byte 102,15,56,220,197
+.byte 102,15,56,220,209
decl %eax
- movaps (%rcx),%xmm5
+ movaps (%rcx),%xmm1
leaq 16(%rcx),%rcx
jnz .Loop_enc1_3
-.byte 102,15,56,221,197
- movups %xmm0,(%rsi)
+.byte 102,15,56,221,209
+ movups %xmm2,(%rsi)
jmp .Lecb_ret
.align 16
.Lecb_enc_two:
+ xorps %xmm4,%xmm4
call _aesni_encrypt3
- movups %xmm0,(%rsi)
- movups %xmm1,16(%rsi)
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
jmp .Lecb_ret
.align 16
.Lecb_enc_three:
call _aesni_encrypt3
- movups %xmm0,(%rsi)
- movups %xmm1,16(%rsi)
- movups %xmm2,32(%rsi)
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ jmp .Lecb_ret
+.align 16
+.Lecb_enc_four:
+ call _aesni_encrypt4
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ jmp .Lecb_ret
+.align 16
+.Lecb_enc_five:
+ xorps %xmm7,%xmm7
+ call _aesni_encrypt6
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ jmp .Lecb_ret
+.align 16
+.Lecb_enc_six:
+ call _aesni_encrypt6
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ movups %xmm7,80(%rsi)
jmp .Lecb_ret
.align 16
.Lecb_decrypt:
- subq $64,%rdx
- jbe .Lecb_dec_tail
- jmp .Lecb_dec_loop3
+ cmpq $128,%rdx
+ jb .Lecb_dec_tail
+
+ movdqu (%rdi),%xmm2
+ movdqu 16(%rdi),%xmm3
+ movdqu 32(%rdi),%xmm4
+ movdqu 48(%rdi),%xmm5
+ movdqu 64(%rdi),%xmm6
+ movdqu 80(%rdi),%xmm7
+ movdqu 96(%rdi),%xmm8
+ movdqu 112(%rdi),%xmm9
+ leaq 128(%rdi),%rdi
+ subq $128,%rdx
+ jmp .Lecb_dec_loop8_enter
.align 16
-.Lecb_dec_loop3:
- movups (%rdi),%xmm0
- movups 16(%rdi),%xmm1
- movups 32(%rdi),%xmm2
- call _aesni_decrypt3
- subq $48,%rdx
- leaq 48(%rdi),%rdi
- leaq 48(%rsi),%rsi
- movups %xmm0,-48(%rsi)
- movl %r10d,%eax
- movups %xmm1,-32(%rsi)
+.Lecb_dec_loop8:
+ movups %xmm2,(%rsi)
movq %r11,%rcx
- movups %xmm2,-16(%rsi)
- ja .Lecb_dec_loop3
+ movdqu (%rdi),%xmm2
+ movl %r10d,%eax
+ movups %xmm3,16(%rsi)
+ movdqu 16(%rdi),%xmm3
+ movups %xmm4,32(%rsi)
+ movdqu 32(%rdi),%xmm4
+ movups %xmm5,48(%rsi)
+ movdqu 48(%rdi),%xmm5
+ movups %xmm6,64(%rsi)
+ movdqu 64(%rdi),%xmm6
+ movups %xmm7,80(%rsi)
+ movdqu 80(%rdi),%xmm7
+ movups %xmm8,96(%rsi)
+ movdqu 96(%rdi),%xmm8
+ movups %xmm9,112(%rsi)
+ leaq 128(%rsi),%rsi
+ movdqu 112(%rdi),%xmm9
+ leaq 128(%rdi),%rdi
+.Lecb_dec_loop8_enter:
-.Lecb_dec_tail:
- addq $64,%rdx
+ call _aesni_decrypt8
+
+ movaps (%r11),%xmm0
+ subq $128,%rdx
+ jnc .Lecb_dec_loop8
+
+ movups %xmm2,(%rsi)
+ movq %r11,%rcx
+ movups %xmm3,16(%rsi)
+ movl %r10d,%eax
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ movups %xmm7,80(%rsi)
+ movups %xmm8,96(%rsi)
+ movups %xmm9,112(%rsi)
+ leaq 128(%rsi),%rsi
+ addq $128,%rdx
jz .Lecb_ret
- cmpq $16,%rdx
- movups (%rdi),%xmm0
- je .Lecb_dec_one
+.Lecb_dec_tail:
+ movups (%rdi),%xmm2
cmpq $32,%rdx
- movups 16(%rdi),%xmm1
+ jb .Lecb_dec_one
+ movups 16(%rdi),%xmm3
je .Lecb_dec_two
- cmpq $48,%rdx
- movups 32(%rdi),%xmm2
- je .Lecb_dec_three
- movups 48(%rdi),%xmm3
- call _aesni_decrypt4
- movups %xmm0,(%rsi)
- movups %xmm1,16(%rsi)
- movups %xmm2,32(%rsi)
- movups %xmm3,48(%rsi)
+ movups 32(%rdi),%xmm4
+ cmpq $64,%rdx
+ jb .Lecb_dec_three
+ movups 48(%rdi),%xmm5
+ je .Lecb_dec_four
+ movups 64(%rdi),%xmm6
+ cmpq $96,%rdx
+ jb .Lecb_dec_five
+ movups 80(%rdi),%xmm7
+ je .Lecb_dec_six
+ movups 96(%rdi),%xmm8
+ movaps (%rcx),%xmm0
+ call _aesni_decrypt8
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ movups %xmm7,80(%rsi)
+ movups %xmm8,96(%rsi)
jmp .Lecb_ret
.align 16
.Lecb_dec_one:
- movaps (%rcx),%xmm4
- movaps 16(%rcx),%xmm5
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
leaq 32(%rcx),%rcx
- pxor %xmm4,%xmm0
+ xorps %xmm0,%xmm2
.Loop_dec1_4:
-.byte 102,15,56,222,197
+.byte 102,15,56,222,209
decl %eax
- movaps (%rcx),%xmm5
+ movaps (%rcx),%xmm1
leaq 16(%rcx),%rcx
jnz .Loop_dec1_4
-.byte 102,15,56,223,197
- movups %xmm0,(%rsi)
+.byte 102,15,56,223,209
+ movups %xmm2,(%rsi)
jmp .Lecb_ret
.align 16
.Lecb_dec_two:
+ xorps %xmm4,%xmm4
call _aesni_decrypt3
- movups %xmm0,(%rsi)
- movups %xmm1,16(%rsi)
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
jmp .Lecb_ret
.align 16
.Lecb_dec_three:
call _aesni_decrypt3
- movups %xmm0,(%rsi)
- movups %xmm1,16(%rsi)
- movups %xmm2,32(%rsi)
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ jmp .Lecb_ret
+.align 16
+.Lecb_dec_four:
+ call _aesni_decrypt4
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ jmp .Lecb_ret
+.align 16
+.Lecb_dec_five:
+ xorps %xmm7,%xmm7
+ call _aesni_decrypt6
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ jmp .Lecb_ret
+.align 16
+.Lecb_dec_six:
+ call _aesni_decrypt6
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ movups %xmm7,80(%rsi)
.Lecb_ret:
.byte 0xf3,0xc3
.size aesni_ecb_encrypt,.-aesni_ecb_encrypt
+.globl aesni_ccm64_encrypt_blocks
+.type aesni_ccm64_encrypt_blocks,@function
+.align 16
+aesni_ccm64_encrypt_blocks:
+ movdqu (%r8),%xmm9
+ movdqu (%r9),%xmm3
+ movdqa .Lincrement64(%rip),%xmm8
+ movdqa .Lbswap_mask(%rip),%xmm9
+.byte 102,69,15,56,0,201
+
+ movl 240(%rcx),%eax
+ movq %rcx,%r11
+ movl %eax,%r10d
+ movdqa %xmm9,%xmm2
+
+.Lccm64_enc_outer:
+ movups (%rdi),%xmm8
+.byte 102,65,15,56,0,209
+ movq %r11,%rcx
+ movl %r10d,%eax
+
+ movaps (%rcx),%xmm0
+ shrl $1,%eax
+ movaps 16(%rcx),%xmm1
+ xorps %xmm0,%xmm8
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+ xorps %xmm3,%xmm8
+ movaps (%rcx),%xmm0
+
+.Lccm64_enc2_loop:
+.byte 102,15,56,220,209
+ decl %eax
+.byte 102,15,56,220,217
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,220,208
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,220,216
+ movaps 0(%rcx),%xmm0
+ jnz .Lccm64_enc2_loop
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+
+ paddq %xmm8,%xmm9
+ decq %rdx
+ leaq 16(%rdi),%rdi
+ xorps %xmm2,%xmm8
+ movdqa %xmm9,%xmm2
+ movups %xmm8,(%rsi)
+ leaq 16(%rsi),%rsi
+ jnz .Lccm64_enc_outer
+
+ movups %xmm3,(%r9)
+ .byte 0xf3,0xc3
+.size aesni_ccm64_encrypt_blocks,.-aesni_ccm64_encrypt_blocks
+.globl aesni_ccm64_decrypt_blocks
+.type aesni_ccm64_decrypt_blocks,@function
+.align 16
+aesni_ccm64_decrypt_blocks:
+ movdqu (%r8),%xmm9
+ movdqu (%r9),%xmm3
+ movdqa .Lincrement64(%rip),%xmm8
+ movdqa .Lbswap_mask(%rip),%xmm9
+
+ movl 240(%rcx),%eax
+ movdqa %xmm9,%xmm2
+.byte 102,69,15,56,0,201
+ movl %eax,%r10d
+ movq %rcx,%r11
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+.Loop_enc1_5:
+.byte 102,15,56,220,209
+ decl %eax
+ movaps (%rcx),%xmm1
+ leaq 16(%rcx),%rcx
+ jnz .Loop_enc1_5
+.byte 102,15,56,221,209
+.Lccm64_dec_outer:
+ paddq %xmm8,%xmm9
+ movups (%rdi),%xmm8
+ xorps %xmm2,%xmm8
+ movdqa %xmm9,%xmm2
+ leaq 16(%rdi),%rdi
+.byte 102,65,15,56,0,209
+ movq %r11,%rcx
+ movl %r10d,%eax
+ movups %xmm8,(%rsi)
+ leaq 16(%rsi),%rsi
+
+ subq $1,%rdx
+ jz .Lccm64_dec_break
+
+ movaps (%rcx),%xmm0
+ shrl $1,%eax
+ movaps 16(%rcx),%xmm1
+ xorps %xmm0,%xmm8
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+ xorps %xmm8,%xmm3
+ movaps (%rcx),%xmm0
+
+.Lccm64_dec2_loop:
+.byte 102,15,56,220,209
+ decl %eax
+.byte 102,15,56,220,217
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,220,208
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,220,216
+ movaps 0(%rcx),%xmm0
+ jnz .Lccm64_dec2_loop
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,221,208
+ jmp .Lccm64_dec_outer
+
+.align 16
+.Lccm64_dec_break:
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm3
+.Loop_enc1_6:
+.byte 102,15,56,220,217
+ decl %eax
+ movaps (%rcx),%xmm1
+ leaq 16(%rcx),%rcx
+ jnz .Loop_enc1_6
+.byte 102,15,56,221,217
+ movups %xmm3,(%r9)
+ .byte 0xf3,0xc3
+.size aesni_ccm64_decrypt_blocks,.-aesni_ccm64_decrypt_blocks
+.globl aesni_ctr32_encrypt_blocks
+.type aesni_ctr32_encrypt_blocks,@function
+.align 16
+aesni_ctr32_encrypt_blocks:
+ cmpq $1,%rdx
+ je .Lctr32_one_shortcut
+
+ movdqu (%r8),%xmm14
+ movdqa .Lbswap_mask(%rip),%xmm15
+ xorl %eax,%eax
+.byte 102,69,15,58,22,242,3
+.byte 102,68,15,58,34,240,3
+
+ movl 240(%rcx),%eax
+ bswapl %r10d
+ pxor %xmm12,%xmm12
+ pxor %xmm13,%xmm13
+.byte 102,69,15,58,34,226,0
+ leaq 3(%r10),%r11
+.byte 102,69,15,58,34,235,0
+ incl %r10d
+.byte 102,69,15,58,34,226,1
+ incq %r11
+.byte 102,69,15,58,34,235,1
+ incl %r10d
+.byte 102,69,15,58,34,226,2
+ incq %r11
+.byte 102,69,15,58,34,235,2
+ movdqa %xmm12,-40(%rsp)
+.byte 102,69,15,56,0,231
+ movdqa %xmm13,-24(%rsp)
+.byte 102,69,15,56,0,239
+
+ pshufd $192,%xmm12,%xmm2
+ pshufd $128,%xmm12,%xmm3
+ pshufd $64,%xmm12,%xmm4
+ cmpq $6,%rdx
+ jb .Lctr32_tail
+ shrl $1,%eax
+ movq %rcx,%r11
+ movl %eax,%r10d
+ subq $6,%rdx
+ jmp .Lctr32_loop6
+
+.align 16
+.Lctr32_loop6:
+ pshufd $192,%xmm13,%xmm5
+ por %xmm14,%xmm2
+ movaps (%r11),%xmm0
+ pshufd $128,%xmm13,%xmm6
+ por %xmm14,%xmm3
+ movaps 16(%r11),%xmm1
+ pshufd $64,%xmm13,%xmm7
+ por %xmm14,%xmm4
+ por %xmm14,%xmm5
+ xorps %xmm0,%xmm2
+ por %xmm14,%xmm6
+ por %xmm14,%xmm7
+
+
+
+
+ pxor %xmm0,%xmm3
+.byte 102,15,56,220,209
+ leaq 32(%r11),%rcx
+ pxor %xmm0,%xmm4
+.byte 102,15,56,220,217
+ movdqa .Lincrement32(%rip),%xmm13
+ pxor %xmm0,%xmm5
+.byte 102,15,56,220,225
+ movdqa -40(%rsp),%xmm12
+ pxor %xmm0,%xmm6
+.byte 102,15,56,220,233
+ pxor %xmm0,%xmm7
+ movaps (%rcx),%xmm0
+ decl %eax
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+ jmp .Lctr32_enc_loop6_enter
+.align 16
+.Lctr32_enc_loop6:
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+ decl %eax
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+.Lctr32_enc_loop6_enter:
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,220,208
+.byte 102,15,56,220,216
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,220,224
+.byte 102,15,56,220,232
+.byte 102,15,56,220,240
+.byte 102,15,56,220,248
+ movaps (%rcx),%xmm0
+ jnz .Lctr32_enc_loop6
+
+.byte 102,15,56,220,209
+ paddd %xmm13,%xmm12
+.byte 102,15,56,220,217
+ paddd -24(%rsp),%xmm13
+.byte 102,15,56,220,225
+ movdqa %xmm12,-40(%rsp)
+.byte 102,15,56,220,233
+ movdqa %xmm13,-24(%rsp)
+.byte 102,15,56,220,241
+.byte 102,69,15,56,0,231
+.byte 102,15,56,220,249
+.byte 102,69,15,56,0,239
+
+.byte 102,15,56,221,208
+ movups (%rdi),%xmm8
+.byte 102,15,56,221,216
+ movups 16(%rdi),%xmm9
+.byte 102,15,56,221,224
+ movups 32(%rdi),%xmm10
+.byte 102,15,56,221,232
+ movups 48(%rdi),%xmm11
+.byte 102,15,56,221,240
+ movups 64(%rdi),%xmm1
+.byte 102,15,56,221,248
+ movups 80(%rdi),%xmm0
+ leaq 96(%rdi),%rdi
+
+ xorps %xmm2,%xmm8
+ pshufd $192,%xmm12,%xmm2
+ xorps %xmm3,%xmm9
+ pshufd $128,%xmm12,%xmm3
+ movups %xmm8,(%rsi)
+ xorps %xmm4,%xmm10
+ pshufd $64,%xmm12,%xmm4
+ movups %xmm9,16(%rsi)
+ xorps %xmm5,%xmm11
+ movups %xmm10,32(%rsi)
+ xorps %xmm6,%xmm1
+ movups %xmm11,48(%rsi)
+ xorps %xmm7,%xmm0
+ movups %xmm1,64(%rsi)
+ movups %xmm0,80(%rsi)
+ leaq 96(%rsi),%rsi
+ movl %r10d,%eax
+ subq $6,%rdx
+ jnc .Lctr32_loop6
+
+ addq $6,%rdx
+ jz .Lctr32_done
+ movq %r11,%rcx
+ leal 1(%rax,%rax,1),%eax
+
+.Lctr32_tail:
+ por %xmm14,%xmm2
+ movups (%rdi),%xmm8
+ cmpq $2,%rdx
+ jb .Lctr32_one
+
+ por %xmm14,%xmm3
+ movups 16(%rdi),%xmm9
+ je .Lctr32_two
+
+ pshufd $192,%xmm13,%xmm5
+ por %xmm14,%xmm4
+ movups 32(%rdi),%xmm10
+ cmpq $4,%rdx
+ jb .Lctr32_three
+
+ pshufd $128,%xmm13,%xmm6
+ por %xmm14,%xmm5
+ movups 48(%rdi),%xmm11
+ je .Lctr32_four
+
+ por %xmm14,%xmm6
+ xorps %xmm7,%xmm7
+
+ call _aesni_encrypt6
+
+ movups 64(%rdi),%xmm1
+ xorps %xmm2,%xmm8
+ xorps %xmm3,%xmm9
+ movups %xmm8,(%rsi)
+ xorps %xmm4,%xmm10
+ movups %xmm9,16(%rsi)
+ xorps %xmm5,%xmm11
+ movups %xmm10,32(%rsi)
+ xorps %xmm6,%xmm1
+ movups %xmm11,48(%rsi)
+ movups %xmm1,64(%rsi)
+ jmp .Lctr32_done
+
+.align 16
+.Lctr32_one_shortcut:
+ movups (%r8),%xmm2
+ movups (%rdi),%xmm8
+ movl 240(%rcx),%eax
+.Lctr32_one:
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+.Loop_enc1_7:
+.byte 102,15,56,220,209
+ decl %eax
+ movaps (%rcx),%xmm1
+ leaq 16(%rcx),%rcx
+ jnz .Loop_enc1_7
+.byte 102,15,56,221,209
+ xorps %xmm2,%xmm8
+ movups %xmm8,(%rsi)
+ jmp .Lctr32_done
+
+.align 16
+.Lctr32_two:
+ xorps %xmm4,%xmm4
+ call _aesni_encrypt3
+ xorps %xmm2,%xmm8
+ xorps %xmm3,%xmm9
+ movups %xmm8,(%rsi)
+ movups %xmm9,16(%rsi)
+ jmp .Lctr32_done
+
+.align 16
+.Lctr32_three:
+ call _aesni_encrypt3
+ xorps %xmm2,%xmm8
+ xorps %xmm3,%xmm9
+ movups %xmm8,(%rsi)
+ xorps %xmm4,%xmm10
+ movups %xmm9,16(%rsi)
+ movups %xmm10,32(%rsi)
+ jmp .Lctr32_done
+
+.align 16
+.Lctr32_four:
+ call _aesni_encrypt4
+ xorps %xmm2,%xmm8
+ xorps %xmm3,%xmm9
+ movups %xmm8,(%rsi)
+ xorps %xmm4,%xmm10
+ movups %xmm9,16(%rsi)
+ xorps %xmm5,%xmm11
+ movups %xmm10,32(%rsi)
+ movups %xmm11,48(%rsi)
+
+.Lctr32_done:
+ .byte 0xf3,0xc3
+.size aesni_ctr32_encrypt_blocks,.-aesni_ctr32_encrypt_blocks
+.globl aesni_xts_encrypt
+.type aesni_xts_encrypt,@function
+.align 16
+aesni_xts_encrypt:
+ leaq -104(%rsp),%rsp
+ movups (%r9),%xmm15
+ movl 240(%r8),%eax
+ movl 240(%rcx),%r10d
+ movaps (%r8),%xmm0
+ movaps 16(%r8),%xmm1
+ leaq 32(%r8),%r8
+ xorps %xmm0,%xmm15
+.Loop_enc1_8:
+.byte 102,68,15,56,220,249
+ decl %eax
+ movaps (%r8),%xmm1
+ leaq 16(%r8),%r8
+ jnz .Loop_enc1_8
+.byte 102,68,15,56,221,249
+ movq %rcx,%r11
+ movl %r10d,%eax
+ movq %rdx,%r9
+ andq $-16,%rdx
+
+ movdqa .Lxts_magic(%rip),%xmm8
+ pxor %xmm14,%xmm14
+ pcmpgtd %xmm15,%xmm14
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm10
+ paddq %xmm15,%xmm15
+ pand %xmm8,%xmm9
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm11
+ paddq %xmm15,%xmm15
+ pand %xmm8,%xmm9
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm12
+ paddq %xmm15,%xmm15
+ pand %xmm8,%xmm9
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm13
+ paddq %xmm15,%xmm15
+ pand %xmm8,%xmm9
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+ subq $96,%rdx
+ jc .Lxts_enc_short
+
+ shrl $1,%eax
+ subl $1,%eax
+ movl %eax,%r10d
+ jmp .Lxts_enc_grandloop
+
+.align 16
+.Lxts_enc_grandloop:
+ pshufd $19,%xmm14,%xmm9
+ movdqa %xmm15,%xmm14
+ paddq %xmm15,%xmm15
+ movdqu 0(%rdi),%xmm2
+ pand %xmm8,%xmm9
+ movdqu 16(%rdi),%xmm3
+ pxor %xmm9,%xmm15
+
+ movdqu 32(%rdi),%xmm4
+ pxor %xmm10,%xmm2
+ movdqu 48(%rdi),%xmm5
+ pxor %xmm11,%xmm3
+ movdqu 64(%rdi),%xmm6
+ pxor %xmm12,%xmm4
+ movdqu 80(%rdi),%xmm7
+ leaq 96(%rdi),%rdi
+ pxor %xmm13,%xmm5
+ movaps (%r11),%xmm0
+ pxor %xmm14,%xmm6
+ pxor %xmm15,%xmm7
+
+
+
+ movaps 16(%r11),%xmm1
+ pxor %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+ movdqa %xmm10,0(%rsp)
+.byte 102,15,56,220,209
+ leaq 32(%r11),%rcx
+ pxor %xmm0,%xmm4
+ movdqa %xmm11,16(%rsp)
+.byte 102,15,56,220,217
+ pxor %xmm0,%xmm5
+ movdqa %xmm12,32(%rsp)
+.byte 102,15,56,220,225
+ pxor %xmm0,%xmm6
+ movdqa %xmm13,48(%rsp)
+.byte 102,15,56,220,233
+ pxor %xmm0,%xmm7
+ movaps (%rcx),%xmm0
+ decl %eax
+ movdqa %xmm14,64(%rsp)
+.byte 102,15,56,220,241
+ movdqa %xmm15,80(%rsp)
+.byte 102,15,56,220,249
+ pxor %xmm14,%xmm14
+ pcmpgtd %xmm15,%xmm14
+ jmp .Lxts_enc_loop6_enter
+
+.align 16
+.Lxts_enc_loop6:
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+ decl %eax
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+.Lxts_enc_loop6_enter:
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,220,208
+.byte 102,15,56,220,216
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,220,224
+.byte 102,15,56,220,232
+.byte 102,15,56,220,240
+.byte 102,15,56,220,248
+ movaps (%rcx),%xmm0
+ jnz .Lxts_enc_loop6
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ paddq %xmm15,%xmm15
+.byte 102,15,56,220,209
+ pand %xmm8,%xmm9
+.byte 102,15,56,220,217
+ pcmpgtd %xmm15,%xmm14
+.byte 102,15,56,220,225
+ pxor %xmm9,%xmm15
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+ movaps 16(%rcx),%xmm1
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm10
+ paddq %xmm15,%xmm15
+.byte 102,15,56,220,208
+ pand %xmm8,%xmm9
+.byte 102,15,56,220,216
+ pcmpgtd %xmm15,%xmm14
+.byte 102,15,56,220,224
+ pxor %xmm9,%xmm15
+.byte 102,15,56,220,232
+.byte 102,15,56,220,240
+.byte 102,15,56,220,248
+ movaps 32(%rcx),%xmm0
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm11
+ paddq %xmm15,%xmm15
+.byte 102,15,56,220,209
+ pand %xmm8,%xmm9
+.byte 102,15,56,220,217
+ pcmpgtd %xmm15,%xmm14
+.byte 102,15,56,220,225
+ pxor %xmm9,%xmm15
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm12
+ paddq %xmm15,%xmm15
+.byte 102,15,56,221,208
+ pand %xmm8,%xmm9
+.byte 102,15,56,221,216
+ pcmpgtd %xmm15,%xmm14
+.byte 102,15,56,221,224
+ pxor %xmm9,%xmm15
+.byte 102,15,56,221,232
+.byte 102,15,56,221,240
+.byte 102,15,56,221,248
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm13
+ paddq %xmm15,%xmm15
+ xorps 0(%rsp),%xmm2
+ pand %xmm8,%xmm9
+ xorps 16(%rsp),%xmm3
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+
+ xorps 32(%rsp),%xmm4
+ movups %xmm2,0(%rsi)
+ xorps 48(%rsp),%xmm5
+ movups %xmm3,16(%rsi)
+ xorps 64(%rsp),%xmm6
+ movups %xmm4,32(%rsi)
+ xorps 80(%rsp),%xmm7
+ movups %xmm5,48(%rsi)
+ movl %r10d,%eax
+ movups %xmm6,64(%rsi)
+ movups %xmm7,80(%rsi)
+ leaq 96(%rsi),%rsi
+ subq $96,%rdx
+ jnc .Lxts_enc_grandloop
+
+ leal 3(%rax,%rax,1),%eax
+ movq %r11,%rcx
+ movl %eax,%r10d
+
+.Lxts_enc_short:
+ addq $96,%rdx
+ jz .Lxts_enc_done
+
+ cmpq $32,%rdx
+ jb .Lxts_enc_one
+ je .Lxts_enc_two
+
+ cmpq $64,%rdx
+ jb .Lxts_enc_three
+ je .Lxts_enc_four
+
+ pshufd $19,%xmm14,%xmm9
+ movdqa %xmm15,%xmm14
+ paddq %xmm15,%xmm15
+ movdqu (%rdi),%xmm2
+ pand %xmm8,%xmm9
+ movdqu 16(%rdi),%xmm3
+ pxor %xmm9,%xmm15
+
+ movdqu 32(%rdi),%xmm4
+ pxor %xmm10,%xmm2
+ movdqu 48(%rdi),%xmm5
+ pxor %xmm11,%xmm3
+ movdqu 64(%rdi),%xmm6
+ leaq 80(%rdi),%rdi
+ pxor %xmm12,%xmm4
+ pxor %xmm13,%xmm5
+ pxor %xmm14,%xmm6
+
+ call _aesni_encrypt6
+
+ xorps %xmm10,%xmm2
+ movdqa %xmm15,%xmm10
+ xorps %xmm11,%xmm3
+ xorps %xmm12,%xmm4
+ movdqu %xmm2,(%rsi)
+ xorps %xmm13,%xmm5
+ movdqu %xmm3,16(%rsi)
+ xorps %xmm14,%xmm6
+ movdqu %xmm4,32(%rsi)
+ movdqu %xmm5,48(%rsi)
+ movdqu %xmm6,64(%rsi)
+ leaq 80(%rsi),%rsi
+ jmp .Lxts_enc_done
+
+.align 16
+.Lxts_enc_one:
+ movups (%rdi),%xmm2
+ leaq 16(%rdi),%rdi
+ xorps %xmm10,%xmm2
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+.Loop_enc1_9:
+.byte 102,15,56,220,209
+ decl %eax
+ movaps (%rcx),%xmm1
+ leaq 16(%rcx),%rcx
+ jnz .Loop_enc1_9
+.byte 102,15,56,221,209
+ xorps %xmm10,%xmm2
+ movdqa %xmm11,%xmm10
+ movups %xmm2,(%rsi)
+ leaq 16(%rsi),%rsi
+ jmp .Lxts_enc_done
+
+.align 16
+.Lxts_enc_two:
+ movups (%rdi),%xmm2
+ movups 16(%rdi),%xmm3
+ leaq 32(%rdi),%rdi
+ xorps %xmm10,%xmm2
+ xorps %xmm11,%xmm3
+
+ call _aesni_encrypt3
+
+ xorps %xmm10,%xmm2
+ movdqa %xmm12,%xmm10
+ xorps %xmm11,%xmm3
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ leaq 32(%rsi),%rsi
+ jmp .Lxts_enc_done
+
+.align 16
+.Lxts_enc_three:
+ movups (%rdi),%xmm2
+ movups 16(%rdi),%xmm3
+ movups 32(%rdi),%xmm4
+ leaq 48(%rdi),%rdi
+ xorps %xmm10,%xmm2
+ xorps %xmm11,%xmm3
+ xorps %xmm12,%xmm4
+
+ call _aesni_encrypt3
+
+ xorps %xmm10,%xmm2
+ movdqa %xmm13,%xmm10
+ xorps %xmm11,%xmm3
+ xorps %xmm12,%xmm4
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ leaq 48(%rsi),%rsi
+ jmp .Lxts_enc_done
+
+.align 16
+.Lxts_enc_four:
+ movups (%rdi),%xmm2
+ movups 16(%rdi),%xmm3
+ movups 32(%rdi),%xmm4
+ xorps %xmm10,%xmm2
+ movups 48(%rdi),%xmm5
+ leaq 64(%rdi),%rdi
+ xorps %xmm11,%xmm3
+ xorps %xmm12,%xmm4
+ xorps %xmm13,%xmm5
+
+ call _aesni_encrypt4
+
+ xorps %xmm10,%xmm2
+ movdqa %xmm15,%xmm10
+ xorps %xmm11,%xmm3
+ xorps %xmm12,%xmm4
+ movups %xmm2,(%rsi)
+ xorps %xmm13,%xmm5
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ leaq 64(%rsi),%rsi
+ jmp .Lxts_enc_done
+
+.align 16
+.Lxts_enc_done:
+ andq $15,%r9
+ jz .Lxts_enc_ret
+ movq %r9,%rdx
+
+.Lxts_enc_steal:
+ movzbl (%rdi),%eax
+ movzbl -16(%rsi),%ecx
+ leaq 1(%rdi),%rdi
+ movb %al,-16(%rsi)
+ movb %cl,0(%rsi)
+ leaq 1(%rsi),%rsi
+ subq $1,%rdx
+ jnz .Lxts_enc_steal
+
+ subq %r9,%rsi
+ movq %r11,%rcx
+ movl %r10d,%eax
+
+ movups -16(%rsi),%xmm2
+ xorps %xmm10,%xmm2
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+.Loop_enc1_10:
+.byte 102,15,56,220,209
+ decl %eax
+ movaps (%rcx),%xmm1
+ leaq 16(%rcx),%rcx
+ jnz .Loop_enc1_10
+.byte 102,15,56,221,209
+ xorps %xmm10,%xmm2
+ movups %xmm2,-16(%rsi)
+
+.Lxts_enc_ret:
+ leaq 104(%rsp),%rsp
+.Lxts_enc_epilogue:
+ .byte 0xf3,0xc3
+.size aesni_xts_encrypt,.-aesni_xts_encrypt
+.globl aesni_xts_decrypt
+.type aesni_xts_decrypt,@function
+.align 16
+aesni_xts_decrypt:
+ leaq -104(%rsp),%rsp
+ movups (%r9),%xmm15
+ movl 240(%r8),%eax
+ movl 240(%rcx),%r10d
+ movaps (%r8),%xmm0
+ movaps 16(%r8),%xmm1
+ leaq 32(%r8),%r8
+ xorps %xmm0,%xmm15
+.Loop_enc1_11:
+.byte 102,68,15,56,220,249
+ decl %eax
+ movaps (%r8),%xmm1
+ leaq 16(%r8),%r8
+ jnz .Loop_enc1_11
+.byte 102,68,15,56,221,249
+ xorl %eax,%eax
+ testq $15,%rdx
+ setnz %al
+ shlq $4,%rax
+ subq %rax,%rdx
+
+ movq %rcx,%r11
+ movl %r10d,%eax
+ movq %rdx,%r9
+ andq $-16,%rdx
+
+ movdqa .Lxts_magic(%rip),%xmm8
+ pxor %xmm14,%xmm14
+ pcmpgtd %xmm15,%xmm14
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm10
+ paddq %xmm15,%xmm15
+ pand %xmm8,%xmm9
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm11
+ paddq %xmm15,%xmm15
+ pand %xmm8,%xmm9
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm12
+ paddq %xmm15,%xmm15
+ pand %xmm8,%xmm9
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm13
+ paddq %xmm15,%xmm15
+ pand %xmm8,%xmm9
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+ subq $96,%rdx
+ jc .Lxts_dec_short
+
+ shrl $1,%eax
+ subl $1,%eax
+ movl %eax,%r10d
+ jmp .Lxts_dec_grandloop
+
+.align 16
+.Lxts_dec_grandloop:
+ pshufd $19,%xmm14,%xmm9
+ movdqa %xmm15,%xmm14
+ paddq %xmm15,%xmm15
+ movdqu 0(%rdi),%xmm2
+ pand %xmm8,%xmm9
+ movdqu 16(%rdi),%xmm3
+ pxor %xmm9,%xmm15
+
+ movdqu 32(%rdi),%xmm4
+ pxor %xmm10,%xmm2
+ movdqu 48(%rdi),%xmm5
+ pxor %xmm11,%xmm3
+ movdqu 64(%rdi),%xmm6
+ pxor %xmm12,%xmm4
+ movdqu 80(%rdi),%xmm7
+ leaq 96(%rdi),%rdi
+ pxor %xmm13,%xmm5
+ movaps (%r11),%xmm0
+ pxor %xmm14,%xmm6
+ pxor %xmm15,%xmm7
+
+
+
+ movaps 16(%r11),%xmm1
+ pxor %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+ movdqa %xmm10,0(%rsp)
+.byte 102,15,56,222,209
+ leaq 32(%r11),%rcx
+ pxor %xmm0,%xmm4
+ movdqa %xmm11,16(%rsp)
+.byte 102,15,56,222,217
+ pxor %xmm0,%xmm5
+ movdqa %xmm12,32(%rsp)
+.byte 102,15,56,222,225
+ pxor %xmm0,%xmm6
+ movdqa %xmm13,48(%rsp)
+.byte 102,15,56,222,233
+ pxor %xmm0,%xmm7
+ movaps (%rcx),%xmm0
+ decl %eax
+ movdqa %xmm14,64(%rsp)
+.byte 102,15,56,222,241
+ movdqa %xmm15,80(%rsp)
+.byte 102,15,56,222,249
+ pxor %xmm14,%xmm14
+ pcmpgtd %xmm15,%xmm14
+ jmp .Lxts_dec_loop6_enter
+
+.align 16
+.Lxts_dec_loop6:
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+ decl %eax
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+.byte 102,15,56,222,241
+.byte 102,15,56,222,249
+.Lxts_dec_loop6_enter:
+ movaps 16(%rcx),%xmm1
+.byte 102,15,56,222,208
+.byte 102,15,56,222,216
+ leaq 32(%rcx),%rcx
+.byte 102,15,56,222,224
+.byte 102,15,56,222,232
+.byte 102,15,56,222,240
+.byte 102,15,56,222,248
+ movaps (%rcx),%xmm0
+ jnz .Lxts_dec_loop6
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ paddq %xmm15,%xmm15
+.byte 102,15,56,222,209
+ pand %xmm8,%xmm9
+.byte 102,15,56,222,217
+ pcmpgtd %xmm15,%xmm14
+.byte 102,15,56,222,225
+ pxor %xmm9,%xmm15
+.byte 102,15,56,222,233
+.byte 102,15,56,222,241
+.byte 102,15,56,222,249
+ movaps 16(%rcx),%xmm1
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm10
+ paddq %xmm15,%xmm15
+.byte 102,15,56,222,208
+ pand %xmm8,%xmm9
+.byte 102,15,56,222,216
+ pcmpgtd %xmm15,%xmm14
+.byte 102,15,56,222,224
+ pxor %xmm9,%xmm15
+.byte 102,15,56,222,232
+.byte 102,15,56,222,240
+.byte 102,15,56,222,248
+ movaps 32(%rcx),%xmm0
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm11
+ paddq %xmm15,%xmm15
+.byte 102,15,56,222,209
+ pand %xmm8,%xmm9
+.byte 102,15,56,222,217
+ pcmpgtd %xmm15,%xmm14
+.byte 102,15,56,222,225
+ pxor %xmm9,%xmm15
+.byte 102,15,56,222,233
+.byte 102,15,56,222,241
+.byte 102,15,56,222,249
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm12
+ paddq %xmm15,%xmm15
+.byte 102,15,56,223,208
+ pand %xmm8,%xmm9
+.byte 102,15,56,223,216
+ pcmpgtd %xmm15,%xmm14
+.byte 102,15,56,223,224
+ pxor %xmm9,%xmm15
+.byte 102,15,56,223,232
+.byte 102,15,56,223,240
+.byte 102,15,56,223,248
+
+ pshufd $19,%xmm14,%xmm9
+ pxor %xmm14,%xmm14
+ movdqa %xmm15,%xmm13
+ paddq %xmm15,%xmm15
+ xorps 0(%rsp),%xmm2
+ pand %xmm8,%xmm9
+ xorps 16(%rsp),%xmm3
+ pcmpgtd %xmm15,%xmm14
+ pxor %xmm9,%xmm15
+
+ xorps 32(%rsp),%xmm4
+ movups %xmm2,0(%rsi)
+ xorps 48(%rsp),%xmm5
+ movups %xmm3,16(%rsi)
+ xorps 64(%rsp),%xmm6
+ movups %xmm4,32(%rsi)
+ xorps 80(%rsp),%xmm7
+ movups %xmm5,48(%rsi)
+ movl %r10d,%eax
+ movups %xmm6,64(%rsi)
+ movups %xmm7,80(%rsi)
+ leaq 96(%rsi),%rsi
+ subq $96,%rdx
+ jnc .Lxts_dec_grandloop
+
+ leal 3(%rax,%rax,1),%eax
+ movq %r11,%rcx
+ movl %eax,%r10d
+
+.Lxts_dec_short:
+ addq $96,%rdx
+ jz .Lxts_dec_done
+
+ cmpq $32,%rdx
+ jb .Lxts_dec_one
+ je .Lxts_dec_two
+
+ cmpq $64,%rdx
+ jb .Lxts_dec_three
+ je .Lxts_dec_four
+
+ pshufd $19,%xmm14,%xmm9
+ movdqa %xmm15,%xmm14
+ paddq %xmm15,%xmm15
+ movdqu (%rdi),%xmm2
+ pand %xmm8,%xmm9
+ movdqu 16(%rdi),%xmm3
+ pxor %xmm9,%xmm15
+
+ movdqu 32(%rdi),%xmm4
+ pxor %xmm10,%xmm2
+ movdqu 48(%rdi),%xmm5
+ pxor %xmm11,%xmm3
+ movdqu 64(%rdi),%xmm6
+ leaq 80(%rdi),%rdi
+ pxor %xmm12,%xmm4
+ pxor %xmm13,%xmm5
+ pxor %xmm14,%xmm6
+
+ call _aesni_decrypt6
+
+ xorps %xmm10,%xmm2
+ xorps %xmm11,%xmm3
+ xorps %xmm12,%xmm4
+ movdqu %xmm2,(%rsi)
+ xorps %xmm13,%xmm5
+ movdqu %xmm3,16(%rsi)
+ xorps %xmm14,%xmm6
+ movdqu %xmm4,32(%rsi)
+ pxor %xmm14,%xmm14
+ movdqu %xmm5,48(%rsi)
+ pcmpgtd %xmm15,%xmm14
+ movdqu %xmm6,64(%rsi)
+ leaq 80(%rsi),%rsi
+ pshufd $19,%xmm14,%xmm11
+ andq $15,%r9
+ jz .Lxts_dec_ret
+
+ movdqa %xmm15,%xmm10
+ paddq %xmm15,%xmm15
+ pand %xmm8,%xmm11
+ pxor %xmm15,%xmm11
+ jmp .Lxts_dec_done2
+
+.align 16
+.Lxts_dec_one:
+ movups (%rdi),%xmm2
+ leaq 16(%rdi),%rdi
+ xorps %xmm10,%xmm2
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+.Loop_dec1_12:
+.byte 102,15,56,222,209
+ decl %eax
+ movaps (%rcx),%xmm1
+ leaq 16(%rcx),%rcx
+ jnz .Loop_dec1_12
+.byte 102,15,56,223,209
+ xorps %xmm10,%xmm2
+ movdqa %xmm11,%xmm10
+ movups %xmm2,(%rsi)
+ movdqa %xmm12,%xmm11
+ leaq 16(%rsi),%rsi
+ jmp .Lxts_dec_done
+
+.align 16
+.Lxts_dec_two:
+ movups (%rdi),%xmm2
+ movups 16(%rdi),%xmm3
+ leaq 32(%rdi),%rdi
+ xorps %xmm10,%xmm2
+ xorps %xmm11,%xmm3
+
+ call _aesni_decrypt3
+
+ xorps %xmm10,%xmm2
+ movdqa %xmm12,%xmm10
+ xorps %xmm11,%xmm3
+ movdqa %xmm13,%xmm11
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ leaq 32(%rsi),%rsi
+ jmp .Lxts_dec_done
+
+.align 16
+.Lxts_dec_three:
+ movups (%rdi),%xmm2
+ movups 16(%rdi),%xmm3
+ movups 32(%rdi),%xmm4
+ leaq 48(%rdi),%rdi
+ xorps %xmm10,%xmm2
+ xorps %xmm11,%xmm3
+ xorps %xmm12,%xmm4
+
+ call _aesni_decrypt3
+
+ xorps %xmm10,%xmm2
+ movdqa %xmm13,%xmm10
+ xorps %xmm11,%xmm3
+ movdqa %xmm15,%xmm11
+ xorps %xmm12,%xmm4
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ leaq 48(%rsi),%rsi
+ jmp .Lxts_dec_done
+
+.align 16
+.Lxts_dec_four:
+ pshufd $19,%xmm14,%xmm9
+ movdqa %xmm15,%xmm14
+ paddq %xmm15,%xmm15
+ movups (%rdi),%xmm2
+ pand %xmm8,%xmm9
+ movups 16(%rdi),%xmm3
+ pxor %xmm9,%xmm15
+
+ movups 32(%rdi),%xmm4
+ xorps %xmm10,%xmm2
+ movups 48(%rdi),%xmm5
+ leaq 64(%rdi),%rdi
+ xorps %xmm11,%xmm3
+ xorps %xmm12,%xmm4
+ xorps %xmm13,%xmm5
+
+ call _aesni_decrypt4
+
+ xorps %xmm10,%xmm2
+ movdqa %xmm14,%xmm10
+ xorps %xmm11,%xmm3
+ movdqa %xmm15,%xmm11
+ xorps %xmm12,%xmm4
+ movups %xmm2,(%rsi)
+ xorps %xmm13,%xmm5
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ leaq 64(%rsi),%rsi
+ jmp .Lxts_dec_done
+
+.align 16
+.Lxts_dec_done:
+ andq $15,%r9
+ jz .Lxts_dec_ret
+.Lxts_dec_done2:
+ movq %r9,%rdx
+ movq %r11,%rcx
+ movl %r10d,%eax
+
+ movups (%rdi),%xmm2
+ xorps %xmm11,%xmm2
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+.Loop_dec1_13:
+.byte 102,15,56,222,209
+ decl %eax
+ movaps (%rcx),%xmm1
+ leaq 16(%rcx),%rcx
+ jnz .Loop_dec1_13
+.byte 102,15,56,223,209
+ xorps %xmm11,%xmm2
+ movups %xmm2,(%rsi)
+
+.Lxts_dec_steal:
+ movzbl 16(%rdi),%eax
+ movzbl (%rsi),%ecx
+ leaq 1(%rdi),%rdi
+ movb %al,(%rsi)
+ movb %cl,16(%rsi)
+ leaq 1(%rsi),%rsi
+ subq $1,%rdx
+ jnz .Lxts_dec_steal
+
+ subq %r9,%rsi
+ movq %r11,%rcx
+ movl %r10d,%eax
+
+ movups (%rsi),%xmm2
+ xorps %xmm10,%xmm2
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
+ leaq 32(%rcx),%rcx
+ xorps %xmm0,%xmm2
+.Loop_dec1_14:
+.byte 102,15,56,222,209
+ decl %eax
+ movaps (%rcx),%xmm1
+ leaq 16(%rcx),%rcx
+ jnz .Loop_dec1_14
+.byte 102,15,56,223,209
+ xorps %xmm10,%xmm2
+ movups %xmm2,(%rsi)
+
+.Lxts_dec_ret:
+ leaq 104(%rsp),%rsp
+.Lxts_dec_epilogue:
+ .byte 0xf3,0xc3
+.size aesni_xts_decrypt,.-aesni_xts_decrypt
.globl aesni_cbc_encrypt
.type aesni_cbc_encrypt,@function
.align 16
@@ -385,37 +2009,38 @@ aesni_cbc_encrypt:
testl %r9d,%r9d
jz .Lcbc_decrypt
- movups (%r8),%xmm0
- cmpq $16,%rdx
+ movups (%r8),%xmm2
movl %r10d,%eax
+ cmpq $16,%rdx
jb .Lcbc_enc_tail
subq $16,%rdx
jmp .Lcbc_enc_loop
.align 16
.Lcbc_enc_loop:
- movups (%rdi),%xmm1
+ movups (%rdi),%xmm3
leaq 16(%rdi),%rdi
- pxor %xmm1,%xmm0
- movaps (%rcx),%xmm4
- movaps 16(%rcx),%xmm5
+
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
+ xorps %xmm0,%xmm3
leaq 32(%rcx),%rcx
- pxor %xmm4,%xmm0
-.Loop_enc1_5:
-.byte 102,15,56,220,197
+ xorps %xmm3,%xmm2
+.Loop_enc1_15:
+.byte 102,15,56,220,209
decl %eax
- movaps (%rcx),%xmm5
+ movaps (%rcx),%xmm1
leaq 16(%rcx),%rcx
- jnz .Loop_enc1_5
-.byte 102,15,56,221,197
- subq $16,%rdx
- leaq 16(%rsi),%rsi
+ jnz .Loop_enc1_15
+.byte 102,15,56,221,209
movl %r10d,%eax
movq %r11,%rcx
- movups %xmm0,-16(%rsi)
+ movups %xmm2,0(%rsi)
+ leaq 16(%rsi),%rsi
+ subq $16,%rdx
jnc .Lcbc_enc_loop
addq $16,%rdx
jnz .Lcbc_enc_tail
- movups %xmm0,(%r8)
+ movups %xmm2,(%r8)
jmp .Lcbc_ret
.Lcbc_enc_tail:
@@ -435,113 +2060,261 @@ aesni_cbc_encrypt:
.align 16
.Lcbc_decrypt:
- movups (%r8),%xmm6
- subq $64,%rdx
+ movups (%r8),%xmm9
movl %r10d,%eax
+ cmpq $112,%rdx
jbe .Lcbc_dec_tail
- jmp .Lcbc_dec_loop3
+ shrl $1,%r10d
+ subq $112,%rdx
+ movl %r10d,%eax
+ movaps %xmm9,-24(%rsp)
+ jmp .Lcbc_dec_loop8_enter
.align 16
-.Lcbc_dec_loop3:
- movups (%rdi),%xmm0
- movups 16(%rdi),%xmm1
- movups 32(%rdi),%xmm2
- movaps %xmm0,%xmm7
- movaps %xmm1,%xmm8
- movaps %xmm2,%xmm9
- call _aesni_decrypt3
- subq $48,%rdx
- leaq 48(%rdi),%rdi
- leaq 48(%rsi),%rsi
- pxor %xmm6,%xmm0
- pxor %xmm7,%xmm1
- movaps %xmm9,%xmm6
- pxor %xmm8,%xmm2
- movups %xmm0,-48(%rsi)
+.Lcbc_dec_loop8:
+ movaps %xmm0,-24(%rsp)
+ movups %xmm9,(%rsi)
+ leaq 16(%rsi),%rsi
+.Lcbc_dec_loop8_enter:
+ movaps (%rcx),%xmm0
+ movups (%rdi),%xmm2
+ movups 16(%rdi),%xmm3
+ movaps 16(%rcx),%xmm1
+
+ leaq 32(%rcx),%rcx
+ movdqu 32(%rdi),%xmm4
+ xorps %xmm0,%xmm2
+ movdqu 48(%rdi),%xmm5
+ xorps %xmm0,%xmm3
+ movdqu 64(%rdi),%xmm6
+.byte 102,15,56,222,209
+ pxor %xmm0,%xmm4
+ movdqu 80(%rdi),%xmm7
+.byte 102,15,56,222,217
+ pxor %xmm0,%xmm5
+ movdqu 96(%rdi),%xmm8
+.byte 102,15,56,222,225
+ pxor %xmm0,%xmm6
+ movdqu 112(%rdi),%xmm9
+.byte 102,15,56,222,233
+ pxor %xmm0,%xmm7
+ decl %eax
+.byte 102,15,56,222,241
+ pxor %xmm0,%xmm8
+.byte 102,15,56,222,249
+ pxor %xmm0,%xmm9
+ movaps (%rcx),%xmm0
+.byte 102,68,15,56,222,193
+.byte 102,68,15,56,222,201
+ movaps 16(%rcx),%xmm1
+
+ call .Ldec_loop8_enter
+
+ movups (%rdi),%xmm1
+ movups 16(%rdi),%xmm0
+ xorps -24(%rsp),%xmm2
+ xorps %xmm1,%xmm3
+ movups 32(%rdi),%xmm1
+ xorps %xmm0,%xmm4
+ movups 48(%rdi),%xmm0
+ xorps %xmm1,%xmm5
+ movups 64(%rdi),%xmm1
+ xorps %xmm0,%xmm6
+ movups 80(%rdi),%xmm0
+ xorps %xmm1,%xmm7
+ movups 96(%rdi),%xmm1
+ xorps %xmm0,%xmm8
+ movups 112(%rdi),%xmm0
+ xorps %xmm1,%xmm9
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
movl %r10d,%eax
- movups %xmm1,-32(%rsi)
+ movups %xmm6,64(%rsi)
movq %r11,%rcx
- movups %xmm2,-16(%rsi)
- ja .Lcbc_dec_loop3
+ movups %xmm7,80(%rsi)
+ leaq 128(%rdi),%rdi
+ movups %xmm8,96(%rsi)
+ leaq 112(%rsi),%rsi
+ subq $128,%rdx
+ ja .Lcbc_dec_loop8
+ movaps %xmm9,%xmm2
+ movaps %xmm0,%xmm9
+ addq $112,%rdx
+ jle .Lcbc_dec_tail_collected
+ movups %xmm2,(%rsi)
+ leal 1(%r10,%r10,1),%eax
+ leaq 16(%rsi),%rsi
.Lcbc_dec_tail:
- addq $64,%rdx
- movups %xmm6,(%r8)
- jz .Lcbc_dec_ret
-
- movups (%rdi),%xmm0
+ movups (%rdi),%xmm2
+ movaps %xmm2,%xmm8
cmpq $16,%rdx
- movaps %xmm0,%xmm7
jbe .Lcbc_dec_one
- movups 16(%rdi),%xmm1
+
+ movups 16(%rdi),%xmm3
+ movaps %xmm3,%xmm7
cmpq $32,%rdx
- movaps %xmm1,%xmm8
jbe .Lcbc_dec_two
- movups 32(%rdi),%xmm2
+
+ movups 32(%rdi),%xmm4
+ movaps %xmm4,%xmm6
cmpq $48,%rdx
- movaps %xmm2,%xmm9
jbe .Lcbc_dec_three
- movups 48(%rdi),%xmm3
- call _aesni_decrypt4
- pxor %xmm6,%xmm0
- movups 48(%rdi),%xmm6
- pxor %xmm7,%xmm1
- movups %xmm0,(%rsi)
- pxor %xmm8,%xmm2
- movups %xmm1,16(%rsi)
- pxor %xmm9,%xmm3
- movups %xmm2,32(%rsi)
- movaps %xmm3,%xmm0
- leaq 48(%rsi),%rsi
+
+ movups 48(%rdi),%xmm5
+ cmpq $64,%rdx
+ jbe .Lcbc_dec_four
+
+ movups 64(%rdi),%xmm6
+ cmpq $80,%rdx
+ jbe .Lcbc_dec_five
+
+ movups 80(%rdi),%xmm7
+ cmpq $96,%rdx
+ jbe .Lcbc_dec_six
+
+ movups 96(%rdi),%xmm8
+ movaps %xmm9,-24(%rsp)
+ call _aesni_decrypt8
+ movups (%rdi),%xmm1
+ movups 16(%rdi),%xmm0
+ xorps -24(%rsp),%xmm2
+ xorps %xmm1,%xmm3
+ movups 32(%rdi),%xmm1
+ xorps %xmm0,%xmm4
+ movups 48(%rdi),%xmm0
+ xorps %xmm1,%xmm5
+ movups 64(%rdi),%xmm1
+ xorps %xmm0,%xmm6
+ movups 80(%rdi),%xmm0
+ xorps %xmm1,%xmm7
+ movups 96(%rdi),%xmm9
+ xorps %xmm0,%xmm8
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ movups %xmm7,80(%rsi)
+ leaq 96(%rsi),%rsi
+ movaps %xmm8,%xmm2
+ subq $112,%rdx
jmp .Lcbc_dec_tail_collected
.align 16
.Lcbc_dec_one:
- movaps (%rcx),%xmm4
- movaps 16(%rcx),%xmm5
+ movaps (%rcx),%xmm0
+ movaps 16(%rcx),%xmm1
leaq 32(%rcx),%rcx
- pxor %xmm4,%xmm0
-.Loop_dec1_6:
-.byte 102,15,56,222,197
+ xorps %xmm0,%xmm2
+.Loop_dec1_16:
+.byte 102,15,56,222,209
decl %eax
- movaps (%rcx),%xmm5
+ movaps (%rcx),%xmm1
leaq 16(%rcx),%rcx
- jnz .Loop_dec1_6
-.byte 102,15,56,223,197
- pxor %xmm6,%xmm0
- movaps %xmm7,%xmm6
+ jnz .Loop_dec1_16
+.byte 102,15,56,223,209
+ xorps %xmm9,%xmm2
+ movaps %xmm8,%xmm9
+ subq $16,%rdx
jmp .Lcbc_dec_tail_collected
.align 16
.Lcbc_dec_two:
+ xorps %xmm4,%xmm4
call _aesni_decrypt3
- pxor %xmm6,%xmm0
- pxor %xmm7,%xmm1
- movups %xmm0,(%rsi)
- movaps %xmm8,%xmm6
- movaps %xmm1,%xmm0
+ xorps %xmm9,%xmm2
+ xorps %xmm8,%xmm3
+ movups %xmm2,(%rsi)
+ movaps %xmm7,%xmm9
+ movaps %xmm3,%xmm2
leaq 16(%rsi),%rsi
+ subq $32,%rdx
jmp .Lcbc_dec_tail_collected
.align 16
.Lcbc_dec_three:
call _aesni_decrypt3
- pxor %xmm6,%xmm0
- pxor %xmm7,%xmm1
- movups %xmm0,(%rsi)
- pxor %xmm8,%xmm2
- movups %xmm1,16(%rsi)
- movaps %xmm9,%xmm6
- movaps %xmm2,%xmm0
+ xorps %xmm9,%xmm2
+ xorps %xmm8,%xmm3
+ movups %xmm2,(%rsi)
+ xorps %xmm7,%xmm4
+ movups %xmm3,16(%rsi)
+ movaps %xmm6,%xmm9
+ movaps %xmm4,%xmm2
leaq 32(%rsi),%rsi
+ subq $48,%rdx
+ jmp .Lcbc_dec_tail_collected
+.align 16
+.Lcbc_dec_four:
+ call _aesni_decrypt4
+ xorps %xmm9,%xmm2
+ movups 48(%rdi),%xmm9
+ xorps %xmm8,%xmm3
+ movups %xmm2,(%rsi)
+ xorps %xmm7,%xmm4
+ movups %xmm3,16(%rsi)
+ xorps %xmm6,%xmm5
+ movups %xmm4,32(%rsi)
+ movaps %xmm5,%xmm2
+ leaq 48(%rsi),%rsi
+ subq $64,%rdx
+ jmp .Lcbc_dec_tail_collected
+.align 16
+.Lcbc_dec_five:
+ xorps %xmm7,%xmm7
+ call _aesni_decrypt6
+ movups 16(%rdi),%xmm1
+ movups 32(%rdi),%xmm0
+ xorps %xmm9,%xmm2
+ xorps %xmm8,%xmm3
+ xorps %xmm1,%xmm4
+ movups 48(%rdi),%xmm1
+ xorps %xmm0,%xmm5
+ movups 64(%rdi),%xmm9
+ xorps %xmm1,%xmm6
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ leaq 64(%rsi),%rsi
+ movaps %xmm6,%xmm2
+ subq $80,%rdx
+ jmp .Lcbc_dec_tail_collected
+.align 16
+.Lcbc_dec_six:
+ call _aesni_decrypt6
+ movups 16(%rdi),%xmm1
+ movups 32(%rdi),%xmm0
+ xorps %xmm9,%xmm2
+ xorps %xmm8,%xmm3
+ xorps %xmm1,%xmm4
+ movups 48(%rdi),%xmm1
+ xorps %xmm0,%xmm5
+ movups 64(%rdi),%xmm0
+ xorps %xmm1,%xmm6
+ movups 80(%rdi),%xmm9
+ xorps %xmm0,%xmm7
+ movups %xmm2,(%rsi)
+ movups %xmm3,16(%rsi)
+ movups %xmm4,32(%rsi)
+ movups %xmm5,48(%rsi)
+ movups %xmm6,64(%rsi)
+ leaq 80(%rsi),%rsi
+ movaps %xmm7,%xmm2
+ subq $96,%rdx
jmp .Lcbc_dec_tail_collected
.align 16
.Lcbc_dec_tail_collected:
andq $15,%rdx
- movups %xmm6,(%r8)
+ movups %xmm9,(%r8)
jnz .Lcbc_dec_tail_partial
- movups %xmm0,(%rsi)
+ movups %xmm2,(%rsi)
jmp .Lcbc_dec_ret
+.align 16
.Lcbc_dec_tail_partial:
- movaps %xmm0,-24(%rsp)
+ movaps %xmm2,-24(%rsp)
+ movq $16,%rcx
movq %rsi,%rdi
- movq %rdx,%rcx
+ subq %rdx,%rcx
leaq -24(%rsp),%rsi
.long 0x9066A4F3
@@ -554,7 +2327,7 @@ aesni_cbc_encrypt:
.align 16
aesni_set_decrypt_key:
.byte 0x48,0x83,0xEC,0x08
- call _aesni_set_encrypt_key
+ call __aesni_set_encrypt_key
shll $4,%esi
testl %eax,%eax
jnz .Ldec_key_ret
@@ -574,9 +2347,9 @@ aesni_set_decrypt_key:
.byte 102,15,56,219,201
leaq 16(%rdx),%rdx
leaq -16(%rdi),%rdi
- cmpq %rdx,%rdi
movaps %xmm0,16(%rdi)
movaps %xmm1,-16(%rdx)
+ cmpq %rdx,%rdi
ja .Ldec_key_inverse
movaps (%rdx),%xmm0
@@ -591,16 +2364,16 @@ aesni_set_decrypt_key:
.type aesni_set_encrypt_key,@function
.align 16
aesni_set_encrypt_key:
-_aesni_set_encrypt_key:
+__aesni_set_encrypt_key:
.byte 0x48,0x83,0xEC,0x08
- testq %rdi,%rdi
movq $-1,%rax
+ testq %rdi,%rdi
jz .Lenc_key_ret
testq %rdx,%rdx
jz .Lenc_key_ret
movups (%rdi),%xmm0
- pxor %xmm4,%xmm4
+ xorps %xmm4,%xmm4
leaq 16(%rdx),%rax
cmpl $256,%esi
je .L14rounds
@@ -715,11 +2488,11 @@ _aesni_set_encrypt_key:
leaq 16(%rax),%rax
.Lkey_expansion_128_cold:
shufps $16,%xmm0,%xmm4
- pxor %xmm4,%xmm0
+ xorps %xmm4,%xmm0
shufps $140,%xmm0,%xmm4
- pxor %xmm4,%xmm0
- pshufd $255,%xmm1,%xmm1
- pxor %xmm1,%xmm0
+ xorps %xmm4,%xmm0
+ shufps $255,%xmm1,%xmm1
+ xorps %xmm1,%xmm0
.byte 0xf3,0xc3
.align 16
@@ -730,11 +2503,11 @@ _aesni_set_encrypt_key:
movaps %xmm2,%xmm5
.Lkey_expansion_192b_warm:
shufps $16,%xmm0,%xmm4
- movaps %xmm2,%xmm3
- pxor %xmm4,%xmm0
+ movdqa %xmm2,%xmm3
+ xorps %xmm4,%xmm0
shufps $140,%xmm0,%xmm4
pslldq $4,%xmm3
- pxor %xmm4,%xmm0
+ xorps %xmm4,%xmm0
pshufd $85,%xmm1,%xmm1
pxor %xmm3,%xmm2
pxor %xmm1,%xmm0
@@ -758,11 +2531,11 @@ _aesni_set_encrypt_key:
leaq 16(%rax),%rax
.Lkey_expansion_256a_cold:
shufps $16,%xmm0,%xmm4
- pxor %xmm4,%xmm0
+ xorps %xmm4,%xmm0
shufps $140,%xmm0,%xmm4
- pxor %xmm4,%xmm0
- pshufd $255,%xmm1,%xmm1
- pxor %xmm1,%xmm0
+ xorps %xmm4,%xmm0
+ shufps $255,%xmm1,%xmm1
+ xorps %xmm1,%xmm0
.byte 0xf3,0xc3
.align 16
@@ -771,12 +2544,23 @@ _aesni_set_encrypt_key:
leaq 16(%rax),%rax
shufps $16,%xmm2,%xmm4
- pxor %xmm4,%xmm2
+ xorps %xmm4,%xmm2
shufps $140,%xmm2,%xmm4
- pxor %xmm4,%xmm2
- pshufd $170,%xmm1,%xmm1
- pxor %xmm1,%xmm2
+ xorps %xmm4,%xmm2
+ shufps $170,%xmm1,%xmm1
+ xorps %xmm1,%xmm2
.byte 0xf3,0xc3
.size aesni_set_encrypt_key,.-aesni_set_encrypt_key
+.size __aesni_set_encrypt_key,.-__aesni_set_encrypt_key
+.align 64
+.Lbswap_mask:
+.byte 15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0
+.Lincrement32:
+.long 6,6,6,0
+.Lincrement64:
+.long 1,0,0,0
+.Lxts_magic:
+.long 0x87,0,1,0
+
.byte
65,69,83,32,102,111,114,32,73,110,116,101,108,32,65,69,83,45,78,73,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
.align 64
diff --git a/lib/accelerated/intel/asm/appro-aes-x86.s
b/lib/accelerated/intel/asm/appro-aes-x86.s
index 981e356..88e76ae 100644
--- a/lib/accelerated/intel/asm/appro-aes-x86.s
+++ b/lib/accelerated/intel/asm/appro-aes-x86.s
@@ -5,18 +5,19 @@
# modification, are permitted provided that the following conditions
# are met:
#
-# * Redistributions of source code must retain copyright notices,
-# this list of conditions and the following disclaimer.
+# * Redistributions of source code must retain copyright
+# * notices,
+# this list of conditions and the following disclaimer.
#
-# * Redistributions in binary form must reproduce the above
-# copyright notice, this list of conditions and the following
-# disclaimer in the documentation and/or other materials
-# provided with the distribution.
+# * Redistributions in binary form must reproduce the above
+# copyright notice, this list of conditions and the following
+# disclaimer in the documentation and/or other materials
+# provided with the distribution.
#
-# * Neither the name of the Andy Polyakov nor the names of its
-# copyright holder and contributors may be used to endorse or
-# promote products derived from this software without specific
-# prior written permission.
+# * Neither the name of the Andy Polyakov nor the names of its
+# copyright holder and contributors may be used to endorse or
+# promote products derived from this software without specific
+# prior written permission.
#
# ALTERNATIVELY, provided that this notice is retained in full, this
# product may be distributed under the terms of the GNU General Public
@@ -44,21 +45,21 @@ aesni_encrypt:
.L_aesni_encrypt_begin:
movl 4(%esp),%eax
movl 12(%esp),%edx
- movups (%eax),%xmm0
+ movups (%eax),%xmm2
movl 240(%edx),%ecx
movl 8(%esp),%eax
- movups (%edx),%xmm3
- movups 16(%edx),%xmm4
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
leal 32(%edx),%edx
- pxor %xmm3,%xmm0
-.L000enc1_loop:
- aesenc %xmm4,%xmm0
+ xorps %xmm0,%xmm2
+.L000enc1_loop_1:
+.byte 102,15,56,220,209
decl %ecx
- movups (%edx),%xmm4
+ movaps (%edx),%xmm1
leal 16(%edx),%edx
- jnz .L000enc1_loop
- aesenclast %xmm4,%xmm0
- movups %xmm0,(%eax)
+ jnz .L000enc1_loop_1
+.byte 102,15,56,221,209
+ movups %xmm2,(%eax)
ret
.size aesni_encrypt,.-.L_aesni_encrypt_begin
.globl aesni_decrypt
@@ -68,165 +69,271 @@ aesni_decrypt:
.L_aesni_decrypt_begin:
movl 4(%esp),%eax
movl 12(%esp),%edx
- movups (%eax),%xmm0
+ movups (%eax),%xmm2
movl 240(%edx),%ecx
movl 8(%esp),%eax
- movups (%edx),%xmm3
- movups 16(%edx),%xmm4
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
leal 32(%edx),%edx
- pxor %xmm3,%xmm0
-.L001dec1_loop:
- aesdec %xmm4,%xmm0
+ xorps %xmm0,%xmm2
+.L001dec1_loop_2:
+.byte 102,15,56,222,209
decl %ecx
- movups (%edx),%xmm4
+ movaps (%edx),%xmm1
leal 16(%edx),%edx
- jnz .L001dec1_loop
- aesdeclast %xmm4,%xmm0
- movups %xmm0,(%eax)
+ jnz .L001dec1_loop_2
+.byte 102,15,56,223,209
+ movups %xmm2,(%eax)
ret
.size aesni_decrypt,.-.L_aesni_decrypt_begin
.type _aesni_encrypt3,@function
.align 16
_aesni_encrypt3:
- movups (%edx),%xmm3
+ movaps (%edx),%xmm0
shrl $1,%ecx
- movups 16(%edx),%xmm4
+ movaps 16(%edx),%xmm1
leal 32(%edx),%edx
- pxor %xmm3,%xmm0
- pxor %xmm3,%xmm1
- pxor %xmm3,%xmm2
- jmp .L002enc3_loop
-.align 16
+ xorps %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+ pxor %xmm0,%xmm4
+ movaps (%edx),%xmm0
.L002enc3_loop:
- aesenc %xmm4,%xmm0
- movups (%edx),%xmm3
- aesenc %xmm4,%xmm1
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
decl %ecx
- aesenc %xmm4,%xmm2
- movups 16(%edx),%xmm4
- aesenc %xmm3,%xmm0
+.byte 102,15,56,220,225
+ movaps 16(%edx),%xmm1
+.byte 102,15,56,220,208
+.byte 102,15,56,220,216
leal 32(%edx),%edx
- aesenc %xmm3,%xmm1
- aesenc %xmm3,%xmm2
+.byte 102,15,56,220,224
+ movaps (%edx),%xmm0
jnz .L002enc3_loop
- aesenc %xmm4,%xmm0
- movups (%edx),%xmm3
- aesenc %xmm4,%xmm1
- aesenc %xmm4,%xmm2
- aesenclast %xmm3,%xmm0
- aesenclast %xmm3,%xmm1
- aesenclast %xmm3,%xmm2
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,220,225
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+.byte 102,15,56,221,224
ret
.size _aesni_encrypt3,.-_aesni_encrypt3
.type _aesni_decrypt3,@function
.align 16
_aesni_decrypt3:
- movups (%edx),%xmm3
+ movaps (%edx),%xmm0
shrl $1,%ecx
- movups 16(%edx),%xmm4
+ movaps 16(%edx),%xmm1
leal 32(%edx),%edx
- pxor %xmm3,%xmm0
- pxor %xmm3,%xmm1
- pxor %xmm3,%xmm2
- jmp .L003dec3_loop
-.align 16
+ xorps %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+ pxor %xmm0,%xmm4
+ movaps (%edx),%xmm0
.L003dec3_loop:
- aesdec %xmm4,%xmm0
- movups (%edx),%xmm3
- aesdec %xmm4,%xmm1
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
decl %ecx
- aesdec %xmm4,%xmm2
- movups 16(%edx),%xmm4
- aesdec %xmm3,%xmm0
+.byte 102,15,56,222,225
+ movaps 16(%edx),%xmm1
+.byte 102,15,56,222,208
+.byte 102,15,56,222,216
leal 32(%edx),%edx
- aesdec %xmm3,%xmm1
- aesdec %xmm3,%xmm2
+.byte 102,15,56,222,224
+ movaps (%edx),%xmm0
jnz .L003dec3_loop
- aesdec %xmm4,%xmm0
- movups (%edx),%xmm3
- aesdec %xmm4,%xmm1
- aesdec %xmm4,%xmm2
- aesdeclast %xmm3,%xmm0
- aesdeclast %xmm3,%xmm1
- aesdeclast %xmm3,%xmm2
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+.byte 102,15,56,222,225
+.byte 102,15,56,223,208
+.byte 102,15,56,223,216
+.byte 102,15,56,223,224
ret
.size _aesni_decrypt3,.-_aesni_decrypt3
.type _aesni_encrypt4,@function
.align 16
_aesni_encrypt4:
- movups (%edx),%xmm3
- movups 16(%edx),%xmm4
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
shrl $1,%ecx
leal 32(%edx),%edx
- pxor %xmm3,%xmm0
- pxor %xmm3,%xmm1
- pxor %xmm3,%xmm2
- pxor %xmm3,%xmm7
- jmp .L004enc3_loop
-.align 16
-.L004enc3_loop:
- aesenc %xmm4,%xmm0
- movups (%edx),%xmm3
- aesenc %xmm4,%xmm1
+ xorps %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+ pxor %xmm0,%xmm4
+ pxor %xmm0,%xmm5
+ movaps (%edx),%xmm0
+.L004enc4_loop:
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
decl %ecx
- aesenc %xmm4,%xmm2
- aesenc %xmm4,%xmm7
- movups 16(%edx),%xmm4
- aesenc %xmm3,%xmm0
- leal 32(%edx),%edx
- aesenc %xmm3,%xmm1
- aesenc %xmm3,%xmm2
- aesenc %xmm3,%xmm7
- jnz .L004enc3_loop
- aesenc %xmm4,%xmm0
- movups (%edx),%xmm3
- aesenc %xmm4,%xmm1
- aesenc %xmm4,%xmm2
- aesenc %xmm4,%xmm7
- aesenclast %xmm3,%xmm0
- aesenclast %xmm3,%xmm1
- aesenclast %xmm3,%xmm2
- aesenclast %xmm3,%xmm7
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+ movaps 16(%edx),%xmm1
+.byte 102,15,56,220,208
+.byte 102,15,56,220,216
+ leal 32(%edx),%edx
+.byte 102,15,56,220,224
+.byte 102,15,56,220,232
+ movaps (%edx),%xmm0
+ jnz .L004enc4_loop
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+.byte 102,15,56,221,224
+.byte 102,15,56,221,232
ret
.size _aesni_encrypt4,.-_aesni_encrypt4
.type _aesni_decrypt4,@function
.align 16
_aesni_decrypt4:
- movups (%edx),%xmm3
- movups 16(%edx),%xmm4
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
shrl $1,%ecx
leal 32(%edx),%edx
- pxor %xmm3,%xmm0
- pxor %xmm3,%xmm1
- pxor %xmm3,%xmm2
- pxor %xmm3,%xmm7
- jmp .L005dec3_loop
-.align 16
-.L005dec3_loop:
- aesdec %xmm4,%xmm0
- movups (%edx),%xmm3
- aesdec %xmm4,%xmm1
+ xorps %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+ pxor %xmm0,%xmm4
+ pxor %xmm0,%xmm5
+ movaps (%edx),%xmm0
+.L005dec4_loop:
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
decl %ecx
- aesdec %xmm4,%xmm2
- aesdec %xmm4,%xmm7
- movups 16(%edx),%xmm4
- aesdec %xmm3,%xmm0
- leal 32(%edx),%edx
- aesdec %xmm3,%xmm1
- aesdec %xmm3,%xmm2
- aesdec %xmm3,%xmm7
- jnz .L005dec3_loop
- aesdec %xmm4,%xmm0
- movups (%edx),%xmm3
- aesdec %xmm4,%xmm1
- aesdec %xmm4,%xmm2
- aesdec %xmm4,%xmm7
- aesdeclast %xmm3,%xmm0
- aesdeclast %xmm3,%xmm1
- aesdeclast %xmm3,%xmm2
- aesdeclast %xmm3,%xmm7
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+ movaps 16(%edx),%xmm1
+.byte 102,15,56,222,208
+.byte 102,15,56,222,216
+ leal 32(%edx),%edx
+.byte 102,15,56,222,224
+.byte 102,15,56,222,232
+ movaps (%edx),%xmm0
+ jnz .L005dec4_loop
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+.byte 102,15,56,223,208
+.byte 102,15,56,223,216
+.byte 102,15,56,223,224
+.byte 102,15,56,223,232
ret
.size _aesni_decrypt4,.-_aesni_decrypt4
+.type _aesni_encrypt6,@function
+.align 16
+_aesni_encrypt6:
+ movaps (%edx),%xmm0
+ shrl $1,%ecx
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+.byte 102,15,56,220,209
+ pxor %xmm0,%xmm4
+.byte 102,15,56,220,217
+ pxor %xmm0,%xmm5
+ decl %ecx
+.byte 102,15,56,220,225
+ pxor %xmm0,%xmm6
+.byte 102,15,56,220,233
+ pxor %xmm0,%xmm7
+.byte 102,15,56,220,241
+ movaps (%edx),%xmm0
+.byte 102,15,56,220,249
+ jmp .L_aesni_encrypt6_enter
+.align 16
+.L006enc6_loop:
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+ decl %ecx
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+.align 16
+.L_aesni_encrypt6_enter:
+ movaps 16(%edx),%xmm1
+.byte 102,15,56,220,208
+.byte 102,15,56,220,216
+ leal 32(%edx),%edx
+.byte 102,15,56,220,224
+.byte 102,15,56,220,232
+.byte 102,15,56,220,240
+.byte 102,15,56,220,248
+ movaps (%edx),%xmm0
+ jnz .L006enc6_loop
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,220,225
+.byte 102,15,56,220,233
+.byte 102,15,56,220,241
+.byte 102,15,56,220,249
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+.byte 102,15,56,221,224
+.byte 102,15,56,221,232
+.byte 102,15,56,221,240
+.byte 102,15,56,221,248
+ ret
+.size _aesni_encrypt6,.-_aesni_encrypt6
+.type _aesni_decrypt6,@function
+.align 16
+_aesni_decrypt6:
+ movaps (%edx),%xmm0
+ shrl $1,%ecx
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+.byte 102,15,56,222,209
+ pxor %xmm0,%xmm4
+.byte 102,15,56,222,217
+ pxor %xmm0,%xmm5
+ decl %ecx
+.byte 102,15,56,222,225
+ pxor %xmm0,%xmm6
+.byte 102,15,56,222,233
+ pxor %xmm0,%xmm7
+.byte 102,15,56,222,241
+ movaps (%edx),%xmm0
+.byte 102,15,56,222,249
+ jmp .L_aesni_decrypt6_enter
+.align 16
+.L007dec6_loop:
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+ decl %ecx
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+.byte 102,15,56,222,241
+.byte 102,15,56,222,249
+.align 16
+.L_aesni_decrypt6_enter:
+ movaps 16(%edx),%xmm1
+.byte 102,15,56,222,208
+.byte 102,15,56,222,216
+ leal 32(%edx),%edx
+.byte 102,15,56,222,224
+.byte 102,15,56,222,232
+.byte 102,15,56,222,240
+.byte 102,15,56,222,248
+ movaps (%edx),%xmm0
+ jnz .L007dec6_loop
+.byte 102,15,56,222,209
+.byte 102,15,56,222,217
+.byte 102,15,56,222,225
+.byte 102,15,56,222,233
+.byte 102,15,56,222,241
+.byte 102,15,56,222,249
+.byte 102,15,56,223,208
+.byte 102,15,56,223,216
+.byte 102,15,56,223,224
+.byte 102,15,56,223,232
+.byte 102,15,56,223,240
+.byte 102,15,56,223,248
+ ret
+.size _aesni_decrypt6,.-_aesni_decrypt6
.globl aesni_ecb_encrypt
.type aesni_ecb_encrypt,@function
.align 16
@@ -240,153 +347,1364 @@ aesni_ecb_encrypt:
movl 24(%esp),%edi
movl 28(%esp),%eax
movl 32(%esp),%edx
- movl 36(%esp),%ecx
- cmpl $16,%eax
- jb .L006ecb_ret
+ movl 36(%esp),%ebx
andl $-16,%eax
- testl %ecx,%ecx
+ jz .L008ecb_ret
movl 240(%edx),%ecx
+ testl %ebx,%ebx
+ jz .L009ecb_decrypt
movl %edx,%ebp
movl %ecx,%ebx
- jz .L007ecb_decrypt
- subl $64,%eax
- jbe .L008ecb_enc_tail
- jmp .L009ecb_enc_loop3
+ cmpl $96,%eax
+ jb .L010ecb_enc_tail
+ movdqu (%esi),%xmm2
+ movdqu 16(%esi),%xmm3
+ movdqu 32(%esi),%xmm4
+ movdqu 48(%esi),%xmm5
+ movdqu 64(%esi),%xmm6
+ movdqu 80(%esi),%xmm7
+ leal 96(%esi),%esi
+ subl $96,%eax
+ jmp .L011ecb_enc_loop6_enter
.align 16
-.L009ecb_enc_loop3:
- movups (%esi),%xmm0
- movups 16(%esi),%xmm1
- movups 32(%esi),%xmm2
- call _aesni_encrypt3
- subl $48,%eax
- leal 48(%esi),%esi
- leal 48(%edi),%edi
- movups %xmm0,-48(%edi)
+.L012ecb_enc_loop6:
+ movups %xmm2,(%edi)
+ movdqu (%esi),%xmm2
+ movups %xmm3,16(%edi)
+ movdqu 16(%esi),%xmm3
+ movups %xmm4,32(%edi)
+ movdqu 32(%esi),%xmm4
+ movups %xmm5,48(%edi)
+ movdqu 48(%esi),%xmm5
+ movups %xmm6,64(%edi)
+ movdqu 64(%esi),%xmm6
+ movups %xmm7,80(%edi)
+ leal 96(%edi),%edi
+ movdqu 80(%esi),%xmm7
+ leal 96(%esi),%esi
+.L011ecb_enc_loop6_enter:
+ call _aesni_encrypt6
movl %ebp,%edx
- movups %xmm1,-32(%edi)
movl %ebx,%ecx
- movups %xmm2,-16(%edi)
- ja .L009ecb_enc_loop3
-.L008ecb_enc_tail:
- addl $64,%eax
- jz .L006ecb_ret
- cmpl $16,%eax
- movups (%esi),%xmm0
- je .L010ecb_enc_one
+ subl $96,%eax
+ jnc .L012ecb_enc_loop6
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ movups %xmm6,64(%edi)
+ movups %xmm7,80(%edi)
+ leal 96(%edi),%edi
+ addl $96,%eax
+ jz .L008ecb_ret
+.L010ecb_enc_tail:
+ movups (%esi),%xmm2
cmpl $32,%eax
- movups 16(%esi),%xmm1
- je .L011ecb_enc_two
- cmpl $48,%eax
- movups 32(%esi),%xmm2
- je .L012ecb_enc_three
- movups 48(%esi),%xmm7
- call _aesni_encrypt4
- movups %xmm0,(%edi)
- movups %xmm1,16(%edi)
- movups %xmm2,32(%edi)
- movups %xmm7,48(%edi)
- jmp .L006ecb_ret
-.align 16
-.L010ecb_enc_one:
- movups (%edx),%xmm3
- movups 16(%edx),%xmm4
- leal 32(%edx),%edx
- pxor %xmm3,%xmm0
-.L013enc1_loop:
- aesenc %xmm4,%xmm0
+ jb .L013ecb_enc_one
+ movups 16(%esi),%xmm3
+ je .L014ecb_enc_two
+ movups 32(%esi),%xmm4
+ cmpl $64,%eax
+ jb .L015ecb_enc_three
+ movups 48(%esi),%xmm5
+ je .L016ecb_enc_four
+ movups 64(%esi),%xmm6
+ xorps %xmm7,%xmm7
+ call _aesni_encrypt6
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ movups %xmm6,64(%edi)
+ jmp .L008ecb_ret
+.align 16
+.L013ecb_enc_one:
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L017enc1_loop_3:
+.byte 102,15,56,220,209
decl %ecx
- movups (%edx),%xmm4
+ movaps (%edx),%xmm1
leal 16(%edx),%edx
- jnz .L013enc1_loop
- aesenclast %xmm4,%xmm0
- movups %xmm0,(%edi)
- jmp .L006ecb_ret
+ jnz .L017enc1_loop_3
+.byte 102,15,56,221,209
+ movups %xmm2,(%edi)
+ jmp .L008ecb_ret
.align 16
-.L011ecb_enc_two:
+.L014ecb_enc_two:
+ xorps %xmm4,%xmm4
call _aesni_encrypt3
- movups %xmm0,(%edi)
- movups %xmm1,16(%edi)
- jmp .L006ecb_ret
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ jmp .L008ecb_ret
.align 16
-.L012ecb_enc_three:
+.L015ecb_enc_three:
call _aesni_encrypt3
- movups %xmm0,(%edi)
- movups %xmm1,16(%edi)
- movups %xmm2,32(%edi)
- jmp .L006ecb_ret
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ jmp .L008ecb_ret
.align 16
-.L007ecb_decrypt:
- subl $64,%eax
- jbe .L014ecb_dec_tail
- jmp .L015ecb_dec_loop3
+.L016ecb_enc_four:
+ call _aesni_encrypt4
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ jmp .L008ecb_ret
.align 16
-.L015ecb_dec_loop3:
- movups (%esi),%xmm0
- movups 16(%esi),%xmm1
- movups 32(%esi),%xmm2
+.L009ecb_decrypt:
+ movl %edx,%ebp
+ movl %ecx,%ebx
+ cmpl $96,%eax
+ jb .L018ecb_dec_tail
+ movdqu (%esi),%xmm2
+ movdqu 16(%esi),%xmm3
+ movdqu 32(%esi),%xmm4
+ movdqu 48(%esi),%xmm5
+ movdqu 64(%esi),%xmm6
+ movdqu 80(%esi),%xmm7
+ leal 96(%esi),%esi
+ subl $96,%eax
+ jmp .L019ecb_dec_loop6_enter
+.align 16
+.L020ecb_dec_loop6:
+ movups %xmm2,(%edi)
+ movdqu (%esi),%xmm2
+ movups %xmm3,16(%edi)
+ movdqu 16(%esi),%xmm3
+ movups %xmm4,32(%edi)
+ movdqu 32(%esi),%xmm4
+ movups %xmm5,48(%edi)
+ movdqu 48(%esi),%xmm5
+ movups %xmm6,64(%edi)
+ movdqu 64(%esi),%xmm6
+ movups %xmm7,80(%edi)
+ leal 96(%edi),%edi
+ movdqu 80(%esi),%xmm7
+ leal 96(%esi),%esi
+.L019ecb_dec_loop6_enter:
+ call _aesni_decrypt6
+ movl %ebp,%edx
+ movl %ebx,%ecx
+ subl $96,%eax
+ jnc .L020ecb_dec_loop6
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ movups %xmm6,64(%edi)
+ movups %xmm7,80(%edi)
+ leal 96(%edi),%edi
+ addl $96,%eax
+ jz .L008ecb_ret
+.L018ecb_dec_tail:
+ movups (%esi),%xmm2
+ cmpl $32,%eax
+ jb .L021ecb_dec_one
+ movups 16(%esi),%xmm3
+ je .L022ecb_dec_two
+ movups 32(%esi),%xmm4
+ cmpl $64,%eax
+ jb .L023ecb_dec_three
+ movups 48(%esi),%xmm5
+ je .L024ecb_dec_four
+ movups 64(%esi),%xmm6
+ xorps %xmm7,%xmm7
+ call _aesni_decrypt6
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ movups %xmm6,64(%edi)
+ jmp .L008ecb_ret
+.align 16
+.L021ecb_dec_one:
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L025dec1_loop_4:
+.byte 102,15,56,222,209
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L025dec1_loop_4
+.byte 102,15,56,223,209
+ movups %xmm2,(%edi)
+ jmp .L008ecb_ret
+.align 16
+.L022ecb_dec_two:
+ xorps %xmm4,%xmm4
call _aesni_decrypt3
- subl $48,%eax
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ jmp .L008ecb_ret
+.align 16
+.L023ecb_dec_three:
+ call _aesni_decrypt3
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ jmp .L008ecb_ret
+.align 16
+.L024ecb_dec_four:
+ call _aesni_decrypt4
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+.L008ecb_ret:
+ popl %edi
+ popl %esi
+ popl %ebx
+ popl %ebp
+ ret
+.size aesni_ecb_encrypt,.-.L_aesni_ecb_encrypt_begin
+.globl aesni_ccm64_encrypt_blocks
+.type aesni_ccm64_encrypt_blocks,@function
+.align 16
+aesni_ccm64_encrypt_blocks:
+.L_aesni_ccm64_encrypt_blocks_begin:
+ pushl %ebp
+ pushl %ebx
+ pushl %esi
+ pushl %edi
+ movl 20(%esp),%esi
+ movl 24(%esp),%edi
+ movl 28(%esp),%eax
+ movl 32(%esp),%edx
+ movl 36(%esp),%ebx
+ movl 40(%esp),%ecx
+ movl %esp,%ebp
+ subl $60,%esp
+ andl $-16,%esp
+ movl %ebp,48(%esp)
+ movdqu (%ebx),%xmm7
+ movdqu (%ecx),%xmm3
+ movl $202182159,(%esp)
+ movl $134810123,4(%esp)
+ movl $67438087,8(%esp)
+ movl $66051,12(%esp)
+ movl $1,%ecx
+ xorl %ebp,%ebp
+ movl %ecx,16(%esp)
+ movl %ebp,20(%esp)
+ movl %ebp,24(%esp)
+ movl %ebp,28(%esp)
+ movdqa (%esp),%xmm5
+.byte 102,15,56,0,253
+ movl 240(%edx),%ecx
+ movl %edx,%ebp
+ movl %ecx,%ebx
+ movdqa %xmm7,%xmm2
+.L026ccm64_enc_outer:
+ movups (%esi),%xmm6
+.byte 102,15,56,0,213
+ movl %ebp,%edx
+ movl %ebx,%ecx
+ movaps (%edx),%xmm0
+ shrl $1,%ecx
+ movaps 16(%edx),%xmm1
+ xorps %xmm0,%xmm6
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+ xorps %xmm6,%xmm3
+ movaps (%edx),%xmm0
+.L027ccm64_enc2_loop:
+.byte 102,15,56,220,209
+ decl %ecx
+.byte 102,15,56,220,217
+ movaps 16(%edx),%xmm1
+.byte 102,15,56,220,208
+ leal 32(%edx),%edx
+.byte 102,15,56,220,216
+ movaps (%edx),%xmm0
+ jnz .L027ccm64_enc2_loop
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+ paddq 16(%esp),%xmm7
+ decl %eax
+ leal 16(%esi),%esi
+ xorps %xmm2,%xmm6
+ movdqa %xmm7,%xmm2
+ movups %xmm6,(%edi)
+ leal 16(%edi),%edi
+ jnz .L026ccm64_enc_outer
+ movl 48(%esp),%esp
+ movl 40(%esp),%edi
+ movups %xmm3,(%edi)
+ popl %edi
+ popl %esi
+ popl %ebx
+ popl %ebp
+ ret
+.size aesni_ccm64_encrypt_blocks,.-.L_aesni_ccm64_encrypt_blocks_begin
+.globl aesni_ccm64_decrypt_blocks
+.type aesni_ccm64_decrypt_blocks,@function
+.align 16
+aesni_ccm64_decrypt_blocks:
+.L_aesni_ccm64_decrypt_blocks_begin:
+ pushl %ebp
+ pushl %ebx
+ pushl %esi
+ pushl %edi
+ movl 20(%esp),%esi
+ movl 24(%esp),%edi
+ movl 28(%esp),%eax
+ movl 32(%esp),%edx
+ movl 36(%esp),%ebx
+ movl 40(%esp),%ecx
+ movl %esp,%ebp
+ subl $60,%esp
+ andl $-16,%esp
+ movl %ebp,48(%esp)
+ movdqu (%ebx),%xmm7
+ movdqu (%ecx),%xmm3
+ movl $202182159,(%esp)
+ movl $134810123,4(%esp)
+ movl $67438087,8(%esp)
+ movl $66051,12(%esp)
+ movl $1,%ecx
+ xorl %ebp,%ebp
+ movl %ecx,16(%esp)
+ movl %ebp,20(%esp)
+ movl %ebp,24(%esp)
+ movl %ebp,28(%esp)
+ movdqa (%esp),%xmm5
+ movdqa %xmm7,%xmm2
+.byte 102,15,56,0,253
+ movl 240(%edx),%ecx
+ movl %edx,%ebp
+ movl %ecx,%ebx
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L028enc1_loop_5:
+.byte 102,15,56,220,209
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L028enc1_loop_5
+.byte 102,15,56,221,209
+.L029ccm64_dec_outer:
+ paddq 16(%esp),%xmm7
+ movups (%esi),%xmm6
+ xorps %xmm2,%xmm6
+ movdqa %xmm7,%xmm2
+ leal 16(%esi),%esi
+.byte 102,15,56,0,213
+ movl %ebp,%edx
+ movl %ebx,%ecx
+ movups %xmm6,(%edi)
+ leal 16(%edi),%edi
+ subl $1,%eax
+ jz .L030ccm64_dec_break
+ movaps (%edx),%xmm0
+ shrl $1,%ecx
+ movaps 16(%edx),%xmm1
+ xorps %xmm0,%xmm6
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+ xorps %xmm6,%xmm3
+ movaps (%edx),%xmm0
+.L031ccm64_dec2_loop:
+.byte 102,15,56,220,209
+ decl %ecx
+.byte 102,15,56,220,217
+ movaps 16(%edx),%xmm1
+.byte 102,15,56,220,208
+ leal 32(%edx),%edx
+.byte 102,15,56,220,216
+ movaps (%edx),%xmm0
+ jnz .L031ccm64_dec2_loop
+.byte 102,15,56,220,209
+.byte 102,15,56,220,217
+.byte 102,15,56,221,208
+.byte 102,15,56,221,216
+ jmp .L029ccm64_dec_outer
+.align 16
+.L030ccm64_dec_break:
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ xorps %xmm0,%xmm6
+ leal 32(%edx),%edx
+ xorps %xmm6,%xmm3
+.L032enc1_loop_6:
+.byte 102,15,56,220,217
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L032enc1_loop_6
+.byte 102,15,56,221,217
+ movl 48(%esp),%esp
+ movl 40(%esp),%edi
+ movups %xmm3,(%edi)
+ popl %edi
+ popl %esi
+ popl %ebx
+ popl %ebp
+ ret
+.size aesni_ccm64_decrypt_blocks,.-.L_aesni_ccm64_decrypt_blocks_begin
+.globl aesni_ctr32_encrypt_blocks
+.type aesni_ctr32_encrypt_blocks,@function
+.align 16
+aesni_ctr32_encrypt_blocks:
+.L_aesni_ctr32_encrypt_blocks_begin:
+ pushl %ebp
+ pushl %ebx
+ pushl %esi
+ pushl %edi
+ movl 20(%esp),%esi
+ movl 24(%esp),%edi
+ movl 28(%esp),%eax
+ movl 32(%esp),%edx
+ movl 36(%esp),%ebx
+ movl %esp,%ebp
+ subl $88,%esp
+ andl $-16,%esp
+ movl %ebp,80(%esp)
+ cmpl $1,%eax
+ je .L033ctr32_one_shortcut
+ movdqu (%ebx),%xmm7
+ movl $202182159,(%esp)
+ movl $134810123,4(%esp)
+ movl $67438087,8(%esp)
+ movl $66051,12(%esp)
+ movl $6,%ecx
+ xorl %ebp,%ebp
+ movl %ecx,16(%esp)
+ movl %ecx,20(%esp)
+ movl %ecx,24(%esp)
+ movl %ebp,28(%esp)
+.byte 102,15,58,22,251,3
+.byte 102,15,58,34,253,3
+ movl 240(%edx),%ecx
+ bswap %ebx
+ pxor %xmm1,%xmm1
+ pxor %xmm0,%xmm0
+ movdqa (%esp),%xmm2
+.byte 102,15,58,34,203,0
+ leal 3(%ebx),%ebp
+.byte 102,15,58,34,197,0
+ incl %ebx
+.byte 102,15,58,34,203,1
+ incl %ebp
+.byte 102,15,58,34,197,1
+ incl %ebx
+.byte 102,15,58,34,203,2
+ incl %ebp
+.byte 102,15,58,34,197,2
+ movdqa %xmm1,48(%esp)
+.byte 102,15,56,0,202
+ movdqa %xmm0,64(%esp)
+.byte 102,15,56,0,194
+ pshufd $192,%xmm1,%xmm2
+ pshufd $128,%xmm1,%xmm3
+ cmpl $6,%eax
+ jb .L034ctr32_tail
+ movdqa %xmm7,32(%esp)
+ shrl $1,%ecx
+ movl %edx,%ebp
+ movl %ecx,%ebx
+ subl $6,%eax
+ jmp .L035ctr32_loop6
+.align 16
+.L035ctr32_loop6:
+ pshufd $64,%xmm1,%xmm4
+ movdqa 32(%esp),%xmm1
+ pshufd $192,%xmm0,%xmm5
+ por %xmm1,%xmm2
+ pshufd $128,%xmm0,%xmm6
+ por %xmm1,%xmm3
+ pshufd $64,%xmm0,%xmm7
+ por %xmm1,%xmm4
+ por %xmm1,%xmm5
+ por %xmm1,%xmm6
+ por %xmm1,%xmm7
+ movaps (%ebp),%xmm0
+ movaps 16(%ebp),%xmm1
+ leal 32(%ebp),%edx
+ decl %ecx
+ pxor %xmm0,%xmm2
+ pxor %xmm0,%xmm3
+.byte 102,15,56,220,209
+ pxor %xmm0,%xmm4
+.byte 102,15,56,220,217
+ pxor %xmm0,%xmm5
+.byte 102,15,56,220,225
+ pxor %xmm0,%xmm6
+.byte 102,15,56,220,233
+ pxor %xmm0,%xmm7
+.byte 102,15,56,220,241
+ movaps (%edx),%xmm0
+.byte 102,15,56,220,249
+ call .L_aesni_encrypt6_enter
+ movups (%esi),%xmm1
+ movups 16(%esi),%xmm0
+ xorps %xmm1,%xmm2
+ movups 32(%esi),%xmm1
+ xorps %xmm0,%xmm3
+ movups %xmm2,(%edi)
+ movdqa 16(%esp),%xmm0
+ xorps %xmm1,%xmm4
+ movdqa 48(%esp),%xmm1
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ paddd %xmm0,%xmm1
+ paddd 64(%esp),%xmm0
+ movdqa (%esp),%xmm2
+ movups 48(%esi),%xmm3
+ movups 64(%esi),%xmm4
+ xorps %xmm3,%xmm5
+ movups 80(%esi),%xmm3
+ leal 96(%esi),%esi
+ movdqa %xmm1,48(%esp)
+.byte 102,15,56,0,202
+ xorps %xmm4,%xmm6
+ movups %xmm5,48(%edi)
+ xorps %xmm3,%xmm7
+ movdqa %xmm0,64(%esp)
+.byte 102,15,56,0,194
+ movups %xmm6,64(%edi)
+ pshufd $192,%xmm1,%xmm2
+ movups %xmm7,80(%edi)
+ leal 96(%edi),%edi
+ movl %ebx,%ecx
+ pshufd $128,%xmm1,%xmm3
+ subl $6,%eax
+ jnc .L035ctr32_loop6
+ addl $6,%eax
+ jz .L036ctr32_ret
+ movl %ebp,%edx
+ leal 1(,%ecx,2),%ecx
+ movdqa 32(%esp),%xmm7
+.L034ctr32_tail:
+ por %xmm7,%xmm2
+ cmpl $2,%eax
+ jb .L037ctr32_one
+ pshufd $64,%xmm1,%xmm4
+ por %xmm7,%xmm3
+ je .L038ctr32_two
+ pshufd $192,%xmm0,%xmm5
+ por %xmm7,%xmm4
+ cmpl $4,%eax
+ jb .L039ctr32_three
+ pshufd $128,%xmm0,%xmm6
+ por %xmm7,%xmm5
+ je .L040ctr32_four
+ por %xmm7,%xmm6
+ call _aesni_encrypt6
+ movups (%esi),%xmm1
+ movups 16(%esi),%xmm0
+ xorps %xmm1,%xmm2
+ movups 32(%esi),%xmm1
+ xorps %xmm0,%xmm3
+ movups 48(%esi),%xmm0
+ xorps %xmm1,%xmm4
+ movups 64(%esi),%xmm1
+ xorps %xmm0,%xmm5
+ movups %xmm2,(%edi)
+ xorps %xmm1,%xmm6
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ movups %xmm6,64(%edi)
+ jmp .L036ctr32_ret
+.align 16
+.L033ctr32_one_shortcut:
+ movups (%ebx),%xmm2
+ movl 240(%edx),%ecx
+.L037ctr32_one:
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L041enc1_loop_7:
+.byte 102,15,56,220,209
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L041enc1_loop_7
+.byte 102,15,56,221,209
+ movups (%esi),%xmm6
+ xorps %xmm2,%xmm6
+ movups %xmm6,(%edi)
+ jmp .L036ctr32_ret
+.align 16
+.L038ctr32_two:
+ call _aesni_encrypt3
+ movups (%esi),%xmm5
+ movups 16(%esi),%xmm6
+ xorps %xmm5,%xmm2
+ xorps %xmm6,%xmm3
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ jmp .L036ctr32_ret
+.align 16
+.L039ctr32_three:
+ call _aesni_encrypt3
+ movups (%esi),%xmm5
+ movups 16(%esi),%xmm6
+ xorps %xmm5,%xmm2
+ movups 32(%esi),%xmm7
+ xorps %xmm6,%xmm3
+ movups %xmm2,(%edi)
+ xorps %xmm7,%xmm4
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ jmp .L036ctr32_ret
+.align 16
+.L040ctr32_four:
+ call _aesni_encrypt4
+ movups (%esi),%xmm6
+ movups 16(%esi),%xmm7
+ movups 32(%esi),%xmm1
+ xorps %xmm6,%xmm2
+ movups 48(%esi),%xmm0
+ xorps %xmm7,%xmm3
+ movups %xmm2,(%edi)
+ xorps %xmm1,%xmm4
+ movups %xmm3,16(%edi)
+ xorps %xmm0,%xmm5
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+.L036ctr32_ret:
+ movl 80(%esp),%esp
+ popl %edi
+ popl %esi
+ popl %ebx
+ popl %ebp
+ ret
+.size aesni_ctr32_encrypt_blocks,.-.L_aesni_ctr32_encrypt_blocks_begin
+.globl aesni_xts_encrypt
+.type aesni_xts_encrypt,@function
+.align 16
+aesni_xts_encrypt:
+.L_aesni_xts_encrypt_begin:
+ pushl %ebp
+ pushl %ebx
+ pushl %esi
+ pushl %edi
+ movl 36(%esp),%edx
+ movl 40(%esp),%esi
+ movl 240(%edx),%ecx
+ movups (%esi),%xmm2
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L042enc1_loop_8:
+.byte 102,15,56,220,209
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L042enc1_loop_8
+.byte 102,15,56,221,209
+ movl 20(%esp),%esi
+ movl 24(%esp),%edi
+ movl 28(%esp),%eax
+ movl 32(%esp),%edx
+ movl %esp,%ebp
+ subl $120,%esp
+ movl 240(%edx),%ecx
+ andl $-16,%esp
+ movl $135,96(%esp)
+ movl $0,100(%esp)
+ movl $1,104(%esp)
+ movl $0,108(%esp)
+ movl %eax,112(%esp)
+ movl %ebp,116(%esp)
+ movdqa %xmm2,%xmm1
+ pxor %xmm0,%xmm0
+ movdqa 96(%esp),%xmm3
+ pcmpgtd %xmm1,%xmm0
+ andl $-16,%eax
+ movl %edx,%ebp
+ movl %ecx,%ebx
+ subl $96,%eax
+ jc .L043xts_enc_short
+ shrl $1,%ecx
+ movl %ecx,%ebx
+ jmp .L044xts_enc_loop6
+.align 16
+.L044xts_enc_loop6:
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,16(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,32(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,48(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ pshufd $19,%xmm0,%xmm7
+ movdqa %xmm1,64(%esp)
+ paddq %xmm1,%xmm1
+ movaps (%ebp),%xmm0
+ pand %xmm3,%xmm7
+ movups (%esi),%xmm2
+ pxor %xmm1,%xmm7
+ movdqu 16(%esi),%xmm3
+ xorps %xmm0,%xmm2
+ movdqu 32(%esi),%xmm4
+ pxor %xmm0,%xmm3
+ movdqu 48(%esi),%xmm5
+ pxor %xmm0,%xmm4
+ movdqu 64(%esi),%xmm6
+ pxor %xmm0,%xmm5
+ movdqu 80(%esi),%xmm1
+ pxor %xmm0,%xmm6
+ leal 96(%esi),%esi
+ pxor (%esp),%xmm2
+ movdqa %xmm7,80(%esp)
+ pxor %xmm1,%xmm7
+ movaps 16(%ebp),%xmm1
+ leal 32(%ebp),%edx
+ pxor 16(%esp),%xmm3
+.byte 102,15,56,220,209
+ pxor 32(%esp),%xmm4
+.byte 102,15,56,220,217
+ pxor 48(%esp),%xmm5
+ decl %ecx
+.byte 102,15,56,220,225
+ pxor 64(%esp),%xmm6
+.byte 102,15,56,220,233
+ pxor %xmm0,%xmm7
+.byte 102,15,56,220,241
+ movaps (%edx),%xmm0
+.byte 102,15,56,220,249
+ call .L_aesni_encrypt6_enter
+ movdqa 80(%esp),%xmm1
+ pxor %xmm0,%xmm0
+ xorps (%esp),%xmm2
+ pcmpgtd %xmm1,%xmm0
+ xorps 16(%esp),%xmm3
+ movups %xmm2,(%edi)
+ xorps 32(%esp),%xmm4
+ movups %xmm3,16(%edi)
+ xorps 48(%esp),%xmm5
+ movups %xmm4,32(%edi)
+ xorps 64(%esp),%xmm6
+ movups %xmm5,48(%edi)
+ xorps %xmm1,%xmm7
+ movups %xmm6,64(%edi)
+ pshufd $19,%xmm0,%xmm2
+ movups %xmm7,80(%edi)
+ leal 96(%edi),%edi
+ movdqa 96(%esp),%xmm3
+ pxor %xmm0,%xmm0
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ movl %ebx,%ecx
+ pxor %xmm2,%xmm1
+ subl $96,%eax
+ jnc .L044xts_enc_loop6
+ leal 1(,%ecx,2),%ecx
+ movl %ebp,%edx
+ movl %ecx,%ebx
+.L043xts_enc_short:
+ addl $96,%eax
+ jz .L045xts_enc_done6x
+ movdqa %xmm1,%xmm5
+ cmpl $32,%eax
+ jb .L046xts_enc_one
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ je .L047xts_enc_two
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,%xmm6
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ cmpl $64,%eax
+ jb .L048xts_enc_three
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,%xmm7
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ movdqa %xmm5,(%esp)
+ movdqa %xmm6,16(%esp)
+ je .L049xts_enc_four
+ movdqa %xmm7,32(%esp)
+ pshufd $19,%xmm0,%xmm7
+ movdqa %xmm1,48(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm7
+ pxor %xmm1,%xmm7
+ movdqu (%esi),%xmm2
+ movdqu 16(%esi),%xmm3
+ movdqu 32(%esi),%xmm4
+ pxor (%esp),%xmm2
+ movdqu 48(%esi),%xmm5
+ pxor 16(%esp),%xmm3
+ movdqu 64(%esi),%xmm6
+ pxor 32(%esp),%xmm4
+ leal 80(%esi),%esi
+ pxor 48(%esp),%xmm5
+ movdqa %xmm7,64(%esp)
+ pxor %xmm7,%xmm6
+ call _aesni_encrypt6
+ movaps 64(%esp),%xmm1
+ xorps (%esp),%xmm2
+ xorps 16(%esp),%xmm3
+ xorps 32(%esp),%xmm4
+ movups %xmm2,(%edi)
+ xorps 48(%esp),%xmm5
+ movups %xmm3,16(%edi)
+ xorps %xmm1,%xmm6
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ movups %xmm6,64(%edi)
+ leal 80(%edi),%edi
+ jmp .L050xts_enc_done
+.align 16
+.L046xts_enc_one:
+ movups (%esi),%xmm2
+ leal 16(%esi),%esi
+ xorps %xmm5,%xmm2
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L051enc1_loop_9:
+.byte 102,15,56,220,209
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L051enc1_loop_9
+.byte 102,15,56,221,209
+ xorps %xmm5,%xmm2
+ movups %xmm2,(%edi)
+ leal 16(%edi),%edi
+ movdqa %xmm5,%xmm1
+ jmp .L050xts_enc_done
+.align 16
+.L047xts_enc_two:
+ movaps %xmm1,%xmm6
+ movups (%esi),%xmm2
+ movups 16(%esi),%xmm3
+ leal 32(%esi),%esi
+ xorps %xmm5,%xmm2
+ xorps %xmm6,%xmm3
+ xorps %xmm4,%xmm4
+ call _aesni_encrypt3
+ xorps %xmm5,%xmm2
+ xorps %xmm6,%xmm3
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ leal 32(%edi),%edi
+ movdqa %xmm6,%xmm1
+ jmp .L050xts_enc_done
+.align 16
+.L048xts_enc_three:
+ movaps %xmm1,%xmm7
+ movups (%esi),%xmm2
+ movups 16(%esi),%xmm3
+ movups 32(%esi),%xmm4
leal 48(%esi),%esi
+ xorps %xmm5,%xmm2
+ xorps %xmm6,%xmm3
+ xorps %xmm7,%xmm4
+ call _aesni_encrypt3
+ xorps %xmm5,%xmm2
+ xorps %xmm6,%xmm3
+ xorps %xmm7,%xmm4
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
leal 48(%edi),%edi
- movups %xmm0,-48(%edi)
+ movdqa %xmm7,%xmm1
+ jmp .L050xts_enc_done
+.align 16
+.L049xts_enc_four:
+ movaps %xmm1,%xmm6
+ movups (%esi),%xmm2
+ movups 16(%esi),%xmm3
+ movups 32(%esi),%xmm4
+ xorps (%esp),%xmm2
+ movups 48(%esi),%xmm5
+ leal 64(%esi),%esi
+ xorps 16(%esp),%xmm3
+ xorps %xmm7,%xmm4
+ xorps %xmm6,%xmm5
+ call _aesni_encrypt4
+ xorps (%esp),%xmm2
+ xorps 16(%esp),%xmm3
+ xorps %xmm7,%xmm4
+ movups %xmm2,(%edi)
+ xorps %xmm6,%xmm5
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ leal 64(%edi),%edi
+ movdqa %xmm6,%xmm1
+ jmp .L050xts_enc_done
+.align 16
+.L045xts_enc_done6x:
+ movl 112(%esp),%eax
+ andl $15,%eax
+ jz .L052xts_enc_ret
+ movdqa %xmm1,%xmm5
+ movl %eax,112(%esp)
+ jmp .L053xts_enc_steal
+.align 16
+.L050xts_enc_done:
+ movl 112(%esp),%eax
+ pxor %xmm0,%xmm0
+ andl $15,%eax
+ jz .L052xts_enc_ret
+ pcmpgtd %xmm1,%xmm0
+ movl %eax,112(%esp)
+ pshufd $19,%xmm0,%xmm5
+ paddq %xmm1,%xmm1
+ pand 96(%esp),%xmm5
+ pxor %xmm1,%xmm5
+.L053xts_enc_steal:
+ movzbl (%esi),%ecx
+ movzbl -16(%edi),%edx
+ leal 1(%esi),%esi
+ movb %cl,-16(%edi)
+ movb %dl,(%edi)
+ leal 1(%edi),%edi
+ subl $1,%eax
+ jnz .L053xts_enc_steal
+ subl 112(%esp),%edi
movl %ebp,%edx
- movups %xmm1,-32(%edi)
movl %ebx,%ecx
+ movups -16(%edi),%xmm2
+ xorps %xmm5,%xmm2
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L054enc1_loop_10:
+.byte 102,15,56,220,209
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L054enc1_loop_10
+.byte 102,15,56,221,209
+ xorps %xmm5,%xmm2
movups %xmm2,-16(%edi)
- ja .L015ecb_dec_loop3
-.L014ecb_dec_tail:
- addl $64,%eax
- jz .L006ecb_ret
- cmpl $16,%eax
- movups (%esi),%xmm0
- je .L016ecb_dec_one
+.L052xts_enc_ret:
+ movl 116(%esp),%esp
+ popl %edi
+ popl %esi
+ popl %ebx
+ popl %ebp
+ ret
+.size aesni_xts_encrypt,.-.L_aesni_xts_encrypt_begin
+.globl aesni_xts_decrypt
+.type aesni_xts_decrypt,@function
+.align 16
+aesni_xts_decrypt:
+.L_aesni_xts_decrypt_begin:
+ pushl %ebp
+ pushl %ebx
+ pushl %esi
+ pushl %edi
+ movl 36(%esp),%edx
+ movl 40(%esp),%esi
+ movl 240(%edx),%ecx
+ movups (%esi),%xmm2
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L055enc1_loop_11:
+.byte 102,15,56,220,209
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L055enc1_loop_11
+.byte 102,15,56,221,209
+ movl 20(%esp),%esi
+ movl 24(%esp),%edi
+ movl 28(%esp),%eax
+ movl 32(%esp),%edx
+ movl %esp,%ebp
+ subl $120,%esp
+ andl $-16,%esp
+ xorl %ebx,%ebx
+ testl $15,%eax
+ setnz %bl
+ shll $4,%ebx
+ subl %ebx,%eax
+ movl $135,96(%esp)
+ movl $0,100(%esp)
+ movl $1,104(%esp)
+ movl $0,108(%esp)
+ movl %eax,112(%esp)
+ movl %ebp,116(%esp)
+ movl 240(%edx),%ecx
+ movl %edx,%ebp
+ movl %ecx,%ebx
+ movdqa %xmm2,%xmm1
+ pxor %xmm0,%xmm0
+ movdqa 96(%esp),%xmm3
+ pcmpgtd %xmm1,%xmm0
+ andl $-16,%eax
+ subl $96,%eax
+ jc .L056xts_dec_short
+ shrl $1,%ecx
+ movl %ecx,%ebx
+ jmp .L057xts_dec_loop6
+.align 16
+.L057xts_dec_loop6:
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,16(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,32(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,48(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ pshufd $19,%xmm0,%xmm7
+ movdqa %xmm1,64(%esp)
+ paddq %xmm1,%xmm1
+ movaps (%ebp),%xmm0
+ pand %xmm3,%xmm7
+ movups (%esi),%xmm2
+ pxor %xmm1,%xmm7
+ movdqu 16(%esi),%xmm3
+ xorps %xmm0,%xmm2
+ movdqu 32(%esi),%xmm4
+ pxor %xmm0,%xmm3
+ movdqu 48(%esi),%xmm5
+ pxor %xmm0,%xmm4
+ movdqu 64(%esi),%xmm6
+ pxor %xmm0,%xmm5
+ movdqu 80(%esi),%xmm1
+ pxor %xmm0,%xmm6
+ leal 96(%esi),%esi
+ pxor (%esp),%xmm2
+ movdqa %xmm7,80(%esp)
+ pxor %xmm1,%xmm7
+ movaps 16(%ebp),%xmm1
+ leal 32(%ebp),%edx
+ pxor 16(%esp),%xmm3
+.byte 102,15,56,222,209
+ pxor 32(%esp),%xmm4
+.byte 102,15,56,222,217
+ pxor 48(%esp),%xmm5
+ decl %ecx
+.byte 102,15,56,222,225
+ pxor 64(%esp),%xmm6
+.byte 102,15,56,222,233
+ pxor %xmm0,%xmm7
+.byte 102,15,56,222,241
+ movaps (%edx),%xmm0
+.byte 102,15,56,222,249
+ call .L_aesni_decrypt6_enter
+ movdqa 80(%esp),%xmm1
+ pxor %xmm0,%xmm0
+ xorps (%esp),%xmm2
+ pcmpgtd %xmm1,%xmm0
+ xorps 16(%esp),%xmm3
+ movups %xmm2,(%edi)
+ xorps 32(%esp),%xmm4
+ movups %xmm3,16(%edi)
+ xorps 48(%esp),%xmm5
+ movups %xmm4,32(%edi)
+ xorps 64(%esp),%xmm6
+ movups %xmm5,48(%edi)
+ xorps %xmm1,%xmm7
+ movups %xmm6,64(%edi)
+ pshufd $19,%xmm0,%xmm2
+ movups %xmm7,80(%edi)
+ leal 96(%edi),%edi
+ movdqa 96(%esp),%xmm3
+ pxor %xmm0,%xmm0
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ movl %ebx,%ecx
+ pxor %xmm2,%xmm1
+ subl $96,%eax
+ jnc .L057xts_dec_loop6
+ leal 1(,%ecx,2),%ecx
+ movl %ebp,%edx
+ movl %ecx,%ebx
+.L056xts_dec_short:
+ addl $96,%eax
+ jz .L058xts_dec_done6x
+ movdqa %xmm1,%xmm5
cmpl $32,%eax
- movups 16(%esi),%xmm1
- je .L017ecb_dec_two
- cmpl $48,%eax
- movups 32(%esi),%xmm2
- je .L018ecb_dec_three
- movups 48(%esi),%xmm7
- call _aesni_decrypt4
- movups %xmm0,(%edi)
- movups %xmm1,16(%edi)
- movups %xmm2,32(%edi)
- movups %xmm7,48(%edi)
- jmp .L006ecb_ret
-.align 16
-.L016ecb_dec_one:
- movups (%edx),%xmm3
- movups 16(%edx),%xmm4
- leal 32(%edx),%edx
- pxor %xmm3,%xmm0
-.L019dec1_loop:
- aesdec %xmm4,%xmm0
+ jb .L059xts_dec_one
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ je .L060xts_dec_two
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,%xmm6
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ cmpl $64,%eax
+ jb .L061xts_dec_three
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa %xmm1,%xmm7
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+ movdqa %xmm5,(%esp)
+ movdqa %xmm6,16(%esp)
+ je .L062xts_dec_four
+ movdqa %xmm7,32(%esp)
+ pshufd $19,%xmm0,%xmm7
+ movdqa %xmm1,48(%esp)
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm7
+ pxor %xmm1,%xmm7
+ movdqu (%esi),%xmm2
+ movdqu 16(%esi),%xmm3
+ movdqu 32(%esi),%xmm4
+ pxor (%esp),%xmm2
+ movdqu 48(%esi),%xmm5
+ pxor 16(%esp),%xmm3
+ movdqu 64(%esi),%xmm6
+ pxor 32(%esp),%xmm4
+ leal 80(%esi),%esi
+ pxor 48(%esp),%xmm5
+ movdqa %xmm7,64(%esp)
+ pxor %xmm7,%xmm6
+ call _aesni_decrypt6
+ movaps 64(%esp),%xmm1
+ xorps (%esp),%xmm2
+ xorps 16(%esp),%xmm3
+ xorps 32(%esp),%xmm4
+ movups %xmm2,(%edi)
+ xorps 48(%esp),%xmm5
+ movups %xmm3,16(%edi)
+ xorps %xmm1,%xmm6
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ movups %xmm6,64(%edi)
+ leal 80(%edi),%edi
+ jmp .L063xts_dec_done
+.align 16
+.L059xts_dec_one:
+ movups (%esi),%xmm2
+ leal 16(%esi),%esi
+ xorps %xmm5,%xmm2
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L064dec1_loop_12:
+.byte 102,15,56,222,209
decl %ecx
- movups (%edx),%xmm4
+ movaps (%edx),%xmm1
leal 16(%edx),%edx
- jnz .L019dec1_loop
- aesdeclast %xmm4,%xmm0
- movups %xmm0,(%edi)
- jmp .L006ecb_ret
+ jnz .L064dec1_loop_12
+.byte 102,15,56,223,209
+ xorps %xmm5,%xmm2
+ movups %xmm2,(%edi)
+ leal 16(%edi),%edi
+ movdqa %xmm5,%xmm1
+ jmp .L063xts_dec_done
.align 16
-.L017ecb_dec_two:
+.L060xts_dec_two:
+ movaps %xmm1,%xmm6
+ movups (%esi),%xmm2
+ movups 16(%esi),%xmm3
+ leal 32(%esi),%esi
+ xorps %xmm5,%xmm2
+ xorps %xmm6,%xmm3
call _aesni_decrypt3
- movups %xmm0,(%edi)
- movups %xmm1,16(%edi)
- jmp .L006ecb_ret
+ xorps %xmm5,%xmm2
+ xorps %xmm6,%xmm3
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ leal 32(%edi),%edi
+ movdqa %xmm6,%xmm1
+ jmp .L063xts_dec_done
.align 16
-.L018ecb_dec_three:
+.L061xts_dec_three:
+ movaps %xmm1,%xmm7
+ movups (%esi),%xmm2
+ movups 16(%esi),%xmm3
+ movups 32(%esi),%xmm4
+ leal 48(%esi),%esi
+ xorps %xmm5,%xmm2
+ xorps %xmm6,%xmm3
+ xorps %xmm7,%xmm4
call _aesni_decrypt3
- movups %xmm0,(%edi)
- movups %xmm1,16(%edi)
- movups %xmm2,32(%edi)
-.L006ecb_ret:
+ xorps %xmm5,%xmm2
+ xorps %xmm6,%xmm3
+ xorps %xmm7,%xmm4
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ leal 48(%edi),%edi
+ movdqa %xmm7,%xmm1
+ jmp .L063xts_dec_done
+.align 16
+.L062xts_dec_four:
+ movaps %xmm1,%xmm6
+ movups (%esi),%xmm2
+ movups 16(%esi),%xmm3
+ movups 32(%esi),%xmm4
+ xorps (%esp),%xmm2
+ movups 48(%esi),%xmm5
+ leal 64(%esi),%esi
+ xorps 16(%esp),%xmm3
+ xorps %xmm7,%xmm4
+ xorps %xmm6,%xmm5
+ call _aesni_decrypt4
+ xorps (%esp),%xmm2
+ xorps 16(%esp),%xmm3
+ xorps %xmm7,%xmm4
+ movups %xmm2,(%edi)
+ xorps %xmm6,%xmm5
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ leal 64(%edi),%edi
+ movdqa %xmm6,%xmm1
+ jmp .L063xts_dec_done
+.align 16
+.L058xts_dec_done6x:
+ movl 112(%esp),%eax
+ andl $15,%eax
+ jz .L065xts_dec_ret
+ movl %eax,112(%esp)
+ jmp .L066xts_dec_only_one_more
+.align 16
+.L063xts_dec_done:
+ movl 112(%esp),%eax
+ pxor %xmm0,%xmm0
+ andl $15,%eax
+ jz .L065xts_dec_ret
+ pcmpgtd %xmm1,%xmm0
+ movl %eax,112(%esp)
+ pshufd $19,%xmm0,%xmm2
+ pxor %xmm0,%xmm0
+ movdqa 96(%esp),%xmm3
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm2
+ pcmpgtd %xmm1,%xmm0
+ pxor %xmm2,%xmm1
+.L066xts_dec_only_one_more:
+ pshufd $19,%xmm0,%xmm5
+ movdqa %xmm1,%xmm6
+ paddq %xmm1,%xmm1
+ pand %xmm3,%xmm5
+ pxor %xmm1,%xmm5
+ movl %ebp,%edx
+ movl %ebx,%ecx
+ movups (%esi),%xmm2
+ xorps %xmm5,%xmm2
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L067dec1_loop_13:
+.byte 102,15,56,222,209
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L067dec1_loop_13
+.byte 102,15,56,223,209
+ xorps %xmm5,%xmm2
+ movups %xmm2,(%edi)
+.L068xts_dec_steal:
+ movzbl 16(%esi),%ecx
+ movzbl (%edi),%edx
+ leal 1(%esi),%esi
+ movb %cl,(%edi)
+ movb %dl,16(%edi)
+ leal 1(%edi),%edi
+ subl $1,%eax
+ jnz .L068xts_dec_steal
+ subl 112(%esp),%edi
+ movl %ebp,%edx
+ movl %ebx,%ecx
+ movups (%edi),%xmm2
+ xorps %xmm6,%xmm2
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L069dec1_loop_14:
+.byte 102,15,56,222,209
+ decl %ecx
+ movaps (%edx),%xmm1
+ leal 16(%edx),%edx
+ jnz .L069dec1_loop_14
+.byte 102,15,56,223,209
+ xorps %xmm6,%xmm2
+ movups %xmm2,(%edi)
+.L065xts_dec_ret:
+ movl 116(%esp),%esp
popl %edi
popl %esi
popl %ebx
popl %ebp
ret
-.size aesni_ecb_encrypt,.-.L_aesni_ecb_encrypt_begin
+.size aesni_xts_decrypt,.-.L_aesni_xts_decrypt_begin
.globl aesni_cbc_encrypt
.type aesni_cbc_encrypt,@function
.align 16
@@ -397,50 +1715,55 @@ aesni_cbc_encrypt:
pushl %esi
pushl %edi
movl 20(%esp),%esi
+ movl %esp,%ebx
movl 24(%esp),%edi
+ subl $24,%ebx
movl 28(%esp),%eax
+ andl $-16,%ebx
movl 32(%esp),%edx
- testl %eax,%eax
movl 36(%esp),%ebp
- jz .L020cbc_ret
+ testl %eax,%eax
+ jz .L070cbc_abort
cmpl $0,40(%esp)
- movups (%ebp),%xmm5
+ xchgl %esp,%ebx
+ movups (%ebp),%xmm7
movl 240(%edx),%ecx
movl %edx,%ebp
+ movl %ebx,16(%esp)
movl %ecx,%ebx
- je .L021cbc_decrypt
- movaps %xmm5,%xmm0
+ je .L071cbc_decrypt
+ movaps %xmm7,%xmm2
cmpl $16,%eax
- jb .L022cbc_enc_tail
+ jb .L072cbc_enc_tail
subl $16,%eax
- jmp .L023cbc_enc_loop
+ jmp .L073cbc_enc_loop
.align 16
-.L023cbc_enc_loop:
- movups (%esi),%xmm5
+.L073cbc_enc_loop:
+ movups (%esi),%xmm7
leal 16(%esi),%esi
- pxor %xmm5,%xmm0
- movups (%edx),%xmm3
- movups 16(%edx),%xmm4
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ xorps %xmm0,%xmm7
leal 32(%edx),%edx
- pxor %xmm3,%xmm0
-.L024enc1_loop:
- aesenc %xmm4,%xmm0
+ xorps %xmm7,%xmm2
+.L074enc1_loop_15:
+.byte 102,15,56,220,209
decl %ecx
- movups (%edx),%xmm4
+ movaps (%edx),%xmm1
leal 16(%edx),%edx
- jnz .L024enc1_loop
- aesenclast %xmm4,%xmm0
- subl $16,%eax
- leal 16(%edi),%edi
+ jnz .L074enc1_loop_15
+.byte 102,15,56,221,209
movl %ebx,%ecx
movl %ebp,%edx
- movups %xmm0,-16(%edi)
- jnc .L023cbc_enc_loop
+ movups %xmm2,(%edi)
+ leal 16(%edi),%edi
+ subl $16,%eax
+ jnc .L073cbc_enc_loop
addl $16,%eax
- jnz .L022cbc_enc_tail
- movaps %xmm0,%xmm5
- jmp .L020cbc_ret
-.L022cbc_enc_tail:
+ jnz .L072cbc_enc_tail
+ movaps %xmm2,%xmm7
+ jmp .L075cbc_ret
+.L072cbc_enc_tail:
movl %eax,%ecx
.long 2767451785
movl $16,%ecx
@@ -451,113 +1774,169 @@ aesni_cbc_encrypt:
movl %ebx,%ecx
movl %edi,%esi
movl %ebp,%edx
- jmp .L023cbc_enc_loop
+ jmp .L073cbc_enc_loop
.align 16
-.L021cbc_decrypt:
- subl $64,%eax
- jbe .L025cbc_dec_tail
- jmp .L026cbc_dec_loop3
+.L071cbc_decrypt:
+ cmpl $80,%eax
+ jbe .L076cbc_dec_tail
+ movaps %xmm7,(%esp)
+ subl $80,%eax
+ jmp .L077cbc_dec_loop6_enter
.align 16
-.L026cbc_dec_loop3:
- movups (%esi),%xmm0
- movups 16(%esi),%xmm1
- movups 32(%esi),%xmm2
- movaps %xmm0,%xmm6
- movaps %xmm1,%xmm7
- call _aesni_decrypt3
- subl $48,%eax
- leal 48(%esi),%esi
- leal 48(%edi),%edi
- pxor %xmm5,%xmm0
- pxor %xmm6,%xmm1
- movups -16(%esi),%xmm5
- pxor %xmm7,%xmm2
- movups %xmm0,-48(%edi)
+.L078cbc_dec_loop6:
+ movaps %xmm0,(%esp)
+ movups %xmm7,(%edi)
+ leal 16(%edi),%edi
+.L077cbc_dec_loop6_enter:
+ movdqu (%esi),%xmm2
+ movdqu 16(%esi),%xmm3
+ movdqu 32(%esi),%xmm4
+ movdqu 48(%esi),%xmm5
+ movdqu 64(%esi),%xmm6
+ movdqu 80(%esi),%xmm7
+ call _aesni_decrypt6
+ movups (%esi),%xmm1
+ movups 16(%esi),%xmm0
+ xorps (%esp),%xmm2
+ xorps %xmm1,%xmm3
+ movups 32(%esi),%xmm1
+ xorps %xmm0,%xmm4
+ movups 48(%esi),%xmm0
+ xorps %xmm1,%xmm5
+ movups 64(%esi),%xmm1
+ xorps %xmm0,%xmm6
+ movups 80(%esi),%xmm0
+ xorps %xmm1,%xmm7
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ leal 96(%esi),%esi
+ movups %xmm4,32(%edi)
movl %ebx,%ecx
- movups %xmm1,-32(%edi)
+ movups %xmm5,48(%edi)
movl %ebp,%edx
- movups %xmm2,-16(%edi)
- ja .L026cbc_dec_loop3
-.L025cbc_dec_tail:
- addl $64,%eax
- jz .L020cbc_ret
- movups (%esi),%xmm0
+ movups %xmm6,64(%edi)
+ leal 80(%edi),%edi
+ subl $96,%eax
+ ja .L078cbc_dec_loop6
+ movaps %xmm7,%xmm2
+ movaps %xmm0,%xmm7
+ addl $80,%eax
+ jle .L079cbc_dec_tail_collected
+ movups %xmm2,(%edi)
+ leal 16(%edi),%edi
+.L076cbc_dec_tail:
+ movups (%esi),%xmm2
+ movaps %xmm2,%xmm6
cmpl $16,%eax
- movaps %xmm0,%xmm6
- jbe .L027cbc_dec_one
- movups 16(%esi),%xmm1
- cmpl $32,%eax
- movaps %xmm1,%xmm7
- jbe .L028cbc_dec_two
- movups 32(%esi),%xmm2
- cmpl $48,%eax
- jbe .L029cbc_dec_three
- movups 48(%esi),%xmm7
- call _aesni_decrypt4
+ jbe .L080cbc_dec_one
movups 16(%esi),%xmm3
+ movaps %xmm3,%xmm5
+ cmpl $32,%eax
+ jbe .L081cbc_dec_two
movups 32(%esi),%xmm4
- pxor %xmm5,%xmm0
- pxor %xmm6,%xmm1
+ cmpl $48,%eax
+ jbe .L082cbc_dec_three
movups 48(%esi),%xmm5
- movups %xmm0,(%edi)
- pxor %xmm3,%xmm2
- pxor %xmm4,%xmm7
- movups %xmm1,16(%edi)
- movups %xmm2,32(%edi)
- movaps %xmm7,%xmm0
- leal 48(%edi),%edi
- jmp .L030cbc_dec_tail_collected
-.L027cbc_dec_one:
- movups (%edx),%xmm3
- movups 16(%edx),%xmm4
- leal 32(%edx),%edx
- pxor %xmm3,%xmm0
-.L031dec1_loop:
- aesdec %xmm4,%xmm0
+ cmpl $64,%eax
+ jbe .L083cbc_dec_four
+ movups 64(%esi),%xmm6
+ movaps %xmm7,(%esp)
+ movups (%esi),%xmm2
+ xorps %xmm7,%xmm7
+ call _aesni_decrypt6
+ movups (%esi),%xmm1
+ movups 16(%esi),%xmm0
+ xorps (%esp),%xmm2
+ xorps %xmm1,%xmm3
+ movups 32(%esi),%xmm1
+ xorps %xmm0,%xmm4
+ movups 48(%esi),%xmm0
+ xorps %xmm1,%xmm5
+ movups 64(%esi),%xmm7
+ xorps %xmm0,%xmm6
+ movups %xmm2,(%edi)
+ movups %xmm3,16(%edi)
+ movups %xmm4,32(%edi)
+ movups %xmm5,48(%edi)
+ leal 64(%edi),%edi
+ movaps %xmm6,%xmm2
+ subl $80,%eax
+ jmp .L079cbc_dec_tail_collected
+.align 16
+.L080cbc_dec_one:
+ movaps (%edx),%xmm0
+ movaps 16(%edx),%xmm1
+ leal 32(%edx),%edx
+ xorps %xmm0,%xmm2
+.L084dec1_loop_16:
+.byte 102,15,56,222,209
decl %ecx
- movups (%edx),%xmm4
+ movaps (%edx),%xmm1
leal 16(%edx),%edx
- jnz .L031dec1_loop
- aesdeclast %xmm4,%xmm0
- pxor %xmm5,%xmm0
- movaps %xmm6,%xmm5
- jmp .L030cbc_dec_tail_collected
-.L028cbc_dec_two:
+ jnz .L084dec1_loop_16
+.byte 102,15,56,223,209
+ xorps %xmm7,%xmm2
+ movaps %xmm6,%xmm7
+ subl $16,%eax
+ jmp .L079cbc_dec_tail_collected
+.align 16
+.L081cbc_dec_two:
+ xorps %xmm4,%xmm4
call _aesni_decrypt3
- pxor %xmm5,%xmm0
- pxor %xmm6,%xmm1
- movups %xmm0,(%edi)
- movaps %xmm1,%xmm0
- movaps %xmm7,%xmm5
+ xorps %xmm7,%xmm2
+ xorps %xmm6,%xmm3
+ movups %xmm2,(%edi)
+ movaps %xmm3,%xmm2
leal 16(%edi),%edi
- jmp .L030cbc_dec_tail_collected
-.L029cbc_dec_three:
+ movaps %xmm5,%xmm7
+ subl $32,%eax
+ jmp .L079cbc_dec_tail_collected
+.align 16
+.L082cbc_dec_three:
call _aesni_decrypt3
- pxor %xmm5,%xmm0
- pxor %xmm6,%xmm1
- pxor %xmm7,%xmm2
- movups %xmm0,(%edi)
- movups %xmm1,16(%edi)
- movaps %xmm2,%xmm0
- movups 32(%esi),%xmm5
+ xorps %xmm7,%xmm2
+ xorps %xmm6,%xmm3
+ xorps %xmm5,%xmm4
+ movups %xmm2,(%edi)
+ movaps %xmm4,%xmm2
+ movups %xmm3,16(%edi)
leal 32(%edi),%edi
-.L030cbc_dec_tail_collected:
+ movups 32(%esi),%xmm7
+ subl $48,%eax
+ jmp .L079cbc_dec_tail_collected
+.align 16
+.L083cbc_dec_four:
+ call _aesni_decrypt4
+ movups 16(%esi),%xmm1
+ movups 32(%esi),%xmm0
+ xorps %xmm7,%xmm2
+ movups 48(%esi),%xmm7
+ xorps %xmm6,%xmm3
+ movups %xmm2,(%edi)
+ xorps %xmm1,%xmm4
+ movups %xmm3,16(%edi)
+ xorps %xmm0,%xmm5
+ movups %xmm4,32(%edi)
+ leal 48(%edi),%edi
+ movaps %xmm5,%xmm2
+ subl $64,%eax
+.L079cbc_dec_tail_collected:
andl $15,%eax
- jnz .L032cbc_dec_tail_partial
- movups %xmm0,(%edi)
- jmp .L020cbc_ret
-.L032cbc_dec_tail_partial:
- movl %esp,%ebp
- subl $16,%esp
- andl $-16,%esp
- movaps %xmm0,(%esp)
+ jnz .L085cbc_dec_tail_partial
+ movups %xmm2,(%edi)
+ jmp .L075cbc_ret
+.align 16
+.L085cbc_dec_tail_partial:
+ movaps %xmm2,(%esp)
+ movl $16,%ecx
movl %esp,%esi
- movl %eax,%ecx
+ subl %eax,%ecx
.long 2767451785
- movl %ebp,%esp
-.L020cbc_ret:
+.L075cbc_ret:
+ movl 16(%esp),%esp
movl 36(%esp),%ebp
- movups %xmm5,(%ebp)
+ movups %xmm7,(%ebp)
+.L070cbc_abort:
popl %edi
popl %esi
popl %ebx
@@ -568,97 +1947,97 @@ aesni_cbc_encrypt:
.align 16
_aesni_set_encrypt_key:
testl %eax,%eax
- jz .L033bad_pointer
+ jz .L086bad_pointer
testl %edx,%edx
- jz .L033bad_pointer
+ jz .L086bad_pointer
movups (%eax),%xmm0
- pxor %xmm4,%xmm4
+ xorps %xmm4,%xmm4
leal 16(%edx),%edx
cmpl $256,%ecx
- je .L03414rounds
+ je .L08714rounds
cmpl $192,%ecx
- je .L03512rounds
+ je .L08812rounds
cmpl $128,%ecx
- jne .L036bad_keybits
+ jne .L089bad_keybits
.align 16
-.L03710rounds:
+.L09010rounds:
movl $9,%ecx
- movups %xmm0,-16(%edx)
- aeskeygenassist $1,%xmm0,%xmm1
- call .L038key_128_cold
- aeskeygenassist $2,%xmm0,%xmm1
- call .L039key_128
- aeskeygenassist $4,%xmm0,%xmm1
- call .L039key_128
- aeskeygenassist $8,%xmm0,%xmm1
- call .L039key_128
- aeskeygenassist $16,%xmm0,%xmm1
- call .L039key_128
- aeskeygenassist $32,%xmm0,%xmm1
- call .L039key_128
- aeskeygenassist $64,%xmm0,%xmm1
- call .L039key_128
- aeskeygenassist $128,%xmm0,%xmm1
- call .L039key_128
- aeskeygenassist $27,%xmm0,%xmm1
- call .L039key_128
- aeskeygenassist $54,%xmm0,%xmm1
- call .L039key_128
- movups %xmm0,(%edx)
+ movaps %xmm0,-16(%edx)
+.byte 102,15,58,223,200,1
+ call .L091key_128_cold
+.byte 102,15,58,223,200,2
+ call .L092key_128
+.byte 102,15,58,223,200,4
+ call .L092key_128
+.byte 102,15,58,223,200,8
+ call .L092key_128
+.byte 102,15,58,223,200,16
+ call .L092key_128
+.byte 102,15,58,223,200,32
+ call .L092key_128
+.byte 102,15,58,223,200,64
+ call .L092key_128
+.byte 102,15,58,223,200,128
+ call .L092key_128
+.byte 102,15,58,223,200,27
+ call .L092key_128
+.byte 102,15,58,223,200,54
+ call .L092key_128
+ movaps %xmm0,(%edx)
movl %ecx,80(%edx)
xorl %eax,%eax
ret
.align 16
-.L039key_128:
- movups %xmm0,(%edx)
+.L092key_128:
+ movaps %xmm0,(%edx)
leal 16(%edx),%edx
-.L038key_128_cold:
+.L091key_128_cold:
shufps $16,%xmm0,%xmm4
- pxor %xmm4,%xmm0
+ xorps %xmm4,%xmm0
shufps $140,%xmm0,%xmm4
- pxor %xmm4,%xmm0
- pshufd $255,%xmm1,%xmm1
- pxor %xmm1,%xmm0
+ xorps %xmm4,%xmm0
+ shufps $255,%xmm1,%xmm1
+ xorps %xmm1,%xmm0
ret
.align 16
-.L03512rounds:
+.L08812rounds:
movq 16(%eax),%xmm2
movl $11,%ecx
- movups %xmm0,-16(%edx)
- aeskeygenassist $1,%xmm2,%xmm1
- call .L040key_192a_cold
- aeskeygenassist $2,%xmm2,%xmm1
- call .L041key_192b
- aeskeygenassist $4,%xmm2,%xmm1
- call .L042key_192a
- aeskeygenassist $8,%xmm2,%xmm1
- call .L041key_192b
- aeskeygenassist $16,%xmm2,%xmm1
- call .L042key_192a
- aeskeygenassist $32,%xmm2,%xmm1
- call .L041key_192b
- aeskeygenassist $64,%xmm2,%xmm1
- call .L042key_192a
- aeskeygenassist $128,%xmm2,%xmm1
- call .L041key_192b
- movups %xmm0,(%edx)
+ movaps %xmm0,-16(%edx)
+.byte 102,15,58,223,202,1
+ call .L093key_192a_cold
+.byte 102,15,58,223,202,2
+ call .L094key_192b
+.byte 102,15,58,223,202,4
+ call .L095key_192a
+.byte 102,15,58,223,202,8
+ call .L094key_192b
+.byte 102,15,58,223,202,16
+ call .L095key_192a
+.byte 102,15,58,223,202,32
+ call .L094key_192b
+.byte 102,15,58,223,202,64
+ call .L095key_192a
+.byte 102,15,58,223,202,128
+ call .L094key_192b
+ movaps %xmm0,(%edx)
movl %ecx,48(%edx)
xorl %eax,%eax
ret
.align 16
-.L042key_192a:
- movups %xmm0,(%edx)
+.L095key_192a:
+ movaps %xmm0,(%edx)
leal 16(%edx),%edx
.align 16
-.L040key_192a_cold:
+.L093key_192a_cold:
movaps %xmm2,%xmm5
-.L043key_192b_warm:
+.L096key_192b_warm:
shufps $16,%xmm0,%xmm4
- movaps %xmm2,%xmm3
- pxor %xmm4,%xmm0
+ movdqa %xmm2,%xmm3
+ xorps %xmm4,%xmm0
shufps $140,%xmm0,%xmm4
pslldq $4,%xmm3
- pxor %xmm4,%xmm0
+ xorps %xmm4,%xmm0
pshufd $85,%xmm1,%xmm1
pxor %xmm3,%xmm2
pxor %xmm1,%xmm0
@@ -666,80 +2045,80 @@ _aesni_set_encrypt_key:
pxor %xmm3,%xmm2
ret
.align 16
-.L041key_192b:
+.L094key_192b:
movaps %xmm0,%xmm3
shufps $68,%xmm0,%xmm5
- movups %xmm5,(%edx)
+ movaps %xmm5,(%edx)
shufps $78,%xmm2,%xmm3
- movups %xmm3,16(%edx)
+ movaps %xmm3,16(%edx)
leal 32(%edx),%edx
- jmp .L043key_192b_warm
+ jmp .L096key_192b_warm
.align 16
-.L03414rounds:
+.L08714rounds:
movups 16(%eax),%xmm2
movl $13,%ecx
leal 16(%edx),%edx
- movups %xmm0,-32(%edx)
- movups %xmm2,-16(%edx)
- aeskeygenassist $1,%xmm2,%xmm1
- call .L044key_256a_cold
- aeskeygenassist $1,%xmm0,%xmm1
- call .L045key_256b
- aeskeygenassist $2,%xmm2,%xmm1
- call .L046key_256a
- aeskeygenassist $2,%xmm0,%xmm1
- call .L045key_256b
- aeskeygenassist $4,%xmm2,%xmm1
- call .L046key_256a
- aeskeygenassist $4,%xmm0,%xmm1
- call .L045key_256b
- aeskeygenassist $8,%xmm2,%xmm1
- call .L046key_256a
- aeskeygenassist $8,%xmm0,%xmm1
- call .L045key_256b
- aeskeygenassist $16,%xmm2,%xmm1
- call .L046key_256a
- aeskeygenassist $16,%xmm0,%xmm1
- call .L045key_256b
- aeskeygenassist $32,%xmm2,%xmm1
- call .L046key_256a
- aeskeygenassist $32,%xmm0,%xmm1
- call .L045key_256b
- aeskeygenassist $64,%xmm2,%xmm1
- call .L046key_256a
- movups %xmm0,(%edx)
+ movaps %xmm0,-32(%edx)
+ movaps %xmm2,-16(%edx)
+.byte 102,15,58,223,202,1
+ call .L097key_256a_cold
+.byte 102,15,58,223,200,1
+ call .L098key_256b
+.byte 102,15,58,223,202,2
+ call .L099key_256a
+.byte 102,15,58,223,200,2
+ call .L098key_256b
+.byte 102,15,58,223,202,4
+ call .L099key_256a
+.byte 102,15,58,223,200,4
+ call .L098key_256b
+.byte 102,15,58,223,202,8
+ call .L099key_256a
+.byte 102,15,58,223,200,8
+ call .L098key_256b
+.byte 102,15,58,223,202,16
+ call .L099key_256a
+.byte 102,15,58,223,200,16
+ call .L098key_256b
+.byte 102,15,58,223,202,32
+ call .L099key_256a
+.byte 102,15,58,223,200,32
+ call .L098key_256b
+.byte 102,15,58,223,202,64
+ call .L099key_256a
+ movaps %xmm0,(%edx)
movl %ecx,16(%edx)
xorl %eax,%eax
ret
.align 16
-.L046key_256a:
- movups %xmm2,(%edx)
+.L099key_256a:
+ movaps %xmm2,(%edx)
leal 16(%edx),%edx
-.L044key_256a_cold:
+.L097key_256a_cold:
shufps $16,%xmm0,%xmm4
- pxor %xmm4,%xmm0
+ xorps %xmm4,%xmm0
shufps $140,%xmm0,%xmm4
- pxor %xmm4,%xmm0
- pshufd $255,%xmm1,%xmm1
- pxor %xmm1,%xmm0
+ xorps %xmm4,%xmm0
+ shufps $255,%xmm1,%xmm1
+ xorps %xmm1,%xmm0
ret
.align 16
-.L045key_256b:
- movups %xmm0,(%edx)
+.L098key_256b:
+ movaps %xmm0,(%edx)
leal 16(%edx),%edx
shufps $16,%xmm2,%xmm4
- pxor %xmm4,%xmm2
+ xorps %xmm4,%xmm2
shufps $140,%xmm2,%xmm4
- pxor %xmm4,%xmm2
- pshufd $170,%xmm1,%xmm1
- pxor %xmm1,%xmm2
+ xorps %xmm4,%xmm2
+ shufps $170,%xmm1,%xmm1
+ xorps %xmm1,%xmm2
ret
.align 4
-.L033bad_pointer:
+.L086bad_pointer:
movl $-1,%eax
ret
.align 4
-.L036bad_keybits:
+.L089bad_keybits:
movl $-2,%eax
ret
.size _aesni_set_encrypt_key,.-_aesni_set_encrypt_key
@@ -766,30 +2145,30 @@ aesni_set_decrypt_key:
movl 12(%esp),%edx
shll $4,%ecx
testl %eax,%eax
- jnz .L047dec_key_ret
+ jnz .L100dec_key_ret
leal 16(%edx,%ecx,1),%eax
- movups (%edx),%xmm0
- movups (%eax),%xmm1
- movups %xmm0,(%eax)
- movups %xmm1,(%edx)
+ movaps (%edx),%xmm0
+ movaps (%eax),%xmm1
+ movaps %xmm0,(%eax)
+ movaps %xmm1,(%edx)
leal 16(%edx),%edx
leal -16(%eax),%eax
-.L048dec_key_inverse:
- movups (%edx),%xmm0
- movups (%eax),%xmm1
- aesimc %xmm0,%xmm0
- aesimc %xmm1,%xmm1
+.L101dec_key_inverse:
+ movaps (%edx),%xmm0
+ movaps (%eax),%xmm1
+.byte 102,15,56,219,192
+.byte 102,15,56,219,201
leal 16(%edx),%edx
leal -16(%eax),%eax
+ movaps %xmm0,16(%eax)
+ movaps %xmm1,-16(%edx)
cmpl %edx,%eax
- movups %xmm0,16(%eax)
- movups %xmm1,-16(%edx)
- ja .L048dec_key_inverse
- movups (%edx),%xmm0
- aesimc %xmm0,%xmm0
- movups %xmm0,(%edx)
+ ja .L101dec_key_inverse
+ movaps (%edx),%xmm0
+.byte 102,15,56,219,192
+ movaps %xmm0,(%edx)
xorl %eax,%eax
-.L047dec_key_ret:
+.L100dec_key_ret:
ret
.size aesni_set_decrypt_key,.-.L_aesni_set_decrypt_key_begin
.byte 65,69,83,32,102,111,114,32,73,110,116,101,108,32,65,69
diff --git a/lib/accelerated/x86.h b/lib/accelerated/x86.h
index c344283..8886516 100644
--- a/lib/accelerated/x86.h
+++ b/lib/accelerated/x86.h
@@ -1,3 +1,12 @@
+#include <config.h>
+
+#ifdef HAVE_CPUID_H
+# include <cpuid.h>
+# define cpuid __cpuid
+
+#else
#define cpuid(func,ax,bx,cx,dx)\
__asm__ __volatile__ ("cpuid":\
"=a" (ax), "=b" (bx), "=c" (cx), "=d" (dx) : "a" (func));
+
+#endif
diff --git a/lib/nettle/Makefile.am b/lib/nettle/Makefile.am
index 500117b..89622c4 100644
--- a/lib/nettle/Makefile.am
+++ b/lib/nettle/Makefile.am
@@ -36,7 +36,6 @@ noinst_LTLIBRARIES = libcrypto.la
libcrypto_la_SOURCES = pk.c mpi.c mac.c cipher.c rnd.c init.c egd.c egd.h \
multi.c ecc_free.c ecc.h ecc_make_key.c ecc_shared_secret.c \
- ecc_test.c ecc_map.c \
- ecc_mulmod.c ecc_points.c ecc_projective_dbl_point_3.c \
+ ecc_map.c ecc_mulmod.c ecc_points.c ecc_projective_dbl_point_3.c \
ecc_projective_add_point.c ecc_projective_dbl_point.c \
ecc_sign_hash.c ecc_verify_hash.c gnettle.h
diff --git a/lib/nettle/ecc_free.c b/lib/nettle/ecc_free.c
index bbf087d..b5e23f9 100644
--- a/lib/nettle/ecc_free.c
+++ b/lib/nettle/ecc_free.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
/**
diff --git a/lib/nettle/ecc_make_key.c b/lib/nettle/ecc_make_key.c
index 3667a5b..ade9e5f 100644
--- a/lib/nettle/ecc_make_key.c
+++ b/lib/nettle/ecc_make_key.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
/**
diff --git a/lib/nettle/ecc_map.c b/lib/nettle/ecc_map.c
index 2ad60bb..a68feb0 100644
--- a/lib/nettle/ecc_map.c
+++ b/lib/nettle/ecc_map.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
/**
diff --git a/lib/nettle/ecc_mulmod.c b/lib/nettle/ecc_mulmod.c
index c8e91a4..6781b03 100644
--- a/lib/nettle/ecc_mulmod.c
+++ b/lib/nettle/ecc_mulmod.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
/**
diff --git a/lib/nettle/ecc_points.c b/lib/nettle/ecc_points.c
index 7a29cb1..ff13755 100644
--- a/lib/nettle/ecc_points.c
+++ b/lib/nettle/ecc_points.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
/**
diff --git a/lib/nettle/ecc_projective_add_point.c
b/lib/nettle/ecc_projective_add_point.c
index 35d12bc..b692289 100644
--- a/lib/nettle/ecc_projective_add_point.c
+++ b/lib/nettle/ecc_projective_add_point.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
/**
diff --git a/lib/nettle/ecc_projective_dbl_point_3.c
b/lib/nettle/ecc_projective_dbl_point_3.c
index 1b85f68..28f08b3 100644
--- a/lib/nettle/ecc_projective_dbl_point_3.c
+++ b/lib/nettle/ecc_projective_dbl_point_3.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 - 3x + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
/**
diff --git a/lib/nettle/ecc_shared_secret.c b/lib/nettle/ecc_shared_secret.c
index c229870..8e41a60 100644
--- a/lib/nettle/ecc_shared_secret.c
+++ b/lib/nettle/ecc_shared_secret.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
#include <string.h>
diff --git a/lib/nettle/ecc_sign_hash.c b/lib/nettle/ecc_sign_hash.c
index 12be36d..be0d8d7 100644
--- a/lib/nettle/ecc_sign_hash.c
+++ b/lib/nettle/ecc_sign_hash.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
#include <nettle/dsa.h>
diff --git a/lib/nettle/ecc_test.c b/lib/nettle/ecc_test.c
deleted file mode 100644
index 30250fa..0000000
--- a/lib/nettle/ecc_test.c
+++ /dev/null
@@ -1,142 +0,0 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
- *
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
- *
- * The library is free for all purposes without any express
- * guarantee it works.
- *
- * Tom St Denis, address@hidden, http://libtom.org
- */
-
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
- */
-#include "ecc.h"
-#include "gnettle.h"
-#include <gnutls_int.h>
-#include <algorithms.h>
-
-/**
- @file ecc_test.c
- ECC Crypto, Tom St Denis
-*/
-
-/**
- Perform on the ECC system
- @return 0 if successful
-*/
-int
-ecc_test (void)
-{
- mpz_t modulus, order, A;
- ecc_point *G, *GG;
- int i, err;
-
- if ((err = mp_init_multi (&modulus, &A, &order, NULL)) != 0)
- {
- return err;
- }
-
- G = ecc_new_point ();
- GG = ecc_new_point ();
- if (G == NULL || GG == NULL)
- {
- mp_clear_multi (&modulus, &order, NULL);
- ecc_del_point (G);
- ecc_del_point (GG);
- return -1;
- }
-
- for (i = 1; i <= 3; i++)
- {
- const gnutls_ecc_curve_entry_st *st = _gnutls_ecc_curve_get_params (i);
-
- printf ("Testing %s (%d)\n", gnutls_ecc_curve_get_name (i), i);
-
- if (mpz_set_str (A, (char *) st->A, 16) != 0)
- {
- fprintf (stderr, "XXX %d\n", __LINE__);
- err = -1;
- goto done;
- }
-
- if (mpz_set_str (modulus, (char *) st->prime, 16) != 0)
- {
- fprintf (stderr, "XXX %d\n", __LINE__);
- err = -1;
- goto done;
- }
-
- if (mpz_set_str (order, (char *) st->order, 16) != 0)
- {
- fprintf (stderr, "XXX %d\n", __LINE__);
- err = -1;
- goto done;
- }
-
- /* is prime actually prime? */
- if ((err = mpz_probab_prime_p (modulus, PRIME_CHECK_PARAM)) <= 0)
- {
- fprintf (stderr, "XXX %d\n", __LINE__);
- err = -1;
- goto done;
- }
-
- if ((err = mpz_probab_prime_p (order, PRIME_CHECK_PARAM)) <= 0)
- {
- fprintf (stderr, "XXX %d\n", __LINE__);
- err = -1;
- goto done;
- }
-
- if (mpz_set_str (G->x, (char *) st->Gx, 16) != 0)
- {
- fprintf (stderr, "XXX %d\n", __LINE__);
- err = -1;
- goto done;
- }
-
- if (mpz_set_str (G->y, (char *) st->Gy, 16) != 0)
- {
- fprintf (stderr, "XXX %d\n", __LINE__);
- err = -1;
- goto done;
- }
- mpz_set_ui (G->z, 1);
-
- /* then we should have G == (order + 1)G */
- mpz_add_ui (order, order, 1);
- if ((err = ecc_mulmod (order, G, GG, A, modulus, 1)) != 0)
- {
- goto done;
- }
-
- if (mpz_cmp (G->y, GG->y) != 0)
- {
- fprintf (stderr, "XXX %d\n", __LINE__);
- err = -1;
- goto done;
- }
-
- if (mpz_cmp (G->x, GG->x) != 0)
- {
- fprintf (stderr, "XXX %d\n", __LINE__);
- err = -1;
- goto done;
- }
-
- }
- err = 0;
-done:
- ecc_del_point (GG);
- ecc_del_point (G);
- mp_clear_multi (&order, &modulus, &A, NULL);
- return err;
-}
-
-/* $Source: /cvs/libtom/libtomcrypt/src/pk/ecc/ecc_test.c,v $ */
-/* $Revision: 1.12 $ */
-/* $Date: 2007/05/12 14:32:35 $ */
diff --git a/lib/nettle/ecc_verify_hash.c b/lib/nettle/ecc_verify_hash.c
index 62efae0..3c5a1e5 100644
--- a/lib/nettle/ecc_verify_hash.c
+++ b/lib/nettle/ecc_verify_hash.c
@@ -1,19 +1,29 @@
-/* LibTomCrypt, modular cryptographic library -- Tom St Denis
+/*
+ * Copyright (C) 2011 Free Software Foundation, Inc.
*
- * LibTomCrypt is a library that provides various cryptographic
- * algorithms in a highly modular and flexible manner.
+ * This file is part of GNUTLS.
*
- * The library is free for all purposes without any express
- * guarantee it works.
+ * The GNUTLS library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
*
- * Tom St Denis, address@hidden, http://libtom.org
*/
-/* Implements ECC over Z/pZ for curve y^2 = x^3 + ax + b
- *
- * All curves taken from NIST recommendation paper of July 1999
- * Available at http://csrc.nist.gov/cryptval/dss.htm
+/* Based on public domain code of LibTomCrypt by Tom St Denis.
+ * Adapted to gmp and nettle by Nikos Mavrogiannopoulos.
*/
+
#include "ecc.h"
/**
hooks/post-receive
--
GNU gnutls
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [SCM] GNU gnutls branch, master, updated. gnutls_2_99_2-27-g5f84e48,
Nikos Mavrogiannopoulos <=