A Cryptographic Hardware Accelerator can be
The purpose is to load off the very computing intensive tasks of encryption/decryption and compression/decompression.
As can be seen in this AES instruction set article, the acceleration is usually achieved by doing certain arithmetic calculation in hardware.
When the acceleration is not in the instruction set of the CPU, it is supported via a kernel driver (/dev/crypto or AF_ALG socket). There are two drivers offering /dev/crypto in OpenWRT:
Both ways result to a /dev/crypto device which can be used by userspace crypto applications (e.g., the ones that utilize openssl or gnutls).
Depending on which arithmetic calculations exactly are being done in the specific hardware, the results differ widely. You should not concern yourself with theoretical bla,bla but find out how a certain implementation performs in the task you want to do with it! You could want to
sshfs. You would set up a sshfs.server on your device and a sshfs.client on the other end. Now how fast can you read/write to this with and without Cryptographic Hardware Accelerators. If the other end, the client, is a “fully grown PC” with a 2GHz CPU, it will probably perform fast enough to use the entire bandwidth of your Internet connection. If the server side is some embedded device, with let's say some 400MHz MIPS CPU, it could benefit highly from some integrated (and supported!) acceleration. You probably want enough performance, that you can use your entire bandwidth. Well, now go and find some benchmark showing you precisely the difference with enabled/disabled acceleration. Because you will not be able to extrapolate this information from specifications you find on this page or on the web.
make menuconfig and select
This must not be combined with cryptodev-linux.
Kernel modules → Cryptographic API modules
Libraries → SSL
Note that there are some known issues with openssl's /dev/crypto support.
Both AF_ALG and /dev/crypto interfaces allow userspace access to any crypto driver offering symmetric-key ciphers, and digest algorithms. That means hardware acceleration, but also software-only drivers. The use of software drivers is almost always slower than implementing it in userspace, as the context switches slow things down considerably. To see all of the available crypto drivers, take a look at /proc/crypto.
$ cat /proc/crypto name : cbc(aes) driver : cbc(aes-aesni) module : kernel priority : 300 refcnt : 1 selftest : passed internal : no type : skcipher async : no blocksize : 16 min keysize : 16 max keysize : 32 ivsize : 16 chunksize : 16 walksize : 16 name : cbc(aes) driver : cbc-aes-aesni module : kernel priority : 400 refcnt : 1 selftest : passed internal : no type : skcipher async : yes blocksize : 16 min keysize : 16 max keysize : 32 ivsize : 16 chunksize : 16 walksize : 16 name : sha256 driver : sha256-generic module : kernel priority : 0 refcnt : 1 selftest : passed internal : no type : shash blocksize : 64 digestsize : 32 name : sha256 driver : sha256-avx module : kernel priority : 160 refcnt : 7 selftest : passed internal : no type : shash blocksize : 64 digestsize : 32 name : sha256 driver : sha256-ssse3 module : kernel priority : 150 refcnt : 1 selftest : passed internal : no type : shash blocksize : 64 digestsize : 32
Notice in this case, that are two drivers offering
cbc-aes-aesni, and three offering
sha256-ssse3. The kernel will export the one with the highest priority for each algorithm. In this case, it would be: cbc-aes-aesni, and sha256-avx. The first step in finding out if you're using hardware acceleration is ensuring that the driver is listed there, and that it's the highest priority among them. For openssl support, the
types supported are
For IPsec, which is done by the Kernel, this will tell you if you are able to use the crypto accelerator. Just make sure you're using the same algorithm made available by your crypto driver. For other uses, openssl needs to be checked.
Openssl supports hardware crypto acceleration through an engine. You may find out what engines are available, along with the enabled algorithms, and configuration commands by running
openssl engine -t -c:
(devcrypto) /dev/crypto engine [DES-CBC, DES-EDE3-CBC, BF-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB, CAMELLIA-128-CBC, CAMELLIA-192-CBC, CAMELLIA-256-CBC,MD5, SHA1, RIPEMD160, SHA224, SHA256, SHA384, SHA512] [ available ] (rdrand) Intel RDRAND engine [RAND] [ available ] (dynamic) Dynamic engine loading support [ unavailable ]
For openssl-1.0.2 and earlier, the engine was called
cryptodev. It was renamed to
devcrypto in openssl 1.1.0. In this example, engine 'devcrypto' is available, showing the list of algorithms available.
make menuconfig and select
Kernel modules → Cryptographic API modules
Cryptographic Engine and Security Acceleration
Note: If you want to learn about the current situation, you should search the Internet or maybe ask in the forum. This is outdated. Especially if you want to know, how fast a copy from a mounted filesystem (say ext3 over USB) over the scp is, you should specifically search for such benchmarks. Some models of the BCM47xx/53xx family support hardware accelerated encryption for IPSec (AES, DES, 3DES), simple hash calculations (MD5, SHA1) and TLS/SSL+HMAC processing. Not all devices have a hw crypto supporting chip. At least Asus WL500GD/X, Netgear WGT634U and Asus WL700gE do have hw crypto. However, testing of a WGT634U indicates that a pin under the BCM5365 was not pulled low to enable strong bulk cryptography, limiting the functionality to single DES.
The specification states the hardware is able to support 75Mbps (9,4MB/s) of encrypted throughput. Without hardware acceleration using the blowfish encryption throughput is only ~0,4MB/s. Benchmark results that show the difference between software and hardware accelerated encryption/decryption can be found here. Due to the overhead of hardware/DMA transfers and buffer copies between kernel/user space it gives only a good return for packet sizes greater than 256 bytes. This size can be reduced for IPSec, because network hardware uses DMA and there is no need to copy the (encrypted) data between kernel and user space. The hardware specification needed for programming the crypto API of the bcm5365P (Broadcom 5365P) can be found here.
| SoC / CPU | Accelerated Methods | Datasheet |
| BCM94704AGR | WEP 128, AES OCB AES CCM | 94704AGR-PB00-R.pdf |
| BCM?4704P | WEP 128, AES OCB, AES CCM, VPN | 94704AGR-PB00-R.pdf |
| BCM5365 | AES (up to 256-bit CTR and CBC modes), DES, 3DES (CBC), HMAC-SHA1, HMAC-MD5, SHA1 and MD5. IPSec encryption and single pass authentication. | 5365_5365P-PB01-R.pdf |
| BCM5365P | AES (up to 256-bit CTR and CBC modes), DES, 3DES (CBC), HMAC-SHA1, HMAC-MD5, SHA1 and MD5. IPSec encryption and single pass authentication. | 5365_5365P-PB01-R.pdf |