The standard pure Java version (master
branch) is highly optimizied but the conversion of user data block with little-endian
semantics to the long[]
m
registers of the algorithm in compress()
is a performance bottleneck. That said the master
branch version performs
substantially better than other pure Java libraries.
The unsafe
version (unsafe
branch) addresses this bottleneck and achieves ~ 4.1 byte/cycle
on a fairly dated i5 1.3 MHz
MacBook Air. This compares rather favorably with the 3.08 bytes/cycle
noted on the official Blake2b
site. To close that gap would likely require use of SIMD
operations which can not (afaik) be done in pure Java. (unsafe
is
only used to efficiently convert the compress'd block byte[]
to long[]
. No non-JVM-managed memory operations are performed, so
this should be your choice if you are OK with use of the use unsafe
package.)
This library includes a benchmark utility which you can run using the provided jars
in lib/
(or directly in the master
or
unsafe
branch).
java -cp <your choice of the 2 jars> ove.crypto.digest.Bench -d <digest-name>
The comparative results (per machine spec above) are Bench
runs for both variants of this library's Blake2b
, MD5
, sha1
,
sha-256
, sha-512
, and Bounch Castle
version of the Blake2b algorithm.