-rw-r--r-- 8706 lib25519-20230630/doc/speed.md raw
In the following speed table, smaller numbers are better.
The numbers are median single-core cycle counts on various microarchitectures.
Overclocking is disabled.
OpenSSL 3.1.1 cycle counts are reported as a baseline for comparison.
For comparability to OpenSSL's speed-testing utility,
the OpenSSL cycle counts omit various OpenSSL overheads; see below for details.
The lib25519 cycle counts include all overheads.
Each library is assigned one color in the table.
| uarch | software | X key | X dh | X batch | Ed key | Ed sign | Ed verif | Ed MSM |
| :---- | :------- | ----: | ---: | ------: | -----: | ------: | -------: | -----: |
| Zen 3 | <span class=openssl>OpenSSL</span> | <span class=openssl>119875</span> | <span class=openssl>114972</span> | | <span class=openssl>124406</span> | <span class=openssl>110711</span> | <span class=openssl>370539</span> | |
| | <span class=lib25519>lib25519</span> | <span class=lib25519>26519</span> | <span class=lib25519>73086</span> | <span class=lib25519>47795</span> | <span class=lib25519>27287</span> | <span class=lib25519>30659</span> | <span class=lib25519>112326</span> | <span class=lib25519>41542
| Tiger Lake | <span class=openssl>OpenSSL</span> | <span class=openssl>115612</span> | <span class=openssl>118735</span> | | <span class=openssl>118894</span> | <span class=openssl>110714</span> | <span class=openssl>370523</span> | |
| | <span class=lib25519>lib25519</span> | <span class=lib25519>26494</span> | <span class=lib25519>64627</span> | <span class=lib25519>21658</span> | <span class=lib25519>27278</span> | <span class=lib25519>31373</span> | <span class=lib25519>116180</span> | <span class=lib25519>39693
| Goldmont | <span class=openssl>OpenSSL</span> | <span class=openssl>248978</span> | <span class=openssl>273332</span> | | <span class=openssl>263920</span> | <span class=openssl>226717</span> | <span class=openssl>740570</span> | |
| | <span class=lib25519>lib25519</span> | <span class=lib25519>88613</span> | <span class=lib25519>286276</span> | <span class=lib25519>280821</span> | <span class=lib25519>91012</span> | <span class=lib25519>100814</span> | <span class=lib25519>346731</span> | <span class=lib25519>95274
| Skylake | <span class=openssl>OpenSSL</span> | <span class=openssl>134236</span> | <span class=openssl>118455</span> | | <span class=openssl>139969</span> | <span class=openssl>125875</span> | <span class=openssl>410016</span> | |
| | <span class=lib25519>lib25519</span> | <span class=lib25519>28293</span> | <span class=lib25519>88082</span> | <span class=lib25519>62417</span> | <span class=lib25519>28928</span> | <span class=lib25519>32588</span> | <span class=lib25519>113410</span> | <span class=lib25519>41775
| Airmont | <span class=openssl>OpenSSL</span> | <span class=openssl>310990</span> | <span class=openssl>618831</span> | | <span class=openssl>329070</span> | <span class=openssl>276825</span> | <span class=openssl>853552</span> | |
| | <span class=lib25519>lib25519</span> | <span class=lib25519>143599</span> | <span class=lib25519>449168</span> | <span class=lib25519>449232</span> | <span class=lib25519>147183</span> | <span class=lib25519>162634</span> | <span class=lib25519>543339</span> | <span class=lib25519>155019
| Broadwell | <span class=openssl>OpenSSL</span> | <span class=openssl>128083</span> | <span class=openssl>121267</span> | | <span class=openssl>133816</span> | <span class=openssl>120153</span> | <span class=openssl>392282</span> | |
| | <span class=lib25519>lib25519</span> | <span class=lib25519>29669</span> | <span class=lib25519>117858</span> | <span class=lib25519>72444</span> | <span class=lib25519>30654</span> | <span class=lib25519>34379</span> | <span class=lib25519>122012</span> | <span class=lib25519>41516
| Haswell | <span class=openssl>OpenSSL</span> | <span class=openssl>174947</span> | <span class=openssl>165650</span> | | <span class=openssl>180558</span> | <span class=openssl>165981</span> | <span class=openssl>428332</span> | |
| | <span class=lib25519>lib25519</span> | <span class=lib25519>41449</span> | <span class=lib25519>115378</span> | <span class=lib25519>76555</span> | <span class=lib25519>42339</span> | <span class=lib25519>46362</span> | <span class=lib25519>160084</span> | <span class=lib25519>57325
| Core 2 | <span class=openssl>OpenSSL</span> | <span class=openssl>302407</span> | <span class=openssl>341190</span> | | <span class=openssl>311866</span> | <span class=openssl>267854</span> | <span class=openssl>755291</span> | |
| | <span class=lib25519>lib25519</span> | <span class=lib25519>95106</span> | <span class=lib25519>306362</span> | <span class=lib25519>306431</span> | <span class=lib25519>98459</span> | <span class=lib25519>107278</span> | <span class=lib25519>363231</span> | <span class=lib25519>105382
In the lib25519 distribution,
`command/lib25519-speed.c` measures lib25519;
`speedcomparison/openssl/openssl25519speed.c` measures OpenSSL;
`benchmarks/*-*` is the output of `lib25519-speed` on various machines;
`speedcomparison/openssl/*-*` is the output of `openssl25519speed` on various machines;
and `autogen/md-speed` extracts the table from those outputs.
Microarchitectures in the table are listed in reverse chronological order of their introduction.
The table reports only median cycle counts;
see the full output files
for differences between multiple measurements and the median.
The table reports the following major operations:
* "X key": Generating an X25519 public key and secret key.
This is `dh_x25519_keypair selected 32` in the `lib25519-speed` output
(`lib25519_dh_keypair` in the stable API).
For OpenSSL,
this is `x25519-keygen-main` in the `openssl25519speed` output,
measuring the cost of `EVP_PKEY_Q_keygen(0,0,"X25519")`.
This does not include small OpenSSL overheads for converting the public key and secret key to storage format.
* "X dh":
Generating an X25519 shared secret.
This is `dh_x25519 selected 32` in the `lib25519-speed` output
(`lib25519_dh` in the stable API).
For OpenSSL,
this is `x25519-dh-main` in the `openssl25519speed` output,
measuring the cost of `EVP_PKEY_derive`
(as in OpenSSL's speed-testing utility).
This does not include the cost of `EVP_PKEY_new_raw_public_key`
to decode the public key (8376 cycles on Tiger Lake),
`EVP_PKEY_CTX_new` and `EVP_PKEY_derive_init` and `EVP_PKEY_derive_set_peer` for initialization
(together 7660 cycles on Tiger Lake),
and
`EVP_PKEY_new_raw_private_key` to decode the secret key if it is not decoded already
(113498 cycles on Tiger Lake).
* "X batch":
Cost _per secret_ of generating 16 separate shared secrets.
This is `nPbatch_montgomery25519 selected 16` in the `lib25519-speed` output _divided by 16_.
* "Ed key": Generating an Ed25519 public key and secret key.
This is `sign_ed25519_keypair selected 32` in the `lib25519-speed` output
(`lib25519_sign_keypair` in the stable API),
For OpenSSL,
this is `ed25519-keygen-main` in the `openssl25519speed` output,
measuring the cost of `EVP_PKEY_Q_keygen(0,0,"ED25519")`.
This does not include small OpenSSL overheads for converting the public key and secret key to storage format.
* "Ed sign": Generating an Ed25519 signature of a 59-byte message.
This is `sign_ed25519 selected 59` in the `lib25519-speed` output
(`lib25519_sign` in the stable API).
For OpenSSL,
this is `ed25519-sign-main` in the `openssl25519speed` output,
measuring the cost of `EVP_DigestSign`
(as in OpenSSL's speed-testing utility).
This does not include the cost of
`EVP_MD_CTX_new` and
`EVP_DigestSignInit`
(together 6258 cycles on Tiger Lake),
and `EVP_PKEY_new_raw_private_key` to decode the secret key if it is not decoded already
(116808 cycles on Tiger Lake).
* "Ed verif": Verifying an Ed25519 signature and recovering a 59-byte message.
This is `sign_ed25519_open selected 59` in the `lib25519-speed` output
(`lib25519_sign_open` in the stable API).
For OpenSSL,
this is `ed25519-verify-main` in the `openssl25519speed` output,
measuring the cost of `EVP_DigestVerify`
(as in OpenSSL's speed-testing utility).
This does not include the cost of
`EVP_MD_CTX_new` and
`EVP_DigestVerifyInit`
(together 6054 cycles on Tiger Lake),
and `EVP_PKEY_new_raw_public_key`
to decode the public key being used for verification
(9560 cycles on Tiger Lake).
* "Ed MSM": Cost _per point_ of multi-scalar multiplication with 16 points and 16 full-size scalars.
This is `multiscalar_ed25519 selected 16` in the `lib25519-speed` output _divided by 16_.