Optimizing SHA-1 Performance on OS X

At work we need to do some SHA-1 verification on application startup, naturally we want it to run as fast as possible — no user likes their browser bounce too many times on the dock before showing up.

The initial implementation is extremely naïve one, yet quite portable. It’s base on Chromium’s (which our code base is built upon) portable SHA-1 implementation: we first read the entire file into a std::string, then pass this string to SHA1HashString(), job done. Simple, right?

Unfortunately, for the files we have it will take at least 280 milliseconds (61 MB/s) on a rather quick (2.6 GHz Core i7) desktop machine, among which about 30 ms was spent on file reading and the rest is on SHA-1. Needless to say, the memory footprint is quite big given the file sizes we have.

First step of using any other SHA-1 function would be decouple the ReadFileToString() function into normal stdio file reading routine:

FILE* file = fopen(path, "rb");
char buf[1 << 16];
size_t len;
// SHA-1 context initialization.
while ((len = fread(buf, 1, sizeof(buf), file)) > 0) {
// SHA-1 context update.
// SHA-1 context done and get the value.

By doing this we can already save quite a lot of time in std::string concatenation, now the time spent on file reading is down to 5 ms. There is not much room for further improvement, let’s see how we can improve the SHA-1 performance.

A well known fast SHA-1 implementation is the one written by Linus Torvalds for GIT, it’s called block-sha1. The code is derived from the Mozilla NSS library but Linus claimed that he has rewritten it entirely. This performed indeed quite well, the actual calculation time is now down to 60 ~ 65 ms (246 MB/s).

However block-sha1 license is not entirely clear for our closed source use, Linus said that he wouldn’t mind to license it as MPL but so far we still consider it’s licensed as rest of GIT since it resides in its repository. We obviously can’t link to GPLv2 licensed code in closed source software.

Another alternative is Steve Reid’s SHA-1 implementation in C, which is completely public domain code. It performances quite fast as well, around 88 ms for us, which equals to 181 MB/s.

Now that we have a good backup plan, I tried the SHA-1 implementation from Mozilla as well, it takes almost as long as Steve Reid’s implementation, no real improvement here but there is not much burden in license for us either.

I would like to try OpenSSL’s implementation as well, since according to Improving the Performance of the Secure Hash Algorithm (SHA-1), it has more optimized implementation for Intel SIMD instructions (SSE3). However I didn’t managed to get the project build with OpenSSL due to some other complications. Also because I managed to find a better alternative than fiddling with OpenSSL’s fragile API: the CommonDigest library from Apple. It performed much better than block-sha1: only takes 40ms (400 MB/s). From the source released, Apple seemed to be using the cross-platform OpenSSL implementation here as well, but it still would be nice to see how does it compare with OpenSSL library shipped with the system. I will try to post some results in upcoming days.

Author: Jiang Jiang

A software engineer from China, working on some OS for a fruit company. Interested in typography and science fiction.

2 thoughts on “Optimizing SHA-1 Performance on OS X”

Leave a Reply

Your email address will not be published. Required fields are marked *