The MD5 certificate collision attack, and what it means for Tor
Today, a team of security researchers and cryptographers gave a talk at the 25th Chaos Communication Congress (25C3), about a nifty attack against X.509 certificates generated using the MD5 digest algorithm. We figured that people will ask us about how this attack affects Tor, so I'm writing an answer in advance.
The short version: This attack doesn't affect Tor.
The medium version: This attack doesn't affect Tor, since Tor doesn't ever use MD5 certificates, and since Tor doesn't care what certificate authorities say. On the other hand, this attack probably does affect your browser. Check your browser vendor for updates over the next few days and weeks, and make sure you install them.
The long version: To understand the attack, first you've got to understand certificates. When your browser makes a connection to a "secure" website, it uses a protocol called SSL (or sometimes TLS) to see who it's talking to and encrypt the connection with them. In SSL, parties are identified using X.509 certificates, which are issued to them by certificate authorities, or "CA"s. Your browser comes with a big list of certificate authorities. When your browser sees a certificate that was signed by a certificate authority it recognizes, it knows it's talking to the right website.
Certificates, like nearly anything of interest, are too big to sign as-is, so the CA uses a cryptographic "digest" algorithm to derive a short "hash" of the certificate that it can sign. The digest algorithm is supposed to be "collision resistant", so that nobody can find two different certificates that produce the same hash. Such a collision would be bad, since somebody who could produce two such certificates could get a CA to sign one of them, and then use that signature on the other one. Since the hash values would be the same, nobody would be able to tell that the CA had not really signed the second certificate.
With me so far? Good. Let's talk about MD5. There is an old broken hash algorithm called MD5. How old and broken? Cryptographers have considered it weak since 1996 or so, and there have been known real collisions in it since 2004. In 2007, researchers published a method for generating MD5 collisions that could be used against X.509 certificates. In other words, the writing has been on the wall since at least 1996 (arguably 1993), and the writing has been getting bigger year after year. You'd have to be pretty oblivious to still use MD5 for signing certificates in 2008!
Unfortunately, some brave CAs still use MD5 for signing certificates in 2008. And so we come to the attack.
Using a method derived from the 2007 paper, and a cluster of 200 Playstation 3s, the researchers generated two certificates that would produce an MD5 collision: one innocuous one, and one CA certificate. They got a CA to sign the first one, and then transferred this signature to the second. Since the second certificate was for a CA, they now had a certificate that let them generate their own certificates, and make any phony claims they wanted about the identity of any website. If your browser saw one of these phony certificates, it would believe it, since it was ultimately signed by a CA it recognized.
For more information on the attack, with a lot of complicated tricky bits I didn't mention, see the authors' writeup.
The good news is that Tor itself is not affected. Tor doesn't use MD5 for anything[*]. Tor doesn't use commercial CAs. Tor doesn't sign certificates for others, and everything in Tor that is signed is signed using SHA-1, not MD5.[**]
The bad news is that your browser probably does have some of the affected CAs listed. As how-to guides for securing your browser become available, we'll post links to them here.
Finally: Happy new year! And best wishes to any programmers who get stuck working all night on New Year's Eve to remove MD5 from their system.
[*] The fine print: Tor uses the TLS protocol, which uses MD5 in a couple of places. But TLS uses in tandem with SHA-1, so an attacker will need to break SHA-1 and MD5 at the same time to harm TLS's security. Read RFC2246 for the ugly ugly details.
[**] Yes, I know that SHA-1 is showing its age too. Unfortunately, the SHA-2 algorithms aren't that much better, and nothing else has seen the same amount of analysis. Once the NIST hash function competition has picked a SHA-3 candidate, we'll switch to that. In the mean time, I'll be launching some design work on or-dev to make it easier for Tor to switch to a better hash algorithm, once we've got one, or in case we need to jump off SHA-1 in a hurry.
I've heard good things about Whirlpool, but I'm no cryptographer. If I understand correctly, the SHA-1 family (and the SHA-2) have gotten more analysis than any other not-totally-broken-in-practice digest functions. By the time the SHA-3 competition is done, the SHA-3 candidates will also be heavily analyzed by the best cryptographers in the field. I don't think that Tiger or Whirlpool has seen quite enough analysis to make me comfortable.
Still, you're right that it would be really bad if Tor is still using SHA-1 when a practical chosen-prefix attack against it is found. I'm hoping we can get the tools ready to migrate to SHA-256 in the meantime, since (a) the SHA-2 functions seem likely to last a while longer than SHA-1, and (b) doing one migration will make the Tor software more hash-agnostic, so that we can move to SHA-3 quickly once it's chosen. Alternatively, if SHA-256 is broken before SHA-3 is out (unlikely, it seems), we could then think about switching to whatever SHA-3 candidate(s) seem best.
Unfortunately, this isn't trivial. We need to maintain backward compatibility, since people would get mad if we made every Tor user and server upgrade their software all at once.
If you want to help, we could use some design proposals here. I've checked in a document to Tor svn at /tor/trunk/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt . It lists everywhere that Tor uses SHA-1 today. If anybody wants to help think about how to design the migration safely, that will help lots whenever we wind up switching to SHA-256, Skein, MD6, or whatever.