People are starting to ask us about a recent tech report from Sambuddho's group about how an attacker with access to many routers around the Internet could gather the netflow logs from these routers and match up Tor flows. It's great to see more research on traffic correlation attacks, especially on attacks that don't need to see the whole flow on each side. But it's also important to realize that traffic correlation attacks are not a new area.
This blog post aims to give you some background to get you up to speed on the topic.
First, you should read the first few paragraphs of the One cell is enough to break Tor's anonymity analysis:
First, remember the basics of how Tor provides anonymity. Tor clients route their traffic over several (usually three) relays, with the goal that no single relay gets to learn both where the user is (call her Alice) and what site she's reaching (call it Bob).
The Tor design doesn't try to protect against an attacker who can see or measure both traffic going into the Tor network and also traffic coming out of the Tor network. That's because if you can see both flows, some simple statistics let you decide whether they match up.
Because we aim to let people browse the web, we can't afford the extra overhead and hours of additional delay that are used in high-latency mix networks like Mixmaster or Mixminion to slow this attack. That's why Tor's security is all about trying to decrease the chances that an adversary will end up in the right positions to see the traffic flows.
The way we generally explain it is that Tor tries to protect against traffic analysis, where an attacker tries to learn whom to investigate, but Tor can't protect against traffic confirmation (also known as end-to-end correlation), where an attacker tries to confirm a hypothesis by monitoring the right locations in the network and then doing the math.
And the math is really effective. There are simple packet counting attacks (Passive Attack Analysis for Connection-Based Anonymity Systems) and moving window averages (Timing Attacks in Low-Latency Mix-Based Systems), but the more recent stuff is downright scary, like Steven Murdoch's PET 2007 paper about achieving high confidence in a correlation attack despite seeing only 1 in 2000 packets on each side (Sampled Traffic Analysis by Internet-Exchange-Level Adversaries).
Second, there's some further discussion about the efficacy of traffic correlation attacks at scale in the Improving Tor's anonymity by changing guard parameters analysis:
Tariq's paper makes two simplifying assumptions when calling an attack successful [...] 2) He assumes that the end-to-end correlation attack (matching up the incoming flow to the outgoing flow) is instantaneous and perfect. [...] The second one ("how successful is the correlation attack at scale?" or maybe better, "how do the false positives in the correlation attack compare to the false negatives?") remains an open research question.
Researchers generally agree that given a handful of traffic flows, it's easy to match them up. But what about the millions of traffic flows we have now? What levels of false positives (algorithm says "match!" when it's wrong) are acceptable to this attacker? Are there some simple, not too burdensome, tricks we can do to drive up the false positives rates, even if we all agree that those tricks wouldn't work in the "just looking at a handful of flows" case?
More precisely, it's possible that correlation attacks don't scale well because as the number of Tor clients grows, the chance that the exit stream actually came from a different Tor client (not the one you're watching) grows. So the confidence in your match needs to grow along with that or your false positive rate will explode. The people who say that correlation attacks don't scale use phrases like "say your correlation attack is 99.9% accurate" when arguing it. The folks who think it does scale use phrases like "I can easily make my correlation attack arbitrarily accurate." My hope is that the reality is somewhere in between — correlation attacks in the current Tor network can probably be made plenty accurate, but perhaps with some simple design changes we can improve the situation.
The discussion of false positives is key to this new paper too: Sambuddho's paper mentions a false positive rate of 6%. That sounds like it means if you see a traffic flow at one side of the Tor network, and you have a set of 100000 flows on the other side and you're trying to find the match, then 6000 of those flows will look like a match. It's easy to see how at scale, this "base rate fallacy" problem could make the attack effectively useless.
And that high false positive rate is not at all surprising, since he is trying to capture only a summary of the flows at each side and then do the correlation using only those summaries. It would be neat (in a theoretical sense) to learn that it works, but it seems to me that there's a lot of work left here in showing that it would work in practice. It also seems likely that his definition of false positive rate and my use of it above don't line up completely: it would be great if somebody here could work on reconciling them.
For a possibly related case where a series of academic research papers misunderstood the base rate fallacy and came to bad conclusions, see Mike's critique of website fingerprinting attacks plus the follow-up paper from CCS this year confirming that he's right.
I should also emphasize that whether this attack can be performed at all has to do with how much of the Internet the adversary is able to measure or control. This diversity question is a large and important one, with lots of attention already. See more discussion here.
In summary, it's great to see more research on traffic confirmation attacks, but a) traffic confirmation attacks are not a new area so don't freak out without actually reading the papers, and b) this particular one, while kind of neat, doesn't supercede all the previous papers.
(I should put in an addendum here for the people who are wondering if everything they read on the Internet in a given week is surely all tied together: we don't have any reason to think that this attack, or one like it, is related to the recent arrests of a few dozen people around the world. So far, all indications are that those arrests are best explained by bad opsec for a few of them, and then those few pointed to the others when they were questioned.)
[Edit: be sure to read Sambuddho's comment below, too. -RD]
Welcome to the forty-fifth issue in 2014 of Tor Weekly News, the weekly newsletter that covers what’s happening in the Tor community.
Mozilla announces Polaris Privacy Initiative
Mozilla, makers of the Firefox browser upon which Tor Browser is based, announced a series of projects to “accelerate pragmatic and user-focused advances in privacy technology for the Web, giving users more control, awareness and protection in their Web experiences”. The Tor Project is one of Mozilla’s two partners in this Polaris Privacy Initiative, and the collaboration will involve looking at the Firefox codebase to see if its relationship to Tor Browser and the Tor development process can be made more efficient, giving Tor engineers more time to focus on other important issues. Mozilla also stated their intention to run several high-capacity Tor middle relays, contributing to a faster and more stable Tor network.
As Andrew Lewman wrote on the Tor blog, “the Tor Browser is one of the best ways to protect privacy on the web and this partnership is a huge step in advancing people’s right to freedom of expression online”. Watch for more announcements as work on these two fronts continues.
Tor and Operation Onymous
An international coalition of law enforcement authorities announced the seizure of over 400 Tor hidden services allegedly engaging in illegal activity. Once the desired headlines had been written, something approaching the facts began to emerge, with the claimed number of seized services dropping sharply to 27; more troublingly, several high-capacity Tor relays with no apparent connection to the hidden services were also seized. However, in contrast to the last major takedown of hidden services, which involved one shared hidden service hosting platform, there was no obvious single feature linking all of the seized sites, leading to concern in the Tor community that an exploit against the Tor network may have been responsible for their discovery.
It could be that these services were deanonymized individually over a period of months using a variety of means, then all seized at once for maximum effect: as Andrew Lewman and others wrote in a response posted to the Tor blog, these methods could include operational security mistakes by service operators, exploitation of flaws in poorly-written website code, or attacks on the Bitcoin cryptocurrency that is widely used on hidden service marketplaces. On the other hand, if an attack on the Tor network itself is at play, it may be a variant of the class of attack known as “traffic confirmation”, like the one observed earlier this year. “Unfortunately,” as the blog post notes, ”the authorities did not specify how they managed to locate the hidden services”; even if they had, recent disclosures concerning “parallel construction” in law enforcement mean that the public would not necessarily be able to trust their explanation.
“Hidden services need some love” has become a familiar refrain in recent months, and even though the story behind these seizures may remain unknown, they have reinvigorated some long-running threads on improvements to the security of this important technology. George Kadianakis coded a patch that allows hidden service operators to “specify a set of nodes that will be pinned as middle nodes in hidden service rendezvous circuits”, while the theory behind this continues to be discussed, as does the hidden service authorization feature and how widely it is used in practice.
“The attention hidden services have received is minimal compared to their social value and compared to the size and determination of their adversaries.” If you are a hidden service operator concerned by these seizures, or you want to help ensure the possibility of free and uncensorable publishing online, see the group blog post for more details, and feel free to join in with the discussions on the tor-dev mailing list.
More monthly status reports for October 2014
Roger Dingledine sent out the report for SponsorF.
Arturo Filastò reported on OONI’s ongoing study of Tor bridge reachability in different countries, and the recent hackfest on the same topic.
Karsten Loesing offered an update on developments in the world of Onionoo, including new mirrors and search improvements.
Help desk round up
The help desk has been asked how to run Tor Browser on a Chromebook. ChromeOS does not allow any programs to be executed except Google Chrome, including other browsers like Tor Browser. The workaround for this is to install a Debian or Ubuntu environment within ChromeOS using crouton. Once crouton is ready, Tor Browser for Linux can be downloaded and installed in the Debian or Ubuntu environment. Crouton users should seek support from the crouton team and not from the Tor help desk.
This issue of Tor Weekly News has been assembled by Harmony, Matt Pagan, Karsten Loesing, and Lunar.
Want to continue reading TWN? Please help us create this newsletter. We still need more volunteers to watch the Tor community and report important news. Please see the project page, write down your name and subscribe to the team mailing list if you want to get involved!
Mozilla announced that the Tor Project and the Center for Democracy & Technology will be part of their new privacy initiative called Polaris, a collaboration to bring even more privacy features into Mozilla’s products. We are honored to be working alongside Mozilla as well as the Center for Democracy & Technology to give Firefox users more options to protect their privacy.
Mozilla is an industry leader in developing features to support the user’s desire for increased privacy online and shares the Tor Project's mission of helping people protect their security online. At the core of Mozilla's values is the belief that individuals’ privacy cannot be treated as optional. We share this belief. Millions of people around the world rely on the protection of the Tor software and network to safeguard their anonymity. We appreciate companies like Mozilla that see the importance of safeguarding privacy. The Tor volunteer network has grown to the point that large companies can usefully contribute without hurting network diversity. The Tor network will get even better with Mozilla's help, and we hope that their participation will encourage even more organizations to join us.
The initial projects with Mozilla will focus on two areas:
The Tor Browser is built on the Firefox platform and we are excited to have the resources of Mozilla’s engineers to help us merge the many Firefox privacy fixes into the Mozilla codebase. The increased attention from Mozilla will give us time to focus on finding and fixing new issues rather than maintaining our fork.
Tor's network size constrains the number of users that can use Tor concurrently. In the short term, Mozilla will help address this by hosting high-capacity Tor middle relays to make Tor’s network more responsive and allow Tor to serve more users.
We believe that the Tor Browser is one of the best ways to protect privacy on the web and this partnership is a huge step in advancing people’s right to freedom of expression online.
Has a Tor bridge already been blocked in a given country? Being able to answer that question would allow Tor to provide more efficient circumvention methods to those who need them. OONI, the Open Observatory of Network Interference is now actively collecting data on bridge reachability. We are also interested in having a better understanding of how reactive censors are in blocking new bridges distributed via Tor Browser and how effective they are at inhibiting usage of particular pluggable transport.
The countries we are focusing on in this survey are China, Iran, Russia and Ukraine. We call these our test vantage points.
From every test vantage point we perform two types of measurements:
- A Bridge reachability measurement that attempts to build a Tor circuit using the bridge in question.
- A TCP connect measurement that simply does a TCP connect to the bridge IP and port.
To establish a baseline to eliminate the cases in which the bridge is marked as blocked, while it is in fact just offline, we measure also from a vantage point located in the Netherlands.
So far we have collected about a month worth of data and it is as always publicly available for download by anybody interested in looking at it.
To advance this study at the end of October we did a OONI hackfest in Berlin. Helped by the ubiquitous sticky notes we were able to come up with a plan for those days of work and for continuing the project.
The first visualisation we produced is that of the reachability of bridges categorised by country and pluggable transport over time. This simple visualisation already conveys a lot of information and has proven itself a useful tool also in debugging issues with ooniprobe and the tools we use.
You can visit the actual page by clicking on the picture above.
Please note that because the tests are new and experimental you might find inaccuracies or bugs, so don't seriously rely on it for research just yet.
We also developed a data pipeline that places all of the collected OONI reports into a database. This makes it much easier to search/aggregate and visualise the data of the reports.
To read more about this project check out the ooni-dev mailing list thread on this topic.
This project is still in it's very early stages of development, but we would love to hear feedback on it or your cool visualization ideas, as well as any questions regarding Tor bridge reachability (or more in general on Internet censorship) that you would like us to answer!
Recently it was announced that a coalition of government agencies took control of many Tor hidden services. We were as surprised as most of you. Unfortunately, we have very little information about how this was accomplished, but we do have some thoughts which we want to share.
Over the last few days, we received and read reports saying that several Tor relays were seized by government officials. We do not know why the systems were seized, nor do we know anything about the methods of investigation which were used. Specifically, there are reports that three systems of Torservers.net disappeared and there is another report by an independent relay operator. If anyone has more details, please get in contact with us. If your relay was seized, please also tell us its identity so that we can request that the directory authorities reject it from the network.
But, more to the point, the recent publications call the targeted hidden services seizures "Operation Onymous" and they say it was coordinated by Europol and other government entities. Early reports say 17 people were arrested, and 400 hidden services were seized. Later reports have clarified that it was hundreds of URLs hosted on roughly 27 web sites offering hidden services. We have not been contacted directly or indirectly by Europol nor any other agency involved.
Tor is most interested in understanding how these services were located, and if this indicates a security weakness in Tor hidden services that could be exploited by criminals or secret police repressing dissents. We are also interested in learning why the authorities seized Tor relays even though their operation was targetting hidden services. Were these two events related?
How did they locate the hidden services?
So we are left asking "How did they locate the hidden services?". We don't know. In liberal democracies, we should expect that when the time comes to prosecute some of the seventeen people who have been arrested, the police would have to explain to the judge how the suspects came to be suspects, and that as a side benefit of the operation of justice, Tor could learn if there are security flaws in hidden services or other critical internet-facing services. We know through recent leaks that the US DEA and others have constructed a system of organized and sanctioned perjury which they refer to as "parallel construction."
Unfortunately, the authorities did not specify how they managed to locate the hidden services. Here are some plausible scenarios:
The first and most obvious explanation is that the operators of these hidden services failed to use adequate operational security. For example, there are reports of one of the websites being infiltrated by undercover agents and the affidavit states various operational security errors.
Another explanation is exploitation of common web bugs like SQL injections or RFIs (remote file inclusions). Many of those websites were likely quickly-coded e-shops with a big attack surface. Exploitable bugs in web applications are a common problem.
Apparently, there are ways to link transactions and deanonymize Bitcoin clients even if they use Tor. Maybe the seized hidden services were running Bitcoin clients themselves and were victims of similar attacks.
Attacks on the Tor network
The number of takedowns and the fact that Tor relays were seized could also mean that the Tor network was attacked to reveal the location of those hidden services. We received some interesting information from an operator of a now-seized hidden service which may indicate this, as well. Over the past few years, researchers have discovered various attacks on the Tor network. We've implemented some defenses against these attacks, but these defenses do not solve all known issues and there may even be attacks unknown to us.
For example, some months ago, someone was launching non-targetted deanonymization attacks on the live Tor network. People suspect that those attacks were carried out by CERT researchers. While the bug was fixed and the fix quickly deployed in the network, it's possible that as part of their attack, they managed to deanonymize some of those hidden services.
Another possible Tor attack vector could be the Guard Discovery attack. This attack doesn't reveal the identity of the hidden service, but allows an attacker to discover the guard node of a specific hidden service. The guard node is the only node in the whole network that knows the actual IP address of the hidden service. Hence, if the attacker then manages to compromise the guard node or somehow obtain access to it, she can launch a traffic confirmation attack to learn the identity of the hidden service. We've been
discussing various solutions to the guard discovery attack for the past many months but it's not an easy problem to fix properly. Help and feedback on the proposed designs is appreciated.
*Similarly, there exists the attack where the hidden service selects the attacker's relay as its guard node. This may happen randomly or this could occur if the hidden service selects another relay as its guard and the attacker renders that node unusable, by a denial of service attack or similar. The hidden service will then be forced to select a new guard. Eventually, the hidden service will select the attacker.
Furthermore, denial of service attacks on relays or clients in the Tor network can often be leveraged into full de-anonymization attacks. These techniques go back many years, in research such as "From a Trickle to a Flood", "Denial of Service or Denial of Security?", "Why I'm not an Entropist", and even the more recent Bitcoin attacks above. In the Hidden Service protocol there are more vectors for DoS attacks, such as the set of HSDirs and the Introduction Points of a Hidden Service.
Finally, remote code execution exploits against Tor software are also always a possibility, but we have zero evidence that such exploits exist. Although the Tor source code gets continuously reviewed by our security-minded developers and community members, we would like more focused auditing by experienced bug hunters. Public-interest initiatives like Project Zero could help out a lot here. Funding to launch a bug bounty program of our own could also bring real benefit to our codebase. If you can help, please get in touch.
Advice to concerned hidden service operators
As you can see, we still don't know what happened, and it's hard to give concrete suggestions blindly.
If you are a concerned hidden service operator, we suggest you read the cited resources to get a better understanding of the security that hidden services can offer and of the limitations of the current system. When it comes to anonymity, it's clear that the tighter your threat model is, the more informed you need to be about the technologies you use.
If your hidden service lacks sufficient processor, memory, or network resources the DoS based de-anonymization attacks may be easy to leverage against your service. Be sure to review the Tor performance tuning guide to optimize your relay or client.
*Another possible suggestion we can provide is manually selecting the guard node of a hidden service. By configuring the EntryNodes option in Tor's configuration file you can select a relay in the Tor network you trust. Keep in mind, however, that a determined attacker will still be able to determine this relay is your guard and all other attacks still apply.
The task of hiding the location of low-latency web services is a very hard problem and we still don't know how to do it correctly. It seems that there are various issues that none of the current anonymous publishing designs have really solved.
In a way, it's even surprising that hidden services have survived so far. The attention they have received is minimal compared to their social value and compared to the size and determination of their adversaries.
It would be great if there were more people reviewing our designs and code. For example, we would really appreciate feedback on the upcoming hidden service revamp or help with the research on guard discovery attacks (see links above).
Also, it's important to note that Tor currently doesn't have funding for improving the security of hidden services. If you are interested in funding hidden services research and development, please get in touch with us. We hope to find time to organize a crowdfunding campaign to acquire independent and focused hidden service funding.
Thanks to Griffin, Matt, Adam, Roger, David, George, Karen, and Jake for contributions to this post.
* Added information about guard node DoS and EntryNodes option - 2014/11/09 18:16 UTC
Welcome to the forty-fourth issue in 2014 of Tor Weekly News, the weekly newsletter that covers what’s happening in the Tor community.
Tor 0.2.6.1-alpha is out
Following last week’s stabilization of Tor 0.2.5.x, Nick Mathewson announced the first alpha release in the Tor 0.2.6.x series. Quoting the changelog, this version “includes numerous code cleanups and new tests, and fixes a large number of annoying bugs. Out-of-memory conditions are handled better than in 0.2.5, pluggable transports have improved proxy support, and clients now use optimistic data for contacting hidden services.” Support for some very old compilers that do not understand the C99 programming standard, systems without threading support, and the Windows CE operating system has also been dropped.
“This is the first alpha release in a new series, so expect there to be bugs.” If you want to test it out, you can find the source code in the distribution directory.
Tor Browser 4.0.1 is out
Mike Perry announced a bugfix release by the Tor Browser team. This version disables DirectShow, which was causing the Windows build of Tor Browser to crash when visiting many websites. This is not a security release, but Windows users who have experienced this issue should upgrade.
Please see Mike’s post for the changelog, and download your copy from the project page.
Facebook, hidden services, and HTTPS certificates
Facebook, one of the world’s most popular websites, surprised the Internet by becoming the most prominent group so far to set up a Tor hidden service. Rather than connecting through an exit relay, Facebook users can now interact with the social network without their traffic leaving the Tor network at all until it reaches its destination.
Soon after the service was announced, some in the Tor community expressed concern over the implications of its unusually memorable .onion address. Had Facebook somehow mustered the computing power to brute-force hidden service keys at will? Alec Muffett, one of the lead engineers behind the project, clarified that in fact “we just did the same thing as everyone else: generated a bunch of keys with a fixed lead prefix (‘facebook’) and then went fishing looking for good ones”, getting “tremendous lucky” in the process. Those concerned by how easy this seems, added Nick Mathewson, “might want to jump in on reviewing and improving proposal 224, which includes a brand-new, even less usable, but far more secure, name format”.
“Why would you want to use Facebook over Tor?” remains a frequently-asked (and -misunderstood) question, so Roger Dingledine took to the Tor blog to address this and related issues. “The key point here is that anonymity isn’t just about hiding from your destination. There’s no reason to let your ISP know when or whether you’re visiting Facebook. There’s no reason for Facebook’s upstream ISP, or some agency that surveils the Internet, to learn when and whether you use Facebook. And if you do choose to tell Facebook something about you, there’s still no reason to let them automatically discover what city you’re in today while you do it.” Not only that, but Facebook is now taking advantage of the special security properties that hidden services provide, including strong authentication (letting users be confident that they are talking to the right server, and not to an impostor) and end-to-end encryption of their data.
This last point generated some confusion, since Facebook have also acquired an HTTPS certificate for their hidden service, which might seem like an unnecessary belt-and-suspenders approach to security. This has been the subject of “feisty discussions” in the Internet security community, with many points for and against: on the one hand, users have been taught that “https is necessary and http is scary, so it makes sense that users want to see the string “https” in front of” URLs, while on the other, “by encouraging people to pay Digicert we’re reinforcing the certificate authority business model when maybe we should be continuing to demonstrate an alternative.”
Please see Roger’s post for a fuller discussion of all these points and more, and feel free to contribute your own thoughts on the tor-talk mailing list. If you experience problems with the service, please contact Facebook support rather than the Tor help desk; as Alec wrote in the announcement, “we expect the service to be of an evolutionary and slightly flaky nature”, as it is an “experiment” — hopefully an experiment that will, as Roger suggested, “help to continue opening people’s minds about why they might want to offer a hidden service, and help other people think of further novel uses for hidden services.”
Monthly status reports for October 2014
The wave of regular monthly reports from Tor project members for the month of October has begun. Juha Nurmi released his report first, followed by reports from Georg Koppen, Sherief Alaa, Pearl Crescent, Lunar, Harmony, Sukhbir Singh, Colin C., Leiah Jansen, Nick Mathewson, Arlo Breault, Noel Torres, and George Kadianakis.
Israel Leiva sent out an update on the progress of the GetTor redevelopment project.
David also sent out a summary of the costs incurred by the meek pluggable transport, which have increased significantly following its incorporation into the latest stable Tor Browser and the consequent “explosion” in use.
Esfandiar Mohammadi announced the MATor project and accompanying paper. MATor is a tool that “assesses the influence of Tor’s path selection on a user’s anonymity”; “since MATor is an ongoing project, we would appreciate your opinion about the approach in general.”
Tor help desk roundup
The help desk has been asked if Tor Browser acts as a relay by default. Tor Browser’s Tor by default acts only as a client, and not as a bridge relay, exit relay, or relay. Additionally, this is unlikely to change in the future.
This issue of Tor Weekly News has been assembled by Lunar, Matt Pagan, Karsten Loesing, and Harmony.
Want to continue reading TWN? Please help us create this newsletter. We still need more volunteers to watch the Tor community and report important news. Please see the project page, write down your name and subscribe to the team mailing list if you want to get involved!
Most notably, Tor Browser 4.0.1 fixes a crash bug affecting many users on Windows (see: bug 13443 for the details). Furthermore, the latest stable version of Tor (0.2.5.10) is included and a bug in our updater code got fixed.
This is not a security update, and we will not be deploying update notification or automatic upgrades for all platforms for this release. We may provide automatic updates just for Windows users later in the week, but we are hesitant to do this immediately due to Bug 13594.
Here is the changelog since 4.0:
- All Platforms
- Update Tor to 0.2.5.10
- Update NoScript to 22.214.171.124
- Bug 13301: Prevent extensions incompatibility error after upgrades
- Bug 13460: Fix MSVC compilation issue
- Bug 13443: Disable DirectShow to prevent crashes on many sites
- Bug 13091: Make app name "Tor Browser" instead of "Tor"
Today Facebook unveiled its hidden service that lets users access their website more safely. Users and journalists have been asking for our response; here are some points to help you understand our thinking.
Part one: yes, visiting Facebook over Tor is not a contradiction
I didn't even realize I should include this section, until I heard from a journalist today who hoped to get a quote from me about why Tor users wouldn't ever use Facebook. Putting aside the (still very important) questions of Facebook's privacy habits, their harmful real-name policies, and whether you should or shouldn't tell them anything about you, the key point here is that anonymity isn't just about hiding from your destination.
There's no reason to let your ISP know when or whether you're visiting Facebook. There's no reason for Facebook's upstream ISP, or some agency that surveils the Internet, to learn when and whether you use Facebook. And if you do choose to tell Facebook something about you, there's still no reason to let them automatically discover what city you're in today while you do it.
Also, we should remember that there are some places in the world that can't reach Facebook. Long ago I talked to a Facebook security person who told me a fun story. When he first learned about Tor, he hated and feared it because it "clearly" intended to undermine their business model of learning everything about all their users. Then suddenly Iran blocked Facebook, a good chunk of the Persian Facebook population switched over to reaching Facebook via Tor, and he became a huge Tor fan because otherwise those users would have been cut off. Other countries like China followed a similar pattern after that. This switch in his mind between "Tor as a privacy tool to let users control their own data" to "Tor as a communications tool to give users freedom to choose what sites they visit" is a great example of the diversity of uses for Tor: whatever it is you think Tor is for, I guarantee there's a person out there who uses it for something you haven't considered.
Part two: we're happy to see broader adoption of hidden services
I think it is great for Tor that Facebook has added a .onion address. There are some compelling use cases for hidden services: see for example the ones described at using Tor hidden services for good, as well as upcoming decentralized chat tools like Ricochet where every user is a hidden service, so there's no central point to tap or lean on to retain data. But we haven't really publicized these examples much, especially compared to the publicity that the "I have a website that the man wants to shut down" examples have gotten in recent years.
Hidden services provide a variety of useful security properties. First — and the one that most people think of — because the design uses Tor circuits, it's hard to discover where the service is located in the world. But second, because the address of the service is the hash of its key, they are self-authenticating: if you type in a given .onion address, your Tor client guarantees that it really is talking to the service that knows the private key that corresponds to the address. A third nice feature is that the rendezvous process provides end-to-end encryption, even when the application-level traffic is unencrypted.
So I am excited that this move by Facebook will help to continue opening people's minds about why they might want to offer a hidden service, and help other people think of further novel uses for hidden services.
Another really nice implication here is that Facebook is committing to taking its Tor users seriously. Hundreds of thousands of people have been successfully using Facebook over Tor for years, but in today's era of services like Wikipedia choosing not to accept contributions from users who care about privacy, it is refreshing and heartening to see a large website decide that it's ok for their users to want more safety.
As an addendum to that optimism, I would be really sad if Facebook added a hidden service, had a few problems with trolls, and decided that they should prevent Tor users from using their old https://www.facebook.com/ address. So we should be vigilant in helping Facebook continue to allow Tor users to reach them through either address.
Part three: their vanity address doesn't mean the world has ended
Their hidden service name is "facebookcorewwwi.onion". For a hash of a public key, that sure doesn't look random. Many people have been wondering how they brute forced the entire name.
The short answer is that for the first half of it ("facebook"), which is only 40 bits, they generated keys over and over until they got some keys whose first 40 bits of the hash matched the string they wanted.
Then they had some keys whose name started with "facebook", and they looked at the second half of each of them to pick out the ones with pronouncable and thus memorable syllables. The "corewwwi" one looked best to them — meaning they could come up with a story about why that's a reasonable name for Facebook to use — so they went with it.
So to be clear, they would not be able to produce exactly this name again if they wanted to. They could produce other hashes that start with "facebook" and end with pronouncable syllables, but that's not brute forcing all of the hidden service name (all 80 bits).
For those who want to explore the math more, read about the "birthday attack". And for those who want to learn more (please help!) about the improvements we'd like to make for hidden services, including stronger keys and stronger names, see hidden services need some love and Tor proposal 224.
Part four: what do we think about an https cert for a .onion address?
Facebook didn't just set up a hidden service. They also got an https certificate for their hidden service, and it's signed by Digicert so your browser will accept it. This choice has produced some feisty discussions in the CA/Browser community, which decides what kinds of names can get official certificates. That discussion is still ongoing, but here are my early thoughts on it.
In favor: we, the Internet security community, have taught people that https is necessary and http is scary. So it makes sense that users want to see the string "https" in front of them.
Against: Tor's .onion handshake basically gives you all of that for free, so by encouraging people to pay Digicert we're reinforcing the CA business model when maybe we should be continuing to demonstrate an alternative.
In favor: Actually https does give you a little bit more, in the case where the service (Facebook's webserver farm) isn't in the same location as the Tor program. Remember that there's no requirement for the webserver and the Tor process to be on the same machine, and in a complicated set-up like Facebook's they probably shouldn't be. One could argue that this last mile is inside their corporate network, so who cares if it's unencrypted, but I think the simple phrase "ssl added and removed here" will kill that argument.
Against: if one site gets a cert, it will further reinforce to users that it's "needed", and then the users will start asking other sites why they don't have one. I worry about starting a trend where you need to pay Digicert money to have a hidden service or your users think it's sketchy — especially since hidden services that value their anonymity could have a hard time getting a certificate.
One alternative would be to teach Tor Browser that https .onion addresses don't deserve a scary pop-up warning. A more thorough approach in that direction is to have a way for a hidden service to generate its own signed https cert using its onion private key, and teach Tor Browser how to verify them — basically a decentralized CA for .onion addresses, since they are self-authenticating anyway. Then you don't have to go through the nonsense of pretending to see if they could read email at the domain, and generally furthering the current CA model.
We could also imagine a pet name model where the user can tell her Tor Browser that this .onion address "is" Facebook. Or the more direct approach would be to ship a bookmark list of "known" hidden services in Tor Browser — like being our own CA, using the old-fashioned /etc/hosts model. That approach would raise the political question though of which sites we should endorse in this way.
So I haven't made up my mind yet about which direction I think this discussion should go. I'm sympathetic to "we've taught the users to check for https, so let's not confuse them", but I also worry about the slippery slope where getting a cert becomes a required step to having a reputable service. Let us know if you have other compelling arguments for or against.
Part five: what remains to be done?
In terms of both design and security, hidden services still need some love. We have plans for improved designs (see Tor proposal 224) but we don't have enough funding and developers to make it happen. We've been talking to some Facebook engineers this week about hidden service reliability and scalability, and we're excited that Facebook is thinking of putting development effort into helping improve hidden services.
And finally, speaking of teaching people about the security features of .onion sites, I wonder if "hidden services" is no longer the best phrase here. Originally we called them "location-hidden services", which was quickly shortened in practice to just "hidden services". But protecting the location of the service is just one of the security features you get. Maybe we should hold a contest to come up with a new name for these protected services? Even something like "onion services" might be better if it forces people to learn what it is.