research

A technical summary of the Usenix fingerprinting paper

Albert Kwon, Mashael AlSabah, and others have a paper entitled Circuit Fingerprinting Attacks: Passive Deanonymization of Tor Hidden Services at the upcoming Usenix Security symposium in a few weeks. Articles describing the paper are making the rounds currently, so I'm posting a technical summary here, along with explanations of the next research questions that would be good to answer. (I originally wrote this summary for Dan Goodin for his article at Ars Technica.) Also for context, remember that this is another research paper in the great set of literature around anonymous communication systems—you can read many more at http://freehaven.net/anonbib/.

"This is a well-written paper. I enjoyed reading it, and I'm glad the researchers are continuing to work in this space.

First, for background, run (don't walk) to Mike Perry's blog post explaining why website fingerprinting papers have historically overestimated the risks for users:
https://blog.torproject.org/blog/critique-website-traffic-fingerprinting...
and then check out Marc Juarez et al's followup paper from last year's ACM CCS that backs up many of Mike's concerns:
http://freehaven.net/anonbib/#ccs2014-critical

To recap, this new paper describes three phases. In the first phase, they hope to get lucky and end up operating the entry guard for the Tor user they're trying to target. In the second phase, the target user loads some web page using Tor, and they use a classifier to guess whether the web page was in onion-space or not. Lastly, if the first classifier said "yes it was", they use a separate classifier to guess which onion site it was.

The first big question comes in phase three: is their website fingerprinting classifier actually accurate in practice? They consider a world of 1000 front pages, but ahmia.fi and other onion-space crawlers have found millions of pages by looking beyond front pages. Their 2.9% false positive rate becomes enormous in the face of this many pages—and the result is that the vast majority of the classification guesses will be mistakes.

For example, if the user loads ten pages, and the classifier outputs a guess for each web page she loads, will it output a stream of "She went to Facebook!" "She went to Riseup!" "She went to Wildleaks!" while actually she was just reading posts in a Bitcoin forum the whole time? Maybe they can design a classifier that works well when faced with many more web pages, but the paper doesn't show one, and Marc Juarez's paper argues convincingly that it's hard to do.

The second big question is whether adding a few padding cells would fool their "is this a connection to an onion service" classifier. We haven't tried to hide that in the current Tor protocol, and the paper presents what looks like a great classifier. It's not surprising that their classifier basically stops working in the face of more padding though: classifiers are notoriously brittle when you change the situation on them. So the next research step is to find out if it's easy or hard to design a classifier that isn't fooled by padding.

I look forward to continued attention by the research community to work toward answers to these two questions. I think it would be especially fruitful to look also at true positive rates and false positives of both classifiers together, which might show more clearly (or not) that a small change in the first classifier has a big impact on foiling the second classifier. That is, if we can make it even a little bit more likely that the "is it an onion site" classifier guesses wrong, we could make the job of the website fingerprinting classifier much harder because it has to consider the billions of pages on the rest of the web too."

Study: What is the value of anonymous communication?

Drexel University researchers in Philadelphia, Pennsylvania are recruiting Tor users for an interview study to see how they use Tor while creating things online—how they write blog posts, edit Wikipedia articles, contribute to open source projects on GitHub, post on discussion forums, comment on news articles, Tweet, write reviews, and many other things.

The researchers want to investigate the ways in which various limits, like CAPTCHAs, or even blocking access to sites entirely, inhibit or don’t inhibit Tor users’ ability to create things online. They hope to identify times when people are forced to modify their behavior to achieve the privacy they want. They want to measure the value of anonymous participation and then begin to talk to service providers and others to optimize the participation of Tor users.

“By understanding the contributions that Tor users make, we can help make a case for the value of anonymity online,” said Associate Professor Rachel Greenstadt, an investigator on the study.

The researchers are also interested in hearing from Tor users about other impediments to their anonymous participation that they have encountered while online.

“It’s critical for online projects to support contributions from anyone eager to participate,” said Assistant Professor Andrea Forte, principal investigator.

For more information about joining the study, see: The Tor Study (http://andreaforte.net/tor.html)

Traffic correlation using netflows

People are starting to ask us about a recent tech report from Sambuddho's group about how an attacker with access to many routers around the Internet could gather the netflow logs from these routers and match up Tor flows. It's great to see more research on traffic correlation attacks, especially on attacks that don't need to see the whole flow on each side. But it's also important to realize that traffic correlation attacks are not a new area.

This blog post aims to give you some background to get you up to speed on the topic.

First, you should read the first few paragraphs of the One cell is enough to break Tor's anonymity analysis:

First, remember the basics of how Tor provides anonymity. Tor clients route their traffic over several (usually three) relays, with the goal that no single relay gets to learn both where the user is (call her Alice) and what site she's reaching (call it Bob).

The Tor design doesn't try to protect against an attacker who can see or measure both traffic going into the Tor network and also traffic coming out of the Tor network. That's because if you can see both flows, some simple statistics let you decide whether they match up.

Because we aim to let people browse the web, we can't afford the extra overhead and hours of additional delay that are used in high-latency mix networks like Mixmaster or Mixminion to slow this attack. That's why Tor's security is all about trying to decrease the chances that an adversary will end up in the right positions to see the traffic flows.

The way we generally explain it is that Tor tries to protect against traffic analysis, where an attacker tries to learn whom to investigate, but Tor can't protect against traffic confirmation (also known as end-to-end correlation), where an attacker tries to confirm a hypothesis by monitoring the right locations in the network and then doing the math.

And the math is really effective. There are simple packet counting attacks (Passive Attack Analysis for Connection-Based Anonymity Systems) and moving window averages (Timing Attacks in Low-Latency Mix-Based Systems), but the more recent stuff is downright scary, like Steven Murdoch's PET 2007 paper about achieving high confidence in a correlation attack despite seeing only 1 in 2000 packets on each side (Sampled Traffic Analysis by Internet-Exchange-Level Adversaries).

Second, there's some further discussion about the efficacy of traffic correlation attacks at scale in the Improving Tor's anonymity by changing guard parameters analysis:

Tariq's paper makes two simplifying assumptions when calling an attack successful [...] 2) He assumes that the end-to-end correlation attack (matching up the incoming flow to the outgoing flow) is instantaneous and perfect. [...] The second one ("how successful is the correlation attack at scale?" or maybe better, "how do the false positives in the correlation attack compare to the false negatives?") remains an open research question.

Researchers generally agree that given a handful of traffic flows, it's easy to match them up. But what about the millions of traffic flows we have now? What levels of false positives (algorithm says "match!" when it's wrong) are acceptable to this attacker? Are there some simple, not too burdensome, tricks we can do to drive up the false positives rates, even if we all agree that those tricks wouldn't work in the "just looking at a handful of flows" case?

More precisely, it's possible that correlation attacks don't scale well because as the number of Tor clients grows, the chance that the exit stream actually came from a different Tor client (not the one you're watching) grows. So the confidence in your match needs to grow along with that or your false positive rate will explode. The people who say that correlation attacks don't scale use phrases like "say your correlation attack is 99.9% accurate" when arguing it. The folks who think it does scale use phrases like "I can easily make my correlation attack arbitrarily accurate." My hope is that the reality is somewhere in between — correlation attacks in the current Tor network can probably be made plenty accurate, but perhaps with some simple design changes we can improve the situation.

The discussion of false positives is key to this new paper too: Sambuddho's paper mentions a false positive rate of 6%. That sounds like it means if you see a traffic flow at one side of the Tor network, and you have a set of 100000 flows on the other side and you're trying to find the match, then 6000 of those flows will look like a match. It's easy to see how at scale, this "base rate fallacy" problem could make the attack effectively useless.

And that high false positive rate is not at all surprising, since he is trying to capture only a summary of the flows at each side and then do the correlation using only those summaries. It would be neat (in a theoretical sense) to learn that it works, but it seems to me that there's a lot of work left here in showing that it would work in practice. It also seems likely that his definition of false positive rate and my use of it above don't line up completely: it would be great if somebody here could work on reconciling them.

For a possibly related case where a series of academic research papers misunderstood the base rate fallacy and came to bad conclusions, see Mike's critique of website fingerprinting attacks plus the follow-up paper from CCS this year confirming that he's right.

I should also emphasize that whether this attack can be performed at all has to do with how much of the Internet the adversary is able to measure or control. This diversity question is a large and important one, with lots of attention already. See more discussion here.

In summary, it's great to see more research on traffic confirmation attacks, but a) traffic confirmation attacks are not a new area so don't freak out without actually reading the papers, and b) this particular one, while kind of neat, doesn't supercede all the previous papers.

(I should put in an addendum here for the people who are wondering if everything they read on the Internet in a given week is surely all tied together: we don't have any reason to think that this attack, or one like it, is related to the recent arrests of a few dozen people around the world. So far, all indications are that those arrests are best explained by bad opsec for a few of them, and then those few pointed to the others when they were questioned.)

[Edit: be sure to read Sambuddho's comment below, too. -RD]

Facebook, hidden services, and https certs

Today Facebook unveiled its hidden service that lets users access their website more safely. Users and journalists have been asking for our response; here are some points to help you understand our thinking.

Part one: yes, visiting Facebook over Tor is not a contradiction

I didn't even realize I should include this section, until I heard from a journalist today who hoped to get a quote from me about why Tor users wouldn't ever use Facebook. Putting aside the (still very important) questions of Facebook's privacy habits, their harmful real-name policies, and whether you should or shouldn't tell them anything about you, the key point here is that anonymity isn't just about hiding from your destination.

There's no reason to let your ISP know when or whether you're visiting Facebook. There's no reason for Facebook's upstream ISP, or some agency that surveils the Internet, to learn when and whether you use Facebook. And if you do choose to tell Facebook something about you, there's still no reason to let them automatically discover what city you're in today while you do it.

Also, we should remember that there are some places in the world that can't reach Facebook. Long ago I talked to a Facebook security person who told me a fun story. When he first learned about Tor, he hated and feared it because it "clearly" intended to undermine their business model of learning everything about all their users. Then suddenly Iran blocked Facebook, a good chunk of the Persian Facebook population switched over to reaching Facebook via Tor, and he became a huge Tor fan because otherwise those users would have been cut off. Other countries like China followed a similar pattern after that. This switch in his mind between "Tor as a privacy tool to let users control their own data" to "Tor as a communications tool to give users freedom to choose what sites they visit" is a great example of the diversity of uses for Tor: whatever it is you think Tor is for, I guarantee there's a person out there who uses it for something you haven't considered.

Part two: we're happy to see broader adoption of hidden services

I think it is great for Tor that Facebook has added a .onion address. There are some compelling use cases for hidden services: see for example the ones described at using Tor hidden services for good, as well as upcoming decentralized chat tools like Ricochet where every user is a hidden service, so there's no central point to tap or lean on to retain data. But we haven't really publicized these examples much, especially compared to the publicity that the "I have a website that the man wants to shut down" examples have gotten in recent years.

Hidden services provide a variety of useful security properties. First — and the one that most people think of — because the design uses Tor circuits, it's hard to discover where the service is located in the world. But second, because the address of the service is the hash of its key, they are self-authenticating: if you type in a given .onion address, your Tor client guarantees that it really is talking to the service that knows the private key that corresponds to the address. A third nice feature is that the rendezvous process provides end-to-end encryption, even when the application-level traffic is unencrypted.

So I am excited that this move by Facebook will help to continue opening people's minds about why they might want to offer a hidden service, and help other people think of further novel uses for hidden services.

Another really nice implication here is that Facebook is committing to taking its Tor users seriously. Hundreds of thousands of people have been successfully using Facebook over Tor for years, but in today's era of services like Wikipedia choosing not to accept contributions from users who care about privacy, it is refreshing and heartening to see a large website decide that it's ok for their users to want more safety.

As an addendum to that optimism, I would be really sad if Facebook added a hidden service, had a few problems with trolls, and decided that they should prevent Tor users from using their old https://www.facebook.com/ address. So we should be vigilant in helping Facebook continue to allow Tor users to reach them through either address.

Part three: their vanity address doesn't mean the world has ended

Their hidden service name is "facebookcorewwwi.onion". For a hash of a public key, that sure doesn't look random. Many people have been wondering how they brute forced the entire name.

The short answer is that for the first half of it ("facebook"), which is only 40 bits, they generated keys over and over until they got some keys whose first 40 bits of the hash matched the string they wanted.

Then they had some keys whose name started with "facebook", and they looked at the second half of each of them to pick out the ones with pronouncable and thus memorable syllables. The "corewwwi" one looked best to them — meaning they could come up with a story about why that's a reasonable name for Facebook to use — so they went with it.

So to be clear, they would not be able to produce exactly this name again if they wanted to. They could produce other hashes that start with "facebook" and end with pronouncable syllables, but that's not brute forcing all of the hidden service name (all 80 bits).

For those who want to explore the math more, read about the "birthday attack". And for those who want to learn more (please help!) about the improvements we'd like to make for hidden services, including stronger keys and stronger names, see hidden services need some love and Tor proposal 224.

Part four: what do we think about an https cert for a .onion address?

Facebook didn't just set up a hidden service. They also got an https certificate for their hidden service, and it's signed by Digicert so your browser will accept it. This choice has produced some feisty discussions in the CA/Browser community, which decides what kinds of names can get official certificates. That discussion is still ongoing, but here are my early thoughts on it.

In favor: we, the Internet security community, have taught people that https is necessary and http is scary. So it makes sense that users want to see the string "https" in front of them.

Against: Tor's .onion handshake basically gives you all of that for free, so by encouraging people to pay Digicert we're reinforcing the CA business model when maybe we should be continuing to demonstrate an alternative.

In favor: Actually https does give you a little bit more, in the case where the service (Facebook's webserver farm) isn't in the same location as the Tor program. Remember that there's no requirement for the webserver and the Tor process to be on the same machine, and in a complicated set-up like Facebook's they probably shouldn't be. One could argue that this last mile is inside their corporate network, so who cares if it's unencrypted, but I think the simple phrase "ssl added and removed here" will kill that argument.

Against: if one site gets a cert, it will further reinforce to users that it's "needed", and then the users will start asking other sites why they don't have one. I worry about starting a trend where you need to pay Digicert money to have a hidden service or your users think it's sketchy — especially since hidden services that value their anonymity could have a hard time getting a certificate.

One alternative would be to teach Tor Browser that https .onion addresses don't deserve a scary pop-up warning. A more thorough approach in that direction is to have a way for a hidden service to generate its own signed https cert using its onion private key, and teach Tor Browser how to verify them — basically a decentralized CA for .onion addresses, since they are self-authenticating anyway. Then you don't have to go through the nonsense of pretending to see if they could read email at the domain, and generally furthering the current CA model.

We could also imagine a pet name model where the user can tell her Tor Browser that this .onion address "is" Facebook. Or the more direct approach would be to ship a bookmark list of "known" hidden services in Tor Browser — like being our own CA, using the old-fashioned /etc/hosts model. That approach would raise the political question though of which sites we should endorse in this way.

So I haven't made up my mind yet about which direction I think this discussion should go. I'm sympathetic to "we've taught the users to check for https, so let's not confuse them", but I also worry about the slippery slope where getting a cert becomes a required step to having a reputable service. Let us know if you have other compelling arguments for or against.

Part five: what remains to be done?

In terms of both design and security, hidden services still need some love. We have plans for improved designs (see Tor proposal 224) but we don't have enough funding and developers to make it happen. We've been talking to some Facebook engineers this week about hidden service reliability and scalability, and we're excited that Facebook is thinking of putting development effort into helping improve hidden services.

And finally, speaking of teaching people about the security features of .onion sites, I wonder if "hidden services" is no longer the best phrase here. Originally we called them "location-hidden services", which was quickly shortened in practice to just "hidden services". But protecting the location of the service is just one of the security features you get. Maybe we should hold a contest to come up with a new name for these protected services? Even something like "onion services" might be better if it forces people to learn what it is.

A call to arms: Helping Internet services accept anonymous users

Looking for a way to help the Internet stay open and free? This topic needs some dedicated people to give it more attention — it could easily grow to as large a project as Tor itself. In the short term, OTF's Information Controls Fellowship Program has expressed interest in funding somebody to get this project going, and EFF's Eva Galperin has said she'd be happy to manage the person as an OTF fellow at EFF, with mentorship from Tor people. The first round of those proposals has a deadline in a few days, but if that timeframe doesn't work for you, this problem isn't going away: let us know and we can work with you to help you coordinate other funding.

The problem

We used to think there are two main ways that the Tor network can fail. First, legal or policy pressure can make it so nobody is willing to run a relay. Second, pressure on or from Internet Service Providers can reduce the number of places willing to host exit relays, which in turn squeezes down the anonymity that the network can provide. Both of these threats are hard to solve, but they are challenges that we've known about for a decade, and due in large part to strong ongoing collaborations we have a pretty good handle on them.

We missed a third threat to Tor's success: a growing number of websites treat users from anonymity services differently. Slashdot doesn't let you post comments over Tor, Wikipedia won't let you edit over Tor, and Google sometimes gives you a captcha when you try to search (depending on what other activity they've seen from that exit relay lately). Some sites like Yelp go further and refuse to even serve pages to Tor users.

The result is that the Internet as we know it is siloing. Each website operator works by itself to figure out how to handle anonymous users, and generally neither side is happy with the solution. The problem isn't limited to just Tor users, since these websites face basically the same issue with users from open proxies, users from AOL, users from Africa, etc.

Two recent trends make the problem more urgent. First, sites like Cloudflare, Akamai, and Disqus create bottlenecks where their components are used by many websites. This centralization impacts many websites at once when e.g. Cloudflare changes its strategy for how to handle Tor users. Second, services increasingly outsource their blacklisting, such that e.g. Skype refuses connections from IP addresses that run Tor exit relays, not because they worry about abuse via Tor (it's hard to use Skype over Tor), but because their blacklist provider has an incentive to be overbroad in its blocking. (Blacklist providers compete in part by having "the most complete" list, and in many cases it's hard for services to notice that they're losing contributions from now-missing users.)

Technical mechanisms do exist to let anonymous users interact with websites in ways that control abuse better. Simple technical approaches include "you can read but you can't post" or "you have to log in to post". More complex approaches track reputation of users and give them access to site features based on past behavior of the user rather than on past behavior of their network address. Several research designs suggest using anonymous credentials, where users anonymously receive a cryptographic credential and then prove to the website that they possess a credential that hasn't been blacklisted — without revealing their credential, so the website can't link them to their past behavior.

Social mechanisms have also proven effective in some cases, ranging from community moderation (I hear Wikipedia Germany manually approves edits from users who don't have sufficiently reputable accounts), to flagging behavior from Tor users (even though you don't know *which* Tor user it is) so other participants can choose how to interact with them.

But applying these approaches to real-world websites has not gone well overall. Part of the challenge is that the success stories are not well-publicized, so each website feels like it's dealing with the question in isolation. Some sites do in fact face quite different problems, which require different solutions: Wikipedia doesn't want jerks to change the content of pages, whereas Yelp doesn't want competitors to scrape all its pages. We can also imagine that some companies, like ones that get their revenue from targeted advertising, are fundamentally uninterested in allowing anonymous users at all.

A way forward

The solution I envision is to get a person who is both technical and good at activism to focus on this topic. Step one is to enumerate the set of websites and other Internet services that handle Tor connections differently from normal connections, and look for patterns that help us identify the common (centralized) services that impact many sites. At the same time, we should make a list of solutions — technical and social — that are in use today. There are a few community-led starts on the Tor wiki already, like the DontBlockMe page and a List of Services Blocking Tor.

Step two is to sort the problem websites based on how amenable they would be to our help. Armed with the toolkit of options we found in step one, we should go to the first (most promising) site on the list and work with them to understand their problem. Ideally we can adapt one of the ideas from the toolkit; otherwise we'll need to invent and develop a new approach tailored to their situation and needs. Then we should go to the second site on the list with our (now bigger) toolkit, and so on down the list. Once we have some success stories, we can consider how to scale better, such as holding a conference where we invite the five best success cases plus the next five unsolved sites on our list.

A lot of the work will be building and maintaining social connections with engineers at the services, to help them understand what's possible and to track regressions (for example, every year or so Google gets a new engineer in charge of deciding when to give out Captchas, and they seem to have no institutional memory of how the previous one decided to handle Tor users). It might be that the centralization of Cloudflare et al can be turned around into an advantage, where making sure they have a good practices will scale to help many websites at once.

EFF is the perfect organization to lead this charge, given its community connections, its campaigns like Who has your back?, and its more (at least more than Tor ;) neutral perspective on the topic. And now, when everybody is sympathetic about the topic of surveillance, is a great time to try to take back some ground. We have a wide variety of people who want to help, from scientists and research groups who would help with technical solutions if only they understood the real problems these sites face, to users and activists who can help publicize both the successful cases and the not-yet-successful cases.

Looking ahead to the future, I'm also part of an upcoming research collaboration with Dan Boneh, Andrea Forte, Rachel Greenstadt, Ryan Henry, Benjamin Mako Hill, and Dan Wallach who will look both at the technical side of the problem (building more useful ideas for the toolkit) and also the social side of the problem: how can we quantify the loss to Wikipedia, and to society at large, from turning away anonymous contributors? Wikipedians say "we have to blacklist all these IP addresses because of trolls" and "Wikipedia is rotting because nobody wants to edit it anymore" in the same breath, and we believe these points are related. The group is at the "applying for an NSF grant" stage currently, so it will be a year or more before funding appears, but I mention it because we should get somebody to get the ball rolling now, and hopefully we can expect reinforcements to appear as momentum builds.

In summary, if this call to arms catches your eye, your next steps are to think about what you most want to work on to get started, and how you would go about doing it. You can apply for an OTF fellowship, or we can probably help you find other funding sources as needed too.

Tor security advisory: "relay early" traffic confirmation attack

This advisory was posted on the tor-announce mailing list.

SUMMARY:

On July 4 2014 we found a group of relays that we assume were trying to deanonymize users. They appear to have been targeting people who operate or access Tor hidden services. The attack involved modifying Tor protocol headers to do traffic confirmation attacks.

The attacking relays joined the network on January 30 2014, and we removed them from the network on July 4. While we don't know when they started doing the attack, users who operated or accessed hidden services from early February through July 4 should assume they were affected.

Unfortunately, it's still unclear what "affected" includes. We know the attack looked for users who fetched hidden service descriptors, but the attackers likely were not able to see any application-level traffic (e.g. what pages were loaded or even whether users visited the hidden service they looked up). The attack probably also tried to learn who published hidden service descriptors, which would allow the attackers to learn the location of that hidden service. In theory the attack could also be used to link users to their destinations on normal Tor circuits too, but we found no evidence that the attackers operated any exit relays, making this attack less likely. And finally, we don't know how much data the attackers kept, and due to the way the attack was deployed (more details below), their protocol header modifications might have aided other attackers in deanonymizing users too.

Relays should upgrade to a recent Tor release (0.2.4.23 or 0.2.5.6-alpha), to close the particular protocol vulnerability the attackers used — but remember that preventing traffic confirmation in general remains an open research problem. Clients that upgrade (once new Tor Browser releases are ready) will take another step towards limiting the number of entry guards that are in a position to see their traffic, thus reducing the damage from future attacks like this one. Hidden service operators should consider changing the location of their hidden service.

THE TECHNICAL DETAILS:

We believe they used a combination of two classes of attacks: a traffic confirmation attack and a Sybil attack.

A traffic confirmation attack is possible when the attacker controls or observes the relays on both ends of a Tor circuit and then compares traffic timing, volume, or other characteristics to conclude that the two relays are indeed on the same circuit. If the first relay in the circuit (called the "entry guard") knows the IP address of the user, and the last relay in the circuit knows the resource or destination she is accessing, then together they can deanonymize her. You can read more about traffic confirmation attacks, including pointers to many research papers, at this blog post from 2009:
https://blog.torproject.org/blog/one-cell-enough

The particular confirmation attack they used was an active attack where the relay on one end injects a signal into the Tor protocol headers, and then the relay on the other end reads the signal. These attacking relays were stable enough to get the HSDir ("suitable for hidden service directory") and Guard ("suitable for being an entry guard") consensus flags. Then they injected the signal whenever they were used as a hidden service directory, and looked for an injected signal whenever they were used as an entry guard.

The way they injected the signal was by sending sequences of "relay" vs "relay early" commands down the circuit, to encode the message they want to send. For background, Tor has two types of cells: link cells, which are intended for the adjacent relay in the circuit, and relay cells, which are passed to the other end of the circuit. In 2008 we added a new kind of relay cell, called a "relay early" cell, which is used to prevent people from building very long paths in the Tor network. (Very long paths can be used to induce congestion and aid in breaking anonymity). But the fix for infinite-length paths introduced a problem with accessing hidden services, and one of the side effects of our fix for bug 1038 was that while we limit the number of outbound (away from the client) "relay early" cells on a circuit, we don't limit the number of inbound (towards the client) relay early cells.

So in summary, when Tor clients contacted an attacking relay in its role as a Hidden Service Directory to publish or retrieve a hidden service descriptor (steps 2 and 3 on the hidden service protocol diagrams), that relay would send the hidden service name (encoded as a pattern of relay and relay-early cells) back down the circuit. Other attacking relays, when they get chosen for the first hop of a circuit, would look for inbound relay-early cells (since nobody else sends them) and would thus learn which clients requested information about a hidden service.

There are three important points about this attack:

A) The attacker encoded the name of the hidden service in the injected signal (as opposed to, say, sending a random number and keeping a local list mapping random number to hidden service name). The encoded signal is encrypted as it is sent over the TLS channel between relays. However, this signal would be easy to read and interpret by anybody who runs a relay and receives the encoded traffic. And we might also worry about a global adversary (e.g. a large intelligence agency) that records Internet traffic at the entry guards and then tries to break Tor's link encryption. The way this attack was performed weakens Tor's anonymity against these other potential attackers too — either while it was happening or after the fact if they have traffic logs. So if the attack was a research project (i.e. not intentionally malicious), it was deployed in an irresponsible way because it puts users at risk indefinitely into the future.

(This concern is in addition to the general issue that it's probably unwise from a legal perspective for researchers to attack real users by modifying their traffic on one end and wiretapping it on the other. Tools like Shadow are great for testing Tor research ideas out in the lab.)

B) This protocol header signal injection attack is actually pretty neat from a research perspective, in that it's a bit different from previous tagging attacks which targeted the application-level payload. Previous tagging attacks modified the payload at the entry guard, and then looked for a modified payload at the exit relay (which can see the decrypted payload). Those attacks don't work in the other direction (from the exit relay back towards the client), because the payload is still encrypted at the entry guard. But because this new approach modifies ("tags") the cell headers rather than the payload, every relay in the path can see the tag.

C) We should remind readers that while this particular variant of the traffic confirmation attack allows high-confidence and efficient correlation, the general class of passive (statistical) traffic confirmation attacks remains unsolved and would likely have worked just fine here. So the good news is traffic confirmation attacks aren't new or surprising, but the bad news is that they still work. See https://blog.torproject.org/blog/one-cell-enough for more discussion.

Then the second class of attack they used, in conjunction with their traffic confirmation attack, was a standard Sybil attack — they signed up around 115 fast non-exit relays, all running on 50.7.0.0/16 or 204.45.0.0/16. Together these relays summed to about 6.4% of the Guard capacity in the network. Then, in part because of our current guard rotation parameters, these relays became entry guards for a significant chunk of users over their five months of operation.

We actually noticed these relays when they joined the network, since the DocTor scanner reported them. We considered the set of new relays at the time, and made a decision that it wasn't that large a fraction of the network. It's clear there's room for improvement in terms of how to let the Tor network grow while also ensuring we maintain social connections with the operators of all large groups of relays. (In general having a widely diverse set of relay locations and relay operators, yet not allowing any bad relays in, seems like a hard problem; on the other hand our detection scripts did notice them in this case, so there's hope for a better solution here.)

In response, we've taken the following short-term steps:

1) Removed the attacking relays from the network.

2) Put out a software update for relays to prevent "relay early" cells from being used this way.

3) Put out a software update that will (once enough clients have upgraded) let us tell clients to move to using one entry guard rather than three, to reduce exposure to relays over time.

4) Clients can tell whether they've received a relay or relay-cell. For expert users, the new Tor version warns you in your logs if a relay on your path injects any relay-early cells: look for the phrase "Received an inbound RELAY_EARLY cell".

The following longer-term research areas remain:

5) Further growing the Tor network and diversity of relay operators, which will reduce the impact from an adversary of a given size.

6) Exploring better mechanisms, e.g. social connections, to limit the impact from a malicious set of relays. We've also formed a group to pay more attention to suspicious relays in the network:
https://blog.torproject.org/blog/how-report-bad-relays

7) Further reducing exposure to guards over time, perhaps by extending the guard rotation lifetime:
https://blog.torproject.org/blog/lifecycle-of-a-new-relay
https://blog.torproject.org/blog/improving-tors-anonymity-changing-guard...

8) Better understanding statistical traffic correlation attacks and whether padding or other approaches can mitigate them.

9) Improving the hidden service design, including making it harder for relays serving as hidden service directory points to learn what hidden service address they're handling:
https://blog.torproject.org/blog/hidden-services-need-some-love

OPEN QUESTIONS:

Q1) Was this the Black Hat 2014 talk that got canceled recently?
Q2) Did we find all the malicious relays?
Q3) Did the malicious relays inject the signal at any points besides the HSDir position?
Q4) What data did the attackers keep, and are they going to destroy it? How have they protected the data (if any) while storing it?

Great questions. We spent several months trying to extract information from the researchers who were going to give the Black Hat talk, and eventually we did get some hints from them about how "relay early" cells could be used for traffic confirmation attacks, which is how we started looking for the attacks in the wild. They haven't answered our emails lately, so we don't know for sure, but it seems likely that the answer to Q1 is "yes". In fact, we hope they *were* the ones doing the attacks, since otherwise it means somebody else was. We don't yet know the answers to Q2, Q3, or Q4.

Tor incentives research roundup: GoldStar, PAR, BRAIDS, LIRA, TEARS, and TorCoin

There has been a considerable amount of work in the area of Tor incentives since the last post on the topic in 2009 (over 5 years ago!). This post will give an overview of some of the major new approaches, including two new designs that I co-authored that will appear at the Workshop on Hot Topics in Privacy Enhancing Technologies later this week. Before getting to those, I'll give background on Tor and discuss some of the social issues to help form a foundation of understanding.

Tor’s volunteer resource model

The Tor network consists of a set of relays that provide bandwidth and other resources to forward traffic for Tor users. Anyone in any part of the world can contribute to Tor by downloading the Tor software and configuring it to operate in relay mode. In fact, the Tor network is currently composed exclusively of such voluntarily-operated relays.

There are many reasons existing relay operators might contribute to Tor. They may contribute because they really want a privacy-enhancing communication tool like Tor to exist, they believe in Tor’s philosophy of open source software and operational transparency, and they want Tor to succeed at its goals. They may wish to be part of the Tor community and the social movement to provide privacy-enhanced communication that connects people in all parts of the world. Tor may be practically useful to them for communication with others or to retrieve content that may otherwise be unavailable to them due to censorship or other network interference. Or they may be technically interested in the Tor software or network and the associated research problems raised by large, distributed systems. Finally, some may be interested for adversarial reasons, e.g., collecting information on the uses of the network or the content being transferred. All of these reasons provide intrinsic motivation for operators to contribute - they don’t expect any direct compensation for their contributions.

The volunteer approach has succeeded so far: Tor now consists of over 5000 relays transferring between 4 and 5 GiB/s in aggregate. For the most part, this is "organic" growth obtained through community outreach, where volunteers first have personal contact with existing community members and become inspired to help.

Whatever their reason for volunteering, relay operators not only contribute resources from the physical machines upon which their relays are executed, but also contribute the time involved with configuring, updating, and adjusting their relays as both the software and network mature. In many cases, operators also contribute monetarily through the direct purchase of dedicated machines or special fast connections to Internet Service Providers (ISPs), and exit relay operators spend social energy on maintaining relationships with their ISP (to make sure the ISP understands that requests coming from their machines are from Tor).

However, because many people that are otherwise excited about or interested in Tor are incapable or unwilling to incur these expenses, there are far fewer Tor relays than Tor clients (users that use Tor to access content, e.g. via the Tor Browser Bundle, without directly contributing resources to the network). Expanding the set of relays could have many benefits: Tor could become faster, more reliable, and more secure while at the same time distributing trust among a larger and more diverse set of distributed peers. A larger network that is transferring a larger quantity and more diverse types of data will, in general, make it more difficult for an adversary to determine who is talking to whom. This means a safer Tor for everyone.

Incentives and motivation... say whuuuht?

There are many social and technical challenges involved with expanding the set of Tor relays. We focus here on the social issues surrounding how to encourage more people to run a relay in the first place by providing incentives (i.e., rewards) for relay operators, and later focus on how to do this without leaking information or otherwise hurting Tor’s anonymity.

An important social issue to consider is that of maintaining the existing community of operators and how rewards could harm it. The presumption here is that existing relay operators are motivated to run relays for their own reasons - the act of contributing to Tor itself has intrinsic value to them. Then the question is: what would happen if we start rewarding operators through the use of an incentive system?

The answer isn’t clear, and there are at least a couple of forks in the cognitive road with many more questions along the way. First, what would happen to the existing operators who used to feel good about volunteering but are now compensated with a reward that may have extrinsic value? Although the new reward may provide some extrinsic value, will it lower their original intrinsic value enough to cause them to lose their motivation to contribute? Will they continue their contributions but begin to expect the reward, such that if the reward is removed later they would no longer be motivated even though originally (before the reward) they were happy to contribute? Second, is the new group of operators attracted to Tor by the reward somehow less desirable than the existing group, because they don’t care as much about Tor or are more likely to stop contributions and leave the system more quickly than the existing volunteers? Is the fact that they have higher extrinsic motivation than intrinsic make them better or worse operators? Will they be less likely to incur the costs of saying "no" when people ask them to log or turn over traffic? Will they in fact be willing to sell logs if it increases their rewards? Third, how will the new group of operators affect the existing group? If their intrinsic motivation was low enough before the reward that they didn’t contribute, but high enough with the reward that they do contribute, are these new contributors going to somehow shift the "community spirit" into something that no longer has the value that it once had? Will this "crowd out" the intrinsically motivated individuals and cause existing operators to leave (a widely acknowledged theory)?

The answers to these questions are not clear, though speculations abound. One response is that we will never know what will happen until we try it, but if we try it we may not like the result. Another is that if we try it and it succeeds, great; if it starts going badly, we can adapt and adjust.

Researchers and other Tor community members have been interested in the Tor incentive problem because a well-designed incentive scheme has the potential to greatly improve the network (we will discuss several technical designs below). However, because of the many open social questions, it has been challenging for the Tor community to come to a consensus on the path forward. On one hand and as discussed above, many people already run Tor relays and provide valuable service, and the network and bandwidth graphs imply some amount of success in network growth. It would be tragic if the introduction of an experimental reward scheme drove away the existing participants. On the other hand, the number of Tor clients still vastly exceeds the number of relays, and Tor’s bandwidth capacity remains the limiting factor for Tor’s growth.

A simplified model of reasoning

If we consider that relay operators with a higher intrinsic motivation to contribute are somehow more desirable to the network and the community because they care more deeply about Tor and the social movement, then we can consider there to be a trade-off between reward value and the desirability of the operator. The lower the extrinsic reward value is, the higher the intrinsic value a potential new operator must possess in order for the total value to be high enough to motivate her to contribute. The higher the extrinsic reward value is, the lower the intrinsic value may be to still attract the new operator. Under this model, as the value of the reward increases, the number of individuals willing to contribute also increases while their "quality" decreases.

Please note that this is a very simplified model of reasoning and since it does not necessarily reflect reality, not everyone should agree with it. There are many other values in play here as well; for example, motivations change over time as the world changes, and incentive schemes that require significant protocol modifications are more difficult to implement, test, and deploy. Nonetheless, this simplified model may help sort out the issues, and it seems to be consistent with the fact that the Tor community has thus far preferred to grow the network through intrinsically motivated relay operators.

Social rewards

Tor’s volunteer approach has generally been on the conservative side of our simplified model, where individuals with higher intrinsic motivations to contribute to Tor are preferred. New operators have been recruited through social and community-oriented efforts such as explaining how great Tor is and why they should care. This works well for people who are already intrinsically motivated about Tor, such as this guy who tattooed a Tor onion on his arm.

Other relay recruitment approaches for intrinsically motivated individuals include Torservers.net and the Noisebridge Tor Exit Node Project: both operate as independent nonprofit organizations for those people who would like to contribute to the Tor network but for one reason or another are not in a position to operate relays themselves. As discussed in a 2011 post, it is preferable that people who can run their own fast exit relays do so. But the approach of turning donations into fast exit capacity allows those who cannot run their own relays to still contribute to improving the performance and exit capacity of the Tor network.

The more recent EFF Tor Challenge is somewhere on the same side of the spectrum, except it actually offers small rewards as incentives. EFF explains Tor’s importance and users are encouraged to contribute to the network by running a relay. The challenge offers prizes to relay operators as an incentive: name and twitter handle published in a list of contributors, a limited edition sticker for running a relay for 12 months, and a t-shirt if that relay is fast (bandwidth of 1 MB/s or larger after 12 months).

Other social rewards are possible here as well. A post on the tor-dev mailing list proposes that Tor host a simple profile page for each relay that will list and celebrate the relay’s total bandwidth contribution over time, and recognize the operator for running special Tor bridge or exit relays. Relatively simple ideas like this may have the potential to attract intrinsically-motivated individuals without requiring extrinsic rewards or a lot of changes to Tor itself.

The approaches in this category are relatively low-risk/low-reward schemes: even though the new operators are receiving recognition or a reward for running a relay, the reward is of such little value that there is little risk that existing contributors who may not have been rewarded will leave. At the same time, a new operator loses little if it decides to shut down the new relay.

On the other side of the spectrum are more "radical" approaches that do not require individuals with high intrinsic motivation (though they are welcome, too!) but change Tor’s design have been explored by researchers. Researchers explore these designs because they are generally more technically interesting and have the potential to produce a much larger set of relays. We now shift to discussing these latest research proposals.

Review of early research on Tor incentive system designs - Gold Star and PAR

The 2009 post discussed two incentive papers that were new at the time: Payment for Anonymous Routing from PETS 2008 (PAR) and Building Incentives into Tor from FC 2010 (the "Gold Star" scheme).

In the Gold Star scheme, the Tor directory authorities measure the bandwidth of Tor relays and assign a special flag (a "gold star") to the fastest 7/8 of relays. Then whenever those relays send traffic through Tor, they choose the other fast gold star relays to form a fast gold star path. Relays then prioritize traffic on these gold star paths. The incentive here is that if you run a relay, you will get faster service when using Tor as a client. The main drawback to this approach is that only relays are able to get gold stars and priority service. This means that all relays that are part of a gold star path know for certain that the initiator of that traffic is someone from the small list of gold star relays. Because the set of gold star relays would be smaller than the set of all Tor users by several orders of magnitude, the anonymity for gold star relays would be significantly harmed.

In PAR, all users (not just relays) are able to be part of the incentive system. PAR has an honest-but-curious centralized entity called a "bank" that manages digital tokens called "coins". Users can purchase coins from the bank and then pay the relays while using Tor; the relays then deposit the coins back into the bank and the bank checks to make sure the coin is valid and that it has never been spent before (i.e., it has not been "double spent"). The main novel idea explored in this work is how to include digital payments into each Tor circuit in a way that prevents the relays from learning the client’s identity from the payment itself.

Adding real money into Tor opens a host of new legal and technical questions, which Roger already briefly discussed. For example, real money might shift Tor into a different legal category (see e.g. the EU discussions of what is a "service provider" and which ones are obliged to do data retention), or change the liability situation for relay operators.

The main design challenge we learned from the PAR design is that the timing of when the client withdraws the coins and the relays deposit them creates a trade-off between the ability to detect double spending and link a client to its coins. If the client withdraws some coins and a few seconds later a relay starts depositing coins, how much information does the bank gain? If the relay waits for some time interval to deposit the coins to hide this timing information, then it becomes possible for the client to double spend those coins during that interval. The longer the relay waits before depositing, the harder it is for the bank to link the coins but the easier it is for the client to cheat. A rational relay will deposit immediately to secure its payment, which leads to the worst situation for anonymity. If we assume the relays will deposit immediately, then it is up to the client to hold coins for some random amount of time after purchasing them from the bank, to disrupt potential attempts to link withdrawals to deposits at the cost of flexibility in usage.

In review, both of these proposals have anonymity problems: the Gold Star scheme because the assignment of rewards to relays is public and identifies traffic as originating from the relatively small set of clients of fast relay operators; and PAR because the timing of the withdrawal by the client and the deposit by the relay may leak information about the client’s traffic. In PAR, coins may be held by the client and the relay longer to disrupt this timing information, but this trades off flexibility in usage and the speed at which double spending can be detected. So what have these papers taught us? We have learned about some requirements that a Tor incentive scheme should probably fulfill: both clients and relays should be able to receive the incentive so that neither group stands out; and the timing of payments should not allow cheating and should not enable linkability.

Preventing double-spending without leaking timing information - BRAIDS

Recruiting New Tor Relays with BRAIDS was the followup research proposal presented at CCS in 2010. One of our goals in BRAIDS was to eliminate the trade-off between double-spending and linkability. To achieve this, we designed a new type of digital token which we called a "relay-specific ticket". Tickets were still issued by a trusted, centralized entity - the ticketmaster (we called it a "bank" in the paper, but BRAIDS was not designed to handle real money). Clients choose which relays they want to use, and receive tickets from the ticketmaster that are valid only at the chosen relays while hiding the chosen relay information from the ticketmaster (using partially-blind signatures). Clients then form a Tor circuit with the chosen relays and send the tickets to receive traffic priority (using a differentiated services scheduler). Each ticket will then provide traffic priority through the chosen relay for a specific number of bytes. Because each ticket a relay receives is only valid at that relay and no other, the relay could prevent double spending locally without contacting the ticketmaster.

Another feature of BRAIDS is that tickets are distributed freely in small amount to any client that asks, but only to one client per IP address, and only once per distribution round. If the clients change their mind about which relays they want to use for a circuit, they may contact the ticketmaster to exchange their old tickets for new ones following a time schedule. The exchange process is also used by relays to turn the tickets they received from clients into new usable tickets.

One main drawback to BRAIDS is that even though all users are able to get a small amount of tickets for free, relays are able to accumulate a much larger stash because they receive the free tickets AND the tickets sent to them by other clients. This means that relays stand out when using tickets to download large flows because it is less likely that a normal client would have been able to afford it. Another major drawback is that the exchange process is somewhat inefficient. Relays will exchange received tickets for new ones which they can use themselves, and clients that didn't spend their tickets (e.g., because their Tor usage was low or their chosen relays became unavailable) must exchange them or lose them. This leads to the ticketmaster exchanging all system tickets over every exchange interval. (The ticket validity interval is split into [spend, relay exchange, client exchange], so clients that don't spend in the "spend" time-frame must wait until "client exchange" time-frame to get new tickets. Increasing the interval lengths make them slightly less flexible to rapid changes. The longer the intervals, the more tickets will "pile up" for later processing by the ticketmaster.)

BRAIDS showed us the power of relay-specific tickets but unveiled the scalability problems associated with a trusted, centralized entity.

Improving scalability - LIRA

LIRA: Lightweight Incentivized Routing for Anonymity was published at NDSS 2013. LIRA still uses a centralized entity to manage the incentives, but LIRA only requires incentive management for the relays (thousands) instead of for all system users (millions) like BRAIDS. The way we achieve this is through the use of a lottery system, where clients could simply guess a random number for every circuit to receive priority on that circuit (traffic is prioritized using a differentiated services scheduler as in BRAIDS) with tunable probability. The lottery is set up with special cryptography magic such that relays are rewarded with guaranteed winning guesses to the lottery; relays are allotted winners according to the amount of bandwidth they contributed.

LIRA is more efficient and scalable than any earlier scheme. However, LIRA’s main drawback is that probabilistic guessing reduces flexibility for clients wanting to receive continuous priority over time, and creates a potential for cheating the system because it enables clients to continuously create new circuits and new guesses until a correct guess is found. Another problem is that a secure bandwidth measurement scheme is required to ensure that relays don’t receive rewards without actually contributing to Tor (this wasn't necessary in BRAIDS because the clients sent rewards (tickets) directly to the relays); secure bandwidth measurement is still an open research problem. Finally, LIRA still relies on a trusted central entity to manage the lottery.

Reducing reliance on trusted parties - TEARS

From Onions to Shallots: Rewarding Tor Relays with TEARS will be presented at HotPETs later this week. The main goal in TEARS is to remove the reliance on a central entity to manage the incentives. The central entities in the above schemes (let’s generalize them as "banks") are referred to as "semi-trusted", because there are several ways a malicious bank could misbehave - for example, the bank could refuse service, could extort users by demanding extra fees in order to process their requests, or could "inflate" the digital currency (coins, tickets, guesses, etc.) by printing its own tokens for a profit.

TEARS draws upon the decentralized Bitcoin design to address these concerns by using a *publicly auditable* e-cash protocol that prevents the bank from misbehaving and getting away with it. The bank in TEARS consists of a group of semi-trusted servers, such as the Tor directory servers (as opposed to Bitcoin’s distributed proof-of-work lottery), only a quorum of which need to function correctly for the overall system to keep working. The e-cash cryptography used here is publicly auditable and every interaction with the bank is conducted over a public communication channel (such as the Bitcoin blockchain itself). The security guarantee is that any form of misbehavior on the part of the bank servers leaves a trail of evidence that can be discovered by anyone watching. This approach is reminiscent of systems like Certificate Transparency and OpenTransactions.

In addition to a decentralized bank, TEARS offers a new two-level token architecture to facilitate rewarding relays for their bandwidth contributions without hurting anonymity. First, a decentralized process audits relays’ bandwidth and informs the bank of the results. The bank mints new "shallots" (anonymous, auditable e-cash) for each relay based on their contributions and deposits them into the relay accounts on their behalf. Separately, the bank’s monetary policy may allow it to mint new shallots and distribute them to users, e.g. using mechanisms suggested in BRAIDS but more commonly known to Bitcoin users as faucets. Second, shallots are transferable among users and may be redeemed for relay-specific "PriorityPasses", which are then used to request traffic priority from Tor relays (again, as in BRAIDS). PriorityPasses are relay-specific so that relays can immediately and locally prevent double spending without leaking information to any other entity. This is a similar feature present in BRAIDS’ tickets. However, a novel feature of PriorityPasses is that they are non-transferable and become useless after being spent at a relay -- this reduces overhead associated with exchanges, and ensures that the process of requesting traffic priority does not harm anonymity because the act of redeeming a shallot for a PriorityPass will be unlinkable to any later transaction. There is a question of how many PriorityPasses can be spent in one circuit before it is suspicious that a client has so many, so the size of the faucets and how they distribute Shallots will play a key role in anonymity. Anonymity is also tied to how relays decide to distribute their Shallots to clients, either via a faucet or a through a third party market.

TEARS was designed to operate inside the existing Tor network and thus does not significantly change the fundamentals of Tor’s design. The decentralized bank and bandwidth measurement components do not alter the way clients choose circuits. Clients and relays that want to use or support TEARS, however, would need to support new protocols for interacting with the bank and logic for handling shallots, PriorityPasses, and traffic priority.

TEARS still relies on a "bandwidth measuring" component that can accurately and robustly determine when a relay has indeed contributed useful bandwidth. While the e-cash system in TEARS is designed to be publicly auditable, the existing mechanisms for bandwidth still require trusted authorities to probe.

Bandwidth measurement - TorCoin

A TorPath to TorCoin - Proof-of-Bandwidth Altcoins for Compensating Relays is the other paper to be presented at HotPETs this week. TorCoin addresses the bandwidth measurement problem with a different approach -- an altcoin (a Bitcoin alternative) based on a novel "proof-of-bandwidth" (rather than proof-of-work) mechanism called TorPath, in which the relays (and endpoints) of a circuit effectively mine for new coins whenever they successfully transfer a batch of packets. In TorCoin, a group of "assignment" authorities are responsible for generating a list of circuits (using a shuffle protocol) and assigning them to clients. Bandwidth proofs are then constructed as the circuit is used such that the client mines the TorCoin and then transfers part of it to each of the circuit’s participants.

Like TEARS, TorCoin distributes the process of rewarding relays for their bandwidth contributions. Bandwidth measurement is done directly as part of the distributed mining process and provides strong guarantees. Also, by utilizing a group of assignment authorities that may have more information about the system or underlying network, there is a lot of potential for generating more secure paths for clients than clients are able to generate for themselves.

TorCoin still has some issues to work out; it may be possible to fix some of the smaller issues with protocol modifications, but some of the larger issues don’t have obvious solutions.

One drawback to TorCoin is that it requires the group of collectively-trusted assignment authorities (you have to trust only that some threshold/quorum number of them are correct) to generate and assign circuits to clients. This is a similar trust model to the current Tor directory authorities. In practice, the assignment authorities cause availability issues: if a majority of the assignment authorities are unreachable, e.g. due to DoS or censorship, then the system is unusable to the clients because they won’t be able to generate circuits. This is also somewhat true of the directory authorities, however, directory information can be signed and then mirrored and distributed by other relays in the system whereas assignment authorities are required to always be online and available to new clients. TorCoin clients contact the assignment authorities in order to build new circuits, whereas in Tor they can build as many circuits as they need once the directory information is retrieved (from the directory auths or from directory mirrors).

TorCoin as written has significant security issues. Because relay assignment is not based on bandwidth, it is easier for an adversary to get into the first and last position on a circuit and break anonymity. This can be done through a sybil attack by adding an arbitrary number of bad relays and joining them to the network without actually providing bandwidth. Because the protocol reveals which ephemeral keys are attached to the assigned circuits, an adversary can confirm when it has compromised a circuit (has malicious nodes in the correct positions) without needing to do any statistical correlation attack (it can match up the ephemeral keys assigned to its malicious relays to the ones posted in the circuit assignment list).

The formation of relay/client groups is not discussed and is similarly vulnerable to sybil attacks where the adversary can completely subvert the coin mining process. This can be done by registering an arbitrary number of relays and clients with the assignment servers, such that a large majority of circuits created by the assignment process will contain malicious relays in all positions and be assigned to a malicious client. This malicious collective of nodes can then “pretend” to send bytes through the circuit without actually doing so, and gain an advantage when mining coins. The paper suggests to use a persistent guard, which means the adversary only needs malicious relays in the middle and exit positions of its sybil client circuits, exponentially increasing the probability of a full compromise (or requiring far fewer nodes to achieve the same probability of compromise as without persistent guards). (The sybil clients only have to get 2 of its relays in the circuit instead of 3, reducing the probability from f^3 to f^2 for malicious fraction f.) Further, even if some relays on a circuit are honest, it is not rational for them to refuse to sign proofs of bandwidth that have been exaggerated (too high) by other relays. It will only benefit a relay to ignore proof-of-bandwidth checks, giving it an advantage over completely honest nodes in the TorCoin mining process.

There are a variety of unaddressed practical deployment issues as well. It is not clear how to do load balancing with TorCoin alone - no one should be able to determine how many bytes were sent by any of the circuit members or where the payments for mined coins are being sent (anonymous TorCoin transactions are necessary for anonymity). Exit policies are not discussed, and there is no clear way to support them and for a client that would like to request a specific exit port. Its not clear how to ensure that a relay is not chosen for the same circuit twice, or two relays from the same relay family are not on the same circuit. Finally, it is not clear how to link ephemeral circuit keys to TorCoin addresses so that payments may be sent from clients to relays without revealing client identity.

Don't misunderstand my ranting here - I think that the TorCoin idea is great (I am a co-author after all). It has the potential to motivate new researchers and developers to start thinking about solutions to problems that Tor is interested in, particularly bandwidth measurement. However, the limitations need to be clear so that we don't start seeing production systems using it and claiming to provide security without working through *at least* the issues mentioned above. The current paper was an initial concept in its early stages, and I expect the system to improve significantly as it is furthered developed.

(For completeness, we should also point out that TorCoin will need a new name if anybody decides to build it. The Tor trademark faq says it's fine for research paper designs to use the Tor mark, but if it becomes actual software then it sure will be confusing to users and the rest of the Tor community about whether it's written by or endorsed by Tor.)

Note that TorCoin and TEARS are at least somewhat complementary, since TEARS *needs* a proof-of-bandwidth scheme, and TorCoin *provides* one. However, they’re also not directly compatible. TorCoin requires a substantial change to both Bitcoin and to Tor (or to put it another way, it would be a system that only partially resembles Bitcoin and only partially resembles Tor). On the other hand, TEARS leaves Tor's circuit-finding mechanism intact, and the token protocol in TEARS is closer to traditional anonymous e-cash systems than to Bitcoin.

For better or for worse?

As outlined above, recent research has made great improvements in the area of Tor incentives. More work is needed to show the feasibility and efficiency of the decentralized approaches, and a secure bandwidth measurement scheme would help fill a critical piece missing from TEARS. Recent improvements to the way Tor schedules sockets and circuits would be necessary to correctly provide traffic priority (see #12541), and then a differentiated services scheduler that is able to prioritize traffic by classes (already described in and prototyped for the BRAIDS, LIRA, and TEARS papers) would also be needed.

Unfortunately, it is unclear how to make headway on the social issues. A small scale rollout of an "experimental build" to those relays who want to support new incentive features could be one way to test a new approach without committing to anything long term.

One question that is often raised is: if an incentive scheme rewards relays for providing bandwidth, then won’t everyone just pick the same cheapest hosting provider and Tor will lose location diversity? This question is largely addressed in the discussion of what constitutes a useful service in the TEARS paper (the TEARS paper and appendices also gives useful commentary on many of the common problems and design decisions to make when designing an incentive scheme). Basically, it can be addressed in the monetary policy, e.g., in addition to rewarding relays for bandwidth, the bank could also assign weights that could be used to adjust the rewards based on certain flags that relays possess, or the geographic location in which they operate. This could be adjusted over time so that there would be a higher incentive to run relays in parts of the world where none exist, or to prefer exit relays over other positions, etc. Although, note that it is unclear exactly what the "correct" utility function should be and when/how it should be adjusted. Note that Torservers.net similarly rewards relays for location diversity (see here).

Another point to make here is that most of these approaches have nothing to do with giving out or transferring real dollars. The tokens in most of these schemes are useful only to receive traffic priority in Tor. Will there be third party markets that form around the exchange of the tokens? Sure. And they may be speculated. But at the end of the day, the tokens would still only provide prioritized traffic. Depending on the configuration of the priority scheduler, the difference between priority traffic and normal traffic may not be that extreme. It is conceivable that the tokens would not be worth nearly enough to compensate an operator for the ISP connection, much less the overhead involved with updating the software, maintaining the machine, and talking with the ISP -- and in that case we are still on the more conservative side of the social incentive discussions above.

Tor has some choices to make in terms of how to grow the network and how to position the community during that growth process. I hope that this post, and the research presented herein, will at least help the community understand some of the options that are available.

All the best, ~Rob

[Thanks to Roger Dingledine, Bryan Ford, Aaron Johnson, Andrew Miller, and Paul Syverson for input and feedback on this post.]

Call for papers: FOCI'14 workshop

The 4th USENIX Workshop on Free and Open Communications on the Internet is calling for papers. The workshop will be held on August 18, 2014 and seeks to bring together researchers and practitioners working on means to study, detect, or circumvent Internet censorship.

Past iterations of the workshop featured a number of great Tor research papers including a proposal for cloud-based onion routing, our OONI design paper, an analysis of how the GFW blocks Tor, a censorship analyzer for Tor, and a proposal for latency reduction in Tor circuits.

Given the past success of the FOCI series, we are looking forward to another round of great Tor papers.

Paper submissions are due on Tuesday, May 13, 2014, 11:59 p.m. PDT.

How to read our China usage graphs

Recently somebody asked me why our usage numbers in China are so low. More precisely, his question was "How do I read this graph in any way other than 'Tor is effectively blocked in China'?" After writing up an answer for him, I realized I should share it with the rest of the Tor community too.

The correct interpretation of the graph is "obfs3 bridges have not been deployed enough to keep up with the demand in China". So it isn't that Tor is blocked — it's that we haven't done much of a deployment for obfs3 bridges or ScrambleSuit bridges, which are the latest steps in the arms race.

The short explanation is that the old vanilla SSL Tor transport doesn't work in China anymore due to their active probing infrastructure. The obfs2 transport doesn't work anymore either, for the same reason. The obfs3 transport works great for now, and thousands of people are happily using it — and some of those people aren't reflected in the graphs you see (I'll explain that more below).

The medium-length explanation is that we've been leading and coordinating the international research effort at understanding how to design and analyze transports that resist both DPI and active probing, and approximately none of these approaches have been deployed at a large scale yet. So it doesn't make sense to say that Tor is blocked in China, because it mischaracterizes Tor as a static protocol. "Tor" when it comes to censorship circumvention is a toolbox of options — some of them work in China, some don't. The ones that work (i.e. that should resist both DPI and active probing) haven't been rolled out very widely, in large part because we have funders who care about the research side but we have nobody who funds the operations, deployment, or scale-up side.

The long explanation is that it comes down to three issues:

First, there are some technical steps we haven't finished deploying in terms of collecting statistics about users of bridges + pluggable transports. The reason is that the server side of the pluggable transport needs to inform the Tor bridge what country the user was from, so the Tor bridge can include that in its (aggregated, anonymized) stats that it publishes to the metrics portal. We've now built most of the pieces, but most of the deployed bridges aren't running the new code yet. So the older bridges that are reporting their user statistics aren't seeing very many users from China, while the bridges that *aren't* reporting their user statistics, which are the ones that offer the newer pluggable transports, aren't well-represented in the graph. We have some nice volunteers looking into what fraction of deployed obfs3 bridges don't have this new 'extended ORPort' feature. But (and you might notice the trend here) we don't have any funders currently who care about counting bridge users in China.

Second, we need to get more addresses. One approach is to get them from volunteers who sign up their computer as a bridge. That provides great sustainability in terms of community involvement (we did a similar push for obfs2 bridges back when Iran was messing with SSL, and got enough to make a real difference at the time), but one address per volunteer doesn't scale very well. The intuition is that the primary resource that relays volunteer is bandwidth, whereas the primary resource that bridges volunteer is their address — and while bandwidth is an ongoing contribution, once your IP address gets blocked then your contribution has ended, at least for the country that blocked it, or until you get another address via DHCP, etc. The more scalable approaches to getting bridge addresses involve coordinating with ISPs and network operators, and/or designs like Flashproxy to make it really easy for users to sign up their address. I describe these ideas more in "approach four" and "approach five" of the Strategies for getting more bridge addresses blog post. But broad deployment of those approaches is again an operational thing, and we don't have any funded projects currently for doing it.

Third, we need ways of letting people learn about bridges and use them without getting them noticed. We used to think the arms race here was "how do you give out addresses such that the good guys can learn a few while the bad guys can't learn all of them", a la the bridges.torproject.org question. But it's increasingly clear that scanning resistance will be the name of the game in China: your transport has to not only blend in with many other flows (to drive up the number of scans they have to launch), but also when they connect to that endpoint and speak your protocol, your service needs to look unobjectionable there as well. Some combination of ScrambleSuit and FTE are great starts here, but it sure is a good thing that the research world has been working on so many prototype designs lately.

So where does that leave us? It would be neat to think about a broad deployment and operations plan here. I would want to do it in conjunction with some other groups, like Team Cymru on the technical platform side and some on-the-ground partner groups for distributing bridge addresses more effectively among social networks. We've made some good progress on the underlying technologies that would increase the success chances of such a deployment — though we've mostly been doing it using volunteers in our spare time on the side, so it's slower going than it could be. And several other groups (e.g. torservers.net) have recently gotten funding for deploying Tor bridges, so maybe we could combine well with them.

In any case it won't be a quick and simple job, since all these pieces have to come together. It's increasingly clear that just getting addresses should be the easy part of this. It's how you give them out, and what you run on the server side to defeat China's scanning, that still look like the two tough challenges for somebody trying to scale up their circumvention tool design.

New Tor Denial of Service Attacks and Defenses

New work on denial of service in Tor will be presented at NDSS '14 on Tuesday, February 25th, 2014:


The Sniper Attack: Anonymously Deanonymizing and Disabling the Tor Network
by Rob Jansen, Florian Tschorsch, Aaron Johnson, and Björn Scheuermann
To appear at the 21st Symposium on Network and Distributed System Security

A copy of the published paper is now available, as are my slides explaining the attacks in both PPTX and PDF formats.

We found a new vulnerability in the design of Tor's flow control algorithm that can be exploited to remotely crash Tor relays. The attack is an extremely low resource attack in which an adversary's bandwidth may be traded for a target relay's memory (RAM) at an amplification rate of one to two orders of magnitude. Ironically, the adversary can use Tor to protect it's identity while attacking Tor without significantly reducing the effectiveness of the attack.

We studied relay availability under the attack using Shadow, a discrete-event network simulator that runs the real Tor software in a safe, private testing environment, and found that we could disable each of the fastest guard and the fastest exit relay in a range of 1-18 minutes (depending on relay RAM capacity). We also found that the entire group of the top 20 exit relays, representing roughly 35% of Tor bandwidth capacity at the time of the analysis, could be disabled in a range of 29 minutes to 3 hours and 50 minutes. We also analyzed how the attack could potentially be used to deanonymize hidden services, and found that it would take between 4 and 278 hours before the attack would succeed (again depending on relay RAM capacity, as well as the bandwidth resources used to launch the attack).

Due to our devastating findings, we also designed three defenses that mitigate our attacks, one of which provably renders the attack ineffective. Defenses have been implemented and deployed into the Tor software to ensure that the Tor network is no longer vulnerable as of Tor version 0.2.4.18-rc and later. Some of that work can be found in Trac tickets #9063, #9072, #9093, and #10169.

In the remainder of this post I will detail the attacks and defenses we analyzed, noting again that this information is presented more completely (and more elegantly) in our paper.


The Tor Network Infrastructure

The Tor network is a distributed system made up of thousands of computers running the Tor software that contribute their bandwidth, memory, and computational resources for the greater good. These machines are called Tor relays, because their main task is to forward or relay network traffic to another entity after performing some cryptographic operations. When a Tor user wants to download some data using Tor, the user's Tor client software will choose three relays from those available (an entry, middle, and exit), form a path or circuit between these relays, and then instruct the third relay (the exit) to fetch the data and send it back through the circuit. The data will get transferred from its source to the exit, from the exit to the middle, and from the middle to the entry before finally making its way to the client.

Flow Control

The client may request the exit to fetch large amounts of data, and so Tor uses a window-based flow control scheme in order to limit the amount of data each relay needs to buffer in memory at once. When a circuit is created, the exit will initialize its circuit package counter to 1000 cells, indicating that it is willing to send 1000 cells into the circuit. The exit decrements the package counter by one for every data cell it sends into the circuit (to the middle relay), and stops sending data when the package counter reaches 0. The client at the other end of the circuit keeps a delivery counter, and initializes it to 0 upon circuit creation. The client increments the delivery counter by 1 for every data cell it receives on that circuit. When the client's delivery counter reaches 100, it sends a special Tor control cell, called a SENDME cell, to the exit to signal that it received 100 cells. Upon receiving the SENDME, the exit adds 100 to its package counter and continues sending data into the circuit.

This flow control scheme limits the amount of outstanding data that may be in flight at any time (between the exit and the client) to 1000 cells, or about 500 KiB, per circuit. The same mechanism is used when data is flowing in the opposite direction (up from the client, through the entry and middle, and to the exit).

The Sniper Attack

The new Denial of Service (DoS) attack, which we call "The Sniper Attack", exploits the flow control algorithm to remotely crash a victim Tor relay by depleting its memory resources. The paper presents three attacks that rely on the following two techniques:

  1. the attacker stops reading from the TCP connection containing the attack circuit, which causes the TCP window on the victim's outgoing connection to close and the victim to buffer up to 1000 cells; and
  2. the attacker causes cells to be continuously sent to the victim (exceeding the 1000 cell limit and consuming the victim's memory resources) either by ignoring the package window at packaging end of the circuit, or by continuously sending SENDMEs from the delivery end to the packaging end even though no cells have been read by the delivery end.



I'll outline the attacks here, but remember that the paper and slides provide more details and useful illustrations of each of the three versions of the attack.

Basic Version 1 (attacking an entry relay)

In basic version 1, the adversary controls the client and the exit relay, and chooses a victim for the entry relay position. The adversary builds a circuit through the victim to her own exit, and then the exit continuously generates and sends arbitrary data through the circuit toward the client while ignoring the package window limit. The client stops reading from the TCP connection to the entry relay, and the entry relay buffers all data being sent by the exit relay until it is killed by its OS out-of-memory killer.

Basic Version 2 (attacking an exit relay)

In basic version 2, the adversary controls the client and an Internet destination server (e.g. website), and chooses a victim for the exit relay position. The adversary builds a circuit through the victim exit relay, and then the client continuously generates and sends arbitrary data through the circuit toward the exit relay while ignoring the package window limit. The destination server stops reading from the TCP connection to the exit relay, and the exit relay buffers all data being sent by the client until it is killed by its OS out-of-memory killer.

Efficient Version

Both of the basic versions of the attack above require the adversary to generate and send data, consuming roughly the same amount of upstream bandwidth as the victim's available memory. The efficient version reduces this cost by one to two orders of magnitude.

In the efficient version, the adversary controls only a client. She creates a circuit, choosing the victim for the entry position, and then instructs the exit relay to download a large file from some external Internet server. The client stops reading on the TCP connection to the entry relay, causing it to buffer 1000 cells.

At this point, the adversary may "trick" the exit relay into sending more cells by sending it a SENDME cell, even though the client has not actually received any cells from the entry. As long as this SENDME does not increase the exit relay's package counter to greater than 1000 cells, the exit relay will continue to package data from the server and send it into the circuit toward the victim. If the SENDME does cause the exit relay's package window to exceed the 1000 cell limit, it will stop responding on that circuit. However, the entry and middle node will hold the circuit open until the client issues another command, meaning its resources will not be freed.

The bandwidth cost of the attack after circuit creation is simply the bandwidth cost of occasionally sending a SENDME to the exit. The memory consumption speed depends on the bandwidth and congestion of non-victim circuit relays. We describe how to parallelize the attack using multiple circuits and multiple paths with diverse relays in order to draw upon Tor's inherent resources. We found that with roughly 50 KiB/s of upstream bandwidth, an attacker could consume the victim's memory at roughly 1 MiB/s. This is highly dependent on the victim's bandwidth capabilities: relays that use token buckets to restrict bandwidth usage will of course bound the attack's consumption rate.

Rather than connecting directly to the victim, the adversary may instead launch the attack through a separate Tor circuit using a second client instance and the "Socks4Proxy" or "Socks5Proxy" option. In this case, she may benefit from the anonymity that Tor itself provides in order to evade detection. We found that there is not a significant increase in bandwidth usage when anonymizing the attack in this way.

Defenses

A simple but naive defense against the Sniper Attack is to have the guard node watch its queue length, and if it ever fills to over 1000 cells, kill the circuit. This defense does not prevent the adversary from parallelizing the attack by using multiple circuits (and then consuming 1000 cells on each), which we have shown to be extremely effective.

Another defense, called "authenticated SENDMEs", tries to protect against receiving a SENDME from a node that didn't actually receive 100 cells. In this approach, a 1 byte nonce is placed in every 100th cell by the packaging end, and that nonce must be included by the delivery end in the SENDME (otherwise the packaging end rejects the SENDME as inauthentic). As above, this does not protect against the parallel attack. It also doesn't defend against either of the basic attacks where the adversary controls the packaging end and ignores the SENDMEs anyway.

The best defense, as we suggested to the Tor developers, is to implement a custom, adaptive out-of-memory circuit killer in application space (i.e. inside Tor). The circuit killer is only activated when memory becomes scarce, and then it chooses the circuit with the oldest front-most cell in its circuit queue. This will prevent the Sniper Attack by killing off all of the attack circuits.

With this new defense in place, the next game is for the adversary to try to cause Tor to kill an honest circuit. In order for an adversary to cause an honest circuit to get killed, it must ensure that the front-most cell on its malicious circuit queue is at least slightly "younger" than the oldest cell on any honest queue. We show that the Sniper Attack is impractical with this defense: due to fairness mechanisms in Tor, the adversary must spend an extraordinary amount of bandwidth keeping its cells young — bandwidth that would likely be better served in a more traditional brute-force DoS attack.

Tor has implemented a version of the out-of-memory killer for circuits, and is currently working on expanding this to channel and connection buffers as well.

Hidden Service Attack and Countermeasures

The paper also shows how the Sniper Attack can be used to deanonymize hidden services:

  1. run a malicious entry guard relay;
  2. run the attack from Oakland 2013 to learn the current guard relay of the target hidden service;
  3. run the Sniper Attack on the guard from step 2, knocking it offline and causing the hidden service to choose a new guard;
  4. repeat, until the hidden service chooses the relay from step 1 as its new entry guard.



The technique to verify that the hidden service is using a malicious guard in step 4 is the same technique used in step 2.

In the paper, we compute the expected time to succeed in this attack while running malicious relays of various capacities. It takes longer to succeed against relays that have more RAM, since it relies on the Sniper Attack to consume enough RAM to kill the relay (which itself depends on the bandwidth capacity of the victim relay). For the malicious relay bandwidth capacities and honest relay RAM amounts used in their estimate, we found that deanonymization would involve between 18 and 132 Sniper Attacks and take between ~4 and ~278 hours.

This attack becomes much more difficult if the relay is rebooted soon after it crashes, and the attack is ineffective when Tor relays are properly defending against the Sniper Attack (see the "Defenses" section above).

Strategies to defend hidden services in particular go beyond those suggested here to include entry guard rate-limiting, where you stop building circuits if you notice that your new guards keep going down (failing closed), and middle guards, guard nodes for your guard nodes. Both of these strategies attempt to make it harder to coerce the hidden service into building new circuits or exposing itself to new relays, since that is precisely what is needed for deanonymization.

Open Problems

The main defense implemented in Tor will start killing circuits when memory gets low. Currently, Tor uses a configuration option (MaxMemInCellQueues) that allows a relay operator to configure when the circuit-killer should be activated. There is likely not one single value that makes sense here: if it is too high, then relays with lower memory will not be protected; if it is too low, then there may be more false positives resulting in honest circuits being killed. Can Tor determine this setting in an OS-independent way that allows relays to automatically find the right value for MaxMemInCellQueues?

The defenses against the Sniper Attack prevent the adversary from crashing the victim relay, but the adversary may still consume a relay's bandwidth (and memory resources, to a critical level) at relatively low cost. This means that even though the Sniper Attack can no longer kill a relay, it can still consume a large amount of its bandwidth at a relatively low cost (similar to more traditional bandwidth amplification attacks). More analysis of general bandwidth consumption attacks and defenses remains a useful research problem.

Finally, hidden services also need some love. More work is needed to redesign them in a way that does not allow a client to cause the hidden service to choose new relays on demand.

Syndicate content Syndicate content