Tor security advisory: "relay early" traffic confirmation attack

This advisory was posted on the tor-announce mailing list.

SUMMARY:

On July 4 2014 we found a group of relays that we assume were trying to deanonymize users. They appear to have been targeting people who operate or access Tor hidden services. The attack involved modifying Tor protocol headers to do traffic confirmation attacks.

The attacking relays joined the network on January 30 2014, and we removed them from the network on July 4. While we don't know when they started doing the attack, users who operated or accessed hidden services from early February through July 4 should assume they were affected.

Unfortunately, it's still unclear what "affected" includes. We know the attack looked for users who fetched hidden service descriptors, but the attackers likely were not able to see any application-level traffic (e.g. what pages were loaded or even whether users visited the hidden service they looked up). The attack probably also tried to learn who published hidden service descriptors, which would allow the attackers to learn the location of that hidden service. In theory the attack could also be used to link users to their destinations on normal Tor circuits too, but we found no evidence that the attackers operated any exit relays, making this attack less likely. And finally, we don't know how much data the attackers kept, and due to the way the attack was deployed (more details below), their protocol header modifications might have aided other attackers in deanonymizing users too.

Relays should upgrade to a recent Tor release (0.2.4.23 or 0.2.5.6-alpha), to close the particular protocol vulnerability the attackers used — but remember that preventing traffic confirmation in general remains an open research problem. Clients that upgrade (once new Tor Browser releases are ready) will take another step towards limiting the number of entry guards that are in a position to see their traffic, thus reducing the damage from future attacks like this one. Hidden service operators should consider changing the location of their hidden service.

THE TECHNICAL DETAILS:

We believe they used a combination of two classes of attacks: a traffic confirmation attack and a Sybil attack.

A traffic confirmation attack is possible when the attacker controls or observes the relays on both ends of a Tor circuit and then compares traffic timing, volume, or other characteristics to conclude that the two relays are indeed on the same circuit. If the first relay in the circuit (called the "entry guard") knows the IP address of the user, and the last relay in the circuit knows the resource or destination she is accessing, then together they can deanonymize her. You can read more about traffic confirmation attacks, including pointers to many research papers, at this blog post from 2009:
https://blog.torproject.org/blog/one-cell-enough

The particular confirmation attack they used was an active attack where the relay on one end injects a signal into the Tor protocol headers, and then the relay on the other end reads the signal. These attacking relays were stable enough to get the HSDir ("suitable for hidden service directory") and Guard ("suitable for being an entry guard") consensus flags. Then they injected the signal whenever they were used as a hidden service directory, and looked for an injected signal whenever they were used as an entry guard.

The way they injected the signal was by sending sequences of "relay" vs "relay early" commands down the circuit, to encode the message they want to send. For background, Tor has two types of cells: link cells, which are intended for the adjacent relay in the circuit, and relay cells, which are passed to the other end of the circuit. In 2008 we added a new kind of relay cell, called a "relay early" cell, which is used to prevent people from building very long paths in the Tor network. (Very long paths can be used to induce congestion and aid in breaking anonymity). But the fix for infinite-length paths introduced a problem with accessing hidden services, and one of the side effects of our fix for bug 1038 was that while we limit the number of outbound (away from the client) "relay early" cells on a circuit, we don't limit the number of inbound (towards the client) relay early cells.

So in summary, when Tor clients contacted an attacking relay in its role as a Hidden Service Directory to publish or retrieve a hidden service descriptor (steps 2 and 3 on the hidden service protocol diagrams), that relay would send the hidden service name (encoded as a pattern of relay and relay-early cells) back down the circuit. Other attacking relays, when they get chosen for the first hop of a circuit, would look for inbound relay-early cells (since nobody else sends them) and would thus learn which clients requested information about a hidden service.

There are three important points about this attack:

A) The attacker encoded the name of the hidden service in the injected signal (as opposed to, say, sending a random number and keeping a local list mapping random number to hidden service name). The encoded signal is encrypted as it is sent over the TLS channel between relays. However, this signal would be easy to read and interpret by anybody who runs a relay and receives the encoded traffic. And we might also worry about a global adversary (e.g. a large intelligence agency) that records Internet traffic at the entry guards and then tries to break Tor's link encryption. The way this attack was performed weakens Tor's anonymity against these other potential attackers too — either while it was happening or after the fact if they have traffic logs. So if the attack was a research project (i.e. not intentionally malicious), it was deployed in an irresponsible way because it puts users at risk indefinitely into the future.

(This concern is in addition to the general issue that it's probably unwise from a legal perspective for researchers to attack real users by modifying their traffic on one end and wiretapping it on the other. Tools like Shadow are great for testing Tor research ideas out in the lab.)

B) This protocol header signal injection attack is actually pretty neat from a research perspective, in that it's a bit different from previous tagging attacks which targeted the application-level payload. Previous tagging attacks modified the payload at the entry guard, and then looked for a modified payload at the exit relay (which can see the decrypted payload). Those attacks don't work in the other direction (from the exit relay back towards the client), because the payload is still encrypted at the entry guard. But because this new approach modifies ("tags") the cell headers rather than the payload, every relay in the path can see the tag.

C) We should remind readers that while this particular variant of the traffic confirmation attack allows high-confidence and efficient correlation, the general class of passive (statistical) traffic confirmation attacks remains unsolved and would likely have worked just fine here. So the good news is traffic confirmation attacks aren't new or surprising, but the bad news is that they still work. See https://blog.torproject.org/blog/one-cell-enough for more discussion.

Then the second class of attack they used, in conjunction with their traffic confirmation attack, was a standard Sybil attack — they signed up around 115 fast non-exit relays, all running on 50.7.0.0/16 or 204.45.0.0/16. Together these relays summed to about 6.4% of the Guard capacity in the network. Then, in part because of our current guard rotation parameters, these relays became entry guards for a significant chunk of users over their five months of operation.

We actually noticed these relays when they joined the network, since the DocTor scanner reported them. We considered the set of new relays at the time, and made a decision that it wasn't that large a fraction of the network. It's clear there's room for improvement in terms of how to let the Tor network grow while also ensuring we maintain social connections with the operators of all large groups of relays. (In general having a widely diverse set of relay locations and relay operators, yet not allowing any bad relays in, seems like a hard problem; on the other hand our detection scripts did notice them in this case, so there's hope for a better solution here.)

In response, we've taken the following short-term steps:

1) Removed the attacking relays from the network.

2) Put out a software update for relays to prevent "relay early" cells from being used this way.

3) Put out a software update that will (once enough clients have upgraded) let us tell clients to move to using one entry guard rather than three, to reduce exposure to relays over time.

4) Clients can tell whether they've received a relay or relay-cell. For expert users, the new Tor version warns you in your logs if a relay on your path injects any relay-early cells: look for the phrase "Received an inbound RELAY_EARLY cell".

The following longer-term research areas remain:

5) Further growing the Tor network and diversity of relay operators, which will reduce the impact from an adversary of a given size.

6) Exploring better mechanisms, e.g. social connections, to limit the impact from a malicious set of relays. We've also formed a group to pay more attention to suspicious relays in the network:
https://blog.torproject.org/blog/how-report-bad-relays

7) Further reducing exposure to guards over time, perhaps by extending the guard rotation lifetime:
https://blog.torproject.org/blog/lifecycle-of-a-new-relay
https://blog.torproject.org/blog/improving-tors-anonymity-changing-guar…

8) Better understanding statistical traffic correlation attacks and whether padding or other approaches can mitigate them.

9) Improving the hidden service design, including making it harder for relays serving as hidden service directory points to learn what hidden service address they're handling:
https://blog.torproject.org/blog/hidden-services-need-some-love

OPEN QUESTIONS:

Q1) Was this the Black Hat 2014 talk that got canceled recently?
Q2) Did we find all the malicious relays?
Q3) Did the malicious relays inject the signal at any points besides the HSDir position?
Q4) What data did the attackers keep, and are they going to destroy it? How have they protected the data (if any) while storing it?

Great questions. We spent several months trying to extract information from the researchers who were going to give the Black Hat talk, and eventually we did get some hints from them about how "relay early" cells could be used for traffic confirmation attacks, which is how we started looking for the attacks in the wild. They haven't answered our emails lately, so we don't know for sure, but it seems likely that the answer to Q1 is "yes". In fact, we hope they *were* the ones doing the attacks, since otherwise it means somebody else was. We don't yet know the answers to Q2, Q3, or Q4.

The attack was independent of the way Tor is packaged, but it's not independent of the way Tor is configured.

So yes, there isn't any difference in theory whether you're on Windows, Linux, Tails, etc.

But some of those provide different configurations for Tor, which impact the attack. For example, Tails configures its Tor to disable entry guards, so I think the attack would have worked much more quickly on Tails users.

hunter2

August 04, 2014

Permalink

Please update the MacPorts configuration to use version 0.2.4.23. Can't update my relay.

hunter2

August 04, 2014

Permalink

Have any more relays been detected using this attack since the latest Tor update?

No.

We did find the node-Tor (javascript) relay implementations sending relay_early cells, but that was due to a bug in their implementation, which they fixed today. And those relays are experimental and tiny so not really a big deal either way.

hunter2

August 04, 2014

Permalink

I began getting some warnings in the tor log (tor-bundle ver. 3.6.3). Would you take a look and tell if there is some suspicious? http://pastebin.com/mrVch5Aa I cant reach some public sites through tor. It looks like my prov can see public addresses. Moreover I have already been given by an official warning right inside the tor-browser. And how much is the threat big? Could they see my traffic or only an address I was looking for?

hunter2

August 04, 2014

Permalink

This all seems to be beating around the bush. What is the real impact here? What % of users can expect to be deanonymized in terms of having the hidden service they were connecting to known? If you've connected to a hidden service that's suspicious during the affected time period, should you be shredding your hard drive or what? I'd like a practical advisory.

I believe the problem is it's unknown. Also, shredding your hard drive won't do anything but make you look suspicious unless you did more than just connect to a hidden service.

1. Nobody can tell what % of users can expect to be deanomymized, given some hidden service address. There are too many factors involved.

2. The definition of "suspicious" depends on who and what

3. Sure, shred your hard drive. Couldn't hurt.

hard drive shredding is not easy, and I'd image it actually could hurt you quite a lot if appropriate safety precautions are not taken. http://www.ssiworld.com/watch/hard_drives.htm

probably better to fill the disk with zeroes and then install something innocuous like Windows XP. Maybe set the clock back a few years first for added plausibility.

hunter2

August 05, 2014

Permalink

Were all servers that were affected located in the US?
I am routing my traffic through a handful of European nodes only, would that have limited the chances of Tor picking an affected entry guard?

hunter2

August 06, 2014

Permalink

yes it will help. But the signal sent by the rogue Hsdir will be recorded by any logging entry guard that was used at the time - forever

I'm thinking the advice to switch to tails after last years FH bust was bad advice.

Why was it bad advice? It is a great improvement against last year's attack vectors.

I think the big downside with Tails currently is the lack of persistent entry guards. If you (the user) have linkable activities, then sometimes an adversary (whether he runs relays or just watches points on the Internet) will be in a position to do traffic confirmation attacks. The exact definition of 'sometimes' is up for grabs, but I think it's pretty clearly more than with entry guards.

hunter2

August 06, 2014

Permalink

Seems that the biggest security hole of tor is the tor browser bundle.

http://www.wired.com/2014/08/operation_torpedo/

"For the last two years, the FBI has been quietly experimenting with drive-by hacks as a solution to one of law enforcement’s knottiest Internet problems: how to identify and prosecute users of criminal websites hiding behind the powerful Tor anonymity system.

The approach has borne fruit—over a dozen alleged users of Tor-based child porn sites are now headed for trial as a result."

The simple solution is:
Just restrict the communication to people that you know, and do not communicate with random webservers..

Solution:
1) run tor
2) run some program over tor that lets you communicate with strong encryption enabled and only with persons that you know, instead of relatively arbitrary servers..
For this, applications like torchat, or retroshare are the way to go.

Webbrowsing will probably never be secure. It is unlikely that a browser with no security holes will ever be built. But on top of that, the process of browsing essentially means communicating with servers that you do not personally know. Your friend is much more unlikely to deliver a malware on your computer that some webserver that you visit. In order to reduce this risk, you have to switch from browsing to targeted communication.

hunter2

August 06, 2014

Permalink

If those relais were removed - why again are 6189 relais online ? There were 59xx at the time of the attack ....

hunter2

August 06, 2014

Permalink

FBI was the only agency not to comment, so that pretty much confirms they have the data

Wow, yeah. Two points for some defense spokewoman for telling us that.

But the fact that FBI and CMU didn't answer the question the 15th time somebody asked it doesn't really tell us much. CMU seems to be sticking to their "I don't answer questions" advice from whichever lawyers decided that was the best way to handle things. And the FBI just never answers these sorts of questions in the first place.

Edit: though to be fair, the DoD just never answers these sorts of questions in the first place either. How odd!

"This particular project was focused on identifying vulnerabilities in Tor, not to collect data that would reveal personal identities of users,"

So I assume these vulnerabilities identified by CMU will be disclosed to TOR at some stage, otherwise what's the point of DoD contracting this !

Does the DoD statement "not to collect data that would reveal personal identities of users" now give the Carnegie-Mellon University the green light to legally destroy the data.

hunter2

August 07, 2014

Permalink

DoD doesn't want people to stop using Tor. Hopefully they tell the FBI not to bust us.

Yeah, uh, please don't do this. We like researchers. That's how we understand privacy and security these days. That's how the papers on http://freehaven.net/anonbib/ come to exist. Many of us are active in the research community.

There is a lot of quite reasonable talk these days about "the real criminals", but it sure isn't those two researchers at cert.

hunter2

August 07, 2014

Permalink

Hi arma, i want an answer to a question that i don't really 100% get... If an agency will take control of a bounce of tor relays (and generically with their funds they could all do it), this could rappresent a serious problem for all the network? What are the countermeasures?

hunter2

August 07, 2014

Permalink

Security experts call it a “drive-by download”: a hacker infiltrates a high-traffic website and then subverts it to deliver malware to every single visitor. It’s one of the most powerful tools in the black hat arsenal, capable of delivering thousands of fresh victims into a hackers’ clutches within minutes.

Now the technique is being adopted by a different kind of a hacker—the kind with a badge. For the last two years, the FBI has been quietly experimenting with drive-by hacks as a solution to one of law enforcement’s knottiest Internet problems: how to identify and prosecute users of criminal websites hiding behind the powerful Tor anonymity system.

Watch out tor team because the FBI has you in a chokehold!

hunter2

August 08, 2014

Permalink

In my view, this was not a law enforcement operation per se, though the data could easily have been given over to the FBI upon the shutting down of the suspect relays (end of the research).

I think that that this was just as it is being portrayed - a DoD initiative to look for weaknessess in Tor. Or that is/was the focus. I also think that if the Univerisity did not destroy or otherwise get rid the IPs they collected, they are in violation of the U.S. wiretapping laws, and they are at risk for being sued in civil court under a tort action. That is why they are not talking. But would DOJ prosecute re criminal? Doubtful.

I'd like to think that this is a warning shot across the bow as far as LE is concerned, and they won't necessarily go after too many users. I also think that CMU probably turned over the IPs to the FBI. My hope is that there are so many IPs to sort through, the FBI will only go for the low hanging fruit, however they may define it, and most will be left alone to go on with their lives.

Let us hope.

If your assessment is correct (and I think it is), the FBI will need to be very careful using this illegally-obtained evidence that they do not reveal the existence of it. Parallel construction all the way.

All it will take is one screwup in front of the right judge and some FBI and DOJ people could actually go to prison for this travesty of justice. They know they're criminals and they're scared.

hunter2

August 08, 2014

Permalink

The FBI only got the low hanging fruit. The smart people chained VPNs to Tor, or modified Tor to use a single entry guard that doesn't rotate, or used WiFi in addition to Tor. You know, all the things the delusional Tor fanbois said not to do because Tor is totally more secure than RSA-1,024, and you should totally let the super smart Tor developers who relied on low latency anonymization instead of cryptographically secure PIR make all the choices for you.