Tor security advisory: "relay early" traffic confirmation attack

This advisory was posted on the tor-announce mailing list.

SUMMARY:

On July 4 2014 we found a group of relays that we assume were trying to deanonymize users. They appear to have been targeting people who operate or access Tor hidden services. The attack involved modifying Tor protocol headers to do traffic confirmation attacks.

The attacking relays joined the network on January 30 2014, and we removed them from the network on July 4. While we don't know when they started doing the attack, users who operated or accessed hidden services from early February through July 4 should assume they were affected.

Unfortunately, it's still unclear what "affected" includes. We know the attack looked for users who fetched hidden service descriptors, but the attackers likely were not able to see any application-level traffic (e.g. what pages were loaded or even whether users visited the hidden service they looked up). The attack probably also tried to learn who published hidden service descriptors, which would allow the attackers to learn the location of that hidden service. In theory the attack could also be used to link users to their destinations on normal Tor circuits too, but we found no evidence that the attackers operated any exit relays, making this attack less likely. And finally, we don't know how much data the attackers kept, and due to the way the attack was deployed (more details below), their protocol header modifications might have aided other attackers in deanonymizing users too.

Relays should upgrade to a recent Tor release (0.2.4.23 or 0.2.5.6-alpha), to close the particular protocol vulnerability the attackers used — but remember that preventing traffic confirmation in general remains an open research problem. Clients that upgrade (once new Tor Browser releases are ready) will take another step towards limiting the number of entry guards that are in a position to see their traffic, thus reducing the damage from future attacks like this one. Hidden service operators should consider changing the location of their hidden service.

THE TECHNICAL DETAILS:

We believe they used a combination of two classes of attacks: a traffic confirmation attack and a Sybil attack.

A traffic confirmation attack is possible when the attacker controls or observes the relays on both ends of a Tor circuit and then compares traffic timing, volume, or other characteristics to conclude that the two relays are indeed on the same circuit. If the first relay in the circuit (called the "entry guard") knows the IP address of the user, and the last relay in the circuit knows the resource or destination she is accessing, then together they can deanonymize her. You can read more about traffic confirmation attacks, including pointers to many research papers, at this blog post from 2009:
https://blog.torproject.org/blog/one-cell-enough

The particular confirmation attack they used was an active attack where the relay on one end injects a signal into the Tor protocol headers, and then the relay on the other end reads the signal. These attacking relays were stable enough to get the HSDir ("suitable for hidden service directory") and Guard ("suitable for being an entry guard") consensus flags. Then they injected the signal whenever they were used as a hidden service directory, and looked for an injected signal whenever they were used as an entry guard.

The way they injected the signal was by sending sequences of "relay" vs "relay early" commands down the circuit, to encode the message they want to send. For background, Tor has two types of cells: link cells, which are intended for the adjacent relay in the circuit, and relay cells, which are passed to the other end of the circuit. In 2008 we added a new kind of relay cell, called a "relay early" cell, which is used to prevent people from building very long paths in the Tor network. (Very long paths can be used to induce congestion and aid in breaking anonymity). But the fix for infinite-length paths introduced a problem with accessing hidden services, and one of the side effects of our fix for bug 1038 was that while we limit the number of outbound (away from the client) "relay early" cells on a circuit, we don't limit the number of inbound (towards the client) relay early cells.

So in summary, when Tor clients contacted an attacking relay in its role as a Hidden Service Directory to publish or retrieve a hidden service descriptor (steps 2 and 3 on the hidden service protocol diagrams), that relay would send the hidden service name (encoded as a pattern of relay and relay-early cells) back down the circuit. Other attacking relays, when they get chosen for the first hop of a circuit, would look for inbound relay-early cells (since nobody else sends them) and would thus learn which clients requested information about a hidden service.

There are three important points about this attack:

A) The attacker encoded the name of the hidden service in the injected signal (as opposed to, say, sending a random number and keeping a local list mapping random number to hidden service name). The encoded signal is encrypted as it is sent over the TLS channel between relays. However, this signal would be easy to read and interpret by anybody who runs a relay and receives the encoded traffic. And we might also worry about a global adversary (e.g. a large intelligence agency) that records Internet traffic at the entry guards and then tries to break Tor's link encryption. The way this attack was performed weakens Tor's anonymity against these other potential attackers too — either while it was happening or after the fact if they have traffic logs. So if the attack was a research project (i.e. not intentionally malicious), it was deployed in an irresponsible way because it puts users at risk indefinitely into the future.

(This concern is in addition to the general issue that it's probably unwise from a legal perspective for researchers to attack real users by modifying their traffic on one end and wiretapping it on the other. Tools like Shadow are great for testing Tor research ideas out in the lab.)

B) This protocol header signal injection attack is actually pretty neat from a research perspective, in that it's a bit different from previous tagging attacks which targeted the application-level payload. Previous tagging attacks modified the payload at the entry guard, and then looked for a modified payload at the exit relay (which can see the decrypted payload). Those attacks don't work in the other direction (from the exit relay back towards the client), because the payload is still encrypted at the entry guard. But because this new approach modifies ("tags") the cell headers rather than the payload, every relay in the path can see the tag.

C) We should remind readers that while this particular variant of the traffic confirmation attack allows high-confidence and efficient correlation, the general class of passive (statistical) traffic confirmation attacks remains unsolved and would likely have worked just fine here. So the good news is traffic confirmation attacks aren't new or surprising, but the bad news is that they still work. See https://blog.torproject.org/blog/one-cell-enough for more discussion.

Then the second class of attack they used, in conjunction with their traffic confirmation attack, was a standard Sybil attack — they signed up around 115 fast non-exit relays, all running on 50.7.0.0/16 or 204.45.0.0/16. Together these relays summed to about 6.4% of the Guard capacity in the network. Then, in part because of our current guard rotation parameters, these relays became entry guards for a significant chunk of users over their five months of operation.

We actually noticed these relays when they joined the network, since the DocTor scanner reported them. We considered the set of new relays at the time, and made a decision that it wasn't that large a fraction of the network. It's clear there's room for improvement in terms of how to let the Tor network grow while also ensuring we maintain social connections with the operators of all large groups of relays. (In general having a widely diverse set of relay locations and relay operators, yet not allowing any bad relays in, seems like a hard problem; on the other hand our detection scripts did notice them in this case, so there's hope for a better solution here.)

In response, we've taken the following short-term steps:

1) Removed the attacking relays from the network.

2) Put out a software update for relays to prevent "relay early" cells from being used this way.

3) Put out a software update that will (once enough clients have upgraded) let us tell clients to move to using one entry guard rather than three, to reduce exposure to relays over time.

4) Clients can tell whether they've received a relay or relay-cell. For expert users, the new Tor version warns you in your logs if a relay on your path injects any relay-early cells: look for the phrase "Received an inbound RELAY_EARLY cell".

The following longer-term research areas remain:

5) Further growing the Tor network and diversity of relay operators, which will reduce the impact from an adversary of a given size.

6) Exploring better mechanisms, e.g. social connections, to limit the impact from a malicious set of relays. We've also formed a group to pay more attention to suspicious relays in the network:
https://blog.torproject.org/blog/how-report-bad-relays

7) Further reducing exposure to guards over time, perhaps by extending the guard rotation lifetime:
https://blog.torproject.org/blog/lifecycle-of-a-new-relay
https://blog.torproject.org/blog/improving-tors-anonymity-changing-guar…

8) Better understanding statistical traffic correlation attacks and whether padding or other approaches can mitigate them.

9) Improving the hidden service design, including making it harder for relays serving as hidden service directory points to learn what hidden service address they're handling:
https://blog.torproject.org/blog/hidden-services-need-some-love

OPEN QUESTIONS:

Q1) Was this the Black Hat 2014 talk that got canceled recently?
Q2) Did we find all the malicious relays?
Q3) Did the malicious relays inject the signal at any points besides the HSDir position?
Q4) What data did the attackers keep, and are they going to destroy it? How have they protected the data (if any) while storing it?

Great questions. We spent several months trying to extract information from the researchers who were going to give the Black Hat talk, and eventually we did get some hints from them about how "relay early" cells could be used for traffic confirmation attacks, which is how we started looking for the attacks in the wild. They haven't answered our emails lately, so we don't know for sure, but it seems likely that the answer to Q1 is "yes". In fact, we hope they *were* the ones doing the attacks, since otherwise it means somebody else was. We don't yet know the answers to Q2, Q3, or Q4.

"The smart people chained VPNs to Tor,"

1.) What makes you sure that obtaining a given user's real IP from even multiple VPNs is beyond the ability of a TLA-grade adversary?

2.) Assuming 1.) is entirely plausible, one might think that using VPNs in addition to Tor would offer no realistic /benefit/ for anonymity but at the same time, would not be likely to add any considerable /risk/ either.

This is not necessarily the case, however. For, as has been pointed-out already, use of a VPN or any other additional proxy with Tor increases one's attack surface and creates an effective exit node which sees all of one's traffic.

"or modified Tor to use a single entry guard that doesn't rotate,"

Wait... Wouldn't using a single, static entry guard only help if said entry guard could somehow be known to be benign? If the entry guard is randomly chosen, however, doesn't this mean that one risks getting stuck with a malicious entry guard?

In contrast, if a different entry guard is chosen each time, one is at least unlikely to get a malicious one every time.

"or used [public] WiFi in addition to Tor."

I am not aware of anyone claiming that this cannot, at times, offer advantages.

But using public WiFi can also pose serious risks that should not be dismissed. These include having one's presence logged or noted (whether by camera or live person) and various risks that the network itself can present.

"Tor fanbois"

I must wonder whether your use of the obnoxious spelling "boi" instead of the proper "boy" has anything to do with the appearance of one of the most visible faces of the Tor Project. If so, I must say that I can only hope that there exist at least /some/ Tor enthusiasts who harbor disgust and disdain at such decadent, narcissistic, degenerate, buggery-invoking flamboyance.

hunter2

August 08, 2014

Permalink

Is there any information on:

Q4) What data did the attackers keep, and are they going to destroy it? How have they protected the data (if any) while storing it?

This could potentially destroy many peoples lives.

hunter2

August 09, 2014

Permalink

Lately, when browsing WWW pages using Tor I have been getting strange download attempts that Tor browser warns about. This has happened about 3-6 times.

The last one was when I loaded up a page and I guess the page had some youtube content.

Youtube.com wanted to upload a file with random ascii filename to my computer when loading that other page.

It seems to me that if I possibly had different OS and maybe different browser then this could install malware to my computer.

This sounds like a 'drive-by download' attack to me?

It's possible it's an attack, but I think it's much more likely that it's a partial page that Firefox misinterpreted. Firefox (Tor Browser) didn't know the right Content-Type of the page (e.g. a jpg of size 0), so Firefox (Tor Browser) decided you probably wanted to download it rather than render it.

hunter2

August 09, 2014

Permalink

Well it´s not solved happen to me today, you have to get rid of all Swedish nod´s if you going to solve this...All Tor users should know by now...GB Sweden USA. Don´t forget that Sweden supported Tor with (NODS) trough out Arabian spring.

hunter2

August 10, 2014

Permalink

Given that we really don't know what's going to happen with the collected data at the moment, are there any measures affected users could take at this stage to limit their vulnerability?

If they do receive the data, how likely are agencies to act on it? Is there some obscure part of US law that would make it difficult for agencies to use this data (eg. difficult to use data collected in an academic study in court)? I realize that these are the questions at hand but it would be nice to get a few practical damage control suggestions in one place.

Well, we don't even know that there is any collected data. I think there's a good chance that the researchers were planning to win fame and admiration at black hat, rather than planning to be an arm of the feds.

If you're running a hidden service, relocating it might be a good idea. But mostly, if there is any data, there isn't much to be done about it now.

And as for whether the feds can use this data (if any) in court, it's hard to say without seeing the data, but ultimately it doesn't matter -- if they can't, they would use it to go gather data that they *can* use.

(Oh, and remember, there are many countries out there. It's easy to take a US-centric perspective here.)

No, they can't. At least in some countries where privacy protection are taken seriously. But we don't know what data they even have. Only addresses, time stamp? That would be not enough to do any harm (see original post of arma). And other datas would be also not of any concern. Because the proof you really were at a special hidden service would be hard. There could be a malicious attack where anything could be written as HSDir. There is no proof that this is actually written by looking at the database.

And In some countries of the EU traffic data are required to delete after only some days.

I am not witing this because of any hopes. Actually I have given up to look at hidden services by end of January because there were too much "strange" sites around.

EU privacy protection is not applicable in police and judicial cooperation in criminal matters.
It does not matter if the US researchers may have offended some US law with their active surveillance. Their data is admissible evidence in the criminal courts of EU member states as far as I know.

Which countries do you mean? Germany bought stolen bank data to find tax evaders.
Would like to hear of any EU countries who are an exception. Maybe Denmark?

hunter2

August 10, 2014

Permalink

i keep seeing this in the log on normal sites is this a attack
[NOTICE] Pluggable transport proxy (flashproxy exec Tor\PluggableTransports\flashproxy-client --register :0 :9000) does not provide any needed transports and will not be launched.
8/9/2014 21:30:17 PM.879 [NOTICE] Pluggable transport proxy (fte exec Tor\PluggableTransports\fteproxy --managed) does not provide any needed transports and will not be launched.
8/9/2014 21:30:17 PM.879 [NOTICE] Pluggable transport proxy (obfs2,obfs3 exec Tor\PluggableTransports\obfsproxy managed) does not provide any needed transports and will not be launched.
8/9/2014 21:30:17 PM.879 [NOTICE] Pluggable transport proxy (flashproxy exec Tor\PluggableTransports\flashproxy-client --register :0 :9000) does not provide any needed transports and will not be launched.
[WARN] Rejecting SOCKS request for anonymous connection to private address [scrubbed].

hunter2

August 11, 2014

Permalink

When do we expect CERT, SEI and CMU to give a formal response?

The DoD was quick to give theirs on Aug 5, informing that they did not receive personal data on users of Internet privacy service TOR through a government-funded project to detect vulnerabilities.

How did the DoD know that the NSA did not receive the data but not the FBI?

Why would we expect them ever to? Seems like if they wanted to say something they'd have done it by now. I assume they're trying to lie low because they know they screwed up and they figure the best way to not make it worse it to stay quiet.

As for why the DoD spokesperson could talk about the NSA -- the NSA is part of the DoD. Whereas the FBI is part of the DoJ.

hunter2

August 12, 2014

Permalink

Wasn't one of the leaks that suggest NSA had seeded PRISM prone TOR via MIT and a few other places by in like 12/2010? That gave them (SOD) 13 months of testing, by 2012 the federal dragnet grabbed carders, silk road, lolita city, a few small drug and porn sites, and anoymous.

hunter2

August 15, 2014

Permalink

i found tor...by an article...my curiosity...made me to see...to read what is tor....
i have to say...that i dont know shit from computers...but i found really attractive the language (al chinese to me)
give me a good reason ..why tor? do i need it?..

...i found trully wonderfull the post about free learnig in computing........

hunter2

August 18, 2014

Permalink

Are future versions of Tor Browser Bundle and Tails going to move to less than 3 guards? ...and keep them longer?

Or will need to hack the configuration for make the changes ourselves?

"Are future versions of Tor Browser Bundle and Tails going to move to less than 3 guards? ...and keep them longer?"

Wouldn't that only help if there were a way to authenticate said guards as being non-malicious?

Absent that, wouldn't using fewer guards and rotating them less often only make things /worse/?

Think about it, at least if you keep rotating between many guards, then your odds of being under the grips of a rogue one for any length of time are less.

hunter2

August 22, 2014

Permalink

It appears that much of the trouble has to do with corrupt
relays. Perhaps there may be a method for reliably
interrogating all relays to detriment if they are trustworthy.
Perhaps this could be carried out via the TOR browser
package thus distributing the burden of this process.

Also establishing a very restrictive communication protocol
such that any alteration or tampering with the data is detected
and the rouge relay responsible automatically removed form
the TOR network.