sjmurdoch's blog

Communications Data Bill Committee publishes report

The UK parliamentary committee considering the Draft Communications Data Bill, to which I gave evidence on behalf of The Tor Project, has now published their report (PDF version). The committee is highly critical of the draft bill, and calls for the government to consult technical experts, industry, law enforcement bodies, public authorities and civil liberties groups before re-drafting the proposed legislation. The committee's recommendations for this revised bill are summarised in section 8 of the report.

Tor is not explicitly dealt with by the report other than noting that systems that use encryption, including Tor, will pose a problem (paragraph 99) for proposals to ask communications service providers to record third party's data traversing their network. The report does however address numerous points which were raised in The Tor Project's written and oral evidence (and other organisation's submissions), including the over-broad powers the bill would hand over, the sensitivity of “communications data”, limited oversight, the challenge of storing sensitive data securely, and the rather dubious cost/benefit justification.

The committee has now completed its duties, and has consequently disbanded. The committee's report will now be considered by the government and we will be very interested in their response.

The Tor Project's position on the draft Communications Data Bill

The UK government has proposed a new bill which would allow UK law enforcement agencies to require that "telecommunication operators" (e.g. ISPs and website operators) intercept and record their users' traffic data (i.e. details of who is communicating with whom, when, from where, and how much, but not the content of communications). The draft of this bill, the Communications Data Bill (dubbed the "Snoopers' Charter" by some), has been published and has met widespread criticism for the unprecedented intrusion of privacy it would permit.

The impact on Tor is less than some have feared, because it is likely that The Tor Project is not a telecommunication operator for the purposes of the bill (because the nodes which carry data are not run by The Tor Project) and Tor's distributed architecture reduces the harm which may be caused by the compromise of traffic data. However, the proposed bill is still bad for privacy, especially for users of systems which don't offer the same protections as Tor, so I submitted written evidence to the parliamentary committee investigating the bill, on behalf of the Tor Project.

Our submission gave an introduction to Tor, how it works, and how it is used, and in particular how important Tor was for maintaining the safety of human rights activists working in repressive regimes. The submission also discusses the risks of the proposed bill, especially the harm which would come if traffic data collected were compromised (as happened to Google) or interception equipment installed for complying with the bill were enabled without authorization (as happend to Vodafone in Greece).

Our submission has been published with the others, and I was also invited to give evidence to the committee in person. The transcript of this session has been published with some minor redactions requested by other companies presenting evidence, but none of my answers have been redacted. Further information about the activities of this committee, including transcripts of other sessions, can be found on committee's page.

Based on the discussions which have taken place, it appears that the committee has serious reservations about the bill but we will not know for sure until the committee publishes their report, expected within a few weeks. Efforts to campaign against the bill continue, particularly by the Open Rights Group.

Top changes in Tor since the 2004 design paper (Part 3)

In this third and final installment of Nick Mathewson and Steven Murdoch's blog series (previously part 1 and part 2) we discuss how Tor has made its traffic harder to fingerprint, as well as usability and security improvements to how users interact with Tor. read more »

Protecting bridge operators from probing attacks

Although Tor was originally designed to enhance privacy, the network is increasingly being used to provide access to websites from countries where Internet connectivity is filtered. To better support this goal, Tor introduced the feature of bridge nodes – Tor nodes which route traffic but are not published in the list of available Tor nodes. Consequently bridges are commonly not blocked in countries where access to the public Tor nodes are (with the exception of China and Iran).

For bridge nodes to work well, we need lots of them. This is both to make it hard to block them all, and also to provide enough bandwidth capacity for all their users. To help achieve this goal, the Tor Browser Bundle now makes it as easy as clicking a button to turn a normal Tor client into a bridge. The upcoming version of Tor will even set up port forwarding on home routers (through UPnP and NAT-PMP) to make it easier still.

There are, however, potential risks to Tor users if they turn their client into a bridge. These were summarized in a paper by Jon McLachlan and Nicholas Hopper: "On the risks of serving whenever you surf". I discussed these risks in a previous blog post, along with ways to mitigate them. In this blog post I'll cover some initiatives the Tor project is working on (or considering), which could further reduce the risks.

The attack than McLachlan and Hopper propose involves finding a list of bridges, and then probing them over time. If the bridge responds at all, then we know that the bridge operator could be surfing the web though Tor. So, if the attacker suspects that a blogger is posting through Tor and the blogger's Tor client is also a bridge, then only the bridges which are running at the time a blog post was written could belong to the blogger. There are likely to be quite a few such bridges, but if the attack is repeated, the set of potential bridge IP addresses will be narrowed down.

What is more, the paper also proposes measuring how quickly each bridge responds, which might tell the attacker something about how heavily the bridge operator is using their Tor client. If the attacker probes the rest of the Tor network, and sees a node with performance patterns similar to the bridge being probed, then it is likely that there is a connection going through that node and that bridge. In some circumstances this attack could be used to track connections all the way through the Tor network, and back to a client who is also acting as a bridge.

One way of reducing the risk of these attacks succeeding is to run the bridge on a device which is always on. In this case, just because a bridge is running doesn't mean the operator is using Tor. Therefore less information is leaked to a potential attacker. As laptops become more prevalent, keeping a bridge on 24/7 can be inconvenient so the Tor Project is working on two bits of hardware (known as Torouters) which can be left running a bridge even if the user's computer is off. If someone probes the bridge now, they can't learn much about whether the operator is using Tor as a client, and so it leaks less information about what the user is doing.

Having an always-on bridge helps resist profiling based on when a client is on. However, evaluating the risk of profiling based on bridge performance is more complex. The reason this attack works is that when a node is congested, increasing traffic on one connection (say, due to the bridge operator using Tor) decreases the traffic on another (say, the attacker probing the bridge). If you run a Tor client on your PC, and a Tor bridge on a separate device, this reduces the opportunity for congestion on one leading to congestion on the other. However, there is still the possibility for congestion on the network connection which the bridge and device share, and maybe you do want to use the bridge device as a client too. So the Torouter helps, but doesn't fix the problem.

To fully decouple bridges from clients, they need to run on different hardware and networks. Here the Tor Cloud project is ideal – bridges run on Amazon's (or another cloud provider's) infrastructure so congestion on the bridge can't affect the Tor client running on your PC, or it's network.

But maybe you do want to use your bridge as a client for whatever reason. Recall that the first step of the attack proposed by McLachlan and Hopper is to find the bridges' IP addresses. This is becoming increasingly difficult as more bridges are being set up. Also, there is work in progress to make it harder to scan networks for bridges. Originally this was intended to make it harder to find and block bridges, but it applies equally well to preventing probing. These schemes (e.g. BridgeSPA by Smits et. al.) mean that just because you have a bridge's IP address, it doesn't mean that you can route traffic through it (to measure performance) or even check if the bridge is running. To do either, you need to have a secret which is distributed only to authorized users of the bridge.

There is still more work to be done in protecting bridge operators, but the projects discussed above (and in the previous blog post) certainly improve the situation. Work continues both in academia and at the Tor Project to better understand the problem and develop further fixes.

Moving Tor to a datagram transport

Tor currently transports data over encrypted TLS tunnels between nodes, in turn carried by TCP. This has worked reasonably well, but recent research has shown that it might not be the best option, particularly for performance.

For example, when a packet gets dropped or corrupted on a link between two Tor nodes, TCP will cause the packet to be retransmitted eventually. However, in the meantime, all circuits passing through this pair of nodes will be stalled, not only the circuit corresponding to the packet which was dropped.

Also, Tor uses two levels of congestion control; TCP for hop-by-hop links, and a custom scheme for circuits. This might not be the right approach -- maybe these schemes aren't the right ones to use or maybe there should only be one level of congestion control.

There have been a variety of solutions proposed to fix one or both of these problems, Most end up sending data as datagrams in UDP packets, and Tor is responsible for managing congestion and recovering from packet loss, and so can (hopefully) do so in a more intelligent way. However, there are many options for what architecture to use, what building blocks to build these schemes from, and how to tweak the many parameters that result.

To help clarify the various options, I've written a summary of both Tor's current network stack architecture and various proposals for how it can be improved. The document discusses various tradeoffs between the approaches, including performance, security, and engineering difficulties in deploying the solution. It ends with some provisional conclusions about which options are most promising.

The document, “Comparison of Tor Datagram Designs”, is now available (with source in git), and I would welcome comments. Over the next few months I'll be working on building and evaluating prototypes for these schemes, and I will be incorporating feedback from the Tor community into the process.

On the risks of serving whenever you surf

Bridge nodes are one of Tor's key architectural components for allowing wide access to the network. These act like normal Tor nodes, except there is no centralized list available to download, so it's harder to block access to all of them. Users who cannot access the Tor network in the normal way can find the IP addresses of a few bridges, and connect to the rest of the Tor network via these nodes. The bridge node IP addresses are distributed in a way such that anyone should be able to find a few, but it should be difficult for someone to find (and block access to) them all. Currently they are available by email or the web, but more strategies are being considered, such as instant messaging or MMORPGs. read more »

Media coverage of "Covert channel vulnerabilities in anonymity systems"

Over the past few days there has been some coverage of my PhD thesis, and its relationship to Tor, on blogs and online news sites. It seems like this wave started with a column by Russ Cooper, which triggered articles in PC World and Dark Reading. The media attention came as a bit of a surprise to me, since nobody asked to interview me over this. I'd encourage other journalists writing about Tor to contact someone from the project as we're happy to help give some context.

My thesis is a fairly diverse collection of work, but the articles emphasize the impact of the attacks I discuss on users of anonymity networks like Tor. Actually, my thesis doesn't aim to show that Tor is insecure; the reason I selected Tor as a test case was that it's one of the few (and by far the largest) low-latency system that aims to stand up to observation. Other, simpler, systems have comparatively well understood weaknesses, and so there is less value in researching them.

Quantifying the security of anonymity systems is a difficult question and still being actively worked on. Comparing different systems is even harder since they make different assumptions on the capabilities of attackers (the “threat model”). The mere chance of attacks doesn't indicate that a system is insecure, since they might make assumptions about the environment that are not met, or are insufficiently reliable for the scenario being considered.

The actual goal of my thesis was try to better understand the strengths and weaknesses of systems like Tor, but more importantly to also to suggest a more general methodology for discovering, and resolving flaws. I proposed that the work from the well-established field of covert channels could be usefully applied, and used examples, including Tor, to justify this.

There remains much work to be done before it's possible to be sure how secure anonymity systems are, but hopefully this framework will be a useful one in moving forward. Since in September 2007 I joined the Tor project, I hope I'll also help in other ways too.

Syndicate content Syndicate content