Moving Tor to a datagram transport

by sjmurdoch | November 7, 2011

Tor currently transports data over encrypted TLS tunnels between nodes, in turn carried by TCP. This has worked reasonably well, but recent research has shown that it might not be the best option, particularly for performance.

For example, when a packet gets dropped or corrupted on a link between two Tor nodes, TCP will cause the packet to be retransmitted eventually. However, in the meantime, all circuits passing through this pair of nodes will be stalled, not only the circuit corresponding to the packet which was dropped.

Also, Tor uses two levels of congestion control; TCP for hop-by-hop links, and a custom scheme for circuits. This might not be the right approach -- maybe these schemes aren't the right ones to use or maybe there should only be one level of congestion control.

There have been a variety of solutions proposed to fix one or both of these problems, Most end up sending data as datagrams in UDP packets, and Tor is responsible for managing congestion and recovering from packet loss, and so can (hopefully) do so in a more intelligent way. However, there are many options for what architecture to use, what building blocks to build these schemes from, and how to tweak the many parameters that result.

To help clarify the various options, I've written a summary of both Tor's current network stack architecture and various proposals for how it can be improved. The document discusses various tradeoffs between the approaches, including performance, security, and engineering difficulties in deploying the solution. It ends with some provisional conclusions about which options are most promising.

The document, “Comparison of Tor Datagram Designs”, is now available (with source in git), and I would welcome comments. Over the next few months I'll be working on building and evaluating prototypes for these schemes, and I will be incorporating feedback from the Tor community into the process.

Comments

Please note that the comment area below has been archived.

Please have a chat with Bram

Please have a chat with Bram Cohen for his experiences designing robust high-throughput datagram transports.

-andy

Indeed, I have chatted with

Indeed, I have chatted with Bram and he had a number of good suggestions which I have yet to integrate into the report.

If tor would also support

If tor would also support UDP it would be able to exploit UDP hole punching techniques, already used with VoIP ICE/RTP, thus exposing Tor relay otherwise unreachable directly on their IP address.

-naif

Yes, it should allow that,

Yes, it should allow that, although needing a rendezvous service to coordinate would have some anonymity consequences so needs careful analysis.

Interesting paper,

Interesting paper, especially if changing the transport protocol will allow tor to carry UDP traffic in the future. Personally, this is one of the features I miss most about tor.

Some of the candidate

Some of the candidate architectures allow Tor to carry UDP, but these are the ones which are the most challenging to build and with the most uncertainty as to the performance impact (positive or negative). Right now, my inclination is to not aim to carry UDP traffic, but this is still to be finalised.

An absolute must, imho. But

An absolute must, imho. But not simply as an either TCP or UDP. Some things are best done with UDP, and the questions for Tor will be which things are those, and how will they be done?

- julie

After realization of that,

After realization of that, will it be able to use SIP and etc. VoiP-clients through Tor?

If Tor can carry UDP, then

If Tor can carry UDP, then yes, but not all proposed architectures support this. See my answer to a previous comment.

Isn't there another way to

Isn't there another way to deal with the dropped packets (dont retransmits happen almost instantly anyway?).
UDP will make it easier to detect and block tor.
Some ISPs and server datacenters also limit UDP speeds, I know OVH do this.
Most company and university firewalls block UDP.
Also UDP has no congestion control, and doing it at both ends of the circuit doesn't work because the ISP routers on every hop in between the two nodes cant do congestion control and so will need to drop packets.
This will make tor slower by wasting the bandwidth of the sender/uploader by retransmitting packets.
I know from using freenet that 10% or so of my bandwidth is wasted retransmitting dropped packets.

Have you though about splitting the circuits over multiple TCP connections? I don't mean 1 connection per circuit, do the same combining the circuits into one TCP connection except do it over a few connections, this should improve speed also.
If a connection stalls you can then retransmit over one of the other connections.

If Tor is going to start using UDP please at least make it optional.

For some time, Tor nodes

For some time, Tor nodes will support both UDP and TCP so they can communicate with older versions of Tor, and I expect it will also be optional. For the first hop, we may permanently need to support TCP to resist blocking.

Whatever option we pick, emphasis will be put on choosing good congestion control, both for end-to-end and hop-by-hop. TCP isn't that great because it signals congestion through packet loss. Other alternatives being considered, such as uTP, detect congestion before packet loss by monitoring latency.

Yes, there is a continuum between 1 TCP session between pairs of hosts and 1 TCP session per circuit, so this option could be explored.