Tor's Open Research Topics: 2018 Edition
- 07/23 23:00:00 UTC - Add additional Censorship topics.
- 07/30 20:00:00 UTC - Add more information about Browser Fingerprinting after conversations with Georg Koppen
- 08/02 21:00:00 UTC - Add links and commentary from David Fifield and Roger about Censorship topics
Tor has always depended upon research from the academic community for security and anonymity analysis, as a source of new ideas, and as a way of getting additional review for our own ideas. We have traditionally maintained a Tor research portal with information and assistance for conducting Tor research, as well as a specific list of open research problems.
This post is meant to update the list of open Tor research problems, to bring focus to specific areas of research that the Tor Project thinks are necessary/useful in our efforts to upgrade and improve the Tor network and associated components and software. It is organized by topic area: network performance, network security, censorship circumvention, and application research. Each topic area provides information about current and desired work and ideas. We conclude with information about doing ethical and useful research on Tor and with suggestions on how to best ensure that this work is useful and easy for us to adopt.
Tor's network performance is one of the largest barriers to its adoption. While adding more relays to the network will increase average-case Tor performance, it will not solve Tor's core performance problem, which is actually performance variance.
Even with lots of additional capacity, the variance in performance of Tor will still remain very high. Today, when you have a good path through the Tor network, Tor performs very well. But sometimes, if you choose an overloaded guard node, or a bad Tor circuit, performance can be poor.
Variance in performance is particularly bad for user experience. People can get used to a predictable level of performance if they expect it, but when network performance is sometimes bad, it gets extremely frustrating, because Tor always seems to be really slow just when you're trying to do that one really important thing (which is also always :).
The bulk of the variance in Tor's performance comes from two main classes of solvable problems: lack of congestion control (with related fairness issues while under congestion) and load balancing issues.
Network Performance: Congestion control and fairness
Tor is a TCP overlay network with full end-to-end reliability and fixed flow control window sizes. The fixed window sizes impose an upper bound on the throughput of client traffic through the network. Worse, Tor does not provide anything that signals congestion to endpoints, and it does not have any way to handle excessive queuing at routers (aside from tearing down circuits and closing connections). This means queues at Tor routers can grow to be very large, which means large amounts of (variable) latency. We have a scheduling system called EWMA that attempts to choose quieter circuits from this queue to improve fairness, but without a way to control congestion and queue size, it is of limited utility.
The wider internet handles congestion control through dropping. When routers develop large queues, they simply drop packets, and TCP and other transports use these drops as a congestion signal. This keeps queues bounded to fixed, small sizes, which keeps latency bounded and prevents out-of-memory issues from crashing routers. Queue management algorithms that decide which packets to drop help ensure fairness among flows, for various definitions of "fair".
Tor has considered various forms of datagram transports with end-to-end congestion control for many years. For background, see my post about using QUIC in Tor and the replies. Adopting something like QUIC requires care and some further study. First, prior work studying QUIC and other datagram transports has focused on performance but has ignored things like information leaks due to connection metadata, client fingerprinting, deliberately dropped packets, deliberately reordered packets, long-term endpoint state, timing information, and (ab)use of explicit congestion notification. Are there ways to use padding, error correction, or clever cryptographic tricks to reduce the utility of drop and reorder based side channels? How should we structure our circuit crypto to avoid tagging side channel attacks? Will we be able to use any of our existing tagging-resistance proposals, or will the these other side channel issues require a completely different approach anyway?
Other, lower hanging fruit options may exist for retrofitting some kind of congestion control and improved fairness onto Tor's existing reliable transport. The obvious choice here would be explicit congestion notification coupled with better circuit prioritization, but this work will need to ensure that explicit congestion notification cannot be abused to create side channels, DoS conditions, or other anonymity issues. We will also need to determine a way to autoscale our window sizes in response to these signals and other network conditions. The fixed window sizes in Tor create an upper bound on client throughput when there is spare capacity, and create queueing when there is a bottleneck anywhere on the internet along the entire path into and through the Tor network.
Network Performance: Load Balancing
Load balancing in Tor is the other pain point that causes variance in performance. We are in the process of replacing Tor's Bandwidth Authorities with a simpler (but very similar) system) of active measurement by central measurement servers. However, there are a handful of issues with both the current system and its replacement.
First, neither system fully accounts for the effects of geographic diversity. If we use centralized measurement, how should we distribute the measurement servers to prevent bias that favors relays in some locations? Should we have a single measurement value for each relay, or should there actually be a set of measurement values for various geographic locations, or based on internet topology? If we have multiple measurement values, what is the safest way for clients to use them?
Second, neither system has any way to handle rapid changes in relay capacity or load. Should we distribute measurements via some other mechanism that is easier to update? Or can we model the variance itself somehow, such that the parameters to this model are also measurement values that are taken into account during client path selection and load balancing decisions?
Finally, both systems are still centralized and vulnerable to DoS and manipulation. The most recent work on decentralized measurements, PeerFlow, showed very good results. If we were to use decentralized measurement, what is the best way to ensure accuracy comparable to what we can achieve with centralized measurement? PeerFlow relied on total bytes transmitted, and some notion of "spare" capacity. Are these the best metrics that could be used, or might active measurement (such as CapProbe, et al) do better? If we're going to do decentralized measurement, should we also explore ways to provide multiple measurements for relays based on geographic or internet topology? Can that be done any easier or more compactly than with centralized measurement? Can we also measure variance as per above, and is this any easier with decentralized measurements?
Independent of the measurement system we use, what should we do about slow relays on the Tor network? We have the ability to set a performance cutoff to be a Guard relay, and also set a different cutoff to be a relay at all, but we set them arbitrarily. How can we pick better cutoffs while still preserving diversity? Can we trust our measurement system for this, or are we measuring relays as fast that still have high degrees of variance in performance?
The security of the Tor network has many aspects and factors. For this post, there are two main areas that are of greatest concern to us at the moment: guard discovery vectors and traffic analysis attacks and defenses.
Network Security: Guard Discovery
Guard discovery is an attack that allows the adversary to determine the guard node(s) in use by a particular Tor client or onion service. The most basic form of this attack is to make many connections to a Tor onion service in order to force it to create circuits until one of the adversary's nodes is chosen for the middle hop next to the guard. At that point, a traffic analysis side channel is used to confirm that the node is in fact next to the actual onion service, leading to discovery of that onion service's guard node. From that point, the guard node can be compromised, coerced, or surveilled to determine the actual IP address of the onion service or client. To address this particular vector, we have developed an addon to use a second and third layer of guard nodes for onion services, as well as to check for various injected traffic side channels. How well do its features work? How much does it impact performance and scalability? Should we merge it into Tor?
There are also several other vectors that remain. In particular, can guard discovery still be accomplished by monitoring the relay statistics that we publish? In particular, read/write history? We have reduced the resolution of these statistics, but is it enough?
It is also obvious that DoS/uptime attacks are a risk here -- if you hold a circuit open to an onion service or client and can DoS Guard nodes (or just wait until they go offline), you will eventually notice your circuit closing at the time a Guard goes down. But what about congestion attacks? It seems reasonable that an adversary may be able to overload the Guard nodes of the network one by one and observe this effect on onion service performance and reliability. How accurate is this attack? Would switching to two (or more) guards make the attack harder? What about multipath circuits with traffic splitting? And can we re-use that multipath handshake to resume circuits (and their streams) that are closed prematurely, so that Guard downtime is no longer such a strong signal?
Network Security: Traffic Analysis
Traffic analysis attacks against Tor can be roughly categorized in terms of increasing amount of information available to the adversary. The adversary gets the least amount of information observing just the network path from the client to the guard node. They gain more information by controlling the guard itself, since they can then examine individual circuits. They then gain more information by observing the path to the guard and the path from the exit (or to the onion service). After that, they gain yet more information if the control both the guard and exit (or client guard and onion service guard). Finally, the most powerful adversary observes not only both ends of the communication, but is also able to inject and manipulate traffic patterns at will.
Tor has already deployed defenses against a limited adversary class that is able to gather traffic statistics from internet routers along the path to the guard node.
We are also in the process of deploying a defense against website traffic fingerprinting performed by an adversary at the guard node. This defense negotiates a state machine with the middle node that defines a traffic distribution in a pair of histograms, and the middle node and client then use it to add cover traffic to shape the actual traffic pattern towards the distribution defined by the state machine. We are favoring this defense because it can introduce padding extremely efficiently, without introducing any latency. Furthermore, it only injects cover traffic when the observed non-padded distribution deviates from the desired distribution that is defined in the programmable state machine and its histograms.
Because the state machines and histograms are programmable, this defense is very flexible. It will allow us to define cover traffic state machines that can add traffic in response to observed traffic patterns, and clients can choose the state machine they want to use depending on their activity and even switch between different state machines for different time periods of use of the same circuit. Can we use this defense to reduce the accuracy of passive end-to-end correlation? Can we use it to defend against other forms of attack at the guard node, such as Circuit Fingerprinting attacks that recognize onion service circuits, or attacks that attempt to determine the type of protocol used on a Tor circuit? Are there cases where we can use it to defend against some forms of the active end-to-end adversary, such as those that only get to inject a small amount of traffic, as in the DropMark attack? Can we use it to reduce the accuracy of the drop and reorder based side channels that are enabled by datagram transports above? How can we make our state machines as efficient as possible in terms of overhead? To what degree does it help in these cases, and at what overhead rates?
Are there any other similar defenses of this form that we should be considering? Is there a better way to perform this type of traffic addition that is more flexible and efficient with respect to overhead?
Orthogonally, how much does multipath circuits with traffic splitting also impede the adversary's ability to perform website traffic fingerprinting, end-to-end correlation, and other traffic analysis attacks? How can we combine traffic splitting with our histogram defense in an optimal sense? Are there clever ways to perform splitting to maximize security against traffic analysis, rather than just focusing on performance (but ideally while still improving both)? To what extent does this trade-off against the long-term observability of each additional guard that we use for traffic splitting?
Censorship circumvention has not received a lot of research attention lately. There are actually two ways Tor is censored: by blocking connections into the Tor network, and also by websites that block Tor or degrading access when Tor is used to access them.
Censorship circumvention: Entrance Side
Censorship circumvention on the entrance side has been considered a "solved" research problem, in part due to the success of domain fronting. Unfortunately, the major cloud infrastructure providers have recently taken steps to block this technique.
Unfortunately, every single one of our currently implemented pluggable transports relies on something like domain fronting to get bridge addresses. This even includes distributed transports like Snowflake. Nothing works well if you can't get bridge addresses easily.
So what options are available to us now? Can we find a way to connect to cloud providers without revealing our destination hostname (for example, through clever use of TLS session resumption or Zero-RTT, or do we have to wait for encrypted SNI to be deployed globally)?
If we go another way, are there any widely used networks or services that we might be able to use as a bidirectional covert channel to distribute bridge addresses? What about these new blockchain systems that have very fast confirmation times and low/no fees, such as EOS, STEEM, Stellar, or Ripple? Can we somehow use them to communicate with censored clients over a very low bandwidth bidirectional channel that is just fast enough to perform an encryption/obfuscation handshake, serve a captcha (or ZK auth challenge), accept the solution, and then hand out unrelated bridge addresses that are more suitable for high-capacity use?
Are there any other covert channels that we could use in this way, or any other radically new obfuscation designs? We have been looking into several options for use with the Snowflake Broker. Is there a way to generalize this so that it also works for BridgeDB? Or, a way to pick our favorite based on censorship analysis and user base size (aka "collateral freedom") of these systems in censored areas?
When BridgeDB and the Snowflake Broker actually hand out bridge addresses, can we do better than just making people solve captchas to get bridges? We've looked into social-based cryptographic distributors like rBridge and Hyphae that can give us feedback about blocking, but this work could use a review to help us pick the best one (or to compare those to a new system you make).
If you notice by now, BridgeDB and the Snowflake Broker share a lot of common failings and needs. Is there a way to unify them? The Snowflake Broker also helps perform coordination that BridgeDB cannot, like cryptographic handshakes, live parameter exhcange, nat traversal handshakes, and more up-to-date and fine-grained blocked/liveness information. Should we generalize the distribution system for all PTs? What properties should it have, especially if we also want fancy distributor crypto?
Related: how do we determine how many bridges to hand out, and when to get more if some of them stop working? There are probably different answers to this question depending on how long the bridges stay up, and stay unblocked, which also will depend on the distribution strategy.
Finally, some newer transports (especially SnowFlake) make use of ephemeral bridges that go up and down. Can we again use multipath circuits with traffic splitting or some other PT-specific mechanism to resume circuits if these bridges go down so that this is not as much of a usability issue?
Censorship Circumvention: Exit Side
Lots of websites still serve excessive captchas to Tor, and some block Tor entirely and even lock accounts that access them over Tor.
What does the academic community think about Privacy Pass? Are there better mechanisms we should use instead, like the blacklistable credential options from the Nymble literature? Wikipedia, Google, and other reputation-based systems would work better if we could give them a ZKP-thing that they could reputation score.
But even if we create the perfect cryptographic solution to this problem, not everyone will adopt it. So what if we changed the paradigm entirely. What if the Zero Knowledge Proof actually went to the Tor exit, so that in order to exit the Tor network at all, you had to solve a captcha and get a token? Would this mean less sites ban Tor?
Perhaps less radically, what if there were a pool of exit bridges that you could use via this mechanism? Would that pool have any less abuse or incidence of banning? What if this pool were pay-to-use? What if you posted a monetary amount via a ZK smart contract, and if there was abuse reported, that money became forfeit? With a pool of exit bridges, we can try these options and see what works.
The applications that use Tor are also in need of some research attention, both in terms of usability, and in terms of privacy.
Application Research: Usability
On the usability side, in 2014 a study was performed that showed that there were numerous "stop points" in the user flow to download, install, and run Tor Browser that confused users, gave them pause, and/or eventually caused them to give up. Has this gotten better? Are there other pain points that could be improved? Could various UI or performance improvements to the download and setup process make things better, particularly for censored users?
Related to this, with the new v3 onion service protocol, our HS names have grown incredibly long. How confusing is this for users? Is it easy to trick users with near-matches of these very long URLs? Can we deploy any sort of naming system that would improve this?
Previous attempts to solve this (such as via NameCoin) have proven too heavyweight. But can we also use one of the lighter, faster blockchains mentioned in the Censorship Circumvention section above (particularly EOS, Stellar, or STEEM) to also give us decentralized naming with low overhead? How should we handle name reservation in such a system such that we can deter squatting and phishing without enabling censorship/name takedown?
We are considering implementing a pluggable naming API similar to our pluggable transports design to allow us to iterate and compare various naming systems, but such systems could also be built via the Tor Control Port MAPADDRESS functionality as well.
Application Research: Privacy
Finally, application privacy could use some renewed attention from academia. Various browsers have been deploying privacy mechanisms to block trackers, resist third party tracking, and/or resist fingerprinting. How good are they compared to Tor Browser? How good are our mobile browsers (OrFox and Onion Browser) in comparison to the desktop version?
In particular, the browser fingerprinting research is in need of a refresh that examines things like the fingerprintability of large populations that all use the same browser, exploring new fingerprinting vectors, and measuring the joint-entropy of known fingerprinting vectors (to properly account for many vectors simply inferring the fact that you're using a particular OS version). Previous research has focused only on the global contribution of individual fingerprinting vectors among a heterogeneous population of browsers. This is not very useful for determining if defenses are actually making users of the same browser look more similar to each other. We are beginning to collect some of this data at FP Central, but it needs more analysis to determine how uniform our populations are, and which features might not be uniform. For those features, the question is why are they not uniform, and in what context? Do we need to examine other approaches for obscuring those features, such as virtualization or randomization? How do we measure success in that case?
And what about other applications? Can we improve on the chattering laptop problem when users send all of their device traffic through Tor? Are there any smart ways of isolating different applications such that these information leaks do not enable linkability? Are there any non-browser applications that are already Tor-ready that we should focus our efforts on? Are there any apps or devices that are particularly bad about leaking identifying data over the network (especially without encryption)?
Working with the Tor Project
If you are interested in working on any of these ideas, please contact us! Your first step is to check out the Research Safety Board page, which contains information about how to conduct experiments on Tor safely. The people listed on that page are also good to reach out to with research topics in general, as am I (Mike Perry). You can also find most of us on IRC in #tor-dev on irc.oftc.net.
In the blog post immediately following this one, we will discuss how to perform your research in a way that is most impactful, to give it the best chance of being adopted by Tor. The tips in that post are also useful for any sort of research you would like to see adopted by any large software project.
There will also be a forthcoming blog post on how to work with the Tor Project to get a very large patch set merged (such as those typically generated by research implementations). Please keep your eyes peeled for that!
Have fun, and good luck!
My understanding is that snowflake mainly needs a dedicated maintainer.
The main research-level problem related to snowflake that I am aware of is "find some way to replace domain fronting so that clients can get fresh snowflake bridges from the Snowflake Broker", which is the same research problem that all of our pluggable transports need solved right now (and the one that I mentioned in the censorship circumvention section).
I bet it could also benefit from Conflux-style session resumption that I mentioned in the Guard Discovery section. I bet those ephemeral webpage-based webrtc bridges go up and down a lot, and probably kill people's connections all the time. Being able to restore circuits via alternate paths would sure help smooth that usability issue over.
Ok I have updated the blog post to capture the session resumption issue with snowflake. I believe the other blocker to snowflake is already covered in the censorship circumvention section.
Doing this update also made me realize that we need to mention the Tor exit censorship problem. I have updated the post to capture this as well.
Nope you covered everything I believe :)