There has been a considerable amount of work in the area of Tor incentives since the last post on the topic in 2009 (over 5 years ago!). This post will give an overview of some of the major new approaches, including two new designs that I co-authored that will appear at the Workshop on Hot Topics in Privacy Enhancing Technologies later this week. Before getting to those, I'll give background on Tor and discuss some of the social issues to help form a foundation of understanding.
The Tor network consists of a set of relays that provide bandwidth and other resources to forward traffic for Tor users. Anyone in any part of the world can contribute to Tor by downloading the Tor software and configuring it to operate in relay mode. In fact, the Tor network is currently composed exclusively of such voluntarily-operated relays.
There are many reasons existing relay operators might contribute to Tor. They may contribute because they really want a privacy-enhancing communication tool like Tor to exist, they believe in Tor’s philosophy of open source software and operational transparency, and they want Tor to succeed at its goals. They may wish to be part of the Tor community and the social movement to provide privacy-enhanced communication that connects people in all parts of the world. Tor may be practically useful to them for communication with others or to retrieve content that may otherwise be unavailable to them due to censorship or other network interference. Or they may be technically interested in the Tor software or network and the associated research problems raised by large, distributed systems. Finally, some may be interested for adversarial reasons, e.g., collecting information on the uses of the network or the content being transferred. All of these reasons provide intrinsic motivation for operators to contribute - they don’t expect any direct compensation for their contributions.
The volunteer approach has succeeded so far: Tor now consists of over 5000 relays transferring between 4 and 5 GiB/s in aggregate. For the most part, this is "organic" growth obtained through community outreach, where volunteers first have personal contact with existing community members and become inspired to help.
Whatever their reason for volunteering, relay operators not only contribute resources from the physical machines upon which their relays are executed, but also contribute the time involved with configuring, updating, and adjusting their relays as both the software and network mature. In many cases, operators also contribute monetarily through the direct purchase of dedicated machines or special fast connections to Internet Service Providers (ISPs), and exit relay operators spend social energy on maintaining relationships with their ISP (to make sure the ISP understands that requests coming from their machines are from Tor).
However, because many people that are otherwise excited about or interested in Tor are incapable or unwilling to incur these expenses, there are far fewer Tor relays than Tor clients (users that use Tor to access content, e.g. via the Tor Browser Bundle, without directly contributing resources to the network). Expanding the set of relays could have many benefits: Tor could become faster, more reliable, and more secure while at the same time distributing trust among a larger and more diverse set of distributed peers. A larger network that is transferring a larger quantity and more diverse types of data will, in general, make it more difficult for an adversary to determine who is talking to whom. This means a safer Tor for everyone.
There are many social and technical challenges involved with expanding the set of Tor relays. We focus here on the social issues surrounding how to encourage more people to run a relay in the first place by providing incentives (i.e., rewards) for relay operators, and later focus on how to do this without leaking information or otherwise hurting Tor’s anonymity.
An important social issue to consider is that of maintaining the existing community of operators and how rewards could harm it. The presumption here is that existing relay operators are motivated to run relays for their own reasons - the act of contributing to Tor itself has intrinsic value to them. Then the question is: what would happen if we start rewarding operators through the use of an incentive system?
The answer isn’t clear, and there are at least a couple of forks in the cognitive road with many more questions along the way. First, what would happen to the existing operators who used to feel good about volunteering but are now compensated with a reward that may have extrinsic value? Although the new reward may provide some extrinsic value, will it lower their original intrinsic value enough to cause them to lose their motivation to contribute? Will they continue their contributions but begin to expect the reward, such that if the reward is removed later they would no longer be motivated even though originally (before the reward) they were happy to contribute? Second, is the new group of operators attracted to Tor by the reward somehow less desirable than the existing group, because they don’t care as much about Tor or are more likely to stop contributions and leave the system more quickly than the existing volunteers? Is the fact that they have higher extrinsic motivation than intrinsic make them better or worse operators? Will they be less likely to incur the costs of saying "no" when people ask them to log or turn over traffic? Will they in fact be willing to sell logs if it increases their rewards? Third, how will the new group of operators affect the existing group? If their intrinsic motivation was low enough before the reward that they didn’t contribute, but high enough with the reward that they do contribute, are these new contributors going to somehow shift the "community spirit" into something that no longer has the value that it once had? Will this "crowd out" the intrinsically motivated individuals and cause existing operators to leave (a widely acknowledged theory)?
The answers to these questions are not clear, though speculations abound. One response is that we will never know what will happen until we try it, but if we try it we may not like the result. Another is that if we try it and it succeeds, great; if it starts going badly, we can adapt and adjust.
Researchers and other Tor community members have been interested in the Tor incentive problem because a well-designed incentive scheme has the potential to greatly improve the network (we will discuss several technical designs below). However, because of the many open social questions, it has been challenging for the Tor community to come to a consensus on the path forward. On one hand and as discussed above, many people already run Tor relays and provide valuable service, and the network and bandwidth graphs imply some amount of success in network growth. It would be tragic if the introduction of an experimental reward scheme drove away the existing participants. On the other hand, the number of Tor clients still vastly exceeds the number of relays, and Tor’s bandwidth capacity remains the limiting factor for Tor’s growth.
If we consider that relay operators with a higher intrinsic motivation to contribute are somehow more desirable to the network and the community because they care more deeply about Tor and the social movement, then we can consider there to be a trade-off between reward value and the desirability of the operator. The lower the extrinsic reward value is, the higher the intrinsic value a potential new operator must possess in order for the total value to be high enough to motivate her to contribute. The higher the extrinsic reward value is, the lower the intrinsic value may be to still attract the new operator. Under this model, as the value of the reward increases, the number of individuals willing to contribute also increases while their "quality" decreases.
Please note that this is a very simplified model of reasoning and since it does not necessarily reflect reality, not everyone should agree with it. There are many other values in play here as well; for example, motivations change over time as the world changes, and incentive schemes that require significant protocol modifications are more difficult to implement, test, and deploy. Nonetheless, this simplified model may help sort out the issues, and it seems to be consistent with the fact that the Tor community has thus far preferred to grow the network through intrinsically motivated relay operators.
Tor’s volunteer approach has generally been on the conservative side of our simplified model, where individuals with higher intrinsic motivations to contribute to Tor are preferred. New operators have been recruited through social and community-oriented efforts such as explaining how great Tor is and why they should care. This works well for people who are already intrinsically motivated about Tor, such as this guy who tattooed a Tor onion on his arm.
Other relay recruitment approaches for intrinsically motivated individuals include Torservers.net and the Noisebridge Tor Exit Node Project: both operate as independent nonprofit organizations for those people who would like to contribute to the Tor network but for one reason or another are not in a position to operate relays themselves. As discussed in a 2011 post, it is preferable that people who can run their own fast exit relays do so. But the approach of turning donations into fast exit capacity allows those who cannot run their own relays to still contribute to improving the performance and exit capacity of the Tor network.
The more recent EFF Tor Challenge is somewhere on the same side of the spectrum, except it actually offers small rewards as incentives. EFF explains Tor’s importance and users are encouraged to contribute to the network by running a relay. The challenge offers prizes to relay operators as an incentive: name and twitter handle published in a list of contributors, a limited edition sticker for running a relay for 12 months, and a t-shirt if that relay is fast (bandwidth of 1 MB/s or larger after 12 months).
Other social rewards are possible here as well. A post on the tor-dev mailing list proposes that Tor host a simple profile page for each relay that will list and celebrate the relay’s total bandwidth contribution over time, and recognize the operator for running special Tor bridge or exit relays. Relatively simple ideas like this may have the potential to attract intrinsically-motivated individuals without requiring extrinsic rewards or a lot of changes to Tor itself.
The approaches in this category are relatively low-risk/low-reward schemes: even though the new operators are receiving recognition or a reward for running a relay, the reward is of such little value that there is little risk that existing contributors who may not have been rewarded will leave. At the same time, a new operator loses little if it decides to shut down the new relay.
On the other side of the spectrum are more "radical" approaches that do not require individuals with high intrinsic motivation (though they are welcome, too!) but change Tor’s design have been explored by researchers. Researchers explore these designs because they are generally more technically interesting and have the potential to produce a much larger set of relays. We now shift to discussing these latest research proposals.
In the Gold Star scheme, the Tor directory authorities measure the bandwidth of Tor relays and assign a special flag (a "gold star") to the fastest 7/8 of relays. Then whenever those relays send traffic through Tor, they choose the other fast gold star relays to form a fast gold star path. Relays then prioritize traffic on these gold star paths. The incentive here is that if you run a relay, you will get faster service when using Tor as a client. The main drawback to this approach is that only relays are able to get gold stars and priority service. This means that all relays that are part of a gold star path know for certain that the initiator of that traffic is someone from the small list of gold star relays. Because the set of gold star relays would be smaller than the set of all Tor users by several orders of magnitude, the anonymity for gold star relays would be significantly harmed.
In PAR, all users (not just relays) are able to be part of the incentive system. PAR has an honest-but-curious centralized entity called a "bank" that manages digital tokens called "coins". Users can purchase coins from the bank and then pay the relays while using Tor; the relays then deposit the coins back into the bank and the bank checks to make sure the coin is valid and that it has never been spent before (i.e., it has not been "double spent"). The main novel idea explored in this work is how to include digital payments into each Tor circuit in a way that prevents the relays from learning the client’s identity from the payment itself.
Adding real money into Tor opens a host of new legal and technical questions, which Roger already briefly discussed. For example, real money might shift Tor into a different legal category (see e.g. the EU discussions of what is a "service provider" and which ones are obliged to do data retention), or change the liability situation for relay operators.
The main design challenge we learned from the PAR design is that the timing of when the client withdraws the coins and the relays deposit them creates a trade-off between the ability to detect double spending and link a client to its coins. If the client withdraws some coins and a few seconds later a relay starts depositing coins, how much information does the bank gain? If the relay waits for some time interval to deposit the coins to hide this timing information, then it becomes possible for the client to double spend those coins during that interval. The longer the relay waits before depositing, the harder it is for the bank to link the coins but the easier it is for the client to cheat. A rational relay will deposit immediately to secure its payment, which leads to the worst situation for anonymity. If we assume the relays will deposit immediately, then it is up to the client to hold coins for some random amount of time after purchasing them from the bank, to disrupt potential attempts to link withdrawals to deposits at the cost of flexibility in usage.
In review, both of these proposals have anonymity problems: the Gold Star scheme because the assignment of rewards to relays is public and identifies traffic as originating from the relatively small set of clients of fast relay operators; and PAR because the timing of the withdrawal by the client and the deposit by the relay may leak information about the client’s traﬃc. In PAR, coins may be held by the client and the relay longer to disrupt this timing information, but this trades off flexibility in usage and the speed at which double spending can be detected. So what have these papers taught us? We have learned about some requirements that a Tor incentive scheme should probably fulfill: both clients and relays should be able to receive the incentive so that neither group stands out; and the timing of payments should not allow cheating and should not enable linkability.
Recruiting New Tor Relays with BRAIDS was the followup research proposal presented at CCS in 2010. One of our goals in BRAIDS was to eliminate the trade-off between double-spending and linkability. To achieve this, we designed a new type of digital token which we called a "relay-specific ticket". Tickets were still issued by a trusted, centralized entity - the ticketmaster (we called it a "bank" in the paper, but BRAIDS was not designed to handle real money). Clients choose which relays they want to use, and receive tickets from the ticketmaster that are valid only at the chosen relays while hiding the chosen relay information from the ticketmaster (using partially-blind signatures). Clients then form a Tor circuit with the chosen relays and send the tickets to receive traffic priority (using a differentiated services scheduler). Each ticket will then provide traffic priority through the chosen relay for a specific number of bytes. Because each ticket a relay receives is only valid at that relay and no other, the relay could prevent double spending locally without contacting the ticketmaster.
Another feature of BRAIDS is that tickets are distributed freely in small amount to any client that asks, but only to one client per IP address, and only once per distribution round. If the clients change their mind about which relays they want to use for a circuit, they may contact the ticketmaster to exchange their old tickets for new ones following a time schedule. The exchange process is also used by relays to turn the tickets they received from clients into new usable tickets.
One main drawback to BRAIDS is that even though all users are able to get a small amount of tickets for free, relays are able to accumulate a much larger stash because they receive the free tickets AND the tickets sent to them by other clients. This means that relays stand out when using tickets to download large flows because it is less likely that a normal client would have been able to afford it. Another major drawback is that the exchange process is somewhat inefficient. Relays will exchange received tickets for new ones which they can use themselves, and clients that didn't spend their tickets (e.g., because their Tor usage was low or their chosen relays became unavailable) must exchange them or lose them. This leads to the ticketmaster exchanging all system tickets over every exchange interval. (The ticket validity interval is split into [spend, relay exchange, client exchange], so clients that don't spend in the "spend" time-frame must wait until "client exchange" time-frame to get new tickets. Increasing the interval lengths make them slightly less flexible to rapid changes. The longer the intervals, the more tickets will "pile up" for later processing by the ticketmaster.)
BRAIDS showed us the power of relay-specific tickets but unveiled the scalability problems associated with a trusted, centralized entity.
LIRA: Lightweight Incentivized Routing for Anonymity was published at NDSS 2013. LIRA still uses a centralized entity to manage the incentives, but LIRA only requires incentive management for the relays (thousands) instead of for all system users (millions) like BRAIDS. The way we achieve this is through the use of a lottery system, where clients could simply guess a random number for every circuit to receive priority on that circuit (traffic is prioritized using a differentiated services scheduler as in BRAIDS) with tunable probability. The lottery is set up with special cryptography magic such that relays are rewarded with guaranteed winning guesses to the lottery; relays are allotted winners according to the amount of bandwidth they contributed.
LIRA is more efficient and scalable than any earlier scheme. However, LIRA’s main drawback is that probabilistic guessing reduces flexibility for clients wanting to receive continuous priority over time, and creates a potential for cheating the system because it enables clients to continuously create new circuits and new guesses until a correct guess is found. Another problem is that a secure bandwidth measurement scheme is required to ensure that relays don’t receive rewards without actually contributing to Tor (this wasn't necessary in BRAIDS because the clients sent rewards (tickets) directly to the relays); secure bandwidth measurement is still an open research problem. Finally, LIRA still relies on a trusted central entity to manage the lottery.
From Onions to Shallots: Rewarding Tor Relays with TEARS will be presented at HotPETs later this week. The main goal in TEARS is to remove the reliance on a central entity to manage the incentives. The central entities in the above schemes (let’s generalize them as "banks") are referred to as "semi-trusted", because there are several ways a malicious bank could misbehave - for example, the bank could refuse service, could extort users by demanding extra fees in order to process their requests, or could "inflate" the digital currency (coins, tickets, guesses, etc.) by printing its own tokens for a profit.
TEARS draws upon the decentralized Bitcoin design to address these concerns by using a *publicly auditable* e-cash protocol that prevents the bank from misbehaving and getting away with it. The bank in TEARS consists of a group of semi-trusted servers, such as the Tor directory servers (as opposed to Bitcoin’s distributed proof-of-work lottery), only a quorum of which need to function correctly for the overall system to keep working. The e-cash cryptography used here is publicly auditable and every interaction with the bank is conducted over a public communication channel (such as the Bitcoin blockchain itself). The security guarantee is that any form of misbehavior on the part of the bank servers leaves a trail of evidence that can be discovered by anyone watching. This approach is reminiscent of systems like Certificate Transparency and OpenTransactions.
In addition to a decentralized bank, TEARS offers a new two-level token architecture to facilitate rewarding relays for their bandwidth contributions without hurting anonymity. First, a decentralized process audits relays’ bandwidth and informs the bank of the results. The bank mints new "shallots" (anonymous, auditable e-cash) for each relay based on their contributions and deposits them into the relay accounts on their behalf. Separately, the bank’s monetary policy may allow it to mint new shallots and distribute them to users, e.g. using mechanisms suggested in BRAIDS but more commonly known to Bitcoin users as faucets. Second, shallots are transferable among users and may be redeemed for relay-specific "PriorityPasses", which are then used to request traffic priority from Tor relays (again, as in BRAIDS). PriorityPasses are relay-specific so that relays can immediately and locally prevent double spending without leaking information to any other entity. This is a similar feature present in BRAIDS’ tickets. However, a novel feature of PriorityPasses is that they are non-transferable and become useless after being spent at a relay -- this reduces overhead associated with exchanges, and ensures that the process of requesting traffic priority does not harm anonymity because the act of redeeming a shallot for a PriorityPass will be unlinkable to any later transaction. There is a question of how many PriorityPasses can be spent in one circuit before it is suspicious that a client has so many, so the size of the faucets and how they distribute Shallots will play a key role in anonymity. Anonymity is also tied to how relays decide to distribute their Shallots to clients, either via a faucet or a through a third party market.
TEARS was designed to operate inside the existing Tor network and thus does not significantly change the fundamentals of Tor’s design. The decentralized bank and bandwidth measurement components do not alter the way clients choose circuits. Clients and relays that want to use or support TEARS, however, would need to support new protocols for interacting with the bank and logic for handling shallots, PriorityPasses, and traffic priority.
TEARS still relies on a "bandwidth measuring" component that can accurately and robustly determine when a relay has indeed contributed useful bandwidth. While the e-cash system in TEARS is designed to be publicly auditable, the existing mechanisms for bandwidth still require trusted authorities to probe.
A TorPath to TorCoin - Proof-of-Bandwidth Altcoins for Compensating Relays is the other paper to be presented at HotPETs this week. TorCoin addresses the bandwidth measurement problem with a different approach -- an altcoin (a Bitcoin alternative) based on a novel "proof-of-bandwidth" (rather than proof-of-work) mechanism called TorPath, in which the relays (and endpoints) of a circuit effectively mine for new coins whenever they successfully transfer a batch of packets. In TorCoin, a group of "assignment" authorities are responsible for generating a list of circuits (using a shuffle protocol) and assigning them to clients. Bandwidth proofs are then constructed as the circuit is used such that the client mines the TorCoin and then transfers part of it to each of the circuit’s participants.
Like TEARS, TorCoin distributes the process of rewarding relays for their bandwidth contributions. Bandwidth measurement is done directly as part of the distributed mining process and provides strong guarantees. Also, by utilizing a group of assignment authorities that may have more information about the system or underlying network, there is a lot of potential for generating more secure paths for clients than clients are able to generate for themselves.
TorCoin still has some issues to work out; it may be possible to fix some of the smaller issues with protocol modifications, but some of the larger issues don’t have obvious solutions.
One drawback to TorCoin is that it requires the group of collectively-trusted assignment authorities (you have to trust only that some threshold/quorum number of them are correct) to generate and assign circuits to clients. This is a similar trust model to the current Tor directory authorities. In practice, the assignment authorities cause availability issues: if a majority of the assignment authorities are unreachable, e.g. due to DoS or censorship, then the system is unusable to the clients because they won’t be able to generate circuits. This is also somewhat true of the directory authorities, however, directory information can be signed and then mirrored and distributed by other relays in the system whereas assignment authorities are required to always be online and available to new clients. TorCoin clients contact the assignment authorities in order to build new circuits, whereas in Tor they can build as many circuits as they need once the directory information is retrieved (from the directory auths or from directory mirrors).
TorCoin as written has significant security issues. Because relay assignment is not based on bandwidth, it is easier for an adversary to get into the first and last position on a circuit and break anonymity. This can be done through a sybil attack by adding an arbitrary number of bad relays and joining them to the network without actually providing bandwidth. Because the protocol reveals which ephemeral keys are attached to the assigned circuits, an adversary can confirm when it has compromised a circuit (has malicious nodes in the correct positions) without needing to do any statistical correlation attack (it can match up the ephemeral keys assigned to its malicious relays to the ones posted in the circuit assignment list).
The formation of relay/client groups is not discussed and is similarly vulnerable to sybil attacks where the adversary can completely subvert the coin mining process. This can be done by registering an arbitrary number of relays and clients with the assignment servers, such that a large majority of circuits created by the assignment process will contain malicious relays in all positions and be assigned to a malicious client. This malicious collective of nodes can then “pretend” to send bytes through the circuit without actually doing so, and gain an advantage when mining coins. The paper suggests to use a persistent guard, which means the adversary only needs malicious relays in the middle and exit positions of its sybil client circuits, exponentially increasing the probability of a full compromise (or requiring far fewer nodes to achieve the same probability of compromise as without persistent guards). (The sybil clients only have to get 2 of its relays in the circuit instead of 3, reducing the probability from f^3 to f^2 for malicious fraction f.) Further, even if some relays on a circuit are honest, it is not rational for them to refuse to sign proofs of bandwidth that have been exaggerated (too high) by other relays. It will only benefit a relay to ignore proof-of-bandwidth checks, giving it an advantage over completely honest nodes in the TorCoin mining process.
There are a variety of unaddressed practical deployment issues as well. It is not clear how to do load balancing with TorCoin alone - no one should be able to determine how many bytes were sent by any of the circuit members or where the payments for mined coins are being sent (anonymous TorCoin transactions are necessary for anonymity). Exit policies are not discussed, and there is no clear way to support them and for a client that would like to request a specific exit port. Its not clear how to ensure that a relay is not chosen for the same circuit twice, or two relays from the same relay family are not on the same circuit. Finally, it is not clear how to link ephemeral circuit keys to TorCoin addresses so that payments may be sent from clients to relays without revealing client identity.
Don't misunderstand my ranting here - I think that the TorCoin idea is great (I am a co-author after all). It has the potential to motivate new researchers and developers to start thinking about solutions to problems that Tor is interested in, particularly bandwidth measurement. However, the limitations need to be clear so that we don't start seeing production systems using it and claiming to provide security without working through *at least* the issues mentioned above. The current paper was an initial concept in its early stages, and I expect the system to improve significantly as it is furthered developed.
(For completeness, we should also point out that TorCoin will need a new name if anybody decides to build it. The Tor trademark faq says it's fine for research paper designs to use the Tor mark, but if it becomes actual software then it sure will be confusing to users and the rest of the Tor community about whether it's written by or endorsed by Tor.)
Note that TorCoin and TEARS are at least somewhat complementary, since TEARS *needs* a proof-of-bandwidth scheme, and TorCoin *provides* one. However, they’re also not directly compatible. TorCoin requires a substantial change to both Bitcoin and to Tor (or to put it another way, it would be a system that only partially resembles Bitcoin and only partially resembles Tor). On the other hand, TEARS leaves Tor's circuit-finding mechanism intact, and the token protocol in TEARS is closer to traditional anonymous e-cash systems than to Bitcoin.
As outlined above, recent research has made great improvements in the area of Tor incentives. More work is needed to show the feasibility and efficiency of the decentralized approaches, and a secure bandwidth measurement scheme would help fill a critical piece missing from TEARS. Recent improvements to the way Tor schedules sockets and circuits would be necessary to correctly provide traffic priority (see #12541), and then a differentiated services scheduler that is able to prioritize traffic by classes (already described in and prototyped for the BRAIDS, LIRA, and TEARS papers) would also be needed.
Unfortunately, it is unclear how to make headway on the social issues. A small scale rollout of an "experimental build" to those relays who want to support new incentive features could be one way to test a new approach without committing to anything long term.
One question that is often raised is: if an incentive scheme rewards relays for providing bandwidth, then won’t everyone just pick the same cheapest hosting provider and Tor will lose location diversity? This question is largely addressed in the discussion of what constitutes a useful service in the TEARS paper (the TEARS paper and appendices also gives useful commentary on many of the common problems and design decisions to make when designing an incentive scheme). Basically, it can be addressed in the monetary policy, e.g., in addition to rewarding relays for bandwidth, the bank could also assign weights that could be used to adjust the rewards based on certain flags that relays possess, or the geographic location in which they operate. This could be adjusted over time so that there would be a higher incentive to run relays in parts of the world where none exist, or to prefer exit relays over other positions, etc. Although, note that it is unclear exactly what the "correct" utility function should be and when/how it should be adjusted. Note that Torservers.net similarly rewards relays for location diversity (see here).
Another point to make here is that most of these approaches have nothing to do with giving out or transferring real dollars. The tokens in most of these schemes are useful only to receive traffic priority in Tor. Will there be third party markets that form around the exchange of the tokens? Sure. And they may be speculated. But at the end of the day, the tokens would still only provide prioritized traffic. Depending on the configuration of the priority scheduler, the difference between priority traffic and normal traffic may not be that extreme. It is conceivable that the tokens would not be worth nearly enough to compensate an operator for the ISP connection, much less the overhead involved with updating the software, maintaining the machine, and talking with the ISP -- and in that case we are still on the more conservative side of the social incentive discussions above.
Tor has some choices to make in terms of how to grow the network and how to position the community during that growth process. I hope that this post, and the research presented herein, will at least help the community understand some of the options that are available.
All the best, ~Rob
[Thanks to Roger Dingledine, Bryan Ford, Aaron Johnson, Andrew Miller, and Paul Syverson for input and feedback on this post.]
Rob just tagged Shadow v1.6.1, and added a link on the download page. This is really more like a pre-1.7.0 release, but he wanted to get out some exciting new support for running multi-threaded simulations earlier!
This release includes:
- Support for running multi-threaded simulations! (use the “-w” flag to specify the number of worker threads)
- A few bugfixes
In preliminary testing, the biggest improvements have been seen when using between 4 and 8 worker threads. Happy simulating!
For a few years now, funders have been asking if they can pay Tor to run more relays. I kept telling them their money was better spent on code and design improvements:
since a) network load would just grow to fill whatever new capacity we have, especially if we don't deal with the tiny fraction of users who do bulk downloads, and b) reducing diversity of relay operator control can harm anonymity.
But lately the Tor network has become noticeably faster, and I think it has a lot to do with the growing amount of excess relay capacity relative to network load:
At the same time, much of our performance improvement comes from better load balancing -- that is, concentrating traffic on the relays that can handle it better. The result though is a direct tradeoff with relay diversity: on today's network, clients choose one of the fastest 5 exit relays around 25-30% of the time, and 80% of their choices come from a pool of 40-50 relays.
Since extra capacity is clearly good for performance, and since we're not doing particularly well at diversity with the current approach, we're going to try an experiment: we'll connect funding to exit relay operators so they can run bigger and/or better exit relays.
If we do it right (make more faster exit relays that aren't the current biggest ones, so there are more to choose from), we will improve the network's diversity as well as being able to handle more users.
We've lined up our first funder (BBG, aka http://www.voanews.com/), and they're excited to have us start as soon as we can. They want to sponsor 125+ fast exits.
I've started a discussion on the tor-relays list about open questions that we as a community will need to decide about:
1) What exactly would we pay for?
2) Should we fund existing relays or new ones?
4) What exactly do we mean by diversity?
5) How much "should" an exit relay cost?
6) How exactly should we choose which exit relay operators to reimburse?
7) How do we audit / track the sponsored relays?
8) Legal questions?
The first step is collecting facts about the current fast Tor exit relays. Please join the discussion on the tor-relays list if you want to contribute:
The Tor network is run by volunteers, and for the most part is entirely independent of the software development effort led by The Tor Project, Inc. While The Tor Project, Inc is a 501(c)3 non-profit that is happy to take donations to create more and better software, up until recently there was no way for you to fund deployment of more relays to improve network capacity and performance, aside from running those relays yourself.
Noisebridge is a US 501(c)3 non-profit, which means that for US citizens, donations are tax deductible. Torservers.net is a German non-profit organization whose donations are tax deductible for German citizens (and also potentially for citizens of other EU member states).
What are the pluses and minuses of donating as opposed to running your own relay? Glad you asked!
While it is relatively easy and risk-free to run a middle relay or a bridge, running an exit can be tough. You have to seek out a friendly ISP, explain Tor to them, and then navigate a laundry list of Internet bureaucracies to ensure that when abuse happens, the burden of answering complaints falls upon you and not your ISP.
These barriers are all made easier the larger your budget is. On top of this, like most things, bandwidth is cheaper in bulk. But not just Costco cheaper: exponential-growth cheaper, all the way up into the gigabit range (and perhaps beyond, but no one has run a Tor node on anything faster).
At these scales, large exit nodes can pay as little as $1/mo per dedicated megabit/second. Sometimes less. This means that adding $30/mo to the hosting budget of a large exit node can buy almost 40 times more full-duplex dedicated bandwidth than a similarly priced business upgrade to your home ADSL line would buy, and about 50 times more bandwidth than Amazon EC2 instances at the entry-level price of $0.08 per half-duplex gigabyte, not counting CPU costs. (Bridge economics in terms of IP address space availability might still favor Amazon EC2, but that is a different discussion).
The downside to donation is that network centralization can lead to a more fragile and a more observable network. If these nodes fail, the network will definitely feel the performance loss. In terms of observability, fewer nodes also means that fewer points of surveillance are required to deanonymize users (though some argue that more users will make such surveillance less reliable, no one has yet rigorously quantified that result against actual attacks).
Therefore, if you are able to run a high capacity relay or exit yourself (or have access to cheap/free/unused bandwidth at your work/school), running your own relay is definitely preferred. If you are part of the Tor community and want to accept donations, we'd love to add you to our recommended donor list. Please join us on the tor-relays mailing list to discuss node configuration and setup.
However, if configuring and maintaining a high capacity relay is not for you, donating a portion of the monthly hosting budgets of either of these organizations is an excellent way to support anonymity, privacy, and censorship circumvention for very large numbers of people.
Tor's entry guard design protects users in a variety of ways.
First, they protect against the "predecessor attack": if you choose new relays for each circuit, eventually an attacker who runs a few relays will be your first and last hop. With entry guards, the risk of end-to-end correlation for any given circuit is the same, but the cumulative risk for all your circuits over time is capped.
Second, they help to protect against the "denial of service as denial of anonymity" attack, where an attacker who runs quite a few relays fails any circuit that he's a part of and that he can't win against, forcing Alice (the user) to generate more circuits and thus increasing the overall chance that the attacker wins. Entry guards greatly reduce the risk, since Alice will never choose outside of a few nodes for her first hop.
Third, entry guards raise the startup cost to an adversary who runs relays in order to trace users. Without entry guards, the attacker can sign up some relays and immediately start having chances to observe Alice's circuits. With them, new adversarial relays won't have the Guard flag so won't be chosen as the first hop of any circuit; and even once they earn the Guard flag, users who have already chosen guards won't switch away from their current guards for quite a while.
But how long exactly? The first research question here examines vulnerability due to natural churn of entry guards. Consider an adversary who runs one entry guard with advertised capacity C. Alice maintains an ordered list of guards (chosen at random, weighted by advertised speed, from the whole list of possible guards). For each circuit, she chooses her first hop (uniformly at random) from the first G guards from her list that are currently running (appending a new guard to the list as needed, but going back to earlier guards if they reappear). How long until Alice adds the adversary's node to her guard list? You can use the uptime data from the directory archives, either to build a model for guard churn or just to use the churn data directly. Assuming G=3, how do the results vary by C? What's the variance?
Research question two: consider intentional churn due to load balancing too. Alice actually discards each guard between 30 and 60 days (uniformly chosen) after she first picks it. This intentional turnover prevents long-running guards from accumulating more and more users and getting overloaded. Another way of looking at it is that it shifts load to new guards so we can make better use of them. How much does this additional churn factor impact your answer from step one above? Or asked a different way, what fraction of Alice's vulnerability to the adversary's entry guard comes from natural churn, and what fraction from the proactive expiration? How does the answer change for different expiry intervals (e.g. between 10 and 30 days, or between 60 and 90)?
Research question three: how do these answers change as we vary G? Right now we choose from among the first three guards, to reduce the variance of expected user performance -- if we always picked the first guard on the list, and some users picked a low-capacity or highly-congested relay, none of that user's circuits would perform well. That said, if choosing G=1 is a huge win for security, we should work on other ways to reduce variance.
Research question four: how would these answers change if we make the cutoffs for getting the Guard flag more liberal, and/or change how we choose what nodes become guards? After all, Tor's anonymity is based on the diversity of entry and exit points, and while it may be tough to get around exit relay scarcity, my theory is that our artificial entry point scarcity (because our requirements are overly strict) is needlessly hurting the anonymity Tor can offer.
Our current algorithm for guard selection has three requirements:
1) The relay needs to have first appeared longer ago than 12.5% of the relays, or 8 days ago, whichever is shorter.
2) The relay needs to advertise at least the median bandwidth in the network, or 250KB/s, whichever is smaller.
3) The relay needs to have at least the median weighted-fractional-uptime of relays in the network, or 98% WFU, whichever is smaller. (For WFU, the clock starts ticking when we first hear about the relay; we track the percentage of that time the relay has been up, discounting values by 95% every 12 hours.)
Today's guard cutoffs in practice are "was first sighted at least 8 days ago, advertises 100KB/s of bandwidth, and has 98% WFU."
Consider two relays, A and B. Relay A first appeared 30 days ago, disappeared for a week, and has been consistently up since then. It has a WFU (after decay, give or take a fencepost) of 786460 seconds / 824195 = 95.4%, so it's not a guard. Relay B appeared two weeks ago and has been up since then. Its WFU is 658517 / 658517 = 100%, so it gets the Guard flag -- even though it has less uptime, and even less weighted uptime. Based on the answers to the first three research questions, which relay would serve Alice best as a guard?
The big-picture tradeoff to explore is: what algorithm should we use to assign Guard flags such that a) we assign the flag to as many relays as possible, yet b) we minimize the chance that Alice will use the adversary's node as a guard?
Updated 06/30/2010: Mention Reduced Exit Policy, ISP Shopping Tips, and Abuse Response Templates
Updated 08/30/2010: Update exit policy with svn, git, hg, Kerberos, remote admin panels, IRC, others
Updated 01/12/2011: Suggest creation of LLC for large exit nodes, provide links to ARIN forms and process.
Updated 02/25/2015: Torservers.net abuse templates URL has changed.
I have noticed that a lot of new exit nodes have recently appeared on the network. This is great news, since exit nodes are typically on the scarce side. Exits usually occupy 30-33% of network by capacity, but are currently at a whopping 38.5% (156 MBytes/sec out of 404 total).
However, I want to make sure that these nodes stay up and don't end up being shut down due to easily preventable abuse complaints. I've run a number of exit nodes on a few different ISPs and not only have I lived to tell about it, I've have not had one shut down yet. Moreover, I've only received about 4 abuse complaints in as many years of running exit nodes. This is in stark contrast to other node operators following a more reactive strategy. I'm convinced this is largely because I observe the following pro-active guidelines. This guide is primarily US centric. Operators in other countries may have slightly different best practices (such as registering with RIPE and not ARIN). read more »
New releases, new hires, new funding
Bruce Leidl joins to work on developing Tor in Java. Bruce will write a fully functional Tor in Java in order to provide a solid foundation for other java-based projects; such as Tor on mobile platforms like Maemo and Android.
On November 2nd we released Vidalia 0.2.6. https://blog.torproject.org/blog/vidalia-026-released
On November 20th, we released Tor Browser Bundle 1.2.10. https://blog.torproject.org/blog/tor-browser-bundle-1210-released
On November 19th, we released Tor 0.2.2.6-alpha. https://blog.torproject.org/blog/tor-0226-alpha-released
Design, develop, and implement enhancements that make
Tor a better tool for users in censored countries.
Roger met with his class at KAIST working on bridge deployment strategies. A few teams developed some creative strategies. Roger is continuing to work with the leading teams to further refine their ideas before publishing. read more »
Congratulations to Chrisd for assuming maintainership of the Polipo codebase from Juliusz. The full announcement is available at the mailing list archives.
Chrisd was a 2009 Google Summer of Code for Tor/EFF. His project was Polipo Portability Enhancements. Chris has proven himself to be a very competent coder and able to design and implement features as needed. He even wrote a SOCKS layer fix for Firefox bugs, https://bugzilla.mozilla.org/show_bug.cgi?id=280661.
What does this mean to Tor?
It means we have a fantastic new coder maintaining the polipo codebase. Bugfixes, features, and more frequent releases should help improve polipo beyond where it is today.
Is Tor going to control Polipo? read more »
A while ago there was a thread on OR-TALK that devolved into
"why does Tor still ship ancient privoxy?"
"why are you shipping polipo with the Tor Browser Bundle instead of current privoxy?"
For those interested, the thread is here, http://archives.seul.org/or/talk/Jul-2009/msg00063.html.
Scott had a good argument for why we should update the bundles to the latest privoxy, and I agree, we should. But then I started thinking about why we needed a proxy at all. Almost all browsers support socks5 direct, isn't that faster than a middleman proxy?
This got me thinking about why polipo is in the TBB, but not the other packages. The TBB "feels faster" when using Tor than using the installed Tor, Vidalia, and Privoxy. However, I couldn't find any actual testing of performance of polipo vs. privoxy vs. socks5 direct. read more »
Jake, Mike, Karsten, Sebastian, and I attended Hacking at Random last week in The Netherlands. I did a talk on Tor performance challenges — basically walking through the key pieces of the "Why Tor is Slow" document that we wrote in March.
As usual with European hacking cons, they produced a really well-done video just days after my talk. So if you want to get the highlights on what we're doing to speed up Tor and what roadblocks remain, take a look at the video and also the slides that come with it.