hidden services

Nine Questions about Hidden Services

This is an interview with a Tor developer who works on hidden services. Please note that Tor Browser and hidden services are two different things. Tor Browser (downloadable at TorProject.org) allows you to browse, or surf, the web, anonymously. A hidden service is a site you visit or a service you use that uses Tor technology to stay secure and, if the owner wishes, anonymous. The secure messaging app ricochet is an example of a hidden service. Tor developers use the terms "hidden services" and "onion services" interchangeably.
1. What are your priorities for onion services development?
Personally I think it’s very important to work on the security of hidden services; that’s a big priority.
The plan for the next generation of onion services includes enhanced security as well as improved performance. We’ve broken the development down into smaller modules and we’re already starting to build the foundation. The whole thing is a pretty insane engineering job.

2. What don't people know about onion Services?
Until earlier this year, hidden services were a labor of love that Tor developers did in their spare time. Now we have a very small group of developers, but in 2016 we want to move the engineering capacity a bit farther out. There is a lot of enthusiasm within Tor for hidden services but we need funding and more high level developers to build the next generation.
3. What are some of Tor's plans for mitigating attacks?
The CMU attack was fundamentally a "guard node" attack; guard nodes are the first hop of a Tor circuit and hence the only part of the network that can see the real IP address of a hidden service. Last July we fixed the attack vector that CMU was using (it was called the RELAY_EARLY confirmation attack) and since then we've been divising improved designs for guard node security.
For example, in the past, each onion service would have three guard nodes assigned to it. Since last September, each onion service only uses one guard node—-it exposes itself to fewer relays. This change alone makes an attack against an onion service much less likely.
Several of our developers are thinking about how to do better guard node selection. One of us is writing code on this right now.
We are modeling how onion services pick guard nodes currently, and we're simulating other ways to do it to see which one exposes itself to fewer relays—the fewer relays you are exposed to, the safer you are.
We’ve also been working on other security things as well. For instance, a series of papers and talks have abused the directory system of hidden services to try to estimate the activity of particular hidden services, or to launch denial-of-service attacks against hidden services.

We’re going to fix this by making it much harder for the attacker's nodes to become the responsible relay of a hidden service (say, catfacts) and be able to track uptime and usage information. We will use a "distributed random number generator"--many computers teaming up to generate a single, fresh unpredictable random number. 

Another important thing we're doing is to make it impossible for a directory service to harvest addresses in the new design. If you don't know a hidden service address, then under the new system, you won't find it out just by hosting its HSDir entry.

There are also interesting performance things: We want to make .onion services scalable in large infrastructures like Facebook--we want high availability and better load balancing; we want to make it serious.

[Load balancing distributes the traffic load of a website to multiple servers so that no one server gets overloaded with all the users. Overloaded servers stop responding and create other problems. An attack that purposely overloads a website to cause it to stop responding is called a Denial of Service (DoS) attack.  - Kate]
There are also onion services that don’t care to stay hidden, like Blockchain or Facebook; we can make those much faster, which is quite exciting.
Meanwhile Nick is working on a new encryption design--magic circuit crypto that will make it harder to do active confirmation attacks. [Nick Mathewson is the co-founder of the Tor Project and the chief architect of our software.] Active confirmation attacks are much more powerful than passive attacks, and we can do a better job at defending against them.
A particular type of confirmation attack that Nick's new crypto is going to solve is a "tagging attack"—Roger wrote a blog post about them years ago called, "One Cell Is Enough"—it was about how they work and how they are powerful.
4. Do you run an onion service yourself?  
Yes, I do run onion services; I run an onion services on every box I have.  I connect to the PC in my house from anywhere in the world through SSH—I connect to my onion service instead of my house IP. People can see my laptop accessing Tor but don’t know who I am or where I go. 
Also, onion services have a property called NAT-punching; (NAT=Network Address Translation). NAT blocks incoming connections;it builds walls around you. Onion services have NAT punching and can penetrate a firewall. In my university campus, the firewall does not allow incoming connections to my SSH server, but with an onion service the firewall is irrelevant.
5. What is your favorite onion service that a nontechnical person might use? 
I use ricochet for my peer to peer chatting--It has a very nice UI and works well.
6. Do you think it’s safe to run an onion service? 
It depends on your adversary. I think onion services provide adequate security against most real life adversaries.

However, if a serious and highly motivated adversary were after me, I would not rely solely on the security of onion services. If your adversary can wiretap the whole Western Internet, or has a million dollar budget, and you only depend on hidden services for your anonymity then you should probably up your game. You can add more layers of anonymity by buying the servers you host your hidden service on anonymously (e.g. with bitcoin) so that even if they deanonymize you, they can't get your identity from the server. Also studying and actually understanding the Tor protocol and its threat model is essential practice if you are defending against motivated adversaries.

7. What onion services don’t exist yet that you would like to see? 
Onion services right now are super-volatile; they may appear for three months and then they disappear. For example, there was a Twitter clone, Tor statusnet; it was quite fun--small but cozy. The guy or girl who was running it couldn’t do it any longer. So, goodbye! It would be very nice to have a Twitter clone in onion services. Everyone would be anonymous. Short messages by anonymous people would be an interesting thing.
I would like to see apps for mobile phones using onion services more—SnapChat over Tor, Tinder over Tor—using Orbot or whatever. 
A good search engine for onion services. This volatility comes down to not having a search engine—you could have a great service, but only 500 sketchoids on the Internet might know about it.
Right now, hidden services are misty and hard to see, with the fog of war all around. A sophisticated search engine could highlight the nice things and the nice communities; those would get far more traffic and users and would stay up longer.

The second question is how you make things. For many people, it’s not easy to set up an onion service. You have to open Tor, hack some configuration files, and there's more.

We need a system where you double click, and bam, you have an onion service serving your blog. Griffin Boyce is developing a tool for this named Stormy. If we have a good search engine and a way for people to start up onion services easily, we will have a much nicer and more normal Internet in the onion space. 
8. What is the biggest misconception about onion services?
People don't realize how many use cases there are for onion services or the inventive ways that people are using them already. Only a few onion services ever become well known and usually for the wrong reasons.
I think it ties back to the previous discussion--—the onion services we all enjoy have no way of getting to us. Right now, they are marooned on their island of hiddenness. 
9. What is the biggest misconception about onion services development?
It’s a big and complex project—it’s building a network inside a network; building a thing inside a thing. But we are a tiny team. We need the resources and person power to do it.

(Interview conducted by Kate Krauss)

Did the FBI Pay a University to Attack Tor Users?

The Tor Project has learned more about last year's attack by Carnegie Mellon researchers on the hidden service subsystem. Apparently these researchers were paid by the FBI to attack hidden services users in a broad sweep, and then sift through their data to find people whom they could accuse of crimes. We publicized the attack last year, along with the steps we took to slow down or stop such an attack in the future:

Here is the link to their (since withdrawn) submission to the Black Hat conference:
along with Ed Felten's analysis at the time:

We have been told that the payment to CMU was at least $1 million.

There is no indication yet that they had a warrant or any institutional oversight by Carnegie Mellon's Institutional Review Board. We think it's unlikely they could have gotten a valid warrant for CMU's attack as conducted, since it was not narrowly tailored to target criminals or criminal activity, but instead appears to have indiscriminately targeted many users at once.

Such action is a violation of our trust and basic guidelines for ethical research. We strongly support independent research on our software and network, but this attack crosses the crucial line between research and endangering innocent users.

This attack also sets a troubling precedent: Civil liberties are under attack if law enforcement believes it can circumvent the rules of evidence by outsourcing police work to universities. If academia uses "research" as a stalking horse for privacy invasion, the entire enterprise of security research will fall into disrepute. Legitimate privacy researchers study many online systems, including social networks — If this kind of FBI attack by university proxy is accepted, no one will have meaningful 4th Amendment protections online and everyone is at risk.

When we learned of this vulnerability last year, we patched it and published the information we had on our blog:

We teach law enforcement agents that they can use Tor to do their investigations ethically, and we support such use of Tor — but the mere veneer of a law enforcement investigation cannot justify wholesale invasion of people's privacy, and certainly cannot give it the color of "legitimate research".

Whatever academic security research should be in the 21st century, it certainly does not include "experiments" for pay that indiscriminately endanger strangers without their knowledge or consent.

A Hidden Service Hackfest: The Arlington Accords

At the beginning of July, a few of us gathered in Washington DC for the first hidden service hackfest. Our crew was comprised of core Tor developers and researchers who were in the area; mostly attendees of PETS. The aim was to push hidden service development forward and swiftly arrive at decisions that were too tiresome and complex to make over e-mail.

Since we were mostly technical folks, we composed technical proposals and prioritized development, and spent less time with organizational or funding tasks. Here is a snapshot of the work that we did during those 5 days:

  • The first day, we discussed current open topics on hidden services and tasks we should be doing in the short-to-medium-term future.

    Our list of tasks included marketing and fundraising ones like "Re-branding hidden services" and "Launch crowdfunding campaign", but we spent most of the first day discussing Proposal 224 aka the "Next Generation Hidden Services" project.

  • Proposal 224 is our master plan for improving hidden services in fundamental ways: The new system will be faster, use better cryptography, have more secure onion addresses, and offer advanced security properties like improved DoS resistance and keeping identity keys offline. It's heavy engineering work, and we are still fine-tuning the design, so implementation has not started yet.

    While discussing how we would implement the system, we decided that we would need to write most of the code for this new protocol from scratch, instead of hooking into the old and rusty hidden service code. To move this forward, we spent part of the following days splitting the proposal into individual modules and figuring out how to refactor the current data structures so that the new protocol can coexist with the old protocol.

  • One open design discussion on proposal 224 has been an earlier suggestion of merging the roles of "hidden service directory" and "introduction point" on the hidden service protocol. This change would improve the security and performance as well as simplify the relevant code, and reduce load on the network. Because it changes the protocol a bit, it would be good to have it specified precisely. For this reason, we spent the second and third days writing a proposal that defines how this change works.
  • Another core part of proposal 224 is the protocol for global randomness calculation. That's a system where the Tor network itself generates a fresh, unpredictable random value everyday; basically like the NIST Randomness Beacon but decentralized.

    Proposal 225 specifies a way that this can be achieved, but there are still various engineering details that need to be ironed out. We spent some time discussing the various ways we can implement the system and the engineering decisions we should take, and produced a draft Tor proposal that specifies the system.

  • We also discussed guard discovery attacks, and the various defenses that we could deploy. The fact that many core Tor people were present helped us decide rapidly which various parameters and trade-offs that we should pick. We sketched a proposal and posted it to the [tor-dev] mailing list and it has already received very helpful feedback.
  • We also took our old design for "Direct Onion Services" and revised it into a faster and far more elegant protocol. These types of services trade service-side location privacy for improved performance, reliability, and scalability. They will allow sites like reddit to offer their services faster on hidden services while respecting their clients anonymity. During the last days of the hackfest, we wrote a draft proposal for this new design.
  • We did more development on OnioNS, the Onion Name System, which allows a hidden service operator to register a human memorable name (e.g. example.tor) that can be used instead of the regular onion address. In the last days of the hackfest we prepared a proof-of-concept demo wherein a domain name was registered and then the Tor Browser successfully loaded a hidden service under that name. That was a significant step for the project.
  • We talked about rebranding the "Hidden Services" project to "Onion Services" to reduce "hidden"/"dark"/"evil" name connotations, and improve terminology. In fact, we've been on this for a while, but we are still not sure what the right name is. What do you think?

And that's only part of what we did. We also wrote code for various tickets, reviewed even more code and really learned how to use Ricochet.

All in all, we managed to fit more things than we hoped into those few days and we hope to do even more focused hackfests in the near future. Email us if you are interested in hosting a hackfest!

If you'd like to get involved with hidden service development, you can contact the hackfest team. Our nicks on IRC OFTC are armadev, asn, dgoulet, kernelcorn, mrphs, ohmygodel, robgjansen, saint, special, sysrqb, and syverson.

Until next time!

How We Work

The Tor Project is driven by ideas. We believe in the right to privacy for every person on the planet. Our community—paid and volunteer—brainstorms projects that embody those ideas, like decentralized hidden messaging systems or ingenious new ways to get uncensored Internet access to people in China.

On our public wikis, we make lists of what we need to build these projects—and then we approach potential sponsors with these lists. If we’re lucky, a sponsor will pay to do the project. If not, we may make it for free.

This is true whether the potential sponsor is a government agency or anyone else.

Because of this system, some projects, like hidden services, need more funding, and we are seeking individual contributions to make this technology stronger. One day we hope to build it into many more programs—for instance, phone apps--to make them private and secure by default.

Our diverse, international community includes thousands of men and women inspired by the ideals we share. They work to support Tor and create important tools based on Tor, like Tails and Orbot (there are at least a dozen of these). Our group includes visionaries who think and talk publicly about the Internet and the future of privacy; among them: @nickm_tor, @ioerror and @RogerDingledine. @aaronsw was one of us.

We will accept no back doors to our software, ever. You can watch @ioerror talk about this at last year’s 31c3 talk in Hamburg. We believe in and build free, open source software—free as in freedom. Tor’s source code is online for everyone to see.

We are proud of our people, our work, and our ideals. We are a human rights organization. We are inventors. Our community is a workshop for the future of privacy tools; maybe even for the future of privacy.

The Tor community is open to newcomers; we hope you will join us.

Crowdfunding the Future (of Hidden Services)

Hidden Services have received a lot of attention in 2015, and Tor is at the center of this conversation. Hidden Services are a Tor technology that allows users to connect to services (blogs, chats, and many other things) with neither the user nor the site giving up identifying information.

In fact, anything you can build on the internet, you can build on hidden services. But they're better--they give users things that normal networking doesn't authentication and confidentiality are built in; anonymity is built in. An internet based on hidden services would be an internet with Tor built in--a feature that users could take for granted. Think of what this might mean to millions of users in countries like China, Iran, or the UK. Yet currently, only about 4% of Tor's traffic comes from hidden services.

So we at Tor have been considering how we might meet the challenge of making them more widely available. In this post, we will briefly discuss the role of hidden services before we explore the idea of using crowdfunding to pay for bold, long-term tech initiatives that will begin to fulfill the promise of this technology.

Hidden Services are a critical part of Tor's ecosystem

Hidden Services provide a means for Tor users to create sites and services that are accessible exclusively within the Tor network, with privacy and security features that make them useful and appealing for a wide variety of applications.

For example, hidden services are currently used by activists and journalists to publish blogs--in anonymity and free from retaliation. They are used by NGOs to securely receive information on government corruption and injustice from concerned citizens. Newspapers such as the Washington Post, and human rights groups such as Amnesty International use them to receive leaked information. They are used by people looking for the latest cat facts, companies that want to secure the path of their clients or by people chatting securely and anonymously -- including at-risk journalists talking to sources.

In addition, developers use hidden services as a building block to incorporate Tor's security and anonymity features into totally separate products. The potential of hidden services is huge, and much of it is yet to be explored.

Next Steps for Hidden Services

We want to make this technology available to the wider public as these services will play a key role in the future of secure communications. This means that we must increase the uses for hidden services, bring them to mobile platforms for anonymous mobile apps, and vastly increase the number of people who use them.

Since our goal is wider use, it is imperative that we build them to be more secure, easier to set up, better performing, and more usable. Clearly, the questions that we answer in early deployment efforts will inform how we answer the deeper questions pertaining to massive worldwide deployment.

We must engage a large number of people to bring hidden services to the next level. Until now, hidden services development largely relied on the volunteer work of developers in their spare time. This will not be sufficient if we are to make the leap to transformative hidden services.

We are currently evaluating funding strategies that will support our Hidden Service initiatives in the short-, intermediate- and long-term. In order to fit the requirements more conservative large funders have, so we can fully sponsor the Next Generation Hidden Services, we must put preliminary pieces in place. And for that we will reach out to crowdfunding. To do this right, we need your feedback.

Why Crowdfunding?

Crowdfunding allows us to engage the broader community in grasping the opportunity that this new technology promises. We are confident that we can deliver significant advancements in the hidden services field in the short-term, and that many small donors who understand their context will be eager to contribute. We intend to begin by prioritizing the improvement of the security, usability, and performance of the current hidden services system.

Further, we want to make sure we support the efforts of community projects and that the community is participating in shaping the evolution of hidden services. For example, it would be important to assist and improve the Tor integration of projects such as SecureDrop, Pond, Ahmia and Ricochet. We are in the unique position to be able to shape the Tor protocol to make these projects easier to use and better performing, and we would like to identify ways to promote broader deployment of these projects.

Identifying, prioritizing and meeting future challenges will require engagement throughout the greater community. For instance, as changes and enhancements are introduced, we hope to speak with the best bug hunters, cryptographers and privacy experts and ask them to audit our code and designs. Non-technical users could help us evaluate the usability of our improvements.

For this crowdfunding campaign we have identified a few possible ideas-- but the point of this post is to ask you for yours. Here are three projects that we have come up with so far:

  • Information Panel for Hidden Service Operators

  • An application that Hidden Service operators could use to learn more about the activity of their Hidden Service. The operator would have access to information on user activity, security information, etc., and will receive important system-generated updates, including log messages

  • Fast-but-not-hidden services

  • A way to set up public hidden services with improved performance but reduced server-side anonymity. Basically, hidden services that don't care about anonymity but still want to protect their clients with Tor's cryptography and anonymity, will be able to run faster since they don't need to protect their own anonymity. This is an optional feature that suits the needs of large sites like Facebook and reddit, and will make their hidden services faster while also reducing the traffic they cause to the network. Also by optimizing for performance in this specialized feature, we can optimize for security even more in the default hidden services configuration.

  • Next Generation Hidden Services

  • Tor has been at the center of hidden services from the beginning. We have big lists of changes we need to do to the Tor protocol to increase the security of hidden services against cryptanalysis, DoS and deanonymization attacks. We also want to improve guard security, allow operators to store their cryptographic keys offline and enable scaling of hidden services to new levels. This is a big project but we hope to start crunching through it as part of this crowdfunding campaign.

    Your Idea for Hidden Services?

    Long story short, we are looking for feedback!

  • What hidden services projects would you like to see us crowdfund?

  • How do you use hidden services; what makes them important to you? How you want to see them evolve?

  • We'd love to hear your ideas on picking crowdfunding rewards and stretch goals.
    Also, we are curious about which crowdfunding platforms you prefer and why.

  • Feel free to use the comments of this blog, or contact us directly at tor-assistants@torproject.org. Also see our wiki page with more information!

    In the following weeks, we will update you on our progress, incorporating feedback we receive from the community. We hope to make this process as transparent and public as possible!


    EDIT: The "Unhidden Services" paragraph was expanded and changed to "Fast-but-not-hidden Services". The previous name was too scary and the description not sufficient to show the potential of the project. Please send us better names for this feature!

    Some statistics about onions

    Non-technical abstract:

    We are starting a project to study and quantify hidden services traffic. As part of this project, we are collecting data from just a few volunteer relays which only allow us to see a small portion of hidden service activity (between 2% and 5%). Extrapolating from such a small sample is difficult, and our data are preliminary.

    We've been working on methods to improve our calculations, but with our current methodology, we estimate that about 30,000 hidden services announce themselves to the Tor network every day, using about 5 terabytes of data daily. We also found that hidden service traffic is about 3.4% of total Tor traffic, which means that, at least according to our early calculations, 96.6% of Tor traffic is *not* hidden services. We invite people to join us in working to research methodologies and develop systems for better understanding Tor hidden services.


    Over the past months we've been working on hidden service statistics. Our goal has been to answer the following questions:

    • "Approximately how many hidden services are there?"
    • "Approximately how much traffic of the Tor network is going to hidden services?"

    We chose the above two questions because even though we want to understand hidden services, we really don't want to harm the privacy of Tor users. From a privacy perspective, the above two questions are relatively easy questions to answer because we don't need data from clients or the hidden services themselves; we just need data from hidden service directories and rendezvous points. Furthermore, the measurements reported by each relay cannot be linked back to specific hidden services or their clients.

    Our first move was to research various ways we could collect these statistics in a privacy-preserving manner. After days of discussions on obfuscating statistics, we began writing a Tor proposal with our design, as well as code that implements the proposal. The code has since been reviewed and merged to Tor! The statistics are currently disabled by default so we asked volunteer relay operators to explicitly turn them on. Currently there are about 70 relays publishing measurements to us every 24 hours:

    Number of relays reporting stats

    So as of now we've been receiving these measurements for over a month, and we have thought a lot about how to best use the reported measurements to derive interesting results. We finally have some preliminary results we would like to share with you:

    How many hidden services are there?

    All in all, it seems that every day about 30000 hidden services announce themselves to the hidden service directories. Graphically:

    Number of hidden services

    By counting the number of unique hidden service addresses seen by HSDirs, we can get the approximate number of hidden services. Keep in mind that we can only see between 2% and 5% of the total HSDir space, so the extrapolation is, naturally, messy.

    How much traffic do hidden services cause?

    Our preliminary results show that hidden services cause somewhere between 400 to 600 Mbit of traffic per second, or equivalently about 4.9 terabytes a day. Here is a graph:

    Hidden services traffic volume

    We learned this by getting rendezvous points to publish the total number of cells transferred over rendezvous circuits, which allows us to learn the approximate volume of hidden service traffic. Notice that our coverage here is not very good either, with a probability of about 5% that a hidden service circuit will use a relay that reports these statistics as a rendezvous point.

    A related statistic here is "How much of the Tor network is actually hidden service usage?". There are two different ways to answer this question, depending on whether we want to understand what clients are doing or what the network is doing. The fraction of hidden-service traffic at Tor clients differs from the fraction at Tor relays because connections to hidden services use 6-hop circuits while connections to the regular Internet use 3-hop circuits. As a result, the fraction of hidden-service traffic entering or leaving Tor is about half of the fraction of hidden-service traffic inside of Tor. Our conclusion is that about 3.4% of client traffic is hidden-service traffic, and 6.1% of traffic seen at a relay is hidden-service traffic.

    Conclusion and future work

    In this blog post we presented some preliminary results that could be extracted from these new hidden service statistics. We hope that this data can help us better gauge the future development and maturity of the onion space as well as detect potential incidents and bugs on the network. To better present our results and methods, we wrote a short technical report that outlines the exact process we followed. We invite you to read it if you are curious about the methodology or the results.

    Finally, this project is only a few months old, and there are various plans for the future. For example:

    • There are more interesting questions that we could examine in this area. For example: "How many people are using hidden services every day?" and "How many times does someone try to visit a hidden service that does not exist anymore?."

      Unfortunately, some of these questions are not easy to answer with the current statistics reporting infrastructure, mainly because collecting them in this way could reveal information about specific hidden services but also because the results of the current system contain too much obfuscating data (each reporting relay randomizes its numbers a little bit before publishing them, so we can learn about totals but not about specific events).

      For this reason, we've been analyzing various statistics aggregation protocols that could be used in place of the current system, allowing us to safely collect other kinds of statistics.

    • We need to incorporate these statistics in our Metrics portal so that they are updated regularly and so that everyone can follow them.

    • Currently, these hidden service statistics are not collected in relays by default. Unfortunately, that gives us very small coverage of the network, which in turn makes our extrapolations very noisy. The main reason that these statistics are disabled by default is that similar statistics are also disabled (e.g. CellStatistics). Also, this allows us more time to consider privacy consequences. As we analyze more of these statistics and think more about statistics privacy, we should decide whether to turn these statistics on by default.

      It's worth repeating that the current results are preliminary and should be digested with a grain of salt. We invite statistically-inclined people to review our code, methods, and results. If you are a researcher interested in digging into the measurements themselves, you can find them in the extra-info descriptors of Tor relays.

      Over the next months, we will also be thinking more about these problems to figure out proper ways to analyze and safely measure private ecosystems like the onion space.

    Till then, take care, and enjoy Tor!

    Some thoughts on Hidden Services

    Hi! Nick here.

    I ought to post my own responses to that Andy Greenberg article, too. (Especially since most everybody else around here is at 31c3 right now, or sick with the flu, or both.)

    When I saw the coverage of the hidden services study that was presented at CCC today, I was reminded of the media fallout from that old study from the 1990s that "proved" that a ridiculously high fraction of the internet was pornography...by looking at Usenet*, and by counting newsgroups and bytes. (You might remember it; it was the basis of the delightful TIME Magazine "Cyberporn" cover.)

    The 1990s researcher wasn't lying outright, but he and the press *were* conflating one question: "What fraction of Usenet groups are 'alt.sex' or 'alt.binaries' (file posting) groups" with two others: "What fraction of internet traffic is porn?" and "What fraction of internet-user hours are spent on porn?"

    These are quite different things.

    The presentation today focused on data about hidden service types and usage. Predictably, given the results from Biryukov, Pustogarov, Thill, and Weinmann, the researcher found that hidden services related to child abuse are only a small fraction of the total number of hidden service addresses on the network. And because of the way that hidden services work, traffic does not go through hidden service directories, but instead through rendezvous points (randomly chosen Tor nodes): so no relay that knows the hidden service's address will learn the actual amount of traffic transmitted. But, as previously documented, abusive services represent a disproportionate fraction of usage... if you're measuring usage with hidden service directory requests.

    Why might that be?

    First, some background. Basically, a Tor client makes a hidden service directory request the first time it visits a hidden service that it has not been to in a while. (If you spend hours at one hidden service, you make about 1 hidden service directory request. But if you spend 1 second each at 100 hidden services, you make about 100 requests.) Therefore, obsessive users who visit many sites in a session account for many more of the requests that this study measures than users who visit a smaller number of sites with equal frequency.

    There are other confounding factors as well. Due to bugs in older Tor implementations, a hidden service that is unreliable (or completely unavailable) will get many, many more hidden service directory requests than a reliable one. So if any abuse sites are unusually unreliable, we'd expect their users to create a disproportionately large number of hidden service directory requests.

    Also, a very large number of hidden service directory requests are probably not made by humans! See bug 13287: We don't know what's up with that. Could this be caused by some kind of anti-abuse organization running an automated scanning tool?

    In any case, a methodology that looks primarily at hidden service directory requests will over-rate services that are frequently accessed from a Tor client that hasn't been there recently, and under-rate services that are used via tor2web, and so on. It also depends a lot on how hidden services are configured, how frequently Tor hidden service directories go up and down, and what times of day they change introduction points in comparison to what time of day their users tend to be awake.

    The greater the number of distinct hidden services a person visits, and the less reliable those sites are, the more hidden service directory requests they will trigger.

    Suppose 10 people use hidden services to look at conspiracy theories, 100 people use hidden services to buy Cuban cigars, and 1000 people use it for online chat.

    But suppose that the average cigar purchaser visits only one or two sites to make purchases, and the average chat user joins one or two networks, whereas the average conspiracy theorist needs to visit several dozen forums and wikis.

    Suppose also that the average Cuban cigar purchaser makes about two purchases a month, the average chat user logs in once a day, and the average conspiracy theorist spends 3 hours a day crawling the hidden web.

    And suppose that conspiracy theory websites come and go frequently, whereas cigar sites and chat networks are more stable.

    In this analysis, even though there are far more people buying cigars, users who use it for obsessive behavior that spans multiple unreliable hidden services will be far overrepresented in the count of hidden service directory requests than users who use it for activities done less frequently and across fewer services. So any comparison of hidden service directory request counts will say more about the behavioral differences of different types of users than about their relative numbers, or the amount of traffic they generated.

    In conclusion, let's spend a minute talking about freedom and philosophy. Any system that provides security on the Internet will inevitably see some use by bad people that we'd rather not help at all. After all, cars are used for getaways, and window shades conceal all kinds of criminality. The only way to make a privacy tool that nobody abuses is to make it so weak that people aren't willing to touch it, or so unusable that nobody can figure it out.

    Up till now, many of the early adopters for Tor hidden services have been folks for whom the risk/effort calculations have been quite extreme, since--as I'd certainly acknowledge--the system isn't terribly usable for the average person as it stands. Roger noted earlier that hidden services amount to less than 2% of our total traffic today. Given their privacy potential, I think that's not even close to enough. We've got to work over the next year or more to develop hidden services to the point where their positive impact is felt by the average netizen, whether they're publishing a personal blog for their friends, using a novel communications protocol more secure than email, or reading a news article based on information that a journalist received through an anonymous submission system. Otherwise, they'll remain a target for every kind of speculation, and every misunderstanding about them will lead people to conclude the worst about privacy online. Come lend a hand?

    (Also, no offense to Andy on this: he is a fine tech reporter and apparently a fine person. And no offense to Dr. Owen, who explained his results a lot more carefully than they have been re-explained elsewhere. Now please forgive me, I'm off to write some more software and get some sleep. Please direct all media inquiries to the email of "press at torproject dot org".)

    * Usenet was sort of like Twitter, only you could write paragraphs on it. ;)

    Thoughts and Concerns about Operation Onymous

    What happened

    Recently it was announced that a coalition of government agencies took control of many Tor hidden services. We were as surprised as most of you. Unfortunately, we have very little information about how this was accomplished, but we do have some thoughts which we want to share.

    Over the last few days, we received and read reports saying that several Tor relays were seized by government officials. We do not know why the systems were seized, nor do we know anything about the methods of investigation which were used. Specifically, there are reports that three systems of Torservers.net disappeared and there is another report by an independent relay operator. If anyone has more details, please get in contact with us. If your relay was seized, please also tell us its identity so that we can request that the directory authorities reject it from the network.

    But, more to the point, the recent publications call the targeted hidden services seizures "Operation Onymous" and they say it was coordinated by Europol and other government entities. Early reports say 17 people were arrested, and 400 hidden services were seized. Later reports have clarified that it was hundreds of URLs hosted on roughly 27 web sites offering hidden services. We have not been contacted directly or indirectly by Europol nor any other agency involved.

    Tor is most interested in understanding how these services were located, and if this indicates a security weakness in Tor hidden services that could be exploited by criminals or secret police repressing dissents. We are also interested in learning why the authorities seized Tor relays even though their operation was targetting hidden services. Were these two events related?

    How did they locate the hidden services?

    So we are left asking "How did they locate the hidden services?". We don't know. In liberal democracies, we should expect that when the time comes to prosecute some of the seventeen people who have been arrested, the police would have to explain to the judge how the suspects came to be suspects, and that as a side benefit of the operation of justice, Tor could learn if there are security flaws in hidden services or other critical internet-facing services. We know through recent leaks that the US DEA and others have constructed a system of organized and sanctioned perjury which they refer to as "parallel construction."

    Unfortunately, the authorities did not specify how they managed to locate the hidden services. Here are some plausible scenarios:

    Operational Security

    The first and most obvious explanation is that the operators of these hidden services failed to use adequate operational security. For example, there are reports of one of the websites being infiltrated by undercover agents and the affidavit states various operational security errors.

    SQL injections

    Another explanation is exploitation of common web bugs like SQL injections or RFIs (remote file inclusions). Many of those websites were likely quickly-coded e-shops with a big attack surface. Exploitable bugs in web applications are a common problem.

    Bitcoin deanonymization

    Ivan Pustogarov et al. have recently been conducting interesting research on Bitcoin anonymity.

    Apparently, there are ways to link transactions and deanonymize Bitcoin clients even if they use Tor. Maybe the seized hidden services were running Bitcoin clients themselves and were victims of similar attacks.

    Attacks on the Tor network

    The number of takedowns and the fact that Tor relays were seized could also mean that the Tor network was attacked to reveal the location of those hidden services. We received some interesting information from an operator of a now-seized hidden service which may indicate this, as well. Over the past few years, researchers have discovered various attacks on the Tor network. We've implemented some defenses against these attacks, but these defenses do not solve all known issues and there may even be attacks unknown to us.

    For example, some months ago, someone was launching non-targetted deanonymization attacks on the live Tor network. People suspect that those attacks were carried out by CERT researchers. While the bug was fixed and the fix quickly deployed in the network, it's possible that as part of their attack, they managed to deanonymize some of those hidden services.

    Another possible Tor attack vector could be the Guard Discovery attack. This attack doesn't reveal the identity of the hidden service, but allows an attacker to discover the guard node of a specific hidden service. The guard node is the only node in the whole network that knows the actual IP address of the hidden service. Hence, if the attacker then manages to compromise the guard node or somehow obtain access to it, she can launch a traffic confirmation attack to learn the identity of the hidden service. We've been
    discussing various solutions to the guard discovery attack for the past many months but it's not an easy problem to fix properly. Help and feedback on the proposed designs is appreciated.

    *Similarly, there exists the attack where the hidden service selects the attacker's relay as its guard node. This may happen randomly or this could occur if the hidden service selects another relay as its guard and the attacker renders that node unusable, by a denial of service attack or similar. The hidden service will then be forced to select a new guard. Eventually, the hidden service will select the attacker.

    Furthermore, denial of service attacks on relays or clients in the Tor network can often be leveraged into full de-anonymization attacks. These techniques go back many years, in research such as "From a Trickle to a Flood", "Denial of Service or Denial of Security?", "Why I'm not an Entropist", and even the more recent Bitcoin attacks above. In the Hidden Service protocol there are more vectors for DoS attacks, such as the set of HSDirs and the Introduction Points of a Hidden Service.

    Finally, remote code execution exploits against Tor software are also always a possibility, but we have zero evidence that such exploits exist. Although the Tor source code gets continuously reviewed by our security-minded developers and community members, we would like more focused auditing by experienced bug hunters. Public-interest initiatives like Project Zero could help out a lot here. Funding to launch a bug bounty program of our own could also bring real benefit to our codebase. If you can help, please get in touch.

    Advice to concerned hidden service operators

    As you can see, we still don't know what happened, and it's hard to give concrete suggestions blindly.

    If you are a concerned hidden service operator, we suggest you read the cited resources to get a better understanding of the security that hidden services can offer and of the limitations of the current system. When it comes to anonymity, it's clear that the tighter your threat model is, the more informed you need to be about the technologies you use.

    If your hidden service lacks sufficient processor, memory, or network resources the DoS based de-anonymization attacks may be easy to leverage against your service. Be sure to review the Tor performance tuning guide to optimize your relay or client.

    *Another possible suggestion we can provide is manually selecting the guard node of a hidden service. By configuring the EntryNodes option in Tor's configuration file you can select a relay in the Tor network you trust. Keep in mind, however, that a determined attacker will still be able to determine this relay is your guard and all other attacks still apply.

    Final words

    The task of hiding the location of low-latency web services is a very hard problem and we still don't know how to do it correctly. It seems that there are various issues that none of the current anonymous publishing designs have really solved.

    In a way, it's even surprising that hidden services have survived so far. The attention they have received is minimal compared to their social value and compared to the size and determination of their adversaries.

    It would be great if there were more people reviewing our designs and code. For example, we would really appreciate feedback on the upcoming hidden service revamp or help with the research on guard discovery attacks (see links above).

    Also, it's important to note that Tor currently doesn't have funding for improving the security of hidden services. If you are interested in funding hidden services research and development, please get in touch with us. We hope to find time to organize a crowdfunding campaign to acquire independent and focused hidden service funding.

    Finally, if you are a relay operator and your server was recently compromised or you lost control of it, please let us know by sending an email to bad-relays@lists.torproject.org.

    Thanks to Griffin, Matt, Adam, Roger, David, George, Karen, and Jake for contributions to this post.

    * Added information about guard node DoS and EntryNodes option - 2014/11/09 18:16 UTC

    Facebook, hidden services, and https certs

    Today Facebook unveiled its hidden service that lets users access their website more safely. Users and journalists have been asking for our response; here are some points to help you understand our thinking.

    Part one: yes, visiting Facebook over Tor is not a contradiction

    I didn't even realize I should include this section, until I heard from a journalist today who hoped to get a quote from me about why Tor users wouldn't ever use Facebook. Putting aside the (still very important) questions of Facebook's privacy habits, their harmful real-name policies, and whether you should or shouldn't tell them anything about you, the key point here is that anonymity isn't just about hiding from your destination.

    There's no reason to let your ISP know when or whether you're visiting Facebook. There's no reason for Facebook's upstream ISP, or some agency that surveils the Internet, to learn when and whether you use Facebook. And if you do choose to tell Facebook something about you, there's still no reason to let them automatically discover what city you're in today while you do it.

    Also, we should remember that there are some places in the world that can't reach Facebook. Long ago I talked to a Facebook security person who told me a fun story. When he first learned about Tor, he hated and feared it because it "clearly" intended to undermine their business model of learning everything about all their users. Then suddenly Iran blocked Facebook, a good chunk of the Persian Facebook population switched over to reaching Facebook via Tor, and he became a huge Tor fan because otherwise those users would have been cut off. Other countries like China followed a similar pattern after that. This switch in his mind between "Tor as a privacy tool to let users control their own data" to "Tor as a communications tool to give users freedom to choose what sites they visit" is a great example of the diversity of uses for Tor: whatever it is you think Tor is for, I guarantee there's a person out there who uses it for something you haven't considered.

    Part two: we're happy to see broader adoption of hidden services

    I think it is great for Tor that Facebook has added a .onion address. There are some compelling use cases for hidden services: see for example the ones described at using Tor hidden services for good, as well as upcoming decentralized chat tools like Ricochet where every user is a hidden service, so there's no central point to tap or lean on to retain data. But we haven't really publicized these examples much, especially compared to the publicity that the "I have a website that the man wants to shut down" examples have gotten in recent years.

    Hidden services provide a variety of useful security properties. First — and the one that most people think of — because the design uses Tor circuits, it's hard to discover where the service is located in the world. But second, because the address of the service is the hash of its key, they are self-authenticating: if you type in a given .onion address, your Tor client guarantees that it really is talking to the service that knows the private key that corresponds to the address. A third nice feature is that the rendezvous process provides end-to-end encryption, even when the application-level traffic is unencrypted.

    So I am excited that this move by Facebook will help to continue opening people's minds about why they might want to offer a hidden service, and help other people think of further novel uses for hidden services.

    Another really nice implication here is that Facebook is committing to taking its Tor users seriously. Hundreds of thousands of people have been successfully using Facebook over Tor for years, but in today's era of services like Wikipedia choosing not to accept contributions from users who care about privacy, it is refreshing and heartening to see a large website decide that it's ok for their users to want more safety.

    As an addendum to that optimism, I would be really sad if Facebook added a hidden service, had a few problems with trolls, and decided that they should prevent Tor users from using their old https://www.facebook.com/ address. So we should be vigilant in helping Facebook continue to allow Tor users to reach them through either address.

    Part three: their vanity address doesn't mean the world has ended

    Their hidden service name is "facebookcorewwwi.onion". For a hash of a public key, that sure doesn't look random. Many people have been wondering how they brute forced the entire name.

    The short answer is that for the first half of it ("facebook"), which is only 40 bits, they generated keys over and over until they got some keys whose first 40 bits of the hash matched the string they wanted.

    Then they had some keys whose name started with "facebook", and they looked at the second half of each of them to pick out the ones with pronouncable and thus memorable syllables. The "corewwwi" one looked best to them — meaning they could come up with a story about why that's a reasonable name for Facebook to use — so they went with it.

    So to be clear, they would not be able to produce exactly this name again if they wanted to. They could produce other hashes that start with "facebook" and end with pronouncable syllables, but that's not brute forcing all of the hidden service name (all 80 bits).

    For those who want to explore the math more, read about the "birthday attack". And for those who want to learn more (please help!) about the improvements we'd like to make for hidden services, including stronger keys and stronger names, see hidden services need some love and Tor proposal 224.

    Part four: what do we think about an https cert for a .onion address?

    Facebook didn't just set up a hidden service. They also got an https certificate for their hidden service, and it's signed by Digicert so your browser will accept it. This choice has produced some feisty discussions in the CA/Browser community, which decides what kinds of names can get official certificates. That discussion is still ongoing, but here are my early thoughts on it.

    In favor: we, the Internet security community, have taught people that https is necessary and http is scary. So it makes sense that users want to see the string "https" in front of them.

    Against: Tor's .onion handshake basically gives you all of that for free, so by encouraging people to pay Digicert we're reinforcing the CA business model when maybe we should be continuing to demonstrate an alternative.

    In favor: Actually https does give you a little bit more, in the case where the service (Facebook's webserver farm) isn't in the same location as the Tor program. Remember that there's no requirement for the webserver and the Tor process to be on the same machine, and in a complicated set-up like Facebook's they probably shouldn't be. One could argue that this last mile is inside their corporate network, so who cares if it's unencrypted, but I think the simple phrase "ssl added and removed here" will kill that argument.

    Against: if one site gets a cert, it will further reinforce to users that it's "needed", and then the users will start asking other sites why they don't have one. I worry about starting a trend where you need to pay Digicert money to have a hidden service or your users think it's sketchy — especially since hidden services that value their anonymity could have a hard time getting a certificate.

    One alternative would be to teach Tor Browser that https .onion addresses don't deserve a scary pop-up warning. A more thorough approach in that direction is to have a way for a hidden service to generate its own signed https cert using its onion private key, and teach Tor Browser how to verify them — basically a decentralized CA for .onion addresses, since they are self-authenticating anyway. Then you don't have to go through the nonsense of pretending to see if they could read email at the domain, and generally furthering the current CA model.

    We could also imagine a pet name model where the user can tell her Tor Browser that this .onion address "is" Facebook. Or the more direct approach would be to ship a bookmark list of "known" hidden services in Tor Browser — like being our own CA, using the old-fashioned /etc/hosts model. That approach would raise the political question though of which sites we should endorse in this way.

    So I haven't made up my mind yet about which direction I think this discussion should go. I'm sympathetic to "we've taught the users to check for https, so let's not confuse them", but I also worry about the slippery slope where getting a cert becomes a required step to having a reputable service. Let us know if you have other compelling arguments for or against.

    Part five: what remains to be done?

    In terms of both design and security, hidden services still need some love. We have plans for improved designs (see Tor proposal 224) but we don't have enough funding and developers to make it happen. We've been talking to some Facebook engineers this week about hidden service reliability and scalability, and we're excited that Facebook is thinking of putting development effort into helping improve hidden services.

    And finally, speaking of teaching people about the security features of .onion sites, I wonder if "hidden services" is no longer the best phrase here. Originally we called them "location-hidden services", which was quickly shortened in practice to just "hidden services". But protecting the location of the service is just one of the security features you get. Maybe we should hold a contest to come up with a new name for these protected services? Even something like "onion services" might be better if it forces people to learn what it is.

    Ahmia search after GSoC development

    The Google Summer of Code (GSoC) was an excellent opportunity to improve on the Ahmia search engine. With Google's stipend and friendly mentoring from The Tor Project, I was able to concentrate on development of my search engine project. Thank you all!

    GSoC 2014 is over, but I am sticking around to continue developing and maintaining Ahmia.

    Here is the current status of ahmia after GSoC development:


    Ahmia is open-source search engine software for Tor hidden service websites. You can test the running search engine at ahmia.fi.

    Building a search engine for anonymous web sites running inside the Tor network is an interesting problem. Tor enables web servers to hide their location and Tor users can connect to these authenticated hidden services while the server and the user both stay anonymous. However, finding web content is hard without a good search engine and therefore a search engine is needed for the Tor network.

    Web search engines are needed to navigate and search the web. There were no search engines for searching hidden service web content, so I decided to build a search engine specially for Tor. I registered ahmia.fi and started development on it as a side project in 2010.

    This development involved programming and testing web crawlers, thinking of ways to find hidden service addresses (since the protocol does not allow enumeration), learning about the Tor community, and implementing a filtering policy. Moreover, I implemented an API that empowers other Tor services that publish content to integrate with Ahmia.

    As a result, Ahmia is a working search engine that indexes, searches and catalogs content published on Tor Hidden Services. Furthermore, it is an environment to share meaningful statistics, insights and news about the Tor network itself.

    Interesting Summer of Code

    One of my best memories from the summer is the Tor Project's Summer 2014 Developers meeting that was hosted by Mozilla in Paris, France. I have always admired the people who are working on the Tor Project.

    I also loved the coding itself. Finally I had time to improve the Ahmia search engine and its many features. I did a lot of work and liked it.

    Some journalist were very interested in my work: Carola Frediani asked if I could analyze the content of hidden services. I coded a script that fetches every front page's HTML, I gathered all the keywords, headers and description texts and made a simple word cloud visualization.

    Hidden website content visualization.

    It is a simple way to glance what is published on the hidden websites.

    Carola found this data useful and used it in her presentation at www.sotn.it on June 11th.

    Technical design of ahmia

    The Ahmia web service is written using the Django web framework. As a result, the server-side language is Python. On the client-side, most of the pages are plain HTML. There are some pages that require JavaScript, but the search itself works without client-side JavaScript.

    The components of Ahmia are:

    • Django front-end site
    • PostgreSQL database for the site
    • Custom scripts to download data about hidden services
    • Django-Haystack connection to Solr database
    • Apache Solr for the crawled data
    • OnionBot crawler that gathers data to Solr database

    Technical architecture.

    See installation and developing tutorial


    The full-text search is implemented using Django-Haystack. The search is using crawled website data that is saved to Apache Solr.


    OnionDir is a list of known online hidden service addresses. A separate script gathers this list and fetches information fields from the HTML (title, keywords, description etc.). Furthermore, users can freely edit these fields.

    We've also started a convention where hidden service admins can add a file to their website, called description.json, to offer an official description of their site in Ahmia.

    As a result, this information is shown in the OnionDir page and over 80 domains are already using this method.


    We are gathering statistics from hidden services. As a result, we can represent and share meaningful data about hidden services and visualize it.

    We are gathering three types of popularity data:

    1. Tor2web nodes share their visiting statistics to Ahmia
    2. Number of public WWW backlinks to hidden services
    3. Number of clicks in the search results

    The click counter tells the total number of clicks on a search result in ahmia.fi


    We have decided to filter any sites related to child porn from our search results. Ahmia is removing everything related to these websites. These websites may not be actual child porn sites. They are rather sites where users can post content (forums, file and image uploads etc.) and as the result there have been, momentarily at least, some suspicious content that has not been moderated in a reasonable period of time. Ahmia.fi does not have the time to monitor these sites carefully and we are banning sites from our public index if we see any evidence of child abuse. Of course, the ban is removed if the site itself contacts us and we review the website to be OK.

    In practice, Ahmia calculates the MD5 sums of the banned domains for use as a filtering policy. Moreover, we are sharing this list and Tor2web nodes can use the list to filter out pages.

    At the moment, there seems to be 1228 hidden website domains online and 7 of them has been filtered because they are possibly sharing child porn content.


    OnionBot is a crawler for hidden service websites based on the Scrapy framework. It crawls the Tor network and passes data to the search database. OnionBot requires the Tor software (using Tor2web mode) and Polipo. The results are saved to Apache Solr.

    Apache Solr

    Apache Solr is a popular, open source enterprise search platform. Its major features include powerful full-text search, hit highlighting, faceted search, and near real-time indexing.

    The schema.xml file contains all of the details about which fields your documents can contain, and how those fields should be dealt with when adding documents to the index, or when querying those fields.

    Security measures for privacy

    In the software

    • We do not log any IP addresses, see Apache configuration
    • We are gathering real-time clicks, however, this data is not shown accurately

    In the host ahmia.fi

    • Backend servers are run separately and they do not have any knowledge about the end-users
    • All servers are hosted in countries with strong privacy laws. For example, Finland and the Netherlands
    • Communication between servers is encrypted
    • Only a few trustworthy people know the locations of the back-end servers and are able to access them

    Future work

    GSoC 2014 was fun and productive!

    There is a lot more to do. However, I do not have time to do everything myself. Of course, I am coding when I have time and maintaining the search engine.

    In addition, I am going to write a scientific article about the implementation.

    Is there anyone who would be interested in developing Ahmia.fi?

    Is anyone familiar with Solr and would know how to tweak it for full text search?

    Furthermore, any kind of help would be most welcome. There are always Linux admin duties, HTML/CSS design, bug fixing, Django development, etc...

    For further information, please don't hesitate to contact me by e-mail: juha.nurmi@ahmia.fi

    Syndicate content Syndicate content