this blog post is the second issue of the Cooking with Onions series which aims to highlight interesting aspects of the onion space. Check-out our first issue as well!
Onion addresses are weird...
This post is about onion addresses being weird and the approaches that can be taken to improve onion service usability.
In particular, if you've cruised around the onionspace, you must have noticed that onion services typically have random-looking addresses that look like these:
To better understand why onion addresses are so strange, it helps to remember that onion services don't use the insecure Domain Name System (DNS), which means there is no organization like ICANN to oversee a single root registry of onion addresses or to handle ownership dispute resolution of onion addresses. Instead, onion services get strong authentication from using self-authenticating addresses: the address itself is a cryptographic proof of the identity of the onion service. When a client visits an onion service, Tor verifies its identity by using the address as ground truth.
In other words, onion services have such absurd names because of all the cryptography that's used to protect them. Cryptographic material are basically huge numbers that look meaningless to most humans, and that's the reason onion addresses tend to look random as well.
To motivate this subject further, Tor developers have medium-term future plans for upgrading the cryptography of onion services, which has the side-effect of increasing onion address length to 54 characters! This means that in the future onion addresses will look like this:
Over the years the Tor community has come up with various ways of handling these large and non-human-memorable onion addresses. Some people memorize them entirely or scribe them into secret notebooks, others use tattoos, third-party centralized directories or just google them everytime. We've heard of people using decks of cards to remember their favorite onion sites, and others who memorize them using the position of stars and the moon.
We believe that the UX problem of onion addresses is not actually solved with the above ad-hoc solutions and remains a critical usability barrier that prevents onion services from being used by a wider audience.
The onion world never had a system like DNS. Even though we are well aware that DNS is far from the perfect solution, it's clear that human memorable domain names play a fundamental role in the user experience of the Internet.
In this blog post we present you a few techniques that we have devised to improve the usability of onion addresses. All of these ideas are experimental and come with various fun open questions, so we are still in exploration mode. We appreciate any help in prototyping, analyzing and finding flaws in these ideas.
Idea 1) A modular name system API for Tor onion services
During the past years, many research groups have experimented and designed various secure name systems (e.g. GNS, Namecoin, Blockstack). Each of these systems has its own strengths and weaknesses, as well as different user models and total user experience. We are not sure which one works best for the onion space, so ideally we'd like to try them all and let the community and the sands of time decide for us. We believe that by integrating these experimental systems into Tor, we can greatly strengthen and improve the whole scientific field by exposing name systems to the real world and an active and demanding userbase.
For this reason and based on our experience with modular anti-censorship techniques, we designed a generic & modular scheme through which any name system can be integrated to Tor: Proposal 279 defines A Name System API for Tor Onion Services which can be used to integrate any complex name system (e.g. Namecoin) or even simple silly naming schemes (e.g. a local /etc/tor-hosts file).
It's worth pointing out that proposal 279 is in draft status and we still need to incorporate feedback received in the mailing list. Furthermore, people have pointed out simple ways through which we can fast-track and prototype the proposal faster. Help in implementing this proposal is greatly appreciated (find us in IRC!).
Idea 2) Using browser extensions to improve usability
Other approaches for improving the usability of onion addresses use the Tor Browser as a framework: think of browser extensions that map human memorable names to onion addresses.
There are many variants here so let's walk through them:
Idea 2.1) Browser Extension + New pseudo-tld + Local onion registry
A browser extension like HTTPS-everywhere, uses an onion registry to map human-memorable addresses from a new pseudo-tld (e.g. ".tor") to onion addresses. For example, it maps "watchtower.tor" to "fixurqfuekpsiqaf.onion" and "globaleconomy.tor" to "froqh6bdgoda6yiz.onion". Such an onion registry could be local (like HTTPS-everywhere) or remote (e.g. a trusted append-only database).
Even an extension with a local onion registry would be a very effective improvement to the current situation since it would be pretty usable and its security model is easy to understand: an audited local database seems to work well for HTTPS-everywhere. However, there are social issues here: how would the onion registry be operated and how should name registrations be handled? I can see people fighting for who will get bitcoin.tor first. That said, this idea can be beneficial even with a small onion database (e.g. 50 popular domains).
Here is a graphical depiction of a browser extension with a local onion registry resolving the domain sailing.tor for a user:
Idea 2.2) Browser extension + New pseudo-tld + Remote onion registries
A more dynamic alternative here involves multiple trusted remote onion registries that the user can add to their torrc. Imagine a web-of-trust based system where you add your friend's Alice onion registry and then you can visit facebook using facebook.alice.onion.
A similar more decentralized alternative could be a browser addon that uses multiple remote onion registries/notaries to resolve a name, employing a majority or supermajority rule to decide the resolution results. Such a system could involve notary nodes similar to SSL schemes like Convergence.
Idea 2.3) Browser extension redirects existing DNS names
An easier but less effective approach would be for the browser extension to only map DNS domain names to onion names. So for example, it would map "duckduckgo.com" to "3g2upl4pq6kufc4m.onion". That makes the job of the name registrar easier, but it also heavily restricts users only to services with a registered DNS domain name. Some attempts have already been made in this area but unfortunately they never really took off.
Idea 2.4) Automatic Redirection using HTTP
The Alt-Svc HTTP header defines a way for a website to say "I'm facebook.com but you should talk to me using fbcdn.com." If we replace that fbcdn.com address with facebookcorewwi.onion - then when you typed in Facebook, the browser would, under the covers, use the .onion address. And this can be done without any browser extension whatsoever.
One problem is that the browser has to remember this mapping, and in Tor Browser that mapping could be used to track or correlate you. Preloading the mapping would solve this, but how to preload the mapping probably brings us back into the realm of a browser extension.
Idea 2.5) Smart browser bookmarks for onion addresses
Talking about random addresses, it's funny how people seem to be pretty happy handling phone numbers (big meaningless random numbers) using a phone book and contacts on their devices.
On the same note, an easier but less usable approach would be to enhance Tor Browser with some sort of smart bookmark/petname system which allows users to register custom names for onion sites, and allows them to trust them or share them with friends. Unfortunately, it' unclear whether the user experience of this feature would make it useful to anyone but power users.
Of course it's important to realize that any approach that relies on a browser extension will only work for the web, and you wouldn't be able to use it for arbitrary TCP services (e.g. visiting an IRC server)
Idea 3) Embed onion addresses in SSL certificates
So let's shift back to non-browser approaches!
Let's Encrypt is an innovative project which issues free SSL certificates in an automated fashion. It has greatly improved Internet security since now anyone can freely acquire an SSL certificate for their service and provide link security to their users.
Now let's imagine that Let's Encrypt embedded onion address information into the certificates it issues, for clients with both a normal service and an onion service. For example, the onion address could be embedded into a custom certificate extension or in the C/ST/L/O fields. Then Tor Browser, when visiting such an SSL-enabled website, would parse and validate the certificate and if an onion address is included, the browser would automagically redirect the user. Take a look at this paper for some more neat ideas on this area.
Idea 4) Embed onion addresses in DNS/DNSSEC records
A similar approach could use the DNS system instead of the SSL CA system. For example, site owners could add their onion address into their TXT or SRV DNS records and Tor could learn to redirect users to the onion address. Of course this approach only applies to operators that can afford a DNS domain. Oh yeah DNS also has zero security...
As you can see there are many approaches that we should explore to improve usability in this area. Each of them comes with its own tradeoffs and applies to different users, so it's important that we allow users to experiment with various systems and let each community decide which approach works best for them.
It's also worth pointing out that some of these approaches are not that hard to implement technically, but they still require lots of effort and community building to really take off and become effective. Involving and pairing with other friendly Internet privacy organizations is essential to achieve our goals.
Furthermore, we should think carefully of unintended usability and security consequences that come with using these systems. For example, people are not used to their browser automagically redirecting them from one domain to another: this can seriously freak people out. It's also not clear how Tor Browser should handle these special names to avoid SSL certificate verification issues and hostname leaks.
One thing is for sure: even though onion services are used daily by thousand of people, the random addresses confuse casual users and prevent the ecosystem from maturing and achieving widespread adoption. We hope that this blog post inspires researchers and developers to toy around with naming systems and take the initiative in building and experimenting with the various approaches. Please join the [tor-dev] mailing list and share your thoughts and projects with us!
And this brings us to the end of this post. Hope you enjoyed this issue of Cooking With Onions! We will be back soon, always with the finest produce and the greatest cooking tips! What would you like us to cook next?
[Thanks to Philipp Winter and Tom Ritter for the feedback on this blog post, as well as to everyone who has discussed and helped develop these ideas.]
During the month of December, we're highlighting other organizations and projects that rely on Tor, build on Tor, or are accomplishing their missions better because Tor exists. Check out our blog each day to learn about our fellow travelers. And please support the Tor Project! We're at the heart of Internet freedom. Donate today!
The topic for today is electronic money. The blockchain is pretty hot right now! Bitcoin, dogecoin, ethereum, zcash you name it... Cryptocurencies have grown from e-toys to globally recognized systems by facilitating free and borderless trade, no bank fees and improved privacy.
You are reading the Tor blog, so let's focus on the privacy and anonymity part. Could cryptocurrencies claim that they provide privacy if Tor was not around to give strong transport-layer anonymity?
To visualize this, let's go through just a few ways Tor is used around the cryptocurrency ecosystem. We will mainly focus on Bitcoin, but the same applies to most blockchain-based cryptocurrencies:
Tor provides privacy to cryptocurrency transactions!
Let's imagine that Alice wants to buy a ticket for Torconf, the best (fictional) conference on computer anonymity. She wants to buy the ticket with Bitcoin so that she does not reveal her interests to her bank or her identity to the conference organizers. To buy the ticket with Bitcoin, she needs to perform a Bitcoin transaction.
Bitcoin transactions work by Alice broadcasting her transaction to a few Bitcoin supernodes. Those nodes then propagate the transaction further to the rest of the Bitcoin network until it becomes recognized. If Alice did not use Tor to conduct her transaction, those initial supernodes trivially learn the IP address of Alice. Furthermore, since the Bitcoin blockchain is a public log of transactions, analysts could match her newest transaction with her previous transactions and just follow the money trail. These are just some of the many well known privacy risks of Bitcoin, and companies have been collecting and selling social graph analytics of the Bitcoin blockchain for years now...
Given the above threats, it should be no surprise that most Bitcoin clients give the option to their users to perform transactions over the Tor network. By routing traffic over Tor, no one learns the origin IP address of Alice when she buys her Torconf ticket.
Furthermore, even the hottest and newest cryptocurrencies (like Zcash) that provide transaction anonymity as a fundamental security property still benefit from Tor's transport-layer anonymity to actually anonymize the networking part of the Zcash transaction.
We feel that Tor has tremendously helped the cryptocurrency community to grow just by providing transport-layer anonymity to transactions! Also, please remember that maintaining anonymity is not an easy task, so always be up-to-date on the latest security news depending on your threat model.
Tor secures cryptocurrency networks!
Apart from users performing anonymous Bitcoin transactions, the Bitcoin network itself uses Tor to increase its defenses. Since last year, the Bitcoin core project has integrated Tor onion services to their core network daemon. If Tor is installed in the system, Bitcoin will automatically create an onion service and act as a Bitcoin node over Tor to avoid leaking the real IP address of the node. This provides greater network resilience and protection against targeted attacks to Bitcoin nodes. You can see that there are hundreds of Tor bitcoin nodes. Zcash and other cryptocurrencies have followed the same path.
Furthermore, many mining pools advertise onion service support for their miners. Bitcoin infrastructure has been a target of hackers for a while, and virgin blocks are more and more valuable, so having anonymity as a miner is a desirable security property.
Tor protects the wider cryptocurrency ecosystem!
If you take a look around the Bitcoin world, you will notice that Tor support is advertised by all sorts of websites and services! Most bitcoin-related websites have onion sites that people can visit over Tor: for example, blockchain.info has been running a popular Tor onion service for its users. Most Bitcoin tumbler services also work over Tor onion services. Same goes for websites and forums offering help with Bitcoin. This is obviously done because the Bitcoin community has a great appreciation and need for privacy.
Tor is proud to have helped the cryptocurrency community grow over the years. We believe that electronic currencies can be a powerful tool for social change, but also a great scientific research area with results that can benefit other areas, like secure electronic voting, consensus algorithms, append-only data structures and secure name systems.
Have a good day :)
During the month of December, we're highlighting other organizations and projects that rely on Tor, build on Tor, or are accomplishing their missions better because Tor exists. Check out our blog each day to learn about our fellow travelers. And please support the Tor Project! We're at the heart of Internet freedom.
The Internet was made for humans to communicate with each other! Even though Internet calls over video and audio are totally possible nowadays, people still enjoy sending texts to each other due to their asynchronous, permanent and casual nature. To understand how important these instant messaging systems are, just check the user growth of systems like WeChat, WhatsApp, etc.
Unfortunately, all these major mainstream messaging systems belong to huge companies whose money comes from advertising and selling the data and metadata of their users.
The good news here is that in the past couple of years, there has been great progress in protecting users' data by employing end-to-end encryption using the Signal protocol. The bad news is that there has still been absolutely no progress in protecting the metadata and location information of users by these mainstream platforms.
Case in point, since most instant messaging systems are not anonymous, they get to learn the full location history of their users through the users' IP address history. Also, all major chat systems require a social media account or a phone number, which is simply impossible for some people, and it also makes it hard to create anonymous or burner accounts for everyone. It also makes you searchable and targettable by people who happen to know your phone number.
In this blog post, we showcase a few open-source text messaging tools that provide location privacy and additional security to their users by using Tor as a default. All of them are free and open source, so feel free to experiment!
Ricochet is an anonymous instant messaging tool that hides metadata by using Tor. It's got a slick UI and works on Windows, Linux and Mac OS X.
In the Ricochet protocol, each user is a Tor onion service. By utilizing onion services, the protocol achieves strong anonymity for its users. And because of its decentralized nature, it's impossible for attackers to censor it by taking down a single server.
Ricochet is designed with UX in mind, so it's easily usable even by people who don't understand how Tor works.
If you happen to only use mobile platforms (like most of the world these days), Chatsecure is an app that you should check out! It works for both Android and iOS, and it allows you to connect to XMPP servers to communicate over encrypted OTR chat. This means that you can also use it to connect to other XMPP-enabled messaging systems like Facebook chat and Google Talk.
It's developed by the Guardian Project, and it's a part of their software suite for private communications that includes Orbot and Orfox. Stay tuned on our blog for more information about this software family later this December!
And now for further excitement, let's get into the more experimental sections of the secure messaging space!
Pond is an anonymous instant messaging tool with various sophisticated security properties that is capable of hiding even the metadata of its users.
The protocol is designed in such a way that even a nasty attacker who is constantly monitoring your Internet connection will have a very hard time figuring out when you actually send and receive Pond messages, even if she conducts statistical analysis of your traffic patterns. Smoke and mirrors you might say, but if you like protocols, we invite you to check out the Pond protocol specs.
Unfortunately, Pond is a side-project, and due to lack of free time, the project is not currently actively being developed, even though there is still a community of users. It only works on Linux, and it has a GUI interface.
Briar is an experimental P2P messaging system that is currently in private beta. It targets mobile users and is closely integrated with Tor onion services.
The Briar protocol is fully decentralized, and all communication is end-to-end encrypted. It aims to be highly resilient against network failures, and so it can also function over Bluetooth or WiFi. Furthermore, it attempts to hide the social graph of its users by keeping the user contact list on the client side.
As you can see, there have been multiple efforts for private and metadata-hiding communication over the past years. Some of these projects are supposed to be used on top of already existing chat frameworks, whereas others aim to create their own ecosystems.
Of course, the research realm of secure messaging is far from complete; it's just getting started. From improving the UX to adding new security properties, this field needs further thinking all around.
For example, secure multiparty messaging is a very important upcoming field that studies how the protocols above that are designed for 1-to-1 communication can scale to hundreds of clients talking at the same time while maintaining their security properties.
Furthermore, as global surveillance is growing, we better understand the importance of hiding metadata from network attackers. Only now are we starting to grasp the importance of security properties like obfuscating communication patterns, hiding the users' social graph and letting users choose when to reveal their online presence.
Tor is extremely interested in the instant messaging space, and we are always on the lookout for innovative developments and interesting messaging projects. We have deep gratitude to all of the people who have helped to push the field of secure messaging forward, and we hope to enable them in the future to provide anonymous communication tools!
Donate and we will make it happen! :)
During the month of December, we're highlighting other organizations and projects that rely on Tor, build on Tor, or are accomplishing their missions better because Tor exists. Check out our blog each day to learn about our fellow travelers. And please support the Tor Project! We're at the heart of Internet freedom.
The Ahmia project
Onion services are used by thousands of people every day, yet they remain as elusive as ever. There is no central repository of onion sites, and there are no great ways to find the content you are looking for. We feel that this "foggy situation" severely impacts the user experience of onion services and hence also impedes their deployment and acceptance by the general public. It's easy to dismiss the onionspace as smelly if you only read media articles about the onion sites that stink the most.
How is one supposed to navigate in the onionspace if there is no map?
On the "normal Internet," people are used to using search engines to find the content they are looking for: blogs, shops, educational resources, cat pictures. Search engines act as streetlights on the dark alleyways of the Internet; allowing people to navigate and visit the places they want.
However in the onionspace, search engines are not well established, and finding the right content is much harder. For years people have resorted to various DIY solutions for listing and finding onion addresses, but none of those solutions is particularly pleasant or complete.
Imagine Alice wants to start a blog about her cats on the onionspace. There is no good place for Alice to list her onion address so that other people can find it. Without a good search engine, it's hard for other cat fans to find her website and start building a community.
How is one supposed to catch 'em all if we don't know how many there are?
Hence, there is no better time to introduce Ahmia! Ahmia is a search engine for onion sites. The Ahmia project has been around for years, and it's been collecting public onion addresses and indexing them so that users can search for the content they are looking for.
Ahmia's indexing technology is improving, and the quality of the search results has gotten much better over the past year. Ahmia also provides an easy way for onion service operators to register their own onion sites with the search engine. Ahmia's onion site is here.
Juha Nurmi, the lead Ahmia developer, is still actively involved with the project, however writing a low-budget search engine is not an easy job! Crawling the Internet requires heavy infrastructure and is technically complicated. Discovering onion links means searching in the deepest corners of both the normal Internet and the onionspace. Ahmia is always looking for more volunteers and sources of funding! Two years ago, Tor supported Ahmia by working together in Google Summer of Code 2014.
How is one supposed to walk around if the fog machine is on?
Finally and closing with a healthy dose of paranoia, we need to remember that centralized search engines might be a temporary solution for now, but they are never the end goal. Centralized services should be avoided in high-security systems like anonymity networks, and we should always strive to build decentralized systems and to research alternative ways to make anonymity systems more usable. There is lots of work to be done.
Donate and get involved!
Thank you for reading and enjoy Monday!
This blog post is the first part of the Cooking with Onions series which aims to highlight various interesting developments on the .onion space. This particular post presents a technique for efficiently scaling busy onion services.
The need for scaling
Onion services have been around for a while. During the past few years, they have been deployed by many serious websites like major media organizations (like the Washington Post), search engines (such as DuckDuckGo) and critical Internet infrastructure (e.g. PGP keyservers). This has been a great opportunity for us, the development team, since our code has been hardened and tested by the sheer volume of clients that use it every day.
This recent widespread usage also gave us greater insights on the various scalability issues that onion service operators face when they try to take their service to the next level. More users means more load to the onion service, and there is only so much that a single machine can handle. The scalability of the onion service protocol has been a topic of interest to us for a while, and recently we've made advancements in this area by releasing a tool called OnionBalance.
So what is OnionBalance?
OnionBalance is software designed and written by Donncha O'Cearbhaill as part of Tor's Summer of Privacy 2015. It allows onion service operators to achieve the property of high availability by allowing multiple machines to handle requests for a single onion service. You can think of it as the onion service equivalent of load balancing using round-robin DNS.
OnionBalance has recently started seeing more and more usage by onion service operators! For example, the Debian project recently started providing onion services for its entire infrastructure, and the whole project is kept in line by OnionBalance.
How OnionBalance works
Consider Alice, an onion operator, who wants to load balance her overloaded onion service using OnionBalance.
She starts by setting up multiple identical instances of that onion service in multiple machines, makes a list of their onion addresses, and passes the list to OnionBalance. OnionBalance then fetches their descriptors, extracts their introduction points, and publishes a "super-descriptor" containing all their introduction points. Alice now passes to her users the onion address that corresponds to the "super-descriptor". Multiple OnionBalance instances can be run with the same configuration to provide redundancy when publishing the super descriptor.
When Bob, a client, wants to visit Alice's onion service, his Tor client will pick a random introduction point out of the super-descriptor and use it to connect to the onion service. That introduction point can correspond to any of the onion service instances, and this way the client load gets spread out.
With OnionBalance, the "super-descriptor" can be published from a different machine to the one serving the onion service content. Your onion service private key can be kept in a more isolated location, reducing the risk of key compromise.
For information on how to set up OnionBalance, please see the following article:
OnionBalance is a handy tool that allows operators to spread the load of their onion service to multiple machines. It's easy to set up and configure and more people should give it a try.
In the meanwhile, we'll keep ourselves busy coming up with other ways to scale onion services in this brave new world of onions that is coming!
Take care until the next episode :)
At the beginning of July, a few of us gathered in Washington DC for the first hidden service hackfest. Our crew was comprised of core Tor developers and researchers who were in the area; mostly attendees of PETS. The aim was to push hidden service development forward and swiftly arrive at decisions that were too tiresome and complex to make over e-mail.
Since we were mostly technical folks, we composed technical proposals and prioritized development, and spent less time with organizational or funding tasks. Here is a snapshot of the work that we did during those 5 days:
- The first day, we discussed current open topics on hidden services and tasks we should be doing in the short-to-medium-term future.
Our list of tasks included marketing and fundraising ones like "Re-branding hidden services" and "Launch crowdfunding campaign", but we spent most of the first day discussing Proposal 224 aka the "Next Generation Hidden Services" project.
- Proposal 224 is our master plan for improving hidden services in fundamental ways: The new system will be faster, use better cryptography, have more secure onion addresses, and offer advanced security properties like improved DoS resistance and keeping identity keys offline. It's heavy engineering work, and we are still fine-tuning the design, so implementation has not started yet.
While discussing how we would implement the system, we decided that we would need to write most of the code for this new protocol from scratch, instead of hooking into the old and rusty hidden service code. To move this forward, we spent part of the following days splitting the proposal into individual modules and figuring out how to refactor the current data structures so that the new protocol can coexist with the old protocol.
- One open design discussion on proposal 224 has been an earlier suggestion of merging the roles of "hidden service directory" and "introduction point" on the hidden service protocol. This change would improve the security and performance as well as simplify the relevant code, and reduce load on the network. Because it changes the protocol a bit, it would be good to have it specified precisely. For this reason, we spent the second and third days writing a proposal that defines how this change works.
- Another core part of proposal 224 is the protocol for global randomness calculation. That's a system where the Tor network itself generates a fresh, unpredictable random value everyday; basically like the NIST Randomness Beacon but decentralized.
Proposal 225 specifies a way that this can be achieved, but there are still various engineering details that need to be ironed out. We spent some time discussing the various ways we can implement the system and the engineering decisions we should take, and produced a draft Tor proposal that specifies the system.
- We also discussed guard discovery attacks, and the various defenses that we could deploy. The fact that many core Tor people were present helped us decide rapidly which various parameters and trade-offs that we should pick. We sketched a proposal and posted it to the [tor-dev] mailing list and it has already received very helpful feedback.
- We also took our old design for "Direct Onion Services" and revised it into a faster and far more elegant protocol. These types of services trade service-side location privacy for improved performance, reliability, and scalability. They will allow sites like reddit to offer their services faster on hidden services while respecting their clients anonymity. During the last days of the hackfest, we wrote a draft proposal for this new design.
- We did more development on OnioNS, the Onion Name System, which allows a hidden service operator to register a human memorable name (e.g. example.tor) that can be used instead of the regular onion address. In the last days of the hackfest we prepared a proof-of-concept demo wherein a domain name was registered and then the Tor Browser successfully loaded a hidden service under that name. That was a significant step for the project.
- We also discussed hidden service statistics and how the two statistics we implemented a few months ago have been very useful. To improve their reliability (since currently only about 3% of the network reports them), we decided to enable them by default in the future.
We also discussed systems for collecting additional statistics in a privacy-preserving manner, using Secure Multiparty Computation or other similar techniques.
- We talked about rebranding the "Hidden Services" project to "Onion Services" to reduce "hidden"/"dark"/"evil" name connotations, and improve terminology. In fact, we've been on this for a while, but we are still not sure what the right name is. What do you think?
- To improve user education, we explored various concepts for a graphical animation explaining hidden services similar in concept to the Tor animation from a few months ago.
And that's only part of what we did. We also wrote code for various tickets, reviewed even more code and really learned how to use Ricochet.
All in all, we managed to fit more things than we hoped into those few days and we hope to do even more focused hackfests in the near future. Email us if you are interested in hosting a hackfest!
If you'd like to get involved with hidden service development, you can contact the hackfest team. Our nicks on IRC OFTC are armadev, asn, dgoulet, kernelcorn, mrphs, ohmygodel, robgjansen, saint, special, sysrqb, and syverson.
Until next time!
Hidden Services have received a lot of attention in 2015, and Tor is at the center of this conversation. Hidden Services are a Tor technology that allows users to connect to services (blogs, chats, and many other things) with neither the user nor the site giving up identifying information.
In fact, anything you can build on the internet, you can build on hidden services. But they're better--they give users things that normal networking doesn't authentication and confidentiality are built in; anonymity is built in. An internet based on hidden services would be an internet with Tor built in--a feature that users could take for granted. Think of what this might mean to millions of users in countries like China, Iran, or the UK. Yet currently, only about 4% of Tor's traffic comes from hidden services.
So we at Tor have been considering how we might meet the challenge of making them more widely available. In this post, we will briefly discuss the role of hidden services before we explore the idea of using crowdfunding to pay for bold, long-term tech initiatives that will begin to fulfill the promise of this technology.
Hidden Services are a critical part of Tor's ecosystem
Hidden Services provide a means for Tor users to create sites and services that are accessible exclusively within the Tor network, with privacy and security features that make them useful and appealing for a wide variety of applications.
For example, hidden services are currently used by activists and journalists to publish blogs--in anonymity and free from retaliation. They are used by NGOs to securely receive information on government corruption and injustice from concerned citizens. Newspapers such as the Washington Post, and human rights groups such as Amnesty International use them to receive leaked information. They are used by people looking for the latest cat facts, companies that want to secure the path of their clients or by people chatting securely and anonymously -- including at-risk journalists talking to sources.
In addition, developers use hidden services as a building block to incorporate Tor's security and anonymity features into totally separate products. The potential of hidden services is huge, and much of it is yet to be explored.
Next Steps for Hidden Services
We want to make this technology available to the wider public as these services will play a key role in the future of secure communications. This means that we must increase the uses for hidden services, bring them to mobile platforms for anonymous mobile apps, and vastly increase the number of people who use them.
Since our goal is wider use, it is imperative that we build them to be more secure, easier to set up, better performing, and more usable. Clearly, the questions that we answer in early deployment efforts will inform how we answer the deeper questions pertaining to massive worldwide deployment.
We must engage a large number of people to bring hidden services to the next level. Until now, hidden services development largely relied on the volunteer work of developers in their spare time. This will not be sufficient if we are to make the leap to transformative hidden services.
We are currently evaluating funding strategies that will support our Hidden Service initiatives in the short-, intermediate- and long-term. In order to fit the requirements more conservative large funders have, so we can fully sponsor the Next Generation Hidden Services, we must put preliminary pieces in place. And for that we will reach out to crowdfunding. To do this right, we need your feedback.
Crowdfunding allows us to engage the broader community in grasping the opportunity that this new technology promises. We are confident that we can deliver significant advancements in the hidden services field in the short-term, and that many small donors who understand their context will be eager to contribute. We intend to begin by prioritizing the improvement of the security, usability, and performance of the current hidden services system.
Further, we want to make sure we support the efforts of community projects and that the community is participating in shaping the evolution of hidden services. For example, it would be important to assist and improve the Tor integration of projects such as SecureDrop, Pond, Ahmia and Ricochet. We are in the unique position to be able to shape the Tor protocol to make these projects easier to use and better performing, and we would like to identify ways to promote broader deployment of these projects.
Identifying, prioritizing and meeting future challenges will require engagement throughout the greater community. For instance, as changes and enhancements are introduced, we hope to speak with the best bug hunters, cryptographers and privacy experts and ask them to audit our code and designs. Non-technical users could help us evaluate the usability of our improvements.
For this crowdfunding campaign we have identified a few possible ideas-- but the point of this post is to ask you for yours. Here are three projects that we have come up with so far:
An application that Hidden Service operators could use to learn more about the activity of their Hidden Service. The operator would have access to information on user activity, security information, etc., and will receive important system-generated updates, including log messages
A way to set up public hidden services with improved performance but reduced server-side anonymity. Basically, hidden services that don't care about anonymity but still want to protect their clients with Tor's cryptography and anonymity, will be able to run faster since they don't need to protect their own anonymity. This is an optional feature that suits the needs of large sites like Facebook and reddit, and will make their hidden services faster while also reducing the traffic they cause to the network. Also by optimizing for performance in this specialized feature, we can optimize for security even more in the default hidden services configuration.
Tor has been at the center of hidden services from the beginning. We have big lists of changes we need to do to the Tor protocol to increase the security of hidden services against cryptanalysis, DoS and deanonymization attacks. We also want to improve guard security, allow operators to store their cryptographic keys offline and enable scaling of hidden services to new levels. This is a big project but we hope to start crunching through it as part of this crowdfunding campaign.
Your Idea for Hidden Services?
Long story short, we are looking for feedback!
Also, we are curious about which crowdfunding platforms you prefer and why.
In the following weeks, we will update you on our progress, incorporating feedback we receive from the community. We hope to make this process as transparent and public as possible!
EDIT: The "Unhidden Services" paragraph was expanded and changed to "Fast-but-not-hidden Services". The previous name was too scary and the description not sufficient to show the potential of the project. Please send us better names for this feature!
We are starting a project to study and quantify hidden services traffic. As part of this project, we are collecting data from just a few volunteer relays which only allow us to see a small portion of hidden service activity (between 2% and 5%). Extrapolating from such a small sample is difficult, and our data are preliminary.
We've been working on methods to improve our calculations, but with our current methodology, we estimate that about 30,000 hidden services announce themselves to the Tor network every day, using about 5 terabytes of data daily. We also found that hidden service traffic is about 3.4% of total Tor traffic, which means that, at least according to our early calculations, 96.6% of Tor traffic is *not* hidden services. We invite people to join us in working to research methodologies and develop systems for better understanding Tor hidden services.
Over the past months we've been working on hidden service statistics. Our goal has been to answer the following questions:
- "Approximately how many hidden services are there?"
- "Approximately how much traffic of the Tor network is going to hidden services?"
We chose the above two questions because even though we want to understand hidden services, we really don't want to harm the privacy of Tor users. From a privacy perspective, the above two questions are relatively easy questions to answer because we don't need data from clients or the hidden services themselves; we just need data from hidden service directories and rendezvous points. Furthermore, the measurements reported by each relay cannot be linked back to specific hidden services or their clients.
Our first move was to research various ways we could collect these statistics in a privacy-preserving manner. After days of discussions on obfuscating statistics, we began writing a Tor proposal with our design, as well as code that implements the proposal. The code has since been reviewed and merged to Tor! The statistics are currently disabled by default so we asked volunteer relay operators to explicitly turn them on. Currently there are about 70 relays publishing measurements to us every 24 hours:
So as of now we've been receiving these measurements for over a month, and we have thought a lot about how to best use the reported measurements to derive interesting results. We finally have some preliminary results we would like to share with you:
How many hidden services are there?
All in all, it seems that every day about 30000 hidden services announce themselves to the hidden service directories. Graphically:
By counting the number of unique hidden service addresses seen by HSDirs, we can get the approximate number of hidden services. Keep in mind that we can only see between 2% and 5% of the total HSDir space, so the extrapolation is, naturally, messy.
How much traffic do hidden services cause?
Our preliminary results show that hidden services cause somewhere between 400 to 600 Mbit of traffic per second, or equivalently about 4.9 terabytes a day. Here is a graph:
We learned this by getting rendezvous points to publish the total number of cells transferred over rendezvous circuits, which allows us to learn the approximate volume of hidden service traffic. Notice that our coverage here is not very good either, with a probability of about 5% that a hidden service circuit will use a relay that reports these statistics as a rendezvous point.
A related statistic here is "How much of the Tor network is actually hidden service usage?". There are two different ways to answer this question, depending on whether we want to understand what clients are doing or what the network is doing. The fraction of hidden-service traffic at Tor clients differs from the fraction at Tor relays because connections to hidden services use 6-hop circuits while connections to the regular Internet use 3-hop circuits. As a result, the fraction of hidden-service traffic entering or leaving Tor is about half of the fraction of hidden-service traffic inside of Tor. Our conclusion is that about 3.4% of client traffic is hidden-service traffic, and 6.1% of traffic seen at a relay is hidden-service traffic.
Conclusion and future work
In this blog post we presented some preliminary results that could be extracted from these new hidden service statistics. We hope that this data can help us better gauge the future development and maturity of the onion space as well as detect potential incidents and bugs on the network. To better present our results and methods, we wrote a short technical report that outlines the exact process we followed. We invite you to read it if you are curious about the methodology or the results.
Finally, this project is only a few months old, and there are various plans for the future. For example:
There are more interesting questions that we could examine in this area. For example: "How many people are using hidden services every day?" and "How many times does someone try to visit a hidden service that does not exist anymore?."
Unfortunately, some of these questions are not easy to answer with the current statistics reporting infrastructure, mainly because collecting them in this way could reveal information about specific hidden services but also because the results of the current system contain too much obfuscating data (each reporting relay randomizes its numbers a little bit before publishing them, so we can learn about totals but not about specific events).
For this reason, we've been analyzing various statistics aggregation protocols that could be used in place of the current system, allowing us to safely collect other kinds of statistics.
- We need to incorporate these statistics in our Metrics portal so that they are updated regularly and so that everyone can follow them.
Currently, these hidden service statistics are not collected in relays by default. Unfortunately, that gives us very small coverage of the network, which in turn makes our extrapolations very noisy. The main reason that these statistics are disabled by default is that similar statistics are also disabled (e.g. CellStatistics). Also, this allows us more time to consider privacy consequences. As we analyze more of these statistics and think more about statistics privacy, we should decide whether to turn these statistics on by default.
It's worth repeating that the current results are preliminary and should be digested with a grain of salt. We invite statistically-inclined people to review our code, methods, and results. If you are a researcher interested in digging into the measurements themselves, you can find them in the extra-info descriptors of Tor relays.
Over the next months, we will also be thinking more about these problems to figure out proper ways to analyze and safely measure private ecosystems like the onion space.
Till then, take care, and enjoy Tor!