Blogs

Tor at the Heart: PETS and the Privacy Research Community

During the month of December, we're highlighting other organizations and projects that rely on Tor, build on Tor, or are accomplishing their missions better because Tor exists. Check out our blog each day to learn about our fellow travelers. And please support the Tor Project! We're at the heart of Internet freedom. Donate today!

So far in this blog series we've highlighted mainly software and advocacy projects. Today is a little different: I'm going to explain more about Tor's role in the academic world of privacy and security research.

Part one: Tor matters to the research community

Just about every major security conference these days has a paper analyzing, attacking, or improving Tor. While ten years ago the field of anonymous communications was mostly theoretical, with researchers speculating that a given design should or shouldn't work, Tor now provides an actual deployed testbed. Tor has become the gold standard for anonymous communications research for three main reasons:

First, Tor's source code and specifications are open. Beyond its original design document, Tor provides a clear and published set of RFC-style specifications describing exactly how it is built, why we made each design decision, and what security properties it aims to offer. The Tor developers conduct design discussion in the open, on public development mailing lists, and the public development proposal process provides a clear path by which other researchers can participate.

Second, Tor provides open APIs and maintains a set of tools to help researchers and developers interact with the Tor software. The Tor software's "control port" lets controller programs view and change configuration and status information, as well as influence path selection. We provide easy instructions for setting up separate private Tor networks for testing. This modularity makes Tor more accessible to researchers because they can run their own experiments using Tor without needing to modify the Tor program itself.

Third, real users rely on Tor. Every day hundreds of thousands of people connect to the Tor network and depend on it for a broad variety of security goals. In addition to its emphasis on research and design, The Tor Project has developed a reputation as a non-profit that fosters this community and puts its users first. This real-world relevance motivates researchers to help make sure Tor provides provably good security properties.

I wrote the above paragraphs in 2009 for our first National Science Foundation proposal, and they've become even more true over time. A fourth reason has also emerged: Tor attracts researchers precisely because it brings in so many problems that are at the intersection of "hard to solve" and "matter deeply to the world". How to protect communications metadata is one of the key open research questions of the century, and nobody has all the answers. Our best chance at solving it is for researchers and developers all around the world to team up and all work in the open to build on each other's progress.

Since starting Tor, I've done probably 100 Tor talks to university research groups all around the world, teaching grad students about these open research problems in the areas of censorship circumvention (which led to the explosion of pluggable transport ideas), privacy-preserving measurement, traffic analysis resistance, scalability and performance, and more.

The result of that effort, and of Tor's success in general, is a flood of research papers, plus a dozen research labs who regularly have students who write their thesis on Tor. The original Tor design paper from 2004 now has over 3200 citations, and in 2014 Usenix picked that paper out of all the security papers in 2004 to win their Test of Time award.

Part two: University collaborations

This advocacy and education work has also led to a variety of ongoing collaborations funded by the National Science Foundation, including with Nick Feamster's group at Princeton on measuring censorship, with Nick Hopper's group at University of Minnesota on privacy-preserving measurement, with Micah Sherr's group at Georgetown University on scalability and security against denial of service attacks, and an upcoming one with Matt Wright's group at RIT on defense against website fingerprinting attacks.

All of these collaborations are great, but there are precious few people on the Tor side who are keeping up with them, and those people need to balance their research time with development, advocacy, management, etc. I'm really looking forward to the time where Tor can have an actual research department.

And lastly, I would be remiss in describing our academic collaborations without also including a shout-out to the many universities that are running exit relays to help the network grow. As professor Leo Reyzin from Boston University once explained for why it is appropriate for his research lab to support the Tor network, "If biologists want to study elephants, they get an elephant. I want my elephant." So, special thanks to Boston University, University of Michigan, University of Waterloo, MIT, CMU (their computer science department that is), University of North Carolina, University of Pennsylvania, Universidad Galileo, and Clarkson University. And if you run an exit relay at a university but you're not on this list, please reach out!

Part three: The Privacy Enhancing Technologies Symposium

Another critical part of the privacy research world is the Privacy Enhancing Technologies Symposium (PETS), which is the premiere venue for technical privacy and anonymity research. This yearly gathering started as a workshop in 2000, graduated to being called a symposium in 2008, and in 2015 it became an open-access journal named Proceedings on Privacy Enhancing Technologies.

The editorial board and chairs for PETS over the years overlap greatly with the Tor community, with a lot of names you'll see at both PETS and the Tor twice-yearly meetings, including Nikita Borisov, George Danezis, Claudia Diaz, Roger Dingledine (me), Ian Goldberg, Rachel Greenstadt, Kat Hanna, Nick Hopper, Steven Murdoch, Paul Syverson, and Matt Wright.

But beyond community overlap, The Tor Project is actually the structure underneath PETS. The group of academics who run the PETS gatherings intentionally did not set up corporate governance and all those pieces of bureaucracy that drag things down — so they can focus on having a useful research meeting each year — and Tor stepped in to effectively be the fiscal sponsor, by keeping the bank accounts across years, and by being the "owner" for the journal since De Gruyter's paperwork assumes that some actual organization has to own it. We're proud that we can help provide stability and longevity for PETS.

Speaking of all these papers: we have tracked the most interesting privacy and anonymity papers over the years on the anonymity bibliography (anonbib). But at this point, anonbib is still mostly a two-man show where Nick Mathewson and I update it when we find some spare time, and it's starting to show its age since its launch in 2003, especially with the huge growth in the field, and with other tools like Google Scholar. Probably the best answer is that we need to trim it down so it's more of a "recommended reading list" than a resource of all relevant papers. If you want to help, let us know!

Part four: The Tor Research Safety Board

This post is running long, so I will close by pointing to the Tor Research Safety Board, a group of researchers who study Tor and who want to minimize privacy risks while fostering a better understanding of the Tor network and its users. That page lists a set of guidelines on what to consider when you're thinking about doing research on Tor users or the Tor network, and a process for getting feedback and suggestions on your plan. We did a soft launch of the safety board this past year in the rump session at PETS, and we've fielded four requests for advice so far. We've been taking it slow in turns of publicity, but if you're a researcher and you can help us refine our process, please take a look!

Tor at the Heart: Online Collaborative Projects

During the month of December, we're highlighting other organizations and projects that rely on Tor, build on Tor, or are accomplishing their missions better because Tor exists. Check out our blog each day to learn about our fellow travelers. And please support the Tor Project! We're at the heart of Internet freedom. Donate today!

Research by Andrea Forte, Nazanin Andalibi and Rachel Greenstadt



Wikipedia blocks edits from Tor — how does this affect the quality and coverage of the "encyclopedia that anyone can edit?" How do captchas and blocking of anonymity services affect the experiences of Tor users when they are trying to contribute content? What can projects do to better support contributions from people who value their privacy?

We are a group of researchers from Drexel University studying these questions. Our initial study of privacy in open collaboration projects, entitled Privacy, Anonymity, and Perceived Risk in Open Collaboration: A Study of Tor Users and Wikipedians, was recently published in advance of its presentation at the ACM conference on Computer-Supported Cooperative Work and Social Computing (CSCW) in February. Our findings offer a rare look at why people turn to privacy tools like Tor and how they experience the Internet as a result. This work was inspired by a previous Tor blog post, A call to arms: Helping Internet services accept anonymous users.

We interviewed 23 people from seven countries ranging in age from 18-41; 12 Tor users who participate in online projects and 11 Wikipedia editors who use a variety of privacy tactics. The Tor Project and Wikimedia Foundation are organizations committed to similar ideals — a free global exchange of information in which everyone is able to participate. The study's central finding is that perceived threats from other individuals, groups of people and governments are substantial enough to force users below the radar and curtail their participation in order to protect their reputation, themselves, and their families.

In nearly all interviews, participants described being wary about how aspects of their participation in open collaboration projects would compromise their privacy or safety. Many participants described crisis experiences of their own or of someone they knew as antecedent to their model of threat in online projects.

Their reasons for guarding their privacy online ranged from concerns about providers obtaining and using their browsing history for targeted advertising to actual verbal abuse, harassment and threats of violence. The most common concern voiced by participants was a fear that their online communication or activities may be accessed or logged by parties without their knowledge or consent.

This threat, which became very real for many Americans after Snowden revealed the extent of the National Security Agency's surveillance and monitoring practices, has been ever-present for users in other countries for some time. According to one non-U.S. respondent "in my country there's basically unknown surveillance going on ... and I don't know what providers to use so at some point I decided to use Tor for everything."

For a political activist, dissident, or just someone who has expressed strong political opinions the threat is multiplied. One such participant who uses Tor said "they busted [my friend's] door down and they beat the ever living crap out of him...and told him, "If you and your family want to live, then you're going to stop causing trouble." This person's privacy strategies were quickly transformed after that experience.

Eleven of the study's participants were recruited from the ranks of Wikipedia editors who expressed concerns about maintaining their privacy. In comparison to political dissent, helping to add information to Wikipedia might seem innocuous, but especially editors who work on controversial topics are also being threatened and harassed. Wikipedia allows anonymous posting, but it does not permit users to mask their IP addresses and blocks Tor users — except in special cases. So wading into the controversial territory, even to present a fact-backed, neutral point of view, puts editors at risk. Some Wikipedians described threats of rape, physical assault, and death as reprisals for their contributions to the project.

Administrators of the site, who often spend their time on managerial tasks and enforcing policies, also reported being harassed or threatened with violence. "It's a lot of emotional work," said one study participant. "I remember being like 13 and getting a lot of rape threats and death threats and that was when I was doing administrative work."

Our analysis suggests that Wikipedia and other collaborative projects are losing valuable contributions to privacy concerns. If certain voices are systematically dampened by the threat of harassment, intimidation, violence, or opportunity and reputation loss, projects like Wikipedia cannot hope to attract the diversity of contributors required to produce "the sum of all human knowledge."

In response to this problem, our research agenda aims to support communities like Wikipedia in developing tools and norms that value and welcome anonymous contributions.

For more:

Andrea Forte will be speaking at the next WikiResearch showcase which will be live-streamed this Wednesday 12/21 at 11:30am PT / 7:30pm UTC.

Read the paper: "Privacy, Anonymity, and Perceived Risk in Open Collaboration: A Study of Tor Users and Wikipedians"

Watch the video from the 32c2 talk: What is the value of anonymous communication?

Tor At The Heart: Cryptocurrencies

During the month of December, we're highlighting other organizations and projects that rely on Tor, build on Tor, or are accomplishing their missions better because Tor exists. Check out our blog each day to learn about our fellow travelers. And please support the Tor Project! We're at the heart of Internet freedom. Donate today!





The topic for today is electronic money. The blockchain is pretty hot right now! Bitcoin, dogecoin, ethereum, zcash you name it... Cryptocurencies have grown from e-toys to globally recognized systems by facilitating free and borderless trade, no bank fees and improved privacy.

You are reading the Tor blog, so let's focus on the privacy and anonymity part. Could cryptocurrencies claim that they provide privacy if Tor was not around to give strong transport-layer anonymity?

To visualize this, let's go through just a few ways Tor is used around the cryptocurrency ecosystem. We will mainly focus on Bitcoin, but the same applies to most blockchain-based cryptocurrencies:

Tor provides privacy to cryptocurrency transactions!

Let's imagine that Alice wants to buy a ticket for Torconf, the best (fictional) conference on computer anonymity. She wants to buy the ticket with Bitcoin so that she does not reveal her interests to her bank or her identity to the conference organizers. To buy the ticket with Bitcoin, she needs to perform a Bitcoin transaction.

Bitcoin transactions work by Alice broadcasting her transaction to a few Bitcoin supernodes. Those nodes then propagate the transaction further to the rest of the Bitcoin network until it becomes recognized. If Alice did not use Tor to conduct her transaction, those initial supernodes trivially learn the IP address of Alice. Furthermore, since the Bitcoin blockchain is a public log of transactions, analysts could match her newest transaction with her previous transactions and just follow the money trail. These are just some of the many well known privacy risks of Bitcoin, and companies have been collecting and selling social graph analytics of the Bitcoin blockchain for years now...

Given the above threats, it should be no surprise that most Bitcoin clients give the option to their users to perform transactions over the Tor network. By routing traffic over Tor, no one learns the origin IP address of Alice when she buys her Torconf ticket.

Furthermore, even the hottest and newest cryptocurrencies (like Zcash) that provide transaction anonymity as a fundamental security property still benefit from Tor's transport-layer anonymity to actually anonymize the networking part of the Zcash transaction.

We feel that Tor has tremendously helped the cryptocurrency community to grow just by providing transport-layer anonymity to transactions! Also, please remember that maintaining anonymity is not an easy task, so always be up-to-date on the latest security news depending on your threat model.

Tor secures cryptocurrency networks!

Apart from users performing anonymous Bitcoin transactions, the Bitcoin network itself uses Tor to increase its defenses. Since last year, the Bitcoin core project has integrated Tor onion services to their core network daemon. If Tor is installed in the system, Bitcoin will automatically create an onion service and act as a Bitcoin node over Tor to avoid leaking the real IP address of the node. This provides greater network resilience and protection against targeted attacks to Bitcoin nodes. You can see that there are hundreds of Tor bitcoin nodes. Zcash and other cryptocurrencies have followed the same path.

Furthermore, many mining pools advertise onion service support for their miners. Bitcoin infrastructure has been a target of hackers for a while, and virgin blocks are more and more valuable, so having anonymity as a miner is a desirable security property.

Tor protects the wider cryptocurrency ecosystem!

If you take a look around the Bitcoin world, you will notice that Tor support is advertised by all sorts of websites and services! Most bitcoin-related websites have onion sites that people can visit over Tor: for example, blockchain.info has been running a popular Tor onion service for its users. Most Bitcoin tumbler services also work over Tor onion services. Same goes for websites and forums offering help with Bitcoin. This is obviously done because the Bitcoin community has a great appreciation and need for privacy.

Tor is proud to have helped the cryptocurrency community grow over the years. We believe that electronic currencies can be a powerful tool for social change, but also a great scientific research area with results that can benefit other areas, like secure electronic voting, consensus algorithms, append-only data structures and secure name systems.

Help Tor grow the cypherpunk ecosystem by donating today!! We also accept Bitcoin!

Have a good day :)

What's new in Tor 0.2.9.8?

Today, we've released the first stable version of the 0.2.9.x series, bringing exciting new features to Tor. The series has seen 1406 commits from 32 different contributors. Please, see the ChangeLog for more details about what has been done.

This post will outline three features (among many other things) that we are quite proud of and want to describe in more detail.

Single Onion Service

Over the past several years, we've collaborated with many large scale service providers such as Facebook and Riseup, organizations that deployed Onion Services to improve their performance.

Onion services are great because they offer both anonymity on the service and the client side. However, there are cases where the onion service does not require anonymity. The main example of this is when the service provider does not need to hide the location of its servers.

As a reminder to the reader, an onion service connection between a client and a service goes through 6 hops, while a regular connection with Tor is 3 hops. Onion services are much slower than regular Tor connections because of this.

Today, we are introducing Single Onion Services! With this new feature, a service can now specify in its configuration file that it does not need anonymity, thus cutting the 3 hops between the service and its Rendezvous Point and speeding up the connection.

For security reasons, if this option is enabled, only single onion service can be configured. They can't coexist with a regular onion service. Because this removes the anonymity aspect of the service, we took extra precautions so that it's very difficult to enable a single onion by mistake. In your torrc file, here is how you do it:


HiddenServiceNonAnonymousMode 1
HiddenServiceSingleHopMode 1


Please read about these options before you enable them in the manual page.

Shared Randomness

We've talked about this before but now it is a reality. At midnight UTC every day, directory authorities collectively generate a global random value that cannot be predicted in advance. This daily fresh random value is the foundation of our next generation onion service work coming soon to a Tor near you.

In the consensus file, they will look like this; if all goes well, at 00:00 UTC, consensus will have a new one:


shared-rand-current-value Hq+hGlzwAVetJ2zkO70riH/SEMNri+c7Ps8xERZ3a0o=
shared-rand-previous-value CY5TncVAltDpkBKZUBYT1canvqmVoNuweiKVZIilHfs=


Thanks to atagar, the Stem Library version 1.5.0 supports parsing the shared random values from the consensus. See here for more information!

Voluntarily, we haven't exposed those values to the control port yet and will wait for a full stable release cycle in order to make sure it's stable enough for a third party application to use them (https://trac.torproject.org/19925).

Mandatory ntor handshake

This is another important security feature introduced in the new release. Authorities, relays and clients now require ntor keys in all descriptors, for all hops, for all circuits, and for all other roles.

In other words, except for onion services (and this will be addressed with the next generation), only ntor is used--now finally dropping the TAP handshake.

This results in better security for the overall network and users.


Enjoy this new release!

Tor 0.3.0.1-alpha: A new alpha series begins

Now that Tor 0.2.9.8 is stable, it's time to release a new alpha series for testing and bug-hunting!

Tor 0.3.0.1-alpha is the first alpha release in the 0.3.0 development series. It strengthens Tor's link and circuit handshakes by identifying relays by their Ed25519 keys, improves the algorithm that clients use to choose and maintain their list of guards, and includes additional backend support for the next-generation hidden service design. It also contains numerous other small features and improvements to security, correctness, and performance.

You can download the source from the usual place on the website. Packages should be available over the next weeks, including an alpha TorBrowser release some time in January.

Please note: This is an alpha release. Please expect more bugs than usual. If you want a stable experience, please stick to the stable releases.

Below are the changes since 0.2.9.8.

Changes in version 0.3.0.1-alpha - 2016-12-19

  • Major features (guard selection algorithm):
    • Tor's guard selection algorithm has been redesigned from the ground up, to better support unreliable networks and restrictive sets of entry nodes, and to better resist guard-capture attacks by hostile local networks. Implements proposal 271; closes ticket 19877.
  • Major features (next-generation hidden services):
    • Relays can now handle v3 ESTABLISH_INTRO cells as specified by prop224 aka "Next Generation Hidden Services". Service and clients don't use this functionality yet. Closes ticket 19043. Based on initial code by Alec Heifetz.
    • Relays now support the HSDir version 3 protocol, so that they can can store and serve v3 descriptors. This is part of the next- generation onion service work detailled in proposal 224. Closes ticket 17238.
  • Major features (protocol, ed25519 identity keys):
    • Relays now use Ed25519 to prove their Ed25519 identities and to one another, and to clients. This algorithm is faster and more secure than the RSA-based handshake we've been doing until now. Implements the second big part of proposal 220; Closes ticket 15055.
    • Clients now support including Ed25519 identity keys in the EXTEND2 cells they generate. By default, this is controlled by a consensus parameter, currently disabled. You can turn this feature on for testing by setting ExtendByEd25519ID in your configuration. This might make your traffic appear different than the traffic generated by other users, however. Implements part of ticket 15056; part of proposal 220.
    • Relays now understand requests to extend to other relays by their Ed25519 identity keys. When an Ed25519 identity key is included in an EXTEND2 cell, the relay will only extend the circuit if the other relay can prove ownership of that identity. Implements part of ticket 15056; part of proposal 220.

  read more »

Tor 0.2.9.8 is released: finally, a new stable series!

Tor 0.2.9.8 is the first stable release of the Tor 0.2.9 series.

The Tor 0.2.9 series makes mandatory a number of security features that were formerly optional. It includes support for a new shared- randomness protocol that will form the basis for next generation hidden services, includes a single-hop hidden service mode for optimizing .onion services that don't actually want to be hidden, tries harder not to overload the directory authorities with excessive downloads, and supports a better protocol versioning scheme for improved compatibility with other implementations of the Tor protocol.

And of course, there are numerous other bugfixes and improvements.

This release also includes a fix for a medium-severity issue (bug 21018 below) where Tor clients could crash when attempting to visit a hostile hidden service. Clients are recommended to upgrade as packages become available for their systems.

You can download the source code from the usual place on the website. Packages should be up within the next few days, with a
TorBrowser release planned for early January.

Below are listed the changes since Tor 0.2.8.11. For a list of changes since 0.2.9.7-rc, see the ChangeLog file.

Changes in version 0.2.9.8 - 2016-12-19

  • New system requirements:
    • When building with OpenSSL, Tor now requires version 1.0.1 or later. OpenSSL 1.0.0 and earlier are no longer supported by the OpenSSL team, and should not be used. Closes ticket 20303.
    • Tor now requires Libevent version 2.0.10-stable or later. Older versions of Libevent have less efficient backends for several platforms, and lack the DNS code that we use for our server-side DNS support. This implements ticket 19554.
    • Tor now requires zlib version 1.2 or later, for security, efficiency, and (eventually) gzip support. (Back when we started, zlib 1.1 and zlib 1.0 were still found in the wild. 1.2 was released in 2003. We recommend the latest version.)
  • Deprecated features:
    • A number of DNS-cache-related sub-options for client ports are now deprecated for security reasons, and may be removed in a future version of Tor. (We believe that client-side DNS caching is a bad idea for anonymity, and you should not turn it on.) The options are: CacheDNS, CacheIPv4DNS, CacheIPv6DNS, UseDNSCache, UseIPv4Cache, and UseIPv6Cache.
    • A number of options are deprecated for security reasons, and may be removed in a future version of Tor. The options are: AllowDotExit, AllowInvalidNodes, AllowSingleHopCircuits, AllowSingleHopExits, ClientDNSRejectInternalAddresses, CloseHSClientCircuitsImmediatelyOnTimeout, CloseHSServiceRendCircuitsImmediatelyOnTimeout, ExcludeSingleHopRelays, FastFirstHopPK, TLSECGroup, UseNTorHandshake, and WarnUnsafeSocks.
    • The *ListenAddress options are now deprecated as unnecessary: the corresponding *Port options should be used instead. These options may someday be removed. The affected options are: ControlListenAddress, DNSListenAddress, DirListenAddress, NATDListenAddress, ORListenAddress, SocksListenAddress, and TransListenAddress.

  read more »

Tor 0.2.8.12 is released

There's a new "old stable" release of Tor! (But maybe you want the 0.2.9.8 release instead; that also comes out today.)

Tor 0.2.8.12 backports a fix for a medium-severity issue (bug 21018 below) where Tor clients could crash when attempting to visit a hostile hidden service. Clients are recommended to upgrade as packages become available for their systems.

It also includes an updated list of fallback directories, backported from 0.2.9.

Now that the Tor 0.2.9 series is stable, only major bugfixes will be backported to 0.2.8 in the future.

You can download Tor 0.2.8 -- and other older release series -- from dist.torproject.org.

Changes in version 0.2.8.12 - 2016-12-19

  • Major bugfixes (parsing, security, backported from 0.2.9.8):
    • Fix a bug in parsing that could cause clients to read a single byte past the end of an allocated region. This bug could be used to cause hardened clients (built with --enable-expensive-hardening) to crash if they tried to visit a hostile hidden service. Non- hardened clients are only affected depending on the details of their platform's memory allocator. Fixes bug 21018; bugfix on 0.2.0.8-alpha. Found by using libFuzzer. Also tracked as TROVE- 2016-12-002 and as CVE-2016-1254.
  • Minor features (fallback directory list, backported from 0.2.9.8):
    • Replace the 81 remaining fallbacks of the 100 originally introduced in Tor 0.2.8.3-alpha in March 2016, with a list of 177 fallbacks (123 new, 54 existing, 27 removed) generated in December 2016. Resolves ticket 20170.
  • Minor features (geoip, backported from 0.2.9.7-rc):
    • Update geoip and geoip6 to the December 7 2016 Maxmind GeoLite2 Country database.

Technology in Hostile States: Ten Principles for User Protection

This blog post is meant to generate a conversation about best practices for using cryptography and privacy by design to improve security and protect user data from well-resourced attackers and oppressive regimes.

The technology industry faces tremendous risks and challenges that it must defend itself against in the coming years. State-sponsored hacking and pressure for backdoors will both increase dramatically, even as soon as early 2017. Faltering diplomacy and faltering trade between the United States and other countries will also endanger the remaining deterrent against large-scale state-sponsored attacks.

Unfortunately, it is also likely that in the United States, current legal mechanisms, such as NSLs and secret FISA warrants, will continue to target the marginalized. This will include immigrants, Muslims, minorities, and even journalists who dare to report unfavorably about the status quo. History is full of examples of surveillance infrastructure being abused for political reasons.

Trust is the currency of the technology industry, and if it evaporates, so will the value of the industry itself. It is wise to get out ahead of this erosion of trust, which has already caused Americans to change online buying habits.

This trust comes from demonstrating the ability to properly handle user data in the face of extraordinary risk. The Tor Project has over a decade of experience managing risk from state and state-sized adversaries in many countries. We want to share this experience with the wider technology community, in the hopes that we can all build a better, safer world together. We believe that the future depends on transparency and openness about the strengths and weaknesses of the technology we build.

To that end, we decided to enumerate some general principles that we follow to design systems that are resistant to coercion, compromise, and single points of failure of all kinds, especially adversarial failure. We hope that these principles can be used to start a wider conversation about current best practices for data management and potential areas for improvement at major tech companies.

Ten Principles for User Protection

1. Do not rely on the law to protect systems or users.
2. Prepare policy commentary for quick response to crisis.
3. Only keep the user data that you currently need.
4. Give users full control over their data.
5. Allow pseudonymity and anonymity.
6. Encrypt data in transit and at rest.
7. Invest in cryptographic R&D to replace non-cryptographic systems.
8. Eliminate single points of security failure, even against coercion.
9. Favor open source and enable user freedom.
10. Practice transparency: share best practices, stand for ethics, and report abuse.

1. Do not rely on the law to protect systems or users.

This is the principle from which the others flow. Whether it is foreign hackers, extra-legal entities like organized crime, or the abuse of power in one of the jurisdictions in which you operate, there are plenty of threats outside and beyond the reach of law that can cause harm to your users. It is wise not to assume that the legal structure will keep your users and their data safe from these threats. Only sound engineering and data management practices can do that.

2. Prepare policy commentary for quick response to crisis.

It is common for technologists to take Principle 1 so far that they ignore the law, or at least ignore the political climate in which they operate. It is possible for the law and even for public opinion to turn against technology quickly, especially during a crisis where people do not have time to fully understand the effects of a particular policy on technology.

The technology industry should be prepared to counter bad policy recommendations with coherent arguments as soon as the crisis hits. This means spending time and devoting resources to testing the public's reaction to statements and arguments about policy in focus groups, with lobbyists, and in other demographic testing scenarios, so that we know what arguments will appeal to which audiences ahead of time. It also means having media outlets, talk show hosts, and other influential people ready to back up our position. It is critical to prepare early. When a situation becomes urgent, bad policy often gets implemented quickly, simply because "something must be done".

3. Only keep the user data that you currently need.

Excessive personally identifiable data retention is dangerous to users, especially the marginalized and the oppressed. Data that is retained is data that is at risk of compromise or future misuse. As Maciej Ceglowski suggests in his talk Haunted By Data, "First: Don't collect it. But if you have to collect it, don't store it! If you have to store it, don't keep it!"

With enough thought and the right tools, it is possible to engineer your way out of your ability to provide data about specific users, while still retaining the information that is valuable or essential to conduct your business. Examples of applications of this idea are Differential Privacy, PrivEx, the EFF's CryptoLog, and how Tor collects its user metrics. We will discuss this idea further in Principle 7; the research community is exploring many additional methods that could be supported and deployed.

4. Give users full control over their data.

For sensitive data that must be retained in a way that can be associated with an individual user, the ethical thing to do is to give users full control over that data. Users should have the ability to remove data that is collected about themselves, and this process should be easy. Users should be given interfaces that make it clear what type of data is collected about them and how, and they should be given easy ways to migrate, restrict, or remove this data if they wish.

5. Allow pseudonymity and anonymity.

Even with full control of your data, there are plenty of reasons to use a pseudonym. Real Name policies harm the marginalized, those vulnerable to abuse, and activists working for social change.

Beyond issues with pseudonymity, the ability to anonymously access information via Tor and VPNs must also be protected and preserved. There is a disturbing trend for automated abuse detection systems to harshly penalize shared IP address infrastructure of all kinds, leading to loss of access.

The Tor Project is working with Cloudflare on both cryptographic and engineering-based solutions to enable Tor users to more easily access websites. We invite interested representatives from other tech companies to help us refine and standardize these solutions, and ensure that these solutions will work for them, too.

6. Encrypt data in transit and at rest.

With recent policy changes in both the US and abroad, it is more important than ever to encrypt data in transit, so that it does not end up in the dragnet. This means more than just HTTPS. Even intra-datacenter communications should be protected by IPSec or VPN encryption.

As more of our data is encrypted in transit, requests for stored data will likely rise.
Companies can still be compelled to decrypt data that is encrypted with keys that they control. The only way to keep user data truly safe is to provide ways for users to encrypt that data with keys that only those users control.

7. Invest in cryptographic R&D to replace non-cryptographic systems.

A common argument against cryptographic solutions for privacy is that the loss of either features, usability, ad targeting, or analytics is in opposition to the business case for the product in question. We believe that this is because the funding for cryptography has not been focused on these needs. In the United States, much of the current cryptographic R&D funding comes from the US military. As Phillip Rogaway pointed out in Part 4 of his landmark paper, The Moral Character of Cryptographic Work, this has created a misalignment between what gets funded versus what is needed in the private sector to keep users' personal data safe in a usable way.

It would be a wise investment for companies that handle large amounts of user data to fund research into potential replacement systems that are cryptographically privacy preserving. It may be the case that a company can be both skillful and lucky enough to retain detailed records and avoid a data catastrophe for several years, but we do not believe it is possible to keep a perfect record forever.

The following are some areas that we think should be explored more thoroughly, in some cases with further research, and in other cases with engineering resources for actual implementations: Searchable encryption, Anonymous Credentials, Private Ad Delivery, Private Location Queries, Private Location Sharing, and PIR in general.

8. Eliminate single points of security failure, even against coercion.

Well-designed cryptographic systems are extremely hard to compromise. Typically, the adversary looks for a way around the cryptography by either exploiting other code on the system, or by coercing one of the parties to divulge either key material or decrypted data. These attacks will naturally target the weakest point of the system - that is a single point of security failure where the fewest number of systems need to be compromised, and where the fewest number of people will notice. The proper engineering response is to ensure that multiple layers of security need to be broken for security to fail, and to ensure that security failure is visible and apparent to the largest possible number of people.

Sandboxing, modularization, vulnerability surface reduction, and least privilege are already established as best practices for improving software security. They also eliminate single points of failure. In combination, they force the adversary to compromise multiple hardened components before the system fails. Compiler hardening is another way to eliminate single points of failure in code bases. Even with memory unsafe languages, it is still possible for the compiler to add additional security layers. We believe that compiler hardening could use more attention from companies who contribute to projects like GCC and clang/llvm, so that the entire industry can benefit. In today's world, we all rely on the security of each other's software, sometimes indirectly, in order to do our work.

When security does fail, we want incidents to be publicly visible. Distributed systems and multi-party/multi-key authentication mechanisms are common ways to ensure this visibility. The Tor consensus protocol is a good example of a system that was deliberately designed such that multiple people must be simultaneously compromised or coerced before security will fail. Reproducible builds are another example of this design pattern. While these types of practices are useful when used internally in an organization, this type of design is more effective when it crosses organizational boundaries - so that multiple organizations need to be compromised to break the security of a system - and most effective when it also crosses cultural boundaries and legal jurisdictions.

We are particularly troubled by the trend towards the use of App Stores to distribute security software and security updates. When each user is personally identifiable to the software update system, that system becomes a perfect vector for backdoors. Globally visible audit logs like Google's General Transparency are one possible solution to this problem. Additionally, the anonymous credentials mentioned in Principle 7 provide a way to authenticate the ability to download an app without revealing the identity of the user, which would make it harder to target specific users with malicious updates.

9. Favor open source and enable user freedom.

The Four Software Freedoms are the ability to use, study, share, and improve software.

Open source software that provides these freedoms has many advantages when operating in a hostile environment. It is easier for experts to certify and verify security properties of the software; subtle backdoors are easier to find; and users are free to modify the software to remove any undesired operation.

The most widely accepted argument against backdoors is that they are technically impossible to deploy, because they compromise the security of the system if they are found. A secondary argument is that backdoors can be avoided by the use of alternative systems, or by their removal. Both of these arguments are stronger for open source than for closed source, precisely because of the Four Freedoms.

10. Practice transparency: share best practices, stand for ethics, and report abuse.

Unfortunately, not all software is open source. Even for proprietary software, the mechanisms by which we design our systems in order to prevent harm and abuse should be shared publicly in as much detail as possible, so that best practices can be reviewed and adopted more widely. For example, Apple is doing great work adopting cryptography for many of its products, but without specifications for how they are using techniques like differential privacy or iMessage encryption, it is hard to know what protections they are actually providing, if any.

Still, even when the details of their work are not public, the best engineers deeply believe that protecting their users is an ethical obligation, to the point of being prepared to publicly resign from their jobs rather than cause harm.

But, before we get to the point of resignation, it is important that we do our best to design systems that make abuse either impossible or evident. We should then share those designs, and responsibly report any instances of abuse. When abuse happens, inform affected organizations, and protect the information of individual users who were at risk, but make sure that users and the general public will hear about the issue with little delay.

Please Join Us

Ideally, this post will spark a conversation about best practices for data management and the deployment of cryptography in companies around the world.

We hope to use this conversation to generate a list of specific best practices that the industry is already undertaking, as well as to provide a set of specific recommendations based on these principles for companies with which we're most familiar, and whose products will have the greatest impact on users.

If you have specific suggestions, or would like to highlight the work of companies who are already implementing these principles, please mention them in the comments. If your company is already taking actions that are consistent with these principles, either write about that publicly, or contact me directly. We're interested in highlighting positive examples of specific best practices as well as instances where we can all improve, so that we all can work towards user safety and autonomy.

We would like to thank everyone at the Tor Project and the many members of the surrounding privacy and Internet freedom communities who provided review, editorial guidance, and suggestions for this post.

Syndicate content Syndicate content