Ethical Tor Research: Guidelines

by ailanthus | November 11, 2015

[Edit: this blog post was a preliminary version of our ideas, and you should look at the Tor Research Safety Board page now.]

Draft 1.1

1. Goals of this document.

  • In general, to describe how to conduct responsible research on Tor and similar privacy tools.
  • To develop guidelines for research activity that researchers can use to evaluate their proposed plan.
  • Produce a (non-exhaustive) list of specific types of unacceptable activity.
  • Develop a “due diligence” process for research that falls in the scope of “potentially dangerous” activities. This process can require some notification and feedback from the Tor network or other third parties.

2. General principles

Experimentation does not justify endangering people. Just as in medicine, there are experiments in privacy that can only be performed by creating an unacceptable degree of human harm. These experiments are not justified, any more than the gains to human knowledge would justify unethical medical research on human subjects.

Research on humans' data is human research. Over the last century, we have made enormous strides in what research we consider ethical to perform on people in other domains. For example, we have generally decided that it's ethically dubious to experiment on human subjects without their informed consent. We should make sure that privacy research is at least as ethical as research in other fields.

We should use our domain knowledge concerning privacy when assessing risks. Privacy researchers know that information which other fields consider non-invasive can be used to identify people, and we should take this knowledge into account when designing our research.

Finally, users and implementors must remember that "should not" does not imply "can not." Guidelines like these can serve to guide researchers who are genuinely concerned with doing the right thing and behaving ethically; they cannot restrain the unscrupulous or unethical. Against invasions like these, other mechanisms (like improved privacy software) are necessary.

3. Guidelines for research

  1. Only collect data that is acceptable to publish. If it would be inappropriate to share it with the world, it is invasive to collect it. In the case of encrypted or secret-shared data, it can be acceptable to assume that the keys or some shares are not published.
  2. Only collect as much data as is needed: practice data minimization.
    1. Whenever possible, use analysis techniques that do not require sensitive data, but which work on anonymized aggregates.
  3. Limit the granularity of the data. For example, "noise" (added data inaccuracies) should almost certainly be added. This will require a working statistical background, but helps to avoid harm to users.
  4. Make an explicit description of benefits and risks, and argue that the benefits outweigh the risks.
    1. In order to be sure that risks have been correctly identified, seek external review from domain experts. Frequently there are non-obvious risks.
    2. Consider auxiliary data when assessing the risk of your research. Data which is not damaging on its own can become dangerous when other data is also available. For example, data from exit traffic can be combined with entry traffic to deanonymize users.
    3. Respect people's own judgments concerning their privacy interests in their own data.
    4. It's a warning sign if you can't disclose details of your data collection in advance. If knowing about your study would cause your subjects to object to it, that's a good sign that you're doing something dubious.
  5. Use a test network when at all possible.
    1. If you can experiment either on a test network without real users, or on a live network, use the test network.
    2. If you can experiment either on your own traffic or on the traffic of strangers, use your own traffic.
    3. "It was easier that way" is not justification for using live user traffic over test network traffic.

4. Examples of unacceptable research activity

  • It is not acceptable to run an HSDir, harvest onion addresses, and publish or connect to those onion addresses.
  • Don't set up exit relays to sniff, or tamper with exit traffic. Some broad measurements (relative frequency of ports; large-grained volume) may be acceptable depending on risk/benefit tradeoffs; fine-grained measures are not.
  • Don't set up relays that are deliberately dysfunctional (e.g., terminate connections to specific sites).

Comments

Please note that the comment area below has been archived.

November 11, 2015

Permalink

"It was easier that way" is not justification for using live test network traffic over user traffic.

Don't you mean "It was easier that way" is not justification for using user traffic over live test network traffic.?

November 11, 2015

Permalink

The good guys not doing something won't stop the bad guys from doing it.

And how can we disprove people who say Tor is only for illegal activity when we're unwilling to probe the network and find out what it's actually used for?
When you start using Tor or a VPN to browse the net there's a risk that your traffic will be observed or tampered with; people should be aware of this instead of burying their heads in the sand.

November 12, 2015

Permalink

> harvest onion addresses, and *publish or connect to those onion addresses.*

Which is being proscribed, harvesting .onion addresses from HSDirs, or publishing a list of .onion addresses? If it's the latter, there are at least four prominent Tor community members currently doing this.

1. https://ahmia.fi/address/
2. https://skunksworkedp2cg.tor2web.org/sites.txt
3. http://www.onion.link/sitemap.xml
4. https://www.google.com/webhp?ie=UTF-8&gws_rd=cr&q=site:onion.to#

Does this constitute a violation?

If it's former, then it's unclear why harvesting .onion addresses from HSDirs is substantially different from harvesting addresses by web-crawling .onion domains.

The difference is that onion harvesting will get onion addresses intended for private use. One of the benefits of onion addresses having such large address spaces is that they are effectively private until the owner (or someone who knows of their existence) decides to share them with the public.

November 12, 2015

Permalink

> Experimentation does not justify endangering people. Just as in medicine, there are experiments in privacy that can only be performed by creating an unacceptable degree of human harm. These experiments are not justified, any more than the gains to human knowledge would justify unethical medical research on human subjects.
>
> Research on humans' data is human research. Over the last century, we have made enormous strides in what research we consider ethical to perform on people in other domains. For example, we have generally decided that it's ethically dubious to experiment on human subjects without their informed consent. We should make sure that privacy research is at least as ethical as research in other fields.

Exactly!

It is unfortunate that it has taken such naked abuses as the CMU payoff to expose why it is so important to push for the acceptance of the principle that academic network security researchers must adhere to a code of ethics, or be drummed out of the academy. Please make it a priority to liaise with like minded organizations to push for the adoption of such standards by universities and legislatures worldwide, and by the UN.

November 16, 2015

Permalink

No surprise that the enemies of encryption are blaming the Paris attacks on Snowden (without even pretending to present any evidence supporting this claim):

http://thehill.com/policy/national-security/260242-cia-head-warns-isis-…
CIA director warns that ISIS has other plots 'in the pipeline'
Julian Hattem
16 Nov 2015

> However, he blamed the inability to fully uncover the plot on new privacy concerns following government whistleblower Edward Snowden’s leaks about global intelligence powers. “I do hope this is going to be a wake-up call, particularly in areas of Europe where there has been a misrepresentation of what the intelligence and security services are doing,” Brennan said, in a preview of a new stage in the fight over global spying powers.

(Techdirt points out that a "wakeup call" is something you request yourself. And the White House memo confirms that the enemies of encryption were simply biding their time, waiting for the next big attack. They had a plan to respond with a media blitz, and I hope Tor Project has a counter-plan.)

Or on people who own Playstations:

http://thehill.com/policy/cybersecurity/260265-isis-terrorists-may-have…
ISIS may have used PlayStations to plan Paris attacks
16 Nov 2015
Katie Bo Williams

Which is like blaming terrorism on communication generally:

> “...my immediate response was ‘big deal,’” wrote security researcher and blogger Graham Cluley. “The PlayStation 4 is the best-selling video game console in the world. If you're raiding the homes of young men in their twenties, don't be surprised if they have a Sony PS4 stashed beneath their TV. Anything which allows two people to exchange messages (whether it be by talking, typing, or waving semaphore flags at each other in a 3D virtual environment) could potentially be used by terrorists to communicate,” Cluley wrote.

No surprise that US governors and EU governments are falling over each other in their haste to close their borders to desperate Syrian refugees:

http://thehill.com/policy/national-security/260249-alabama-michigan-gov…
Six states refuse Syrian refugees
Jesse Byrnes
16 Nov 2015

No surprise that Peter King is exploiting the tragedy to call for even more intrusive suspicionless warrant-less dragnet surveillance of American Muslims:

http://thehill.com/policy/national-security/260220-republican-calls-for…
Republican calls for increased surveillance in Muslim neighborhoods
Bradford Richardson
15 Nov 2015

No surprise that The Donald proposes to "shut down" US mosques (SCOTUS, please sit up and pay attention here, the Constitution needs you right now):

http://thehill.com/policy/national-security/260241-trump-youre-going-to…
Trump: You're going to have to watch and study the mosques
Jesse Byrnes
16 Nov 2015

But here comes a truly disgusting example of a deeply misguided reaction to a terror event, which also illustrates why the scientific and tech community really really *really* need ethical guidelines written wisely, which appeal to reason rather than to adrenaline:

https://www.techdirt.com/
Scientist Bans Use Of His Software By 'Immigrant-Friendly' Countries, So Journal Retracts Paper About His Software
Glynn Moody
16 Nov 2015

> Recently, German scientist Gangolf Jobb declared that starting on October 1st scientists working in countries that are, in his opinion, too welcoming to immigrants -- including Great Britain, France and Germany -- could no longer use his Treefinder software, which creates trees showing potential evolutionary relationships between species. He'd already banned its use by U.S. scientists in February, citing the country’s "imperialism." Last week, BMC Evolutionary Biology pulled the paper describing the software, noting it now "breaches the journal’s editorial policy on software availability."

I can only guess that his simultaneous opposition to US imperialism and to German generosity in welcoming refugees is founded upon some kind of isolationism, which strikes me as perfectly absurd, given Germany's geographical position in the heart of Europe. Views like these are apt to induce a momentary nostalgia for Kaiser Wilhelm and Teddy Roosevelt.

November 24, 2015

Permalink

> Web traffic is very important for the popularity of your website. This can be done in a number of ways including regular press releases and blog posts.

Plus one.

Not sure paying a SEO person is a wise investment of scarce funds, but Tor Project should certainly consider issuing well-crafted press releases, which can include links to the blog.

Before the Snowden leaks, some of us experienced much difficulty in convincing developers working on privacy-enhancing tools to pay enough attention to politics and legal considerations (including trade treaties which will pose a huge threat to us in the coming decades). So thanks again to Snowden, Poitras, Greenwald and other intrepid persons for helping to change the world view of so many all around the world in a way which (contrary to the false claims of our enemies) will help keep everyone safer.

Because governments endanger way more people that "terrorism".

(Ironically, even our enemies now largely admit that IS is functioning as something resembling a nation state.)

It's all of them against all of us.

Do good deeds and try to stay safe, everyone.

November 25, 2015

Permalink

I will Gladly give up Convenience's For Privacy ! Example i Have Been Having Health problem's (Might Be Cancer) , But since i am within a couple years of being able to Retire, I would Like read about it online Without the Company finding out what i am searching for. And Yes i have not thought about the Insurance Part yet. But Point being is That I don't want Anyone to find out what i am doing( Not Even Family, or Friends). Said all that to say THANK YOU Tor For everything you do for the common man. More Privacy and Less Convenience For Me. Thank You Again Tor.

November 27, 2015

Permalink

Draft 1.2

3. Guidelines for research
You MUST do your research without making anyone know about the results and process of your research before its PUBLIC disclosure and you MUST to take all the possible measures to protect this kind of information. If the adversary doesn't know you are making research, he is unable to seize its results.

December 01, 2015

Permalink

> I will Gladly give up Convenience's For Privacy ! Example i Have Been Having Health problem's (Might Be Cancer) , But since i am within a couple years of being able to Retire, I would Like read about it online Without the Company finding out what i am searching for.

If you can obtain a copy of Julia Angwin's book Dragnet Nation, you will certainly appreciate what she has to say.

Good luck with the big C (Cancer, not your employer).

December 01, 2015

Permalink

> If the adversary doesn't know you are making research, he is unable to seize its results.

So, researchers could not be able to use email, poorly encrypted or non-anonymous texting, or unencrypted phone calls to discuss anything in advance of publication.

A very steep learning curve for medical researchers (and indeed for everyone dealing with sensitive personal information), but as a community we all need to start climbing toward that mountaintop where we hope to find a some measure of security and privacy.

It's all of them (governments, data-scrounging mind-manipulating corporations, well-intentioned idiots) against all of us (the People).

December 02, 2015

Permalink

I completely agree that medical research IRBs provide an excellent ready-to-hand standard to build upon, but this happens at a time when our enemies are nixing IRBs too.

Americans might be interested to learn that the 21st Century Cures Act (which has passed the House and is expected to pass the Senate) appears to mandate free access by "researchers" to the complete medical records of all Americans. The Act specifically says that no-one need ask the permission of any IRB, much less seek a search warrant. Some privacy advocates feel the Act essentially repeals the HIPAA Privacy Rule, the last remaining protection under US law for the personal records of Americans.

> the 21st Century Cures Act (which has passed the House and is expected to pass the Senate) appears to mandate free access by "researchers" to the complete medical records of all Americans.

The "mental health reform" bills in the US House (HR-2646) and Senate (S-1945) appear to contain similar language encouraging intelligence agencies to go phishing (again) in the private lives of US citizens not reasonably suspected of having done anything wrong.

Americans should ask their doctor to consider using TM for communicating personal medical information.

December 04, 2015

Permalink

Here is another example of what sounds like irresponsible "research" indiscriminately targeting all Tor users:

https://tor.stackexchange.com/questions/9151/how-can-i-log-all-the-keys…
How can I log all the keys that I contact in Tor?
up vote
Shinra
2 Dec 2015

> Is there a command-line switch or set of patches that allow me to dump all of the public keys that pass through my node?

I can only guess what "Shinra" means by "dump all public keys", and I cannot think of any reason why anyone friendly to Tor would want to do that. I presume such an effort would fail with current Tor architecture but possibly some Tor dev should spend some time worth worrying about novel attacks with similar goals.

I suggest that Tor Project try to provide a clearinghouse where Tor users can anonymously (and optionally, non-publicly, since some reports could be de-anonymizing) report suspected attempts to (prepare to) perform ethically questionable or dangerous "observational research" on real people who happen to use Tor. I suggest that the Project be prepared to explain why "observational research" can be just as dangerous as the administration of experimental medications/treatments without obtaining proper informed consent.

August 29, 2017

Permalink

"Don't set up relays that are deliberately dysfunctional (e.g., terminate connections to specific sites)."

In other words, it is not allowed to setup a relay, if you terminate connections to for example child porn websites ? Or to command and control servers used by malware ? I think the TOR project team should reassess these rules. They should not protect criminals abusing TOR, but fight them.