Metrics Reloaded

by linda | January 9, 2017

.frame {
text-align: center; margin: 1em 0;
}
.screenshot {
max-height:100%;
max-width:45%;
vertical-align:top;
horizontal-align:center;
}

If you haven’t noticed already, https://metrics.torproject.org/ has a new look. The underlying data, graphing engine, and graphs remain the same.

The goal for this project was to make Tor metrics easier to use and more useful. Our process involved usability inspections, feature brainstorming, rough wireframes, and iterative prototypes. This page documents our process in detail.

We restructured, redesigned, and added content to:

  • Alleviate pain points in using the interface for better workflow and navigation.
  • Aggregate resources for journalists, developers, relay operators, and researchers.
  • Increase compatibility with phones and tablets through responsive design.

It’s truly a place where you can learn interesting facts about the Tor network! We’re especially excited about the news page, which lists various world events with measured anomalies. We hope that the operation, development, and research pages help our many valued Tor community members to find the resources they need. Feel free to email metrics-team@lists.torproject.org with suggestions.

This work was sponsored by Mozilla's Open Source Support. The objectives were to 1) determine the usability of Tor Metrics and 2) address the most pressing usability issues identified (milestone 6.1 and 6.2 of this contract).

Comments

Please note that the comment area below has been archived.

We're very proud of it! The design was the result of the metrics and UX teams collaborating on the layout, previous volunteer designers who contributed to the style guide, and a very meticulous web developer (Rafe!) who added additional flair to the site.

January 09, 2017

Permalink

Very neat IMO. One question: When will the "Applications: How many Tor applications, like Tor Browser, have been downloaded or updated." section be available?

So you're going to plant a tracker like Tails does. Taking a log in any condition is just bad. So your servers are talking logs to count or collect silently about visitors right?

karsten

January 10, 2017

In reply to by Anonymous (not verified)

Permalink

No need to return your monitor, we indeed ran out of distinguishable colors a few versions ago ("Other", "0.1.0", "0.1.1", "0.1.2", "0.2.0", "0.2.1", "0.2.2", "0.2.3", "0.2.4", "0.2.5", "0.2.6", "0.2.7", "0.2.8", "0.2.9"). At some point we'll have to switch to a better visualization, or maybe we should have done that a few versions ago. If you have suggestions and/or some experience with R/ggplot2, please consider contributing!

January 09, 2017

Permalink

Those recaptcha are irritating when visiting sites with Tor.

Yep.
The problem is that cloudflare is mostly used by small websites (blog, forum, shitty press). Which are the targets of script kiddies hiding behind Tor. It is not because of DoS/DDoS that cloudflare blocks tor, imo.

Anyway, they do it for business and this is not ethical. Shame on them.

January 10, 2017

Permalink

No , i am one off only twenty , that's not good , so i am an an elitist? lol

karsten

January 11, 2017

In reply to by Anonymous (not verified)

Permalink

Good point. We have a deliverable where we're going to analyze such issues: "2017-01: Perform an analysis on reducing the amount of sensitive, potentially personally identifying data stored in memory of Tor relays and bridges or reported to the directory authorities. (Sponsor X 4.1. Tor daemon)". I'll make sure that we're looking into this example, too.

January 11, 2017

In reply to karsten

Permalink

thank you

Using Tor will hide you within a particular anonymity set. In this case--let's say you are from a New Zealand and released some sensitive documents, you can hide who you are by using Tor, but you can't hide that you are using Tor. If we broadcast that there are only a small handful of people using Tor there, someone could easily know that it could be one of 20 people. To be clear, all that we leak is "20 people in new zeland used Tor in a given amount of time"--we don't even store any sensitive information to hack, be coerced into giving up, etc. But assuming they had the resources to find out who was using Tor, knowing that there is only a small amount of leads may encourage them to pursue who was responsible for leaking those documents (and presumably do something not good to you after).

We post what we consider to be non-sensitive and non-risky data onto metrics.torproject.org. Some are obviously sensitive (unencrypted data passed through exit relays), others are probably safe (number of relays in the Tor network) and others are in between (number of Tor users in a country). This is usually safe for countries where hundreds of thousands of people use it, but can potentially be unsafe in others. Karsten mentioned above that we are working on determining how we determine when it is safe to post such information.

January 11, 2017

In reply to linda

Permalink

so you have sold your users... the same as "we have just one user from your city right now"

We do not provide metrics on users by city.

We do the best to protect users as much as we can. We have protocols that obfuscate traffic to look like random noise or other non-Tor protocols. We have bridges that we try to keep secret.

In research, it is clearly stated that Tor provides you with anonymity within a particular anonymity set--we will take your comment as a cue to better educate users.

arma

January 11, 2017

In reply to linda

Permalink

Also, remember that each relay adds some noise before it publishes its aggregated 24-hour summary. So that number 20 comes from an aggregate of intentionally noisy numbers. I think it's fair to say that the number 20 means that the actual number is somewhere between 5 and 32, and there isn't really any way for *us* to have a better idea than that, because the relays only publish the noisy number.

So yes, a small number might be bad news for a variety of reasons, but the "Tor said there were 20, and we know about 19, so you must be the 20th" scenario would not be a correct use of the data.

The even better idea here would be to use the newer fancier privacy-preserving counting mechanisms, where we can get a single aggregate number, with an appropriate amount of noise added, across all relays, without anybody (us or others) being able to know the numbers that each relay contributes. For an example of that approach, check out PrivCount:
http://www.robgjansen.com/blog/2016/10/23/introducing-privcount-for-saf…

linda

January 11, 2017

In reply to arma

Permalink

This is an important clarification I missed in my comment above.

January 12, 2017

In reply to linda

Permalink

Interesting! Thanks to you and also the commenter that asked the question.

Not arguing any point of view, but I have a question: what kind of adversary would it take to obtain the same information on their own? Anyone running a Tor client (via directory authority requests, consensus, etc.)? Anyone running a directory authority? Anyone watching some number of Tor clients/relays upstreams? Anyone harvesting or having access to (e.g. Google) information about many bridges? In other words, what is privileged about metrics.torproject.org in regards to the information it obtains and publishes?

NZ is also a five eyes member , so when information comes out that only twenty people are using a bridge type, it's not great news if you are one in twenty

January 11, 2017

Permalink

How do I see the average throughput of middle relays, exit relays, and bridges, individually? I run some relays and I'm not exactly sure what kind they should be, based on their bandwidth. For example, I don't want to waste a fast connection by running a bridge, but I don't want to waste a potential bridge IP address by running a slow middle relay. I think the recommendation for bridges is <10Mbps (1.25MB/s), but I don't know how old this number is or if it is based on any actual metrics.

You can't see this information on metrics.torproject.org, because it was not designed to display individual relay information nor to be used as a personal diagnostic tool. It was designed to show the aggregate statistics and give an idea of how the Tor network as a whole is doing.

Atlas.torproject.org will show you after-the-fact information about bandwidth use. If you type in your nickname or a fingerprint, Atlas should give you the information that you need for the relays that you run. It can also look up specific relays and bridges, even the ones you do not run (in fact, we use this to collect data for Tor metrics). Some information for bridges are scrubbed out on Atlas for security reasons, but the bandwidth use is not.

Arm (https://www.torproject.org/projects/arm.html.en) is real-time status monitor that lets you visualize resource use, realying, events, and more. But, it is several years old, only compatible with linux, and due for a replacement (the replacement is under development but not ready yet).

There are also other, per-OS tools to see how much bandwidth a machine you are running is using. We don't have any particular recommendations here, but want to let you know that it's an option.

January 12, 2017

In reply to linda

Permalink

Thanks for the reply! I've used Atlas but I forgot about it somehow. Anyway, Atlas appears to only show bandwidth about one node at a time. I was wondering if there's a convenient way to see, e.g. "in a set of all nodes that are a bridge/middle/exit, the average/median bandwidth for an individual node from the set is x MB/s". Actually if the data is downloadable in CSV or similar format, I could calculate that trivially with a shell script (or even Excel, I guess), but screen scraping and parsing HTML is a little out of the question for the purpose.