Tor descriptors à la carte: Tor Metrics Library 2

by karsten | June 29, 2017

 

We're often asked by researchers, users, and journalists for Tor network data. How can you find out how many people use the Tor network daily? How many relays make up the network? How many times has Tor Browser been downloaded in your language? In order to get to these answers from archived data, we have to continuously fetch, parse, and evaluate Tor descriptors. We do this with the Tor Metrics Library.

Today, the Tor Metrics Team is proud to announce major improvements and launch Tor Metrics Library version 2.0.0. These improvements, supported by a Mozilla Open Source Support (MOSS) “Mission Partners” award, enhance our ability to monitor the performance and stability of the Tor network.

From internal tool to public resource 

Originally, the library was an internal tool. We used it to fetch the latest descriptors archived by CollecTor in all Java-based codebases and to parse descriptors that had been published by Tor relays, bridges, directory authorities, and other parts of the public Tor network.

Over the years, we've added more data sources and made it into a publicly-available resource. Our data has been used in many ad-hoc analyses, as well as in AtlasExoneraTor, and Tor Metrics.

Better memory-efficiency, fewer bugs 

This launch adds numerous improvements, from interface simplifications over memory-efficiency improvements to added support for newly-added descriptor parts and, last but not least, bugfixes. You can check out the change log for a complete overview.

A few months ago, the library found a home on the recently-reorganized Tor Metrics website. Here you'll find tutorials for getting started with the library by downloading descriptors from CollecTor and performing two simple analyses to determine the current relay capacity by Tor version and frequency of bridge transports. The project page also contains links to all releases, the full change log, and the latest JavaDocs.

You can be a part of what's next 

“Tor metrics are the ammunition that lets Tor and other security advocates argue for a more private and secure Internet from a position of data, rather than just dogma or perspective.” 
—Bruce Schneier (June 1, 2016)

As always, if you're a developer doing something cool with Tor network data, please let us know what features you're finding valuable so we continue to support those. And, if we think other people could learn from your project, we could feature it on the Tor Metrics website.

Happy à la carte descriptor collecting, reading, and parsing with Tor Metrics Library 2. Bon appétit!

Comments

Please note that the comment area below has been archived.

June 30, 2017

Permalink

How many times has Tor Browser been downloaded in your country?

Can one really fetch nb of TB downloads per country? If true that would be really awesome!

Uhm, no, that's not possible. We do have numbers on Tor Browser downloads by locale, which is related but not the same. Looks like this sentence slipped in in one of the edits and nobody noticed. Oops. Changed to say "How many times has Tor Browser been downloaded in your language?" which is probably easier to understand than locale. Thanks for pointing this out!

July 02, 2017

Permalink

"You're a data person and only trust the statistics that you doctored yourself? Here's all the data right from the source, doctor."

Hahahaha... What are the differences of the same data in the eyes of "a data person" and "a non-data person"? Here the illustration:
+) Data in the eyes of a non-data person: :[[... B@$$ @$$!... #$^^&&*%$^&...
+) For a data person: ^.^... $$... $$$$... $$$$$$... $$$$$$$$$$$$$$$$$$$... xD