How Bandwidth Scanners Monitor The Tor Network

by juga | April 11, 2019

The Tor network is comprised of thousands of volunteer-run relays around the world, and millions of people rely on it for privacy and freedom online everyday. To monitor the Tor network's performance, detect attacks on it, and better distribute load across the network, we employ what we call Tor bandwidth scanners. The bandwidth scanners are run by the directory authorities (dirauths).

tor-scanner

Tor relays report their own bandwidth based on the traffic they have sent and received. But this reported bandwidth is not verified by other relays. Bandwidth scanners help verify relay bandwidths. They also provide some initial traffic to new relays, so those relays can report a useful amount of bandwidth.

Torflow was the first Tor bandwidth scanner, started in 2011. Over time, it has become more difficult to install and to maintain, because the libraries it was built with are no longer maintained. In 2018, we started to develop "Simple Bandwidth Scanner" (sbws) using more modern and maintained libraries. Right now, out of nine dirauths, six are bandwidth authorities, which means they run bandwidth scanners. (There is also one bridge authority. It doesn't do bandwidth scanning.)

Tor Bandwidth Authorities

sbws chooses two relays, and builds a path between them. One relay is the target of the sbws measurement. The other relay is a random relay that's faster than the target relay. The scanner downloads data from a web server through this path between the relays. It measures the bandwidth as the amount of data downloaded and the time it took. Every hour, the scanner filters invalid measurements, aggregates them, and scales the valid ones. Finally, it writes a bandwidth file with all the relays' bandwidth. The directory authorities read this file and vote on the relays' bandwidth.

Torflow divides the network into partitions depending on relay bandwidth. So some relays would end up stuck in a low-bandwidth partition. Unlike Torflow, sbws does not divide relays into partitions, so relays can't get stuck in a slow partition.

Recent Updates

To reach a consensus about a relay's bandwidth, as reported by the scanners, tor uses the median of at least three of their votes. Right now, there is only one directory authority running sbws, and five run Torflow. We plan to have three authorities running sbws by the end of April, so we'll start to see the effects of the changes to sbws soon. If all goes well, we'll eventually want all dirauths to switch from Torflow to sbws.

The latest version of sbws reports all relays that it has seen, including ones that it could not measure. This will help us to diagnose issues and anomalies in the relays, the network, and the software itself. It will also help to answer relay operators questions about their relay consensus weight and bandwidth.

In the next few months, we will start archiving the bandwidth files from sbws and Torflow using CollecTor. Once the directory authorities start running Tor version 0.4.0.4-alpha or later, CollecTor can ask them for their bandwidth files. This will increase transparency while preserving anonymity, since the reported bandwidth values are aggregated from multiple measurements.

We also changed tor so it includes the bandwidth file hash and bandwidth file headers in the directory authority votes. In this diagram you can see how the directory documents are related.

Before this change, it was possible to know the bandwidths that were reported by a scanner in a vote, but we could not know which bandwidth file corresponded to which vote. The bandwidth file headers can also help to debug bandwidth file and vote issues.

We wrote a specification for the bandwidth file format. This way, others can develop parsers to obtain metrics, or develop compatible bandwidth scanners.

Future work

There are a still several engineering and research improvements that can be done.

So far, sbws scales the raw bandwidth measurements in the same way as Torflow. Scaling is needed in order to balance the load in the network.

The measurement and scaling of bandwidth weights should achieve an equilibrium goal. For instance, the user should experience consistent performance, regardless of the relays that their tor client has randomly chosen.

sbws is decentralized in the sense that there will be several instances of it running, but each of these instances is a single point of failure. Any shared servers or DNS infrastructure are also single points of failure.

Tor needs a minimum of three bandwidth authorities, and we have six bandwidth authorities running right now. We hope that sbws will be easy for directory authority operators to deploy, so we might have seven or eight authorities running a bandwidth scanner in the future.

sbws is still vulnerable to denial of services attacks and traffic manipulation, as explained in the 2018 research post.

Get Involved

We would be grateful to anyone who could help to improve the scanner. We encourage you to open tickets, base your implementations on sbws, develop a compatible external application programming interface, or extend the existing bandwidth file format. If implementations use similar code and data formats, it will be easier for Tor to use them, maintain them, and generate metrics from them.

If you want to dive deeper into some of the current bandwidth data, take a look at:

Acknowledgements

sbws has been developed in conjunction with Tor network team and Tor community. It has been partly funded by Prototype Fund under the name OnBaSca.

Comments

Please note that the comment area below has been archived.

April 11, 2019

Permalink

Wait, there are only 9 dirauths and only 1 bridge auth out of about 6300 relays? Isn't it bad to have so few? Are these single points of failure related to why users sometimes can't connect to any onion sites?

Before version 3 of the Tor directory protocol, each of the directory authorities was indeed a single point of failure. Now though they vote on the state of the network which means that adding directory authorities can increase trust as opposed to introducing more single points of failure. Even when all the directory authorities are offline, the network can still survive for a time using other relays as caches of the network status. We would like to increase the number of bridge authorities in the future and work is ongoing for allowing that, but the bridge authority does not need to be online for users to use bridges, only to learn about new bridges if they do not already know of any, or if the ones they are using are blocked.

April 13, 2019

Permalink

Which, if any, of the software listed here can be used by a normal Tor user and that automatically report the bandwidth results to the appropiate server? Because it sounds like only the bandwidth authorites are the ones reporting the results.

Thanks :)

Without the bandwidth authorities, we would have difficulty balancing load over the Tor network. Not all relays are equal and so we cannot uniformly distribute users over the relays. By using the bandwidth authorities to inform load balancing, every user of the network is indirectly benefitting from the bandwidth authorities though not running the code or interacting with it directly. We do not collect metrics from clients, including the tor client software or Tor Browser Bundle, as we do not yet have a means of performing this data collection in a way we consider to be safe. Tor relays aggregate the bandwidth used across all of its users to allow us to collect bandwidth usage statistics in a safer way, but we never see individual user bandwidth statistics.