EFF's Panopticlick and Torbutton

by mikeperry | January 29, 2010

The EFF has recently released a browser fingerprinting test suite that they call Panopticlick. The idea is that in normal operation, your browser leaks a lot of information about its configuration which can be used to uniquely fingerprint you independent of your cookies.

Because of how EFF's testing tool functions, it has created some confusion and concern among Tor users, so I wanted to make a few comments to try to clear things up.

First off, Torbutton has defended against these and other types of attacks since the 1.2.0 series began. We make the User Agent of all Torbutton users uniform, we block all plugins both to prevent proxy bypass conditions and to block subtler forms of plugin tracking, we round screen resolution down to 50 pixel multiples, we set the timezone to GMT, and we clear and disable DOM Storage.

In fact, based on my display resolution calculations, we should only be presenting just over 7 bits of information to fingerprint Tor users, and this is only in the form of window size, which for most users either changes from day to day, or is set to a common maximized display size.

Why then does EFF's page tend to tell Tor users that they are unique amongst the hundreds of thousands of users that have been fingerprinted so far? The answer has largely to do with selection bias. The majority of visitors to EFF's site are likely not Tor users. Thus Torbutton's protection mechanisms tend to make Tor users stand out as unique amongst the rest of the web. This is not as bad as it seems. Torbutton's protection mechanisms are only meant to make all Tor users look uniform amongst themselves, not to make them look like the rest of web users. After all, Tor users are already identifiable because they appear from easily recognizable Tor exit node IP addresses.

What's more is that these protections are of course not enabled while Tor is disabled. In fact, one of Torbutton's design requirements is to not provide any evidence that you are a Tor user during normal operation.

I'd like to commend the EFF for bringing these web fingerprinting details to the public eye in a way that I unfortunately was unable to do when I first developed protections for them.

However, I wish that they also included or at least referenced url history disclosure information with their tool. After all, if you have history enabled (and you haven't set Torbutton to block history reads during Non-Tor usage), each URL you visit adds another bit to the set that can be used to fingerprint you. Often bits that are extremely sensitive, such as which diseases and genetic conditions you have based on your Wikipedia or Google Health url history. I am convinced that it is only a matter of time before the ad networks begin mining this data to provide targeted ads for over-the-counter and prescription medications and to sell this data to other marketing and insurance firms, if they don't do it already.

Comments

Please note that the comment area below has been archived.

I noticed that Torbutton

I noticed that Torbutton sets the window size to be equal to the screen size, but this is very unusual. Wouldn't it be better to set the screen size to be a standard screen size, as close to the window size as possible?

This is likely to leak

This is likely to leak information in the form of the amount of overhead you need. Different platforms and devices will have different needs for decoration overhead in terms of this differential. Better to behave as if this overhead is always 0. My feeling was that webapps really only need to know the total size available to the render window, and should behave as if this is the maximal size available for them to work with anyways. I've always hated websites that try to increase the window size to utilize more of your available desktop, which seems to be the only use case for this information that I can think of. They should be working with the space you have given them.

I wonder whether it would be

I wonder whether it would be possible to plug holes opened up by plugins, eg font enumeration via flash?

Yes it is possible, but not

Yes it is possible, but not really feasible because you have to operate with these plugins on a binary level, not on a javascript or even programmable API level. I have written prototype code to instrument flash on Windows to prevent it from making its own socket calls for example. It is even possible to do this relatively cross-platform - ie to abstract the hooking procedures for different binary formats. However, making this stable and not run afoul of antivirus software on various platforms is a daunting task, not to mention working with plugins that operate partially out of process, like adobe's acroread, for example.

Would you mind publishing

Would you mind publishing this prototype code? I am interest in this subject.

WinXp Pro and Firefox

WinXp Pro and Firefox 3.6

Downloaded the latest inst. bundle of Tor for windows:
http://www.torproject.org/easy-download.html.en

Polipo.exe causing difficulties to get to internet though Tor and Vidalia seem to work ok and connecting properly.
I unchecked in Torbutton Polipo and found everything seemed to work OK.
Wondering if do not use Polipo at all, is there any disadvantage?
How can I get Polipo working.

In earlier version of Tor there was Privoxy and it was giving no problems.

Sorry if this is not a correct site, but I could not find anything else.

IIRC Firefox used to have a

IIRC Firefox used to have a DNS leaking problem and an additional proxy took care of that. Polipo is chosen instead of Privoxy for speed, also I think later versions of Privoxy had also some leaking problems.

Panopticlick didn't include

Panopticlick didn't include the CSS site history thing because Peter felt that visited URL history changes too frequently (e.g. when users visit new sites, or when old visited sites expire from their history) to be a reliable and stable identifier. This is different from other browser properties which don't change under normal day-to-day browsing behavior but only when the user makes a configuration change to their computer.

However, another researcher has been looking into how reliably site history can be used with fuzzy matching techniques so there should be some preliminary experiments and data about this soon. There is obviously some kind of trade-off between the false positive rate and the false negative rate in recognizing users from their fingerprints based on what data sources you integrate into the fingerprint and how you match them. Using visited site history would probably decrease false positives at the cost of increasing false negatives, but it will be interesting to see data that shows whether it might be worth it.

The Panopticlick site also explains that Panopticlick is not aiming to catalogue every metric that might be used for device or browser fingerprinting, but just to gather some hard data on the distribution of some specific browser features to make their tracking potential more concrete.

Yes - interesting. I suppose

Yes - interesting. I suppose the EFF was right in that they have plenty of bits to work with as it is..

I've always suspected that the best way to do this is to make a bit vector space and to classify users by their nearest neighbour in the space, but I suppose it is a difficult problem when to create a new user entry vs classifying them as being close to an existing user.

Between this and Ccleaner

Between this and Ccleaner (Piriform) (keeping everthing clean/empty) im not concerned bout the media part lol