Traffic correlation using netflows

by arma | November 15, 2014

People are starting to ask us about a recent tech report from Sambuddho's group about how an attacker with access to many routers around the Internet could gather the netflow logs from these routers and match up Tor flows. It's great to see more research on traffic correlation attacks, especially on attacks that don't need to see the whole flow on each side. But it's also important to realize that traffic correlation attacks are not a new area.

This blog post aims to give you some background to get you up to speed on the topic.

First, you should read the first few paragraphs of the One cell is enough to break Tor's anonymity analysis:

First, remember the basics of how Tor provides anonymity. Tor clients route their traffic over several (usually three) relays, with the goal that no single relay gets to learn both where the user is (call her Alice) and what site she's reaching (call it Bob).

The Tor design doesn't try to protect against an attacker who can see or measure both traffic going into the Tor network and also traffic coming out of the Tor network. That's because if you can see both flows, some simple statistics let you decide whether they match up.

Because we aim to let people browse the web, we can't afford the extra overhead and hours of additional delay that are used in high-latency mix networks like Mixmaster or Mixminion to slow this attack. That's why Tor's security is all about trying to decrease the chances that an adversary will end up in the right positions to see the traffic flows.

The way we generally explain it is that Tor tries to protect against traffic analysis, where an attacker tries to learn whom to investigate, but Tor can't protect against traffic confirmation (also known as end-to-end correlation), where an attacker tries to confirm a hypothesis by monitoring the right locations in the network and then doing the math.

And the math is really effective. There are simple packet counting attacks (Passive Attack Analysis for Connection-Based Anonymity Systems) and moving window averages (Timing Attacks in Low-Latency Mix-Based Systems), but the more recent stuff is downright scary, like Steven Murdoch's PET 2007 paper about achieving high confidence in a correlation attack despite seeing only 1 in 2000 packets on each side (Sampled Traffic Analysis by Internet-Exchange-Level Adversaries).

Second, there's some further discussion about the efficacy of traffic correlation attacks at scale in the Improving Tor's anonymity by changing guard parameters analysis:

Tariq's paper makes two simplifying assumptions when calling an attack successful [...] 2) He assumes that the end-to-end correlation attack (matching up the incoming flow to the outgoing flow) is instantaneous and perfect. [...] The second one ("how successful is the correlation attack at scale?" or maybe better, "how do the false positives in the correlation attack compare to the false negatives?") remains an open research question.

Researchers generally agree that given a handful of traffic flows, it's easy to match them up. But what about the millions of traffic flows we have now? What levels of false positives (algorithm says "match!" when it's wrong) are acceptable to this attacker? Are there some simple, not too burdensome, tricks we can do to drive up the false positives rates, even if we all agree that those tricks wouldn't work in the "just looking at a handful of flows" case?

More precisely, it's possible that correlation attacks don't scale well because as the number of Tor clients grows, the chance that the exit stream actually came from a different Tor client (not the one you're watching) grows. So the confidence in your match needs to grow along with that or your false positive rate will explode. The people who say that correlation attacks don't scale use phrases like "say your correlation attack is 99.9% accurate" when arguing it. The folks who think it does scale use phrases like "I can easily make my correlation attack arbitrarily accurate." My hope is that the reality is somewhere in between — correlation attacks in the current Tor network can probably be made plenty accurate, but perhaps with some simple design changes we can improve the situation.

The discussion of false positives is key to this new paper too: Sambuddho's paper mentions a false positive rate of 6%. That sounds like it means if you see a traffic flow at one side of the Tor network, and you have a set of 100000 flows on the other side and you're trying to find the match, then 6000 of those flows will look like a match. It's easy to see how at scale, this "base rate fallacy" problem could make the attack effectively useless.

And that high false positive rate is not at all surprising, since he is trying to capture only a summary of the flows at each side and then do the correlation using only those summaries. It would be neat (in a theoretical sense) to learn that it works, but it seems to me that there's a lot of work left here in showing that it would work in practice. It also seems likely that his definition of false positive rate and my use of it above don't line up completely: it would be great if somebody here could work on reconciling them.

For a possibly related case where a series of academic research papers misunderstood the base rate fallacy and came to bad conclusions, see Mike's critique of website fingerprinting attacks plus the follow-up paper from CCS this year confirming that he's right.

I should also emphasize that whether this attack can be performed at all has to do with how much of the Internet the adversary is able to measure or control. This diversity question is a large and important one, with lots of attention already. See more discussion here.

In summary, it's great to see more research on traffic confirmation attacks, but a) traffic confirmation attacks are not a new area so don't freak out without actually reading the papers, and b) this particular one, while kind of neat, doesn't supercede all the previous papers.

(I should put in an addendum here for the people who are wondering if everything they read on the Internet in a given week is surely all tied together: we don't have any reason to think that this attack, or one like it, is related to the recent arrests of a few dozen people around the world. So far, all indications are that those arrests are best explained by bad opsec for a few of them, and then those few pointed to the others when they were questioned.)

[Edit: be sure to read Sambuddho's comment below, too. -RD]

Comments

Please note that the comment area below has been archived.

tor真不错！

???

Oh.....surprised.......Chines

Oh.....surprised.......Chinese tor user? C.C.P.??? China Gongan???? Are you investigating the Tor users?????

just wanted a translation

just wanted a translation into English...

Google translate tells me it

Google translate tells me it means "Tor really good!"

I'm Chinese normal Tor user,

I'm Chinese normal Tor user, meek-amazon and meek-azure can working in China.

Glad this article has been

Glad this article has been put up now. There's been a lot of sh*t-stirring and exaggerating done by the media/journalists regarding Tor these past few days.
There's also been a bit of worry over the Tor Projects silence on social media and blogs.

Hopefully everyone will calm down, and stop disrupting things.

I also recommend his PhD

I also recommend his PhD thesis:

Sambuddho Chakravarty (2014) Traffic Analysis Attacks and Defenses in Low Latency Anonymous Communication
http://www.cs.columbia.edu/~angelos/Papers/theses/sambuddho_thesis.pdf

Hi I am here to myself

Hi
I am here to myself clarify all misconceptions. Firslty, they have blow it a bit out of proportion by saying that "81% of Tor traffic", which is not true. It was only 81.4% of our experiments, and we have spoken about this upfront in our paper. Secondly, its only a case of experimental validation and the challenges involved in it that is the highlight of the paper. In my thesis I have also tried to address how to solve this particular attack, which might work for other attacks as well...

Regards
Sambuddho

TOR allows you to safely

TOR allows you to safely surf the web? This is total bullshit, and others still believe in him. This project has already collapsed, which can be heard once more as it is broken by governments, and other common Internet robbers who do not have more knowledge. He was a good 10 years ago, not now, its mechanisms do not give advice, often detected only after the publication of defects are patched. And what do the developers? They sleep? They wait until others discover something, because they do not want to, and advertise it so that it is not wonderful - crap. Good advertising for the naive who think that the firing of the TOR and will be anonymous. There is no longer in the dictionary concept of anonymity, yes she was 30 years ago and then ended. Those who propagate the word anonymity - have to move in time, as something they messed up, and that's good.

There are certainly many

There are certainly many people who agree with you, especially recently (but mostly because more people know about Tor now than the did in the recent-past). Tor's goals and mission have not changed, so if you, or anyone you know, can help improve the security and trust in Tor, please do. We're an open community, please join and propose improvement (and help implement them, if you can!).

Spreading discontent and fear hurts our users and the people who need this technology the most.

Hi Sambuddho, Thanks for the

Hi Sambuddho,

Thanks for the note. Indeed it is unfortunate when some journalist grabs at a new paper, misinterprets it, and uses it to produce more ad revenue or whatever it is journalists prioritize these days.

At the same time, it also shows how far we have to go in terms of teaching the general public about how anonymity works -- or heck, how science works.

For any (other) authors of anonymity research papers who are reading this: please consider doing a guest blog post with us to explain your work. We've had good experiences doing that in the past, e.g.
https://blog.torproject.org/blog/new-tor-denial-service-attacks-and-def…
https://blog.torproject.org/blog/tor-incentives-research-roundup-goldst…
https://blog.torproject.org/blog/what-spoiled-onions-paper-means-tor-us…
but there are a lot more interesting research papers out there!

Thanks Roger, for being

Thanks Roger, for being supportive. Please do understand, that:
a) we have already published this work in PAM 2014 (earlier this year) and its nothing really new. Its an extension or experimental validation of Zelinski and Murdoch's 2007 paper an
b) I do fully support Tor, both ethically as well as through my research. Most of my present research activities look at ways to improve anonymity and privacy.

If at all you see an attack paper, its intention is like several other attacks papers: exploring possible vulerabilities merely as through academic exercise. Truly people misrinterpret the results by giving a cursory glance to the paper and start to spread fear which is absolutely unnaceptable.

Shouldn't the emphasis be on

Shouldn't the emphasis be on adding more nodes? More Nodes==Higher security yes?

' So far, all indications

' So far, all indications are that those arrests are best explained by bad opsec for a few of them, and then those few pointed to the others when they were questioned.'

I was arrested for 'Downloading indecent images of children'. I obviously haven't committed such crimes. But as of now the only evidence I got to see was 'A tor user has downloaded 30 images of children from the tor network'.

My OPSEC was relatively good, never revealed information about myself.

But just to note, unless they had grounds to believe I was using the Tor network then how would they have known I used it for such activity, Even though I haven't.

Obviously I won't be charged because I'm innocent but this somewhat causes privacy issues just because the 'tor devs' don't have any idea at all what has happened.

Sounds like the police in

Sounds like the police in your area really misunderstand Tor.

It might be wise to get somebody to go talk to them about what Tor is and what it's for, who uses it and why, etc. Otherwise they will end up harassing somebody else next.

See e.g.
https://blog.torproject.org/blog/trip-report-tor-trainings-dutch-and-be…

Maybe you could contact us in a more private way, to let us know which jurisdiction and make some introductions?

The united kingdom, so

The united kingdom, so that's all you're getting out of me. I don't have time to email you and such. It's up to you whether you believe me.

Oh, one more thing to add

Oh, one more thing to add on-top of my previous comment. The dates match the time of the one-cell relay confirmation attack.

They said from 'Jan 30th'.

Wont reveal anymore details because even typing this is bad OPSec

Did they say 2009? I think

Did they say 2009?

I think you have a lot more reading to do before jumping to conclusions.

Jan 30th 2014. Also the only

Jan 30th 2014. Also the only thing which partly makes sense is the fact I was using Tor chat.

So in theory its possible to get my IP from the guard of the one-cell relay correlation attack. But then it gets complicated...

Would like to know how they could prove I was 'downloading 30 images between Feb 2014- June 2014.

They can't, certainly not

They can't, certainly not from that particular attack which was HS dirs. Did you ever use p2p? Did you ever use tor on clearnet sites or hidden services? Tor2web?

It's possible another version of the attack or another agency was using similar techniques for some time before the flaw was discovered.

Or perhaps they just picked random tor users in the hope they have something incriminating.

Also, you should be aware

Also, you should be aware that 'tor chat' has no relation to Tor, nor have we or anybody else audited its design or code. Sorry about the confusing name. :(

TorChat uses Tor. Also, it

TorChat uses Tor. Also, it doesn't use exit relays, because it is not a tor-to-clearnet service; it is a tor-to-tor-user connection.

also buggy and unmaintained

also buggy and unmaintained since 2012.

Perhaps he had a similar user ID to someone else / someone added him to a buddy list / tried to send him files or maybe the feds were simply 'fishing' and exploited torchat in some way.

If that's really all he used tor for I'm at a loss.

That's your problem. Torchat

That's your problem. Torchat is unmaintained full of holes and generally sucks big time

Torchat’s security is

Torchat’s security is unknown. It creates a hidden service on your computer leaving you vulnerable to deanonymization attacks that apply to all hidden services. It also seems to be a very basic protocol that looks like netcat over Tor. There is no way to decline a file transfer. It automatically starts the transfer, writing the file to /tmp which is a RAM-mounted tmpfs on Linux. Then you are supposed to save the file somewhere. Theoretically an attacker could transfer /dev/urandom while you are away from your computer until it fills up your RAM and crashes your computer. This would be great for inducing intersection attacks.

Another thing is that once someone learns your Torchat ID there is no way to prevent them from knowing you are online, even if you remove them from your buddy list. The reason is because your Torchat instance is a hidden service that publishes a normal hidden service descriptor which anyone can download. There’s no way to stop that. So you should be very conservative about handing out your Torchat ID and only give it to extremely trusted associates.

My error I meant the

My error I meant the RELAY_EARLY attack.
Might be related

https://blog.torproject.org/blog/tor-weekly-news-—-august-6th-2014

Relay early attack? That

Relay early attack? That particular attack only proved you looked up a particular hidden service, not what you did on it or if you visited it.

From my understanding it

From my understanding it also was able to find the location of hidden services. Since torchat uses hidden services it might have been possible.

But that doesn't prove that the adversaries would know what exactly is being sent over torchat.

The relay-early attack could

The relay-early attack could only identify the location of a hidden service if you were running the guard that the hidden service picked.

And if you run the guard, many other correlation ("confirmation") attacks will likely work too.

https://blog.torproject.org/blog/tor-security-advisory-relay-early-traf…

Torchat runs it's own hidden

Torchat runs it's own hidden service on your machine. Normally a hidden service keeps it's guards for some months, but you don't run torchat all the time so it will be changing guards very quickly. This is bad.

I observe that in the

I observe that in the experiments they performed using multiple relays, the true positive rate dropped from 71/90 (=79%, not entirely sure of the origin of 81.4%) to 14/24 (=58%), while false positives grew from 6/90 (=6.7%) to 3/24 (=12.5%), so how well this attack actually scales is not clear.

Yes, I agree. And to

Yes, I agree.

And to translate for the non-scientists here, "is not clear" means "sure doesn't look like it'll work the way people seem to think it will".

Please read our PAM 2014

Please read our PAM 2014 paper for clarifications/details.

I had a guy running Tor

I had a guy running Tor unbeknownst To me. He was using a vpn also. He was a renter and a hacker. My little e2500 linksys log picked him up when i enabled it. I think you are underestimating how truly identifiable you can be.

Since ScrambleSuit is

Since ScrambleSuit is capable of changing its network fingerprint (packet length distribution, inter-arrival times, etc.) why not make every relay and client ScrambleSuit by default?

The performance impact of

The performance impact of doing such at thing would be extreme for several reasons.

The first is the extra CPU overhead that would be incurred by the protocol itself (HMAC-SHA256 and CTR-AES128 aren't exactly cheap).

The second is that even assuming CPU was unlimited, that would gut network throughput for the busy relays. The ScrambleSuit inter-arrival time obfuscation (which is disabled for precisely this reason), puts a hard limit on the per-TCP connection throughput of ~278 KiB/s (1427 bytes of payload per frame, 200 frames/sec with a 5 ms average delay). Since each TCP connection can carry any number of different circuits worth of traffic, that's not enough bandwidth. Additionally HOL blocking problems in the presence of congestion would be made considerably worse with the bandwidth required for the extra padding traffic.

Both of the network related performance problems could potentially be addressed by having one TCP connection per circuit, but that has both performance and anonymity implications as well so it isn't a no-brainer.

The ScrambleSuit (and also obfs4) stream length and delay obfuscation is designed vs a different set of adversaries as well, so I'm not sure how well it would hold up vs certain active attacks.

max input rate == max output

max input rate == max output rate, adding internal buffers you get max time delay as buffer size/max rate. in short with right programming and 1Gbps links for max 1 sec delay 128MB memory is enough. it is a common technic for obfuscating physical placement.

So, it's not a memory thing

So, it's not a memory thing or a buffer size thing at all, and I'm not sure if I just did a bad job of explaining it, so I will make one attempt to further clarify. Examples will use ScrambleSuit but they apply to obfs4 as well.

The way the inter-arrival time obfuscation works is that after each write is done, a variable delay is added between 0 to 10 ms (5 ms on average over a infinite sample size, though this is a gross oversimplification since the way the delay is sampled is a bit more complicated).

Each write is 1 (variable length) ScrambleSuit frame, which can carry up to 1427 bytes of payload. So assuming bulk data is being transmitted, your data rate is limited to 1427 bytes per 5 ms. In some cases there is more delay, in some cases there is less, but the long term average will converge to around 278 KiB/s (1427 * (1000 / 5) / 1024).

This is not a matter of "the right programming", how much bandwidth the link actually has (as long as it's "sufficiently large"), or of bufffer sizes, because the artificial delay added between writes to obfuscate the flow signature limits the maximum theoretical output rate, since you can only drain 1427 bytes from the buffer every 5 ms.

So choose the distribution

So choose the distribution according to the relay's configured bandwidth limit, e.g. a mean inter-frame delay of 50 microseconds would give a throughput of 20,000 frames per second.

Can't sleep for less than a millisecond on some platform or other? Then sleep for 0 ms with probability 0.95 or 1 ms with probability 0.05.

The lower you make the

The lower you make the inter-frame delay the less obfuscation is provided since it's easier to filter out the changes made to the profile of the flow.

Anyway, making every relay PT aware isn't something I'd consider minimal effort by any stretch of the imagination, and trying to retrofit in defenses for this sort of thing in that manner seems questionable, especially using protocols that were designed for other things in mind ("obfuscation good enough to get past DPI boxes").

A better approach to this would be to figure out a clever way to use the PADDING/VPADDING cell types and do this at the Tor wire protocol level, but figuring out what "clever" things to do without massively gutting performance is a open research question.

it is years the german

it is years the german police is exploiting people's routers, yet in another way via javascript, to deanonymize Tor users. When the exploit takes place the deanonymization has a 100% success rate.

https://freenetproject.org/wh

https://freenetproject.org/whatis.html

Replace the TOR for the miracle.

Freenet isnt a suitable

Freenet isnt a suitable replacement for Tor's darknet functions, the attacks outlined here, basically sybil attacks on the outlier of the network, are still applicable if an adversary gets two nodes connecting to yours where it can observe traffic going in and out and compare the difference.

When it comes to a global adversary who can monitor the traffic of all nodes in a network theres no network currently that can withstand this sort of traffic analysis, hopefully someone will develop something that will solve this problem and we can have true anonymity.

There was a project that was

There was a project that was started recently, i dont recall the name, but it was a high latency network that moved data once every hour in a uniform manner so that not even a global adversary could tell who requested or input what into the network. I dont think it has been launched, it was really more of a nuts and bolts github project. Anyone know the name of it?

(Its not freenet, sybil attacks are still possible since freenet does not work in a uniform manner, even if it is high latency. )

Anyways this is what we really need, at least for a true anonymous darknet to exist. Tor will always have its place as a pseudo-anonymous whack-a-mole style way to access the clearnet, but when it comes to communications where you 100% have to know no one else is listening in, all traffic has to move uniformly.

Do you mean pond?

Do you mean pond? http://pond.imperialviolet.org/

Nope, it was an actual

Nope, it was an actual anonymity network not just a specific function, it was called something weird like coineffine, but nothing shows up in the search engine.

NNTP? :)

Waiting for an hour for a

Waiting for an hour for a page to call is definitely what you would call high latency; not even in the bad old days of Tor was it that bad :D

What if Tor did padding and

What if Tor did padding and sent something continiously? If you set the bandwidth limit to 100 KB/s then Tor could send random encrypted traffic at 100 KB/s continiously. That way it would not be possible to see when something is actually in the stream.

I would also like Tor to see Tor wrap it's traffic in something like encrypted BitTorrent so that anyone who just looks at the stream can not tell it apart form encrypted BitTorrent (which is a much more common thing on the Internet than https traffic).

Yes, Tor should add some

Yes, Tor should add some "noise", useless, random data if you like into the mix to confuse and throw attackers of the trail... ;)

Does this attack depend on

Does this attack depend on injecting javascript on HTML pages? If so why not just block all javascript like you should do when using tor?

No, variants of the attack

No, variants of the attack can work fine without javascript. (I'm not sure about the attack presented in this paper in particular, but it's a moot question.)

would not be posible to

would not be posible to artificially generate some sort of random artificial microlatency between tor entry and exit nodes, something that is imperceptible to users but increases the overall network noise and decreases chances of traffic analisys matching?

I answer some of this one

I answer some of this one here:
https://blog.torproject.org/blog/traffic-correlation-using-netflows#com…
and in the many other answers here and on
https://blog.torproject.org/blog/quick-summary-recent-traffic-correlati…

hi .. unrelated to this

hi .. unrelated to this blog... but WHY DOES TOR NEED CORRECT TIME ON MY COMPUTER to RUN????

Calm down. Tor needs correct

Calm down. Tor needs correct time so it can see how old the consensus (list of tor relay IPs and info related to those relays) is. This allows it to keep up to date with the current number of Tor relays, the current Tor relays that have been marked as "bad" so clients don't connect to them, etc. It also allows Tor to check whether a relay's "certificate" has expired yet. It wouldn't be good if you connected to some relay pretending to be the same relay as one from 2005 who's certificate it might have stolen!

It's time to implement

It's time to implement patterned/random packet timing. The impact on latency would be minimal and it would make Tor vastly more secure.

by far more articles have

by far more articles have overstated security of Tor rather than understated.
If they hadn't so much, by now there would have been a demand for better onion routing solutions and tons of funds that are needed to make them.

I don't follow this one --

I don't follow this one -- there *is* a demand for better anonymity designs, but nobody has one (or more precisely, nobody has one that is convincingly better), and actually the 'tons of funds' are not easy to come by no matter your design.

It's also certainly the case that assessing the security of a design by reading mainstream newspaper articles about it is never going to get you where you want to be.

And finally, check out
http://freehaven.net/anonbib/
for many useful papers on anonymity designs.

https://en.wikipedia.org/wiki

https://en.wikipedia.org/wiki/Traffic_analysis

Traffic flow security:

# causing the circuit to appear busy at all times or much of the time by sending dummy traffic

# sending a continuous encrypted signal, whether or not traffic is being transmitted. This is also called masking or link encryption.

# It is difficult to defeat traffic analysis without both encrypting messages and masking the channel. When no actual messages are being sent, the channel can be masked [9] by sending dummy traffic, similar to the encrypted traffic, thereby keeping bandwidth usage constant .[10] "It is very hard to hide information about the size or timing of messages. The known solutions require Alice to send a continuous stream of messages at the maximum bandwidth she will ever use...This might be acceptable for military applications, but it is not for most civilian applications."

Don' forget the word

Don' forget the word __random__ -> not max but random not constant but random. This can be quite acceptable for any applications. And don't buy 'military' - it means standardized and approved in this context. BTW who told you should use standards in inter-relay network!?

So the problem is that full

So the problem is that full obfuscation would require sending 100% of maximum traffic per user/connection all the time. That is very inefficient and expensive. But, with bandwidth cost continually decreasing, won't there come a point in the future when this is practicable? Isn't that the future of TOR?
https://gigaom2.files.wordpress.com/2012/08/news20120802-1.gif

One problem is that you, as a user, would have to let your TOR traffic run at this maximum speed 24/7; if you didn't, then large adversaries could still record when your connection begins and match it to when a certain exit node starts a certain connection to a certain website. There will be a certain delay, and the exit node will have many connections running at the same time, but it is still information; at least it may allow your adversary to increase its chances to guess right. Or am I mistaken?

I will see your graph and

I will see your graph and raise you one:
https://metrics.torproject.org/bandwidth.html?graph=bandwidth&start=201…

So yes, bandwidth is getting cheaper, but also the number of users and load on the network will grow to fill whatever the Tor network has to offer.

Also, you should do the math on millions of users all using their full bandwidth all the time -- it adds up to look very grim indeed. :( Perhaps what we need are approximations (sometimes known by the phrase 'traffic shaping') that get some of the benefits with only some of the costs?

See
https://blog.torproject.org/blog/traffic-correlation-using-netflows#com…
for more discussion.

We have to be careful when

We have to be careful when discussing traffic correlation attacks as a family of attacks as opposed to a specific attack implementation. Someone implements an attack and generalizes the success or failure of that particular one to the general attack family. But implementations will get better and eventually reach their theoretical limit.

It's not a given that an attack will have false positives. There are algorithms with a zero false positive rate -- not near zero, exactly zero. And the obvious implementation is real-time and distributed. The dilemma is that you can't discuss these things without giving attackers something to go after your friends with, but without a proof of concept researchers aren't convinced...

It seems to me that as the

It seems to me that as the set of relays and flows scales up, the concept of an algorithm with an exactly zero false positive rate gets harder and harder.

I mean, feel free to not discuss it, but I think our adversaries do a lot of their discussions in secret already, so our best bet is transparency and openness. Submit your analysis to the PETS symposium:
https://petsymposium.org/2015/
and get some experts to review it. The more we know the more prepared we are!

NSA的鸡鸡很小。

how I can controle my

how I can controle my country's Tor navigation, for example i want the other see me that i navigate from country ..x .. how i can do that ?

https://www.torproject.org/do

https://www.torproject.org/docs/faq#ChooseEntryExit

Does this affect Tor users

Does this affect Tor users who don't visit hidden sites? I don't have a VPN right now and use Tor to hide my IP. I check emails and log into the everyday sites I visit. Nothing illegal. Well known websites.

you should consult you state

you should consult you state agencies. you never know legal or illegal are sites you visit till you ask them. preferably you should have write down list of web-pages you are going to visit in next month approved by some agency.

I'm in the United States.

I'm in the United States. The sites I visit are not illegal under U.S. law.

I think it is a bit

I think it is a bit disconcerting that people behind the Tor Project is not willing to accept that traffic correlation right now is a very important issue. The fact that the majority of users come from democratic countries and that the majority of tor nodes are running on those democratic countries as well but with government that implement Orwellian surveillance programs to control traffic at all levels (rogue exit nodes, ISP logs, Internet routers surveillance, cable tapping, etc), ironically makes a person connecting from China to the Tor network more secure than someone connecting from the UK, because China can not become a global adversary to a Tor user. In fact the Tor network could be, particularly for people living in a country belonging to the Five Eyes alliance for instance, more compromised than government are willing to recognize or reveal through detentions and police stint operations. I agree that on Tor you need to launch selective attacks to reveal someone's identity but it seems quite easy to adquire targets if you can profile network flows; in other words, if you know where to look for.

I'm sorry if it looks like

I'm sorry if it looks like we didn't think traffic correlation is a big issue. It is.

But this paper that people are talking about doesn't move the field forward much.

And talking in generalities is not helpful either.

What we need are more good research papers to actually answer questions for us. In particular, I'd love to see a graph where the x axis is how much extra overhead (inefficiency, delay, padding) we add, and the y axis is how much protection we can get against traffic correlation attacks with that overhead. Currently we have zero (!) data points for that graph.

Much research remains before we know how to build a system that is safe against such attacks.

The correlation can be

The correlation can be minimized in future. If the TOR establishes two three-node (instead of one) connections the correlation will be minimized.

Additionally the TOR shall send some dummy background communication which will be lost inside the TOR network (without reaching destination).

Additionally the client of the TOR (TOR browser) could send random requests via the same three-node connection to other servers with random messages in background of the main communication. It should hinder attackers.

See

See e.g.
https://blog.torproject.org/blog/quick-summary-recent-traffic-correlati…
and
https://blog.torproject.org/blog/quick-summary-recent-traffic-correlati…
for further discussions of these ideas.

My question: as I understood

My question: as I understood relays changes each 10 minutes or so. How can you correlate communication after thav 10 minutes if you had not the last hop node?

You probably want to worry

You probably want to worry more about cookies and pixels

I'm from Australia and my

I'm from Australia and my ISP has blocked connections to the public TOR network.
It is legal to use TOR here in Australia, so why has my ISP blocked connections to the public TOR network?
I am just a normal average law abiding citizen who wants to have privacy on the internet.
Also can someone here give me a list of TOR bridges?

https://bridges.torproject.or

https://bridges.torproject.org

Which ISP is it? I don't

Which ISP is it? I don't know of any ISPs in Australia that are censoring Tor relays either by address or by DPI (and it would be good to learn if there are some).

It's more likely that you have some other problem going on, like your clock is wrong or you have some firewall or antivirus thing that's preventing Tor from reaching the network.

If you are really stuck ,you

If you are really stuck ,you could download a copy of the Tails OS
from https://tails.boum.org/ and boot from a live cd

Why doesn't tor send random

Why doesn't tor send random data all the time to random nodes to make the connections be less correlative? They see a 200kbps burst on your tor node, and five seconds later see a 200kbps burst coming from the exit node. So why not send a baseline amount of data all of the time, and when real data needs to exit, then back off of sending the fake data by the same amount?