Tor at the Heart: The Ahmia project

During the month of December, we're highlighting other organizations and projects that rely on Tor, build on Tor, or are accomplishing their missions better because Tor exists. Check out our blog each day to learn about our fellow travelers. And please support the Tor Project! We're at the heart of Internet freedom.
Donate today!


The Ahmia project

Onion services are used by thousands of people every day, yet they remain as elusive as ever. There is no central repository of onion sites, and there are no great ways to find the content you are looking for. We feel that this "foggy situation" severely impacts the user experience of onion services and hence also impedes their deployment and acceptance by the general public. It's easy to dismiss the onionspace as smelly if you only read media articles about the onion sites that stink the most.

How is one supposed to navigate in the onionspace if there is no map?

On the "normal Internet," people are used to using search engines to find the content they are looking for: blogs, shops, educational resources, cat pictures. Search engines act as streetlights on the dark alleyways of the Internet; allowing people to navigate and visit the places they want.

However in the onionspace, search engines are not well established, and finding the right content is much harder. For years people have resorted to various DIY solutions for listing and finding onion addresses, but none of those solutions is particularly pleasant or complete.

Imagine Alice wants to start a blog about her cats on the onionspace. There is no good place for Alice to list her onion address so that other people can find it. Without a good search engine, it's hard for other cat fans to find her website and start building a community.

How is one supposed to catch 'em all if we don't know how many there are?

Hence, there is no better time to introduce Ahmia! Ahmia is a search engine for onion sites. The Ahmia project has been around for years, and it's been collecting public onion addresses and indexing them so that users can search for the content they are looking for.

Ahmia's indexing technology is improving, and the quality of the search results has gotten much better over the past year. Ahmia also provides an easy way for onion service operators to register their own onion sites with the search engine. Ahmia's onion site is here.

Juha Nurmi, the lead Ahmia developer, is still actively involved with the project, however writing a low-budget search engine is not an easy job! Crawling the Internet requires heavy infrastructure and is technically complicated. Discovering onion links means searching in the deepest corners of both the normal Internet and the onionspace. Ahmia is always looking for more volunteers and sources of funding! Two years ago, Tor supported Ahmia by working together in Google Summer of Code 2014.

How is one supposed to walk around if the fog machine is on?

Finally and closing with a healthy dose of paranoia, we need to remember that centralized search engines might be a temporary solution for now, but they are never the end goal. Centralized services should be avoided in high-security systems like anonymity networks, and we should always strive to build decentralized systems and to research alternative ways to make anonymity systems more usable. There is lots of work to be done.

Donate and get involved!

Thank you for reading and enjoy Monday!

Seth Schoen

December 05, 2016

Permalink

I guess a peer-to-peer search engine like YaCy is still a long way out? In the shorter term, I wonder if free (untrusted) federation would be possible among multiple operators running the software. I'll have to read more about it.

I don't know much about distributed search engines but YaCy indeed seems to be a long way out. It's underdeveloped and has had fundamental issues over the years.

We should look on whether there has been any research on this area in the past years, but it's definitely a very hard problem to solve (depending on the desired security properties).

Secure naming systems are other better-understood systems that might help with improving the UX of onion services. Also, onion services have long-term identity keys that could be used to provide some sort of authenticity even if the search engine is a centralized entity.

Secure naming was a hard problem too until the relatively recent discovery of the blockchain (which interestingly solved quite a lot of other formerly-hard problems, too), and Namecoin even has provisions for associating .bit names with .onion and .i2p addresses. I'm really unpleasantly surprised that Namecoin still hasn't gained much traction, even in the deep web.

That said, I don't see how secure naming is a replacement for a search engine from a usability perspective. They're both useful, but they both solve different problems. I've never used or heard of Ahmia before this blog post, but onion search engines, even if centralized, are a good thing, and I guess for now we'll have to rely on diversity and numbers for availability and censorship-resistance in leu of a federated network of search engines or a peer-to-peer approach.

As for .onions providing authenticity, they do in the same sense that conventional clearnet sites do in relation to search engines. In other words, they authenticate the content, but only after the search has lead to it. I interpret the advantages of a secure distributed search engine as decentralization, censorship-resistance, and maybe privacy, but I suppose we would have to come up with a clear and specific threat model before tackling the problem.

Seth Schoen

December 06, 2016

Permalink

Just a question (no malice whatsoever) : does Ahmia honor "robots.txt", their general and/or specific prohibitions ? If it does, what user-agent(s) does it use or recognize as its own ?

Seth Schoen

December 06, 2016

Permalink

Ahmia link in your article returns "502 Bad Gateway
nginx/1.6.2" message.

Useless.

Seth Schoen

January 11, 2017

Permalink

Does ahmia censor any search results based on any country's laws or opinions, or is it entirely based on freedom of speech?

I believe the answer is no, they don't do what they do based on laws from a particular country.

But they do choose not to list certain onion addresses, based on their own decisions about what they want their site to be.