meejah's blog

Exploring Tor with carml

carml is a command-line, pipe-friendly tool for exploring and controlling a running Tor daemon. Most of the sub-commands will be interesting to developers and tinkerers; a few of these will be interesting to end users. This post concentrates on the developers and tinkerers.

carml is a Python program written using Twisted and my library txtorcon. If you're familiar with Python, create a new virtualenv and pip install carml. There are more verbose install instructions available. Once this works, you should be able to type carml and see the help output.

Connecting to Tor

carml works somewhat like git, in that a normal invocation is carml followed by some global options and then a sub-command with its own options. The most-useful global option is --connect <endpoint> which tells carml how to connect to the control-port. Technically this can be any Twisted client endpoint-string but for Tor will be one of tcp:<port> (or simply a port) or unix:/var/run/tor/control for a unix-socket.

For Tor Browser Bundle, use carml --connect 9151. Typically a "system" Tor is reachable at carml --connect 9051 or carml --connect unix:/var/run/tor/control. You may need to enable the control-port in the configuration and re-load (or re-start) Tor. More details are in the documentation.

Start Exploring

The most interesting general purpose command is probably carml monitor -- try running it for a while and you can see what your Tor client is doing. This gives some good insight into Tor behavior.

A (very basic) usage graph is available via carml graph to see what bandwidth you're using (this needs work on the scaling -- PRs welcome!)

Explicit Circuits

Sometimes, you want to use a particular circuit. For example, you're trying to confirm some possibly-nefarious activity of an Exit. We can combine the carml circ and carml stream commands:

carml circ --build "*,*,4D08D29FDE23E75493E4942BAFDFFB90430A81D2"

This means make a 3-hop circuit through any entry-guard, any middle and then one particular exit (identified by ID). You can*= identify via name (only if it's unique!) but hashes are highly recommended. Of course, you could explicitly choose the other hops as well. Note that the stars still leave the selection up to carml / txtorcon which cannot (and does not) use Tor's exact selection algorithm.

Next, you'll want to actually attach circuits to that stream. It will have printed out something like "Circuit ID 1234". Now we can use carml stream:

carml stream --attach 1234

This will cause all new streams to be attached to circuit 1234 (until we exit the carml stream command). In another terminal, try torsocks curl https://www.torproject.org to visit Tor Project's web site via your new circuit. Once you kill the above carml stream command, Tor will select circuits via its normal algorithm once again.

Note that it's not currently possible to attach streams destined for onion services (this is a Tor limitation, see connection_edge.c).

Debugging Tor

The control protocol reveals all Tor events, which includes INFO and DEBUG logging events. This allows you to easily turn on DEBUG and INFO logging via the carml events command:

carml events INFO DEBUG

This can of course be piped through grep or anything else. You can give a --count to carml events, which is useful for some of the other events.

For example, if you want to "do something" every time a new consensus document is published, you could do this:

carml events --once NEWCONSENSUS

This will wait until exactly one NEWCONSENSUS event is produced, dump the contents of it to stdout (which will be the new consensus) and exit. Using a bash script that runs the above (maybe piped to /dev/null) you can ensure a new consensus is available before continuing.

Events that Tor emits are documented in torspec section 4.1. You can use carml to list them, with carml events --list.

Another example might be that you want to ensure your relay is still listed in the consensus every hour. One way would be to schedule a cron-job shortly before the top of each hour which does something like:

carml events --once NEWCONSENSUS | grep 
# log something useful if grep didn't find anything

Raw Commands

You can issue a raw control-port command to Tor via the carml cmd sub-command. This takes care of authentication, etc. and exits when the command succeeds (or errors). This can be useful to test out new commands under development etc (as the inputs / outputs are not in any way validated).

Every argument after cmd is joined back together with spaces before being sent to Tor so you don't have to quote things.

carml cmd getinfo info/names
carml cmd ADD_ONION NEW:BEST Port=1234

End-User Commands

Briefly, the commands intended to be "end-user useful" are:

carml pastebin: create a new hidden service and serve a directory, single file, or stdin at it. You can combine with carml copybin or simply torsocks curl ... on the other side. Still an "exercise to the reader" to securely distribute the address.

carml tbb: download, verify and run a new Tor Browser Bundle. This pins the public-key of torproject.org and bundles the keys of likely suspects that sign the bundles. It is less useful now that TBB auto-updates.

carml newid: sends the NEWNYM signal, which clears the DNS cache and causes Tor to not re-use any existing circuits for new requests.

carml monitor shows you what Tor is doing currently. Similarly, carml graph shows you just the current in/out bandwidth.

Pure Entertainment

Commands that can provide hours of entertainment include:

  • carml xplanet
  • carml tmux

I hope you find carml useful. Suggestions, bugs, and fixes all welcome on carml's GitHub page.

See Also

There is also a curses-based Tor tool called ARM (blog post). This is being re-written as "Nyx" currently.

Tor at the Heart: Tahoe-LAFS

During the month of December, we're highlighting other organizations and projects that rely on Tor, build on Tor, or are accomplishing their missions better because Tor exists. Check out our blog each day to learn about our fellow travelers. And please support the Tor Project! We're at the heart of Internet freedom.
Donate today!

Overview

Tahoe-LAFS is a free and open source decentralized data storage system, with provider-independent security and fine-grained access control. This means that data stored using Tahoe-LAFS remains confidential and retrievable even if some storage servers fail or are taken over by an attacker.

Using a Tahoe-LAFS client, you turn a large file into a redundant collection of shares referenced via a filecap. Shares are encrypted chunks of data distributed across many storage servers. A filecap is a short cryptographic string containing enough information to retrieve, re-assemble and decrypt the shares. Filecaps come in up to three variants: a read-cap, a verify-cap and (for mutable files) a write-cap.

Starting with version 1.12.0, Tahoe-LAFS has added Tor support to give users the option of connecting anonymously and to give node operators the option of offering anonymous services.

Data Storage

At the lowest level, Tahoe-LAFS is essentially a key-value store. The store uses relatively short strings (around 100 bytes) called capabilities as the keys and arbitrary binary data (up to "dozens of gigabytes" and beyond) for the values.

On top of the key-value store is built a file storage layer, with directories, allowing you to share sub-trees with others (without, for example, revealing the existence or contents of parent directories).

A "backup" command exists on top of the file storage layer, backing up a directory of files to the Grid. There is also a feature called "magic folder" built on top of the filesystem layer which automatically synchronizes a directory between two participants.

Encryption

When adding a value, the client first encrypts it (with a symmetric key), then splits it into segments of manageable sizes, and then erasure-encodes these for redundancy. So, for example, a "2-of-3" erasure-encoding means that the segment is split into a total of 3 pieces, but any 2 of them are enough to reconstruct the original (read more about ZFEC). These segments then become shares, which are stored on particular Storage nodes. Storage nodes are a data repository for shares; users do not rely on them for integrity or confidentiality of the data.

Ultimately, the encryption-key and some information to help find the right Storage nodes become part of the "capability string" (read more about the encoding process). The important point is that a capability string is both necessary and sufficient to retrieve a value from the Grid -- the case where this will fail is when too many nodes have become unavailable (or gone offline) and you can no longer retrieve enough shares.

There are write-capabilities, read-capabilities and verify capabilities; one can be diminished into the "less authoritative" capabilities offline. That is, someone with a write-capability can turn it into a read-capability (without interacting with a server). A verify-capability can confirm the existence and integrity of a value, but not decrypt the contents. It is possible to put both mutable and immutable values into the Grid; naturally, immutable values don't have a write-capability at all.

Sharing Capabilities

You can share these capabilities to give others access to certain values in the Grid. For example, you could give the read-capability to your friend, and retain the write-cap for yourself: then you can keep updating the contents, but your friend is limited to passively seeing the changes. (They need to be connected to the same Grid).

To delete a value, you simply forget (i.e. delete) the capability string, after which it is impossible to recover the data. (Storage servers do have a way to garbage-collect unreferenced shares).

System Topology

In a Tahoe-LAFS system (usually called a Grid) there are three types of nodes: an Introducer, one or more Storage nodes and some number of Client nodes. A node can act as both a Storage and Client node at the same time.

An Introducer tells new clients about all the currently known Storage nodes. If all of the Introducers fail, new clients won't be able to discover the Storage servers but the Grid will continue to function normally for all existing users. Client nodes connect to all known Storage servers. It's also possible to run a Grid without any Introducers at all, by distributing a list of Storage servers out-of-band.

These connections use TLS via an object-capability system called Foolscap which is based on the ideas of the E Language. The important two things about this are: the transport is encrypted, and it does not rely on Certificate Authorities for security.

The storage redundancy also happens to enable faster downloads! Because the values are redundantly-stored across several Storage servers, a Client can download from many Storage servers at once (kind of like BitTorrent). For example, a "2-of-3" encoding means you need 2 shares to recover the original value, so you can download from 2 different Storage servers at once.

Tor Connections

Recently, Tahoe-LAFS has added full Tor support. This means the ability to make client-type connections over Tor -- for example, a Client connecting to an Introducer or a Client connecting to a Storage server and also the ability to listen as an Onion service for Introducer and Storage nodes is now possible! This allows for a fully Tor-ified Tahoe-LAFS Grid, where all network connections are done via Tor and the network locations of all participants are kept hidden by Tor.

One immediate advantage of using Tor is for users behind NAT (Network Address Translation) routers, such as most home users. Making a Storage node available over a Tor Onion service means users don't have to change firewall rules (or similar techniques, like STUN) in order for other users to connect to their Storage node. This is because all Tor connections are made out-bound to the Tor network.

While the Foolscap URIs used internally by Tahoe-LAFS already have integrity-assurance, the use of Onion services also provides benefits in the form of self-certifying network addresses: instead of, for example, relying on DNS and Certificate Authorities, a user receiving an Onion URI from a trusted source can be assured they're connecting to the intended service.

Some Grid operators may want assurance that all clients are using Tor to access their service. Setting up the Grid to listen only via Tor Onion Services provides such assurance. Of course, users running a Client can also choose to use Tor at their own option for connections to the Grid regardless of whether the Grid itself is using Tor onion services. This can help clients who are in hostile network environments reach their data in a secure way.

The Tahoe-LAFS Project is actively working towards an easy to use data- storage system that respects the user and Tor is a great compliment to that mission.

More Information

This short article only provides a brief overview of the Tahoe-LAFS system. We are always interested in attention to our cryptographic protocols or code! You can reach us on https://tahoe-lafs.org or on GitHub at https://github.com/tahoe-lafs/tahoe-lafs and the IRC channel #tahoe-lafs on freenode.

Thanks to Chris Wood, Brian Warner, Liz Steininger and David Stainton for feedback on this post.

Syndicate content Syndicate content