[liberationtech] Censorship resistance attacks and counterattacks
Bram Cohen
bram.cohen at yahoo.com
Sat Sep 11 17:17:40 PDT 2010
For the sake of clarity of discussion, here are my general thoughts on
censorship resistance attacks and counterattacks. My apologies if this writeup
is a little rough, I just threw it together.
For the purposes of these notes, by 'censorship resistance tools', I'll be
referring to ones for browsing the web from inside of countrywide firewalls
which are trying to limit access, such as Freegate, Ultrasurf, the like.
Obviously there are other forms of censorship and resistance to it, but that's
what's being discussed for now.
The usage pattern for censorship resistance tools goes something like this:
1) system sends information about proxies to users
2) users use proxies to browse the web freely
3) firewall operator finds out IPs of proxies and blocks them by IP
4) go back to step 1
It's an ongoing cat and mouse game involving cycling through a lot of IPs and a
lot of careful secrecy.
An attacker might also, instead of outright blocking an IP, artificially create
a very high packet loss rate going to it, which might make users conclude that
the anti-censorship system doesn't work very well and give up on it. That could
be countered by trying to guess when there's an artificially high packet loss
rate, but that's potentially an insidious game - the attacker might, for
example, determine where the machines developers use for testing are, and not
artificially drop packets to those.
There's considerable concern about the threat model of the censor finding out
which users are using the proxies and doing bad things to them. I'll just cut to
the chase on that issue - the resistance to attacks of that form is inherently
weak. The censor can simply record the destinations of all outgoing connections,
and retroactively correlate them to discovered proxies, unveiling the IP of a
user. This is a vicious attack which can't be completely eliminated. Posession
of the tool might also be incriminating.
High level methods of avoiding detection include:
* Have lots of cover traffic - that is, lots of users, so attacking them all is
impractical. This is probably the ultimate solution, because a tool which
doesn't have enough users to provide cover traffic isn't successful, and a
successful tool implicitly provides lots of cover traffic.
* Have user use shared/ephemeral IPs. This is a low tech approach having little
to do with the protocol.
* Use no software, that is, http/https proxies. This makes the user have no
recurring evidence, but can expose what the user is doing to snooping.
* Use ephemeral or easy to dispose of software. This is a good idea, but the
techiques for doing it are tricky or rely on physical security.
* Run proxies on web sites running other services which are also used by users
within the target area. This is a great approach, but requires cooperation of a
web site which has the willingness to be (or confidence it won't be) blocked.
* Use actual skype connections. This is an interesting approach which has the
benefit of lots of cover traffic, but suffers from limitations on the bandwidth
skype intermediaries will provide, and could be attacked by an attacker running
lots of high quality skype nodes and noticing the very suspicious traffic.
* Dial down the level of paranoia. In the end a certain amount of this may be
necessary.
Censors have multiple ways of finding IP addresses which are used by the
anti-censorship system:
* Use the same methods as the software. This is a very insidious approach,
putting the anti-censorship system in a position of trying to simultaneously
publish new IPs and keep their distribution limited.
* Correlation attacks on existing known IPs. This is also a very insidious
attack - the attacker simply takes IPs which are known to use the
anti-censorship tool, and looks for otherwise unpopular IPs which a lot of those
are connecting to.
* Probing - an attacker can connect to suspected proxies and try to get them to
give themselves away by doing a handshake. Depending on the type of proxy
connection used, this can be very effective, sometimes in combination with
reverse DNS.
* Trick proxy users into hitting a web site and observe what IPs the connections
come from, observing the IPs of the proxies directly.
* Deep packet inspection and traffic pattern analysis, including packet sizes,
connection number and duration, etc. These can be extremely effective, but can
be extremely expensive for an anti-anti-censor to set up. Connection number and
duration are probably the most telling pieces of information, and the cheapest
to implement, as well as the easiest for the anti-censor to manipulate.
There are several ways for an anti-censor to make it hard to find their IPs:
* Use lots of IPs. If each user can be given their own dedicated IP then the
system is extremely hard to attack. Problem is, this approach requires procument
of lots of IPs, which isn't easy.
* Limit how many users info is given to. This is a good idea, but difficult to
do.
* Encrypt info with not widely circulated keys. This moves the problem to key
distribution and management, which is a good idea.
* Distribute fake IPs including stuff the censor would regret blocking. I think
this is kind of fun.
* Have clients only connect to one IP. This is a very good idea! Should be
followed as closely as possible.
* Make traffic go through more than one hop, masking the IPs of proxies to
connections on the outgoing side. While clearly a good idea, this doubles the
bandwidth used, which kind of sucks.
* Rely on deep packet inspection being hard. Less unreasonable than you might
imagine - deep packet inspection systems are very expensive and take a while to
upgrade, and intelligence on what the deep packet inspection can do is sometimes
available.
* Steganographically encode connections to proxies - this obviously must be
done, although it isn't obvious what the best approach is.
There are several things proxy connections could be made to look like -
* HTTP - while there's plenty of cover traffic for HTTP, deep packet inspection
and probing can probably be very effective in recognizing patterns in it, making
it not very appealing for stego connections
* SSL/TLS - there's a decent amount of cover traffic for TLS connections in the
form of HTTPS, and using the HTTPS port is probably a good approach, especially
since the traffic patterns are going to match http anyway, since that's what it
is. There's some concern that man in the middle attacks might be launched,
although those are difficult, and an attacker might get suspicious if reverse
DNS doesn't return believable information. Still, this may be the best option,
and is certainly the simplest to implement.
* BitTorrent - BitTorrent has lots of cover traffic, and the obfuscated version
of the protocol looks fairly generic, although its traffic patterns are very
distinctive and wouldn't be closely matched by anti-censorship web browsing.
* utp - utp is a udp-based TCP-alike originally designed for BitTorrent. It has
the advantage that some deep packet inspection systems just plain don't support
UDP, and it's easy to use as a swap-in replacement for TCP. It has some of the
same cover traffic problems as regular BitTorrent.
* SSH - while tunneling over SSH is not uncommon, making using SSH connections
no more suspicious than having long-lived high-throughput SSH connections is to
begin with, that's already a high level of suspiciousness, so this probably
isn't a great approach.
* skype - skype traffic has good cover traffic, but is a very poor match in
terms of usage patterns.
* noise - a TCP connection which has just plain garbage going over it is a
surprisingly reasonable approach. Lots of weird miscellaneous things on the
internet are hard to classify, and obfuscated BitTorrent provides a decent
amount of cover.
There are several methods a censorship resistance system can use to get IP
addresses out -
* offline - this is the most secure way, but it's very slow and expensive
* spam cannon - a spam blast can be sent out containing addresses of proxies.
This works but is moderately slow and moderately expensive. It's also
potentially very easy to intercept.
* to existing users - client software can be sent IPs of failback proxies when
it makes a proxy connection. This works and is fast, but has the problem that an
attacker can run client software and use it to find proxies as well.
* via web stego - this technique hasn't been used yet, but IPs could be encoded
steganographically in real web traffic. Given the tremendous popularity of
censorship resistance tools in the west, it might be possible to enlist the help
of lots of web sites, and make it essentially impossible to filter them all out.
I'm working on technology for this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/liberationtech/attachments/20100911/d963c41f/attachment.html>
More information about the liberationtech
mailing list