[liberationtech] on the traceability of circumvention tools
Roger Dingledine
arma at mit.edu
Fri Sep 17 02:16:01 PDT 2010
On Fri, Sep 17, 2010 at 12:11:46AM -0700, Mehdi Yahyanejad wrote:
> > http://metrics.torproject.org/recurring-users-graphs.html#iran
> >
> > The recent graphs looks like about 40,000 people were using Tor at the
> > end of August from Iran.
> >
>
> I appreciate that you shared the updated data. I hadn't checked it during the
> past two months. The numbers I had recalled were only a few thousands which
> based on your graphs was the case up until two months ago.
Actually, I believe the jump in numbers a few months ago on those graphs
is due to changing our geoip database to be more accurate. You can see
it best at
http://metrics.torproject.org/graphs/direct-users/iran-direct-all.png
where really, the whole line should be fluctuating around the 10k or
20k mark.
More generally, these numbers are very rough. They come from aggregating
a few vantage points of the network (relays that report which countries
they see connections from), and extrapolating what other relays "should"
have seen based on the relative weightings of these relays compared
to the rest of the network. We built an anonymity network, after all,
and one of the downsides of a decentralized network is that it's tough
to learn much about your users as a whole.
So you should look at these graphs more as an order of magnitude estimate,
and sometimes maybe not even that. We're working on getting better user
count algorithms, but we want to do it in a way that doesn't impact user
privacy, and that turns out to be hard. See e.g.
http://metrics.torproject.org/papers/wecsr10.pdf
> I didn't consider
> a few thousands significant compared to how ultrasurf/freegate was doing. It
> seems Tor has made quite a bit of progress in the past two months. That's great.
Without trying to step into too much of a minefield here, there is
a longstanding question in the circumvention tool world about how to
measure user counts. Last I checked (a couple of years ago), the GIFC
folks were measuring their users by how many web pages they handle per
day. It gives them great-sounding results, but it isn't measuring the same
thing. The problem is that the incentives are lined up for a spiraling
arms race to show the biggest numbers to the media and to funders. Soon
tools will all have to be proudly showing 50 million users in Iran or
nobody will even give them a second glance.
I'll close with another anecdote, unfortunately slightly anonymized. I
was talking to the CSO of one of those web 2.0 companies that everybody
in Iran wanted to reach last year right after the elections. They were
datamining their weblogs to see where the Iranian users were connecting
from that month, since they couldn't connect directly. The answer was
that about 10000 of them per day were coming through Tor, and the rest
of them (about 90000 per day) were coming through Amazon cloud proxies.
So a) I like that ratio. At this point I'm just fine with Tor being the
safer but less publicized option (this goes back to the earlier discussion
about the relation between publicity and attracting attention). Though
I am sad that a lot of people were using plaintext proxies, and a lot
of those people learned a lesson about that choice. Then b) Where were
the other tools? At the time, Freegate and Ultrasurf were filtering
connections from Iran, because it wasn't their issue (and because
bandwidth costs them money).
--Roger
More information about the liberationtech
mailing list