[liberationtech] Concept for takedown-resistant publishing

Fri Feb 3 07:26:08 PST 2012

On Feb 3, 2012, at 2:02 AM, Daniel Margo wrote:
> Torrent is an open protocol. There are numerous open source torrent libraries. Opera can download torrents natively, for Firefox there are extensions to do this, and so far as I know every browser has some kind of plug-able framework for MIME type (Internet media type) handling. To have a browser download and then open a torrent of a page is totally realistic, and probably not that much code.
> 
> The problem is that this is not the actual technical challenge. The actual challenges are these:
> 1. Name resolution: How do I find the torrent file, and then the P2P cloud itself?
> 2. Updates: How do I share a Web site as a living, updating document?
> 3. Server backend: Is it realistic to run a modern Web site without a client-server relationship?
> 
> 1. Name resolution comes in two parts: finding the torrent file, and then finding the actual P2P cloud. Web sites like The Pirate Bay function as name resolution services for finding torrent files, but obviously any single such server can be blocked. What I presume we actually desire is a name resolution service just like DNS for Web sites: you type in a URL, it gets resolved to a torrent file.
> 
> URLs have a hostname and a path, e.g. "www.hostname.com/path/to/torrent.file". "www.hostname.com" is resolved by DNS to a server, and then "/path/to/torrent.file" is resolved by that server. So unfortunately, you don't gain any takedown resistance by hosting your torrent file at "www.myserver.com/my.torrent", because if "www.myserver.com" is taken down, then "/my.torrent" can't be resolved. So what you would actually need is for DNS to provide hostname resolution directly into torrents. This is a sweet idea, but changing DNS is hard.
> 
> Even if you could do that, well, what's in a "torrent file"? A torrent file contains information to get you joined into the P2P cloud; specifically, it contains the address of a tracker, a central server that gates entry into the cloud. If the tracker is down or blocked by some technology, you can't enter the cloud. This is primarily a weakness of the BitTorrent protocol, and not the idea itself; there are extensions to BitTorrent and other P2P protocols that are resilient to this weakness. But fundamentally, finding your way into the P2P cloud is an act of name resolution to find other peers in the cloud, which is most easily done by a name resolution server, in this case the tracker. If we're going about making changes to DNS, probably the most technically sane (but politically unrealistic) solution would be for DNS itself to provide tracking capability for the cloud. Again, that is a sweet idea, but changing DNS is hard.
> 
> I don't mean to suggest these issues are insurmountable, merely that this is the actual Hard Part.

This is a quite accurate analogy. Though you are only thinking of one site per torrent. Something that would
make this scheme much more powerful (as it would have more seeders) is one torrent for multiple sites.

You could then use a scheme like the magnetic URI scheme to reference a particular site (file(s) of the site). You
still have the problem of takedown of a particular tracker, but at least the serving of the link
(the .torrent that is replaced by the Magnetic URI) can be done over any other channel.

You are not necessarily worried of DNS censorship as you could provide the tracker address as an IP address.
You are still worried of censorship through other means (IP blocking), but this can be overcome by having multiple
trackers running.

> 
> 2.  In order to support dynamic content there has to be a cryptographic distinction between updates coming from the legitimate publisher of the site (Alice), and subterfuge coming from Evil Eve. Cryptographic authentication might be authenticated by the data being shared (e.g. this data is signed by Alice, and everybody knows her signature) or by the P2P network itself (e.g. in addition to sharing the torrent, the servers also provide a distributed authentication service that will only accept updates signed by Alice). I am by no means an expert on this subject, so I will refrain from talking about it extensively, but I bring it up merely because cryptography is non-optional for any dynamic scheme, and I'm not aware of any update-able, cryptographically-secured P2P torrents. It sounds like maybe they should exist? It also sounds Hard, and Google isn't turning anything up.
> 
> Again, I'm not suggesting these issues are insurmountable; in some sense Google Docs does all this. But they do it with a pretty sophisticated backend that glues many technologies together (I guarantee Google has a killer internal name resolution and authentication service), and I have no idea how Hard it would be to make those parts takedown-resistant (in the sense that there are no central servers. There are unquestionably central servers at Google).
> 

This is a very tough problem and such a feature would be a core one of "multiple sites per torrent" scheme. 
Though I disagree that there should be an update feature. The best option is probably to have an append
only mechanism so that you are not worried of Evil Eve changing content that is already there.

This would also have a benefit for the resilience of the network since the protocol does not understand the concept
of "modification" (and therefore deletion) of content. Nobody can force you to remove your content because it is not
technically possible.

> 3. The real elephant in the room is that modern Web sites are best thought of as programs, not files, and program distribution is infinitely harder than file distribution. When you visit a Wordpress blog, what you appear to receive is an HTML file: but in actuality that HTML was streamed by a PHP script running on the server talking to a MySQL database. The number of layers in this onion is arbitrary; it's anything you could run on a computer. Arbitrary code, or I could hook MSPaint up to the thing if I wanted it enough. Distributing *and executing* arbitrary code like this is Quite Possibly Impossible, and if it is possible it is Very Far Away. At any rate, BitTorrent can't do this.
> 
> The word "arbitrary" is important. In specific cases, you can certainly find a case-specific resolution. Diaspora is building something like a distributed social network "program", and I do honestly believe that with a lot of code and hard thinking, you could distribute a Wordpress blog's backend on P2P. It would probably require a total rewrite of Wordpress such that it wouldn't even be the same piece of software, and you would have to solve the other Hard Problems above, but I think it is technically possible at this time. But in the general case, you might have more luck working towards a Singularity and then asking the Machine-Gods for an answer.
> 

Projects such as unhosted (http://unhosted.org/) lead me to believe that the future of web technologies is in moving
more and more logic away from the server and into the hand of the client.
The diffusion of fast computers and the speed of current javascript interpreting engines have made it possible (and
in most cases even desirable) to make clients do what once was only done by servers.

While certain web sites will probably never be moved totally into the clients hands (I am thinking of google for
example), a great part of the web we use today could very easily be moved into pure client side logic and the
server only used for data storage. A blog for example is just a set of posts (content) that is updated every 
X time. All the processing of that data (pagination, styling) can easily be moved into the clients hands.

The only limit that I see is that the data would be able to flow in one direction from the site to the client, though
I don't see this as a critical factor to the success of such a scheme.

> What seems more realistic to me is taking snapshots of a Web site and distributing those as files instead, rather than trying to distribute the actual Web site program. That's why we designed Mirror As You Link to work that way.
> 
> These are all the hard issues I can think of, as a technical person with some distributed systems background. There may be others, since in some cases we're really exploring uncharted waters here.
> - Daniel Margo

Thanks for your good rationalization of the the issue.

- Art.