Freedom and privacy in the cloud: a call for action

This is a post about freedom. The freedom to keep your data for yourself and the freedom to run free software. You should be able to reclaim and enjoy these freedoms also when using web applications.

If you are a supporter of the free software movement, you can easily opt for Gimp instead of Photoshop, or Firefox instead of Internet Explorer. You can also protect the privacy of your data by using the many encryption tools that are available (GPG, TrueCrypt, …). But when it comes to web applications things get complicated.

The benefits of web apps (ubiquitous access, seamless upgrades, reliable storage, …) are many, but quite often users lose their freedom to study, modify and discuss the source code that powers those web apps.

Furthermore, we are forced to trust web applications provider with our data (bookmarks, text documents, chat transcripts, financial info, … and now health records) that no longer resides on our hard disks, but are stored somewhere “in the cloud”.

It’s not a nice situation when you have to chose between convenience and freedom.

Let me be clear: web apps are great and I’m in love with them. But I think it’s time to ask for more freedom and more privacy. Here is a three step plan to achieve both these results.

1. Choose AGPL

Why is AGPL important? Because it means that, if you are an application service provider and your services are based on software with an AGPL license, you have to make the source code available to anyone that uses the service! FSF guidelines suggest to add a “Source” link that leads users to an archive of the code right into the web application interface.

(Don’t ask me why it took so long to tackle this problem within the free software community!)

Action points

  • Help Clipperz to assemble an “AGPL Suite”: a collection of web applications that provides tools for the most common needs.

    The suite should include: word processor, web chat, password manager, wiki, address book, to do list, calendar, bookmark manager, … Each web app must be released under an AGPL license! Therefore forget Google, del.icio.us, Plaxo, Meebo, … at least unless they switch to AGPL.

    There are already a couple of candidates for inclusion (Ajax Chat for the web chat and, of course, Clipperz for the password manager), but most of the spots in the suite are still vacant!

  • Join Clipperz in its effort to evangelize the benefits of AGPL to the maintainers of open source web projects. Ask them to convert to AGPL.

2. Add zero-knowledge sauce

Web developers and web users are still largely ignoring the opportunity offered by browser-based cryptography to bring the privacy and security of traditional software programs to web applications.

At Clipperz we envisioned a new architecture paradigm called “zero-knowledge web apps” (here a more detailed description) that combines the idea of host-proof hosting with a set of rules focused on the “learn nothing” mantra.

The name was both an homage to cryptography (a “zero-knowledge proof” is a standard cryptographic protocol) and a promise of a specific relation between the application provider and the users. The server hosting the web app could know nothing of its users, not even their usernames! Clipperz applied this paradigm to implement its online password manager.

Action points

  • Apply zero-knowledge techniques to each component of the “AGPL Suite”. Converting an existing web applications to the zero-knowledge architecture is not easy, but at Clipperz we have a considerable experience on the subject and we will be happy to share our knowledge and code base.

    We could eventually enjoy a web based word processor that can’t read our documents, a truly off-the-record web chat, a wiki where we could lightheartedly store valuable information, and so on.

  • Build and maintain a list of ASPs that host the whole “AGPL Suite”. It will be a useful reference for those who value free software and privacy, but don’t possess the necessary skills and resources to run web apps from their own server.

3. Build a smarter browser

We are almost there, but we still need to provide users of web apps with an even more flexible and secure environment. In fact, given the architecture of a zero-knowledge web app, the server typically performs the following tasks:

  • loads the Javascript code to the user’s browser (the actual program);
  • optionally authenticates the user (using a zero-knowledge protocol);
  • retrieves and stores encrypted data as requested by the user’s browser.

Free software implies full control over anything that runs in my computer. Therefore two questions arise:

  • How can I run a modified version of the Javascript code instead of the one loaded by the server?
  • How can I be alerted of changes in the Javascript code that the server loads to my browser?

I recently had the tremendous honor to exchange thoughts with the very Richard Stallman about the above issues and he proposed a smart solution to both problems.

Stallman suggests to add a feature to the browser allowing a user to say: “When you get URL X, use the Javascript from URL Y as if it came from URL X.” If the user does invoke this feature, he can run his copy of the Javascript and still being able to exchange data with the server hosting the web application.

A browser with such capabilities could also easily verify if the Javascript from URL X is different from the alternative Javascript stored at URL Y. If the user trusts the present release of the Javascript code from URL X, he could make a copy of it at URL Y and be alerted if any change occurs.

This solution protects the user from malicious code that could be unknowingly executed by his browser, stealing his data and destroying the whole zero-knowledge architecture.

Action points

  • Write add-ons for the major free browsers (Mozilla, Webkit, …) that implement the Stallman’s solution.

  • Advocate for including the “AGPL Suite” along with the above enhanced browsers into GNU/Linux distributions.

How to contribute

  • Keep reading this blog where I will post regular updates.
  • Send in your comments and suggestion.
  • Spread the word writing in your blog, posting in forums, …
  • Make a donation.

Last but not least: how would you name this ambitious project?
Let me know in the comments!

Richard Stallman

tags:

I’ve mentioned this in the

I’ve mentioned this in the forum before, but I think what’s needed to provide confidence in zero-knowledge hosting situations is some type of 3rd party code review/hosting service. In this model it seems that there are two components of the data being hosted: the interface / business logic and the actual user’s data. If we’re to assume that the user’s data is sensitive, it seems critical that the interface code be both open and subject to scrutiny, and that the published code matches the code that’s actually being used.

While I think the ‘Stallman solution’ is a step in the right direction, but if we’re talking about simply moving from one centralized location to another, we still have a single point of failure. It would seem that the best solution would be to allow users to either download or cache the interface code and to provide an easy way to verify its integrity after download (this would only need to happen once). In other words - use the cloud to manage user data, but use the user’s computer to execute software. This seems like the best of both worlds, as your data is ubiquitous yet secure.

No single point of failure

The “Stallman solution” does not implies any “centralized location”. “URL Y” could be any place on the net trusted by the user, or his/her own computer where the code has been previously downloaded.

Marco

AGPL isn't a necessary component

I can understand why one would chosen to use the AGPL for his project in order to avoid freeloaders. However, AGPL is not a security prerequisite. The source code has to be made available but almost any license that allows for reviewing the code will do. The PGP license and Microsoft's Shared Source licenses are good enough, for example.

I think it would be relatively easy to create a distributed document hashing service that one can use to detect tampering of files keyed by their URL. However, it would be easier to just rely on signed browser plugins.

In reality, it is very difficult to find competent code reviewers that are willing to donate the hundreds of hours necessary to review code for security defects. Otherwise, the OpenSSL fiasco would have never happened and Linux would be exploit-free by now.

The vast majority of "tampering" is benevolent. If the service provider provides a critical security fix, you will be left unprotected if you have to wait a week or more for an independent team to prove that the update is "safe." (That is on top of the time it took to code the fix and to have it reviewed internally.) In that time you are vulnerable to a much more likely set of exploits.

(I hate CAPTCHAs. It would be awesome if you guys used your security expertise to solve the "I am anonymous but I can prove I am not a spammer" problem.)

We hate CAPTCHAs too

@ Brian

And in fact, there are no CAPTCHAs to register for a Clipperz account. We implemented a protection based on hashcash. I’m not aware of any Drupal module (the CMS used for the Clipperz website) that offers something similar. We would love to use it.

Any Drupal expert out there? We are ready to help.

Yes, AGPL it’s not necessary from a security point of view, but I like the freedom it provides both to users and developers.

Marco

hosted Web apps

I’m also currently researching the web apps space to mash up of what I think is important to access for everyone everywhere. So far I haven’t gotten very far as there doesn’t seem to be much choice. I hope something will come up before M$ Live Mesh will hit the shelves. Starting with a word processor, I like google docs but I haven’t seen anything similar in the FOSS world. So far my idea is Roundcube as a mail client and dekiwiki since it seem to me it is the most user friendly. (Although I hate the fact that you have to install mono). I’m currently trying to develop a wiki Version of Typo3 with frontend only features but the project ist still in the very begining.

Anyway, in regards to your security concerns, these applications will have to communicate with each other and the user and since they are seperate apps the browser plugin will have to talk to all these apps individually. Honestly I have no idea how one would approach this problem.

thomas

Great projects, but GPLed

Roundcube, Dekiwiki and Typo3 are certaily great projects, but all of them are released under a GPL license.

Marco

Please add re-direct security to limit scammers

More thought will need to go into the re-directable scripts idea, to prevent scam artists & spammers from taking over from un-suspecting persons. (ie, redirect login script to send id/password to scammer site).

An idea to limit damage might be user configurable option to allow re-direct, and a local signing requirement which forces the user to type in different password when authorizing the re-direct. (to limit scriptable hacks).

call it lin-4?

what’s a lin for? It’s for protecting freedom, privacy, etc.

firefox chrome application

A firefox chrome application is downloaded and installed from a given url Y and stays exactly the same when used to interact with another url X. Also, the user is part of the application update process. The firefox extentions are like this. From chrome, XUL and XBL allow for a great platform to create an application.

Relevant Ad Revenue?

I really like the sound of this. But suppose I ask google to switch to a zero knowledge system? They come back and say something like “but we need that to make search and ad’s more relevant to YOU personally…” and I have to admit, this is one of their strong points. I even sometimes click on ads! Amazing!

Anyway…

What tactics should we employ so that one of my favorite companies is still able to make the money to pay the bills WITHOUT giving up my privacy? Is this an issue?

In short, how do I respond to the “relevancy” argument?

Oh, and Richard Stallman, you rock!

what RMS forgets

is that typically people need to make a living, ie., trade or work to pay bills. Just because Stallman doesn’t have to buy soap or water for instance doesn’t mean that the rest of the world wants to go around smelling like a pig. Maybe when Stallman takes a shower and gets a job he can refine his views a bit and maybe he won’t be ignored by most of the world like he is now.

Re: Drupal and spam

Re: Drupal and spam blocking… have you investigated Mollom?

http://www.mollom.com http://drupal.org/project/mollom

Granted, it’s not an AGPL project… ; )

Also, while the service will algorithmically detect and squash spam, it will still present a CAPTCHA if it’s not quite sure.

The Greasemonkey solution

“Stallman suggests to add a feature to the browser allowing a user to say: “When you get URL X, use the Javascript from URL Y as if it came from URL X.” If the user does invoke this feature, he can run his copy of the Javascript and still being able to exchange data with the server hosting the web application.”

Maybe I’m missing something, but this seems pretty similar to what Greasemonkey (and GM4IE) already provide. The big difference is that URL Y has to be local. OTOH using a remote server seems to open one up to various attacks (i.e., site Y suddenly starts serving nefarious scripts, etc).

A zero-knowledge based web browser already exists! :)

Hello. We built a collaborative web browser based on the zero knowledge algorithm SRPP:

http://srp.stanford.edu

We built it as a custom, internal development project for some hedge funds in Boston. It has a lot of group-messaging and workflow features. I personally use it as my main web browser and RSS reader daily, and love that I can search a private repository of 20K+ of my own bookmarks.

http://sourceforge.net/projects/suprabrowser

It’s a very sophisticated application, but more suited to expert users as opposed to normal end users. It takes about a day of actual usage before figuring out what the heck is going on.

However, once a user gets used to it, it’s totally awesome. It has the ability to manage mailing lists, tag collaboratively and search an individual or group repository of bookmarks. It has file sharing, threaded chat, contextual highlighting of posts and bookmarks, and a bunch of other features. It’s based on the Gecko rendering engine and already has these attributes that you mentioned:

* loads the Javascript code to the user’s browser (the actual program);
* optionally authenticates the user (using a zero-knowledge protocol);
* retrieves and stores encrypted data as requested by the user’s browser.

We would be very happy to collaborate. It’s hard to know how community gets built around something like this. We have the same vision as you, and have a working application. If you are comfortable starting with where we currently are and taking a leadership position with respect to the direction that this application and open source project goes, I would be most thankful. We unfortunately are exceptionally busy and don’t have the time to build community, especially when it seems somewhat random as to how people start to get excited about something.

We are open to changing direction and incorporating new features as long as it seems that other people would like what it is that we are doing. We intend to make it more accessible to other people and less sophisticated end users, but haven’t seen any interest whatsoever from the open source community.

There are so many ideas and possibilities for different directions to take, and if it seems that people rally around this, we will make sure that the resources can be allocated to bringing it about. The missing element so far is any interest at all in a product that took 5+ years to develop.

Sincerely, David Thomson

TiddlyWiki

Have you ever seen TiddlyWiki? It’s entirely client-side, but there are server-side implementations. It is published under a BSD license, though.

A name suggestion

My name suggestion is: Cristal Web

Cristal, like cristal clear, transparent. But also, Cristal, like hard as cristal.

zero knowledge information exchange

“What tactics should we employ so that one of my favorite companies is still able to make the money to pay the bills WITHOUT giving up my privacy? Is this an issue?”

Yes this is an issue, but it’s not that hard to solve. If say, Google needs to target an ad to some keywords in your encrypted data, they could receive your data in plain-text but anonymized format, crunch it, and send ads back. They don’t need to know who you are or where they are sending the ads. All this should only happen with your consent of course, but hey, who doesn’t want ads? ;)

Much already possible

I can already achieve a great deal of this. I have an Amazon S3 account, which provides cheap encrypted but not anonymous storage: they know who I am, but not what I store. This seems to be necessary unless you are going to allow people to store unlimited quantities for free (no, Google doesn’t allow that).

I use the closed-source program JungleDisk to access the S3 system as if it were a drive on a Linux 386, Windows, or Mac system — one of the reasons I chose JD is that they have a GPLed command line retriever. There are various other programs that do more or less the same thing.

Now I can run OpenOffice.org or any other office programs on the files in the simulated drive. I just have to be willing to download one of them to the computer I am using. And that buys me a lot more power and function than any downloadable Javascript app would. This is particularly important in a spreadsheet program: someday Javascript-based spreadsheet programs may be competitive in speed with C++ ones, but that day is not today.

Re: We hate CAPTCHAs too

Hi Marco,

Simon Rycroft has created a Hashcash project for Drupal.

Crystal Web

I quote the name Crystal Web (but please, write it with ‘y’!).

Thanks for this post, is

Thanks for this post, is really helpful.

Post new comment

The content of this field is kept private and will not be shown publicly.