Christian Scholz and his Data Portability Project pals have roped me into their Data Without Borders podcasts. On Friday, Christian and Trent Adams and Steve Greenberg and I had some fun relaunching the series by talking about the DPP Terms of Service and End-User License Agreement (TOS/EULA) task force.
Steve was passionate in describing this work. I think he’s right when he says that you first have to ensure that people are aware of a site’s terms of service; disclosing them in a form human beings can grok (Ã la Creative Commons or the nutrition label approach I wrote about here) can begin to empower humans to change things if they so desire, using a variety of means.
At one point we talked about the Archive Team project run by Jason Scott, which I think of as “data portability of last resort”. These folks are like digital historian ninjas who swoop in to save data that might otherwise be lost forever — like everything on GeoCities.
The thing is, website-sanctioned bulk import and export of data isn’t all that huge an improvement on this kind of rescue operation. True data portability wants granularity and timeliness. For example, if you choose to host (so to speak) your current location info at FireEagle, you might still want to reuse it in other places for other purposes, and luckily OAuth lets FireEagle, Dopplr etc. give you a nimble and safe way to “port” this data back and forth.
This is a kind of data statelessness, in that when you tell various sites they can set, read, and republish your location, they’re letting go of any pretense of exclusive hosting control so that they can offer you a different kind of value.
Now, in the IdM and VRM worlds, some of us have been talking about identity statelessness for a while, which is similar but looks more like straight data-sharing (reading) rather than arbitrary service access (setting). For some reason this is a tougher sell — even though CRM systems and user accounts are shot through with pale copies of stale data (and, in the enterprise case, even though syncing directories and replicating databases is brittle and no fun).
Even when one party — say, you yourself — is authoritative for some piece of personal data (like your home address), all the sites insist on making you provision a copy of this data into their profile pages by hand and by value, and insist on thinking they own something truly valuable even after you move and forget to tell them.
In short: To the extent data is volatile, copies of it leak value. If the chain of evidence between its authoritative source and a recipient of data is broken, it quickly becomes value-free. And if the chain of authorization breaks, you’ve got digital shadow cruft. Why oh why can’t we get to a place where, as Scott Cantor put it to me once, identity-aware apps think in terms of data caching rather than data replication?
The Data Portability TOS/EULA work is helping us raise our standards for what true data portability should look like: Open Arms – Ever Fresh – Graceful Exit. OAuth already helps us get a bit beyond disclosure of site terms, closer to a world where users have an active say in what sites do with our stuff. I’m hoping UMA (recent deep-dive Technometria podcast here) can help us go even further because of its notion of user-dictated terms that recipients must meet in order to have the privilege of fresh access.
We’re likely to discuss this topic in the DWB podcast sometime soon, so I hope you’ll give a listen.
Because caching, especially if you do it badly, is easy, whereas replication is hard. In addition, sites that ask for your address even though they will never need it just want to peddle it to postal spammers.
Then we will have to hope that a business model based on a lack of respect for individual users and their desires is evolutionarily unstable and that the alternative propositions we’re working on are more attractive. :-)