Documentation: Distributed architectures for decentralized data governance

alberto · March 01, 2016 10:46

I do not expect much more action on the hackpad, so I am putting this content on the website directly.

The EC’s DG CONNECT has a new call out called Distributed architectures for decentralized data governance. The EU struggles with the seemingly unavoidable tendency of data to flow towards central repositories owned by big companies (Google, Facebook etc.). This creates an imbalance of power: value is created by aggregation, but almost all of it is appropriated by these companies. The response to this has so far been regulatory – making new laws or antitrust-type moves. This call is looking in a different direction: can the response be, instead, socio-technical? Can different solutions be created that are attractive enough to be adopted by a critical mass of people? The technical problem has been solved before. Two solutions have been mentioned: peer-to-peer networks and blockchain. They technically work, but have so far been unable to alter the big picture.

The call funds proofs of concept: solutions, use cases, technical and socio-technical architectures. The deadline is April 12 2016; the total budget 5 million. DG CONNECT expect to receive fewer than 10 proposals and to fund about 2 projects.

The discussion revolved around what a prototype solution might look like – especially since it would need to be invented and deployed in a highly centralized world dominated by “winner take it all” dynamics, not in a tabula rasa situation. For example, Diaspora might have been more successful if it had not been born in a world which already had Facebook.

More information: Research and innovation

sr_gio · March 02, 2016 10:11

Thanks Alberto.

We might be interested in the topic of the Call;

not sure we’ll be able to make it to participate though…

markomanka · March 02, 2016 13:35

This is not a trivial challenge. When the blockchain, and p2p/mesh networks are already on the table but are seen as defective for lack of scale impact, I read a call for system architecture.

Would we be able to take a radical enough step back?

@Alberto, I presume you are going to call for a conference call to brainstorm on this. When will this hangout be?

gandhiano · March 02, 2016 14:35

My opinion is that a focus on the technological aspect alone will not change the playing ground.

In particular, I see blockchain as an over-hyped technology. Apart from some specific use cases, we really don’t need full anonymity and machine-controlled ledgers to achieve decentralized, commonly owned infrastructures. What we need is to develop patterns of social and technological interaction, which enable us to replicate the same logic of ownership, collaboration and control that Elinor Ostrom observed and documented from the millenar praxis of commons worldwide.

P2P/Mesh networks, on the other hand provide much more interesting use cases. Initiatives like Freifunk (in Germany, basically citizen-owned internet/wireless networks) are particularly successful in bridging local communities to care for their technology and implement it.

alberto · March 02, 2016 15:16

Not a technological problem

I asked Fabrizio: “When you present the call to computer scientist, is their reaction to say they solved this problem 20 years ago?” And he replied “Pretty much.”

So, you all are right, I believe. People could have a completely new technology in mind. But if you had a socio-technical model that would work with either p2p networks or blockchain, that would be seen as a valid proposal, at least in principle.

In terms of OpenCare, it would make a ton of sense if we could get a prototype for decentralized storage of personal data on health. Maybe these data are perceived as sensitive enough that people would consider storing them somewhere other than a large, centralized “Facebook for health”.

markomanka · March 03, 2016 09:24

Something tingles in my head already since a while…

http://www.tandfonline.com/doi/full/10.3109/19401736.2015.1101541

and ftp://ftp.irit.fr/IRIT/SIG/2015_JASIST_C.pdf

alberto · March 03, 2016 09:58

Good call!

a central repository for DNA sequence for every experiment on earth might be the least evolutionarily conserved lesson in evolution humankind will ever face.

I like it. But we should focus on personal data… which, of course, in OpenCare blend into research data.

melancon · March 03, 2016 12:00

DNA data is not personal data

My two cents, but I am no expert on this topic. I believe @MassimoMercuri @gandhiano @markomanka are much more educated on the issue. I read both papers Marco pointed at – let’s hope I read them right this time.

It is one thing to imagine and build a decentralized and distributed (cloud and tutti quanti) solution to make genomes available to the scientific community. This type of solution can indeed serve the community well if there are people willing to develop services and interfaces so machine can easily grasp the data, so users don’t have to deal with its distributed nature.

Just like when you use a P2P service, knowing it is P2P is okay, but you like the app to properly deal with collecting data bits and aggregating them into a proper entity (file) at the end. This means developing API and web services here and there that apps can access.

So far so good when the data is anonymized scientific data. What if the data relates to individuals? But then information privacy raises a number of (some difficult) issues. Initiatives to build archives of accessible data (cf IRIT paper pointed by Marco) have to be supervized/controlled/forbidden (choose whatever is relevant). I agree the solution resides more in the societal dimension fo the problem, but I fear it won’t admit any implementation before there are reliable and easily deployable tech solutions to potential privacy problems.

markomanka · March 04, 2016 09:47

@melancon, sorry for the belated reply, with the mentions not working I discovered your comment almost by chance.

Indeed those papers dealt with “simplified” cases (apart from humans, all other genomes have no individual rights, but are only regulated by eventual IP claims)… but for the sake of OpenCare, let me stress that DNA is very much personal data, if it is human genome we are talking about.

The questions you raise are perfectly valid, and are actually what I find fascinating about envisioning an extension of these solutions to sensitive data. How to nest appropriate cryptography, and routing and communication strategies to guarantee security (both as data protection, and as data availability when needed) effectively.

That we have so little time to prepare the EU documentation is what scares me, because otherwise this is something SCImPULSE is already trying to work on for slightly different purposes, and we could embark our partners from the other effort…

markomanka · March 04, 2016 09:52

and this…

https://medium.com/@Ledger/introducing-bolos-blockchain-open-ledger-operating-system-b9893d09f333

alberto · March 03, 2016 09:51

Make the call!

There is not much time on this call. We need to decide:

whether to participate at all
if we do, how we do the work.

Let me be clear: I am personally NOT going to do the work I did with OpenCare, that is taking personal responsibility for hammering the proposal home. I can do some homework, but someone else needs to cover the two roles needed for a successful proposal: the visionary, the person who “gets it”, and the leader, the person who assigns the homework and keeps the ball rolling.

Realistically, I think this is only works if we find some tech partner that knows a technical solution (p2p networks or blockchain) AND if we can latch it onto OpenCare by dreaming up a use case on health personal data.

I suggest a call in the next day or two to make the decision.

jimmytidey · March 03, 2016 10:12

Distributing data without (visible) peer to peer / block chain

I did a bit of thinking about why Diaspora failed. Fundamentally, if the end user has to understand federation or block chain, the user experience is broken. You might as well provide a service that requires the user to learn SQL. If the tech is hidden, then the offer becomes hard to understand for the user.

When Alberto talks about decentralised storage, what about a device that lives in your house that manages your family health records (or whatever)? Backs up works by creating an encrypted image stored to a thumb drive you keep on your key chain.

It’s decentralised in terms of the fact that you own the server, rather than federated across multiple servers where ownership is unclear.

I have no idea if this is the kind of pitch that would be welcome. OTOH, these guys look like they are doing it: https://mydex.org/

melancon · March 04, 2016 12:17

There’s DNA and DNA

You are absolutely right. The DNA I was referring to is the anonymized DNA stored in scientific databases (if I understand well they are,obtained by consensus after processing quite a number of different DNA chains). Personal DNA is of course DNA as well, but with the additional information that it has been obtained from and is being carried into the body of a person.

As far as Op3nCare is concerned, will we be dealing with personal DNA data? That being said, any medical information about a person puts us in front of the exact same problem.

markomanka · March 04, 2016 12:58

@melancon another good question… We won’t deal with personal genomics by design, as far as I am concerned, but this is a community project and with today’s toolkits somebody could think of sharing unforeseen data. While as long as we do not offer dedicated hosting and tools they wouldn’t truly be on the OpenCare platform, the discussions on it could go quite a long way, and the borders blurry… those conversations could be as sensitive as personal genomics data depending on the content.