Decentralized risks: Hosting information for others comes at a cost


#1

I was just in Berlin for the Data Terra Nemo conference and also ended up attending a DGOV meet-up. Data Terra Nemo is a conference to bring together the community developing open source protocols and clients for decentralized peer-to-peer applications. A very simplified introduction for those who are unfamiliar with the technology: Decentralized web applications are hosted without traditional servers. Instead, every client is both a host and a client. You can often still access these sites from traditional browsers, thanks to that many of them have been implemented in javascript. This shift allows for a lot interesting applications, and this is the scope of Data Terra Nemo.

Many of the sessions touched upon the current challenges of decentralized technologies.

Blindly hosting information for others comes at a cost

‘Gossip’ protocols like Scuttlebutt allows applications to share a pool of messages between users that are connected to each other. These messages are then interpreted by client applications, which can be social media applications, book recommendation services, chess games or private messages. I wrote a separate post about Scuttlebutt, and I am quite inspired by the community which has grown up around that technology. Largely because of the human-centric values of the core developers it has a unique feel and has attracted a big community for being such a bleeding edge technology. My reading of the role that Scuttlebutt plays in this space is that of the experimental avant-garde, the playground where new and radical ideas can be tested and implemented. Dominic Tarr, original developer of Scuttlebutt, had an interesting reflection on his own modus-operandi that I think has translated into Scuttlebutt itself, which is to “not build the next big thing, but rather build the thing that inspires the next big thing, that way you don’t have to maintain it”. And it turns out that the developers of Scuttlebutt are working on problems that affect a distributed technology at its core.

One of the core elements of Scuttlebutt is that I can host data for other people on the network without being connected to them directly. I can also host data on my own computer that is encrypted communication between users that I am connected to. This is a feature that is very useful in situations where Bob sends a message to Alice, who is in a country where traffic to the outside internet is highly restricted. If Cindy, who has the privilege of a VPN connection, is connected to both Bob and Alice, then Alice will receive the message from Bob as soon as Alice and Cindy connect to the same network. Cindy doesn’t even know that she is carrying a message between Bob and Alice, as both the message and the information about who is talking is encrypted. This is possible because, by default, everyone replicates the entire message stream of their entire network. In fact, many people replicate every message in their networks 2-3 hops aways from themselves.

This creates some challenges. If Hans is in your network and connects to Lars who is a neo-nazi connected to a group of other neo-nazis, you might actually be acting as a server for information that you would really rather not propagate. This is not a hypothetical case. There are already known instances of the Norwegian alt-right using Scuttlebutt as their preferred means of communication. Luckily for people that don’t want to deal with them or host their information, they are still quite an isolated part of the network and prefer to keep it that way. Nevertheless, it only takes that one well connected person on their network-island connects to another well-connected person in the Scuttlebutt mainstream for that isolation to be broken.

Scuttlebutt has tried to solve this with giving users the ability to block accounts they find abusive, but for those replicating data from a large network, they might actually never know who is posting that content. One solution that has been talked about is subscribable block-lists. This way, I could outsource blocking to a person or group I trust to maintain a block-list.

Luckily, the most worrisome case of abusive photographs and other image content is not as big of a problem. Images are only downloaded to your computer when your client actually sees them, making it clear to you that foul content has made it into your stream. You might still host links to those images without knowing, but you are not likely to have images on your hard drive that you have never seen.

Bottom line is that the community of Scuttlebutt is charging head first into the future, working on challenges that many more will be facing as the benefits of these technologies reach more hands. Distributed systems are both democratizing and empowering, and they come with a whole new set of possibilities.

I’d love to hear perspectives from @elch, @zelf and @hendrikpeter on this. Do you worry about this when using Scuttlebutt? How can we make these peculiarities of decentralized technologies visible without scaring people off?


What is happening in your worlds this week?
#2

I hope you don’t mind if I butt in just a little (pun intended) :wink:

I feel a little weary about being an SSB conduit and also a little reckless for not paying more attention to whom I am propagating. The all or nothing block is also a little rough (I know it needs to be rough since it’s trailblazing!). It feels a little unhuman to be fully accepting and propagative of signal or not at all receptive to anything a person is trying to communicate, I hope the future can hold more ambiguity. Also attempting to create circles of trust is something I’ve been thinking a fair bit about in this space too.

I don’t want to feed any type of Scuttlebutt/Holochain rivalry hanging around the decentralized communities, but I hope it could be interesting to hear a little of how these problems are considered in the HC space (AFAIK). There, there are two main parts to handling this kind of stuff, peer-validation and warrants.

Peer-validation is basically an infrastructure to ensure that peers do not publish things to the app (the shared DHT) that break the rules that the users of the app have agreed on, might take a toll on encryption though, not sure how that is going to pan out, I suppose only people that have the keys to decrypt can act as replicators of data in those cases.

Warrants are similar to the concepts of blacklists, as actors one actor deems another malicious in some way a warrant is issued which includes the piece of malicious data signed by the actor that sent it (could be multiple different ways with multiple corresponding warrants). As warrants circulate, the app logic states how to resolve them. Banning? Flagging? Starting a process? All that is also meant to be dependable on the reputations and prior histories of the actors issuing warrents as well as those on the receiving end.

Also, theoretical and only partially implemented at this stage (again AFAIK).

I don’t know how all that jives with what is cooking in the SSB cauldrons but my purpose in life is cross-pollination so I hope it might help in these early days of evolution :honeybee:


#3

How is this supposed to work? I’m curious. What does “validation” mean in this context?


#4

Validation means making sure that entries are entered into the shared space only so long as they follow the rules specified in the application. When somebody enters a public entry to their local chain, it is also going to be published to the DHT. Peers will recieve the entry and pass it through the validation functions declared in the app to make sure they follow them before adding them to their part of the DHT.

Validation could consist of simple stuff like, no images larger than 2MB or the oft quoted no messages longer than 140 characters. But it could also be to make sure that you are not editing some material that you don’t have authorship over or posting in a space where you only have reading rights or something like that.

For mutual credit currency situations, validation in apps will ensure you stay within your credit limits and not allow you to go further positive or negative than your calculated limit is (that could be based on other things like prior transactions or integrated activity from other app spaces).

Potentially it could also be things like checking links against an external list of websites in order to flag them as inflammatory or blocking users from spamming until they have achieved some amount of credibility in the community of users for that app. It is all written on a per-app basis and DHTs are shared only within an app. Apps in this context is just the data layer since UI’s can bridge a whole bunch of back-end “apps” and read/write to many DHTs for rich user experiences.


#5

Thanks for the clear explanation, @zaunders. And thanks @hugi for explaining what really is in the SSB folder on my laptop.

I was thinking about this, and it dawned on me that the European Union’s General Data Protection Regulation was definitely not written with SSB in mind. The GDPR seems to have been written with centralization in mind. For example, if you treat data, you should have a data protection officer. This is someone who stands watch on the data; she is in charge of enforcing the user’s rights, like having their data deleted and knowing what uses are being made of said data.

But with SSB, this model glitches. In fact, there is no SSB. My right of having my data deleted from SSB is unenforceable, because the data are sitting on people’s hard drives. These people do not even know they have my data on their hard drive! Most of them, like me, will also have an automated backup procedure.

The GDPR gives people a right to sue for misuse of their data. But in the case of something like SSB, there is no one to sue. This might be a feature and not a bug: people are expected to think before they post, and when they post they can never un-post. I do not know, it is so orthogonal to the regulatory frame of mind that I struggle to transpose the model to SSB and its federated ilk.

Don’t get me wrong, I am a big fan of the GDPR. It was designed to limit the power of large centralized corps, and I am already seeing beneficial effects in this sense. But I don’t see how it can provide cover against Norwegian nazis using my laptop as a communication device, or help me keep control of data I share on SSB. Anyone has any ideas?


#6

hahahahha this had me laughing out loud, I’d missed that he said this! :smile: He might be onto something considering beaker browsers new development… (wanted to link but can’t find posts about it at the moment) I’m gonna keep reading now though. I think this might tie into more thoughts down the line…


#7

This touches upon an aspect I worry about in regards to scuttlebutt as well yet is of a different kind. In your example you mention being faced with data you would rather not partake in, I have found that blocking does indeed hinder most forms of data sharing which I would like to avoid. The one case i have witnessed a lot of discussion around is the same as with facebook/instagram of the “annoying uncle/aunt syndrome”, in which one has a social obligation to not block the individual yet would rather not see the unfiltered spouting on their feed. Facebook/instagram has solved this by simply opening up for “opting out” of seeing their content in your feed. This would be simple to implement on the application layer of SSB for example.

On the other hand, speaking of hops and the spread of data, comparing the amount of privacy (in the sense of who has access to your data) SSB is by far more private than current internet standards and platforms. This ties into the core issue of all kinds of flat-structure tools/platforms used for private communication, that they can easily be used for hiding shady information as well, something one inevitably has to take a stance on.

In regards to designing for communicating privacy/lack-thereof /quirks of the new protocols, security expert Eileen Wagner had a great workshop about this at Radical Networks last year :slight_smile:


#8

This is actually interesting to discuss though! I had a conversation with the an IT guy who had done a lot of research into GDPR lately to ensure his company fit the regulations. What he said he’d found was that GDPR did initially serve it’s purpose of ensuring that European data stayed within Europe, but it had simultaneously opened up a new market for American companies to make money by charging the companies who used their services (such as GDrive) for ensuring that their data would be stored in the companies European servers.

Inherently I don’t think GDPR suits it’s purpose of ensuring the data privacy for it’s “citizens” as it’s an issue rooted in the infrastructure of the default-web rather than how the already faulty system is implemented.

In general, spot on regarding GDPR’s effect on SSB though @alberto :smile:


#9

Could you say more about this issue rooted in the infrastructure of the default web? How would you describe it?


#10

That seems a very partial take on the GDPR. Its main effect is that people are waking up to the fact that the cowboy era of data hoarding is over. “Data minimalism” has become a thing (for example, it is a tenet of City of Amsterdam’s digital strategy: a far cry from the alcyon days of the “smart city”). IT folks focus on the costs of compliance, but the bite of the GDPR is that it creates digital rights; puts the liability for infringing those rights on the entities that collect data; and then steps aside and lets the courts do their job. The GDPR has inherently more bite for large corps than for small ones, because class actions are much more of a real risk for them. No one is going to go through the trouble of suing Edgeryders. Facebook, though… that’s another matter.


#11

Yess! I completely agree, it’s a much needed statement indeed, setting a precedent for the future and targeting the big companies. In reality though it makes it difficult for smaller companies to continue their work as they are reliant on the bigger companies which in turn can profit from this reliance with the rules of GDPR as a backing.

But yes, it’s a much needed statement, if executed in a proper manner is disputable, or if it’s even possible to take action in a positive form when the infrastructure itself directly contradicts personal ownership of data.

This leads into @johncoates question:

Could you say more about this issue rooted in the infrastructure of the default web? How would you describe it?

The infrastructure of the default web is

  1. Centralized, as seen in this image
  2. Inherently distributes the ownership of data away from the users
  3. Relies on middlemen to deliver the data itself which sees all meta-data

With the structure above as a basic foundation of how the https protocol works it is practically impossible to organize for private data where the individual has ownership of the utilization of the data itself since the user can’t control who has access to the data or how the data is stored.

The movement of Distributed / Decentralized webs are all centered around re-organizing this foundational infrastructure, and more, such as in the case of Mesh networks which goes even further and looks at the hardware infrastructure of the internet.


#12

It looks like a movement that is gaining in numbers energy and power.


#13

I think that is why Tim Berners-Lee is doing the Solid thing now. It’s not quite there yet, but if we get this right, it’ll be a nice middle ground where everyone stays in control of their data, but we will still have sort of centralized app & service providers. (Because, let us be frank, no one wants to have to worry about the uptime or safety of their data storage.)


#14

And another similar initiative is Wireline. I saw their demo recently, and it seemed pretty stable. Apparently they are very close to releasing.

Maybe @leobard has some updates? Last time we talked he was hanging out in a chat channel with Tim Berners-Lee.


#15

Hosting info for others definitely comes at a cost, both for the host and the guest, considering “there’s no free lunch”. I’m quite interested in Solid, as it feels like a “middle-way” from the mainstream Internet as we know it but with the capacity to give users more power and give them control over their data - especially if self-hosting data. Also with the possibility to doing it in an association, co-op or a company they own or are a member of. Works well with the https://mydata.org framework, would love to see a combination.


#16

I looked at the blurbs of both Solid and Wireline. The idea has been floating around for quite some time: I remember hearing about it for the first time at an event called Public Services 2.0 in 2009. So, I guess my questions would be:

  • In a world that normally moves quite fast, what is delaying deployment? Maybe @RobvanKranenburg has some answers here.
  • What is keeping entities accessing your “pod” or “data wallet” or whatever saving a copy of your data, and then cross-referencing it with whatever else? Technically, of course, they have to copy your data. Legally (at least under the scenario of restrictive data protection regulations) they are supposed to delete them, but… will they? Facebook is rumored to have an “you account” even if you yourself do not have a Facebook account, and never had one. Would this kind of scenario be prevented by Solid/Wireline? Because if it would not, we go back to good old antitrust policy: forget about the tech, just never allow companies to grow too big, break them up, nationalize them, whatever.

Edit: Cory Doctorow seems to share this point of view.


#17

Or maybe regulating them in certain ways makes them stronger.