Putting them here for now.
The NGI Forward SSN has five ethnographic codes that contain the word “data”. They are:
| Code | number of occurrences |
|---|---|
personal data |
43 |
location data |
8 |
open data |
6 |
data protection |
10 |
data storage |
3 |
They all co-occur with personal data, which acts as the hub of the subgraph. Below we show the ego network of personal data, with the codes above shown in yellow.
An edge in this graph means that the two codes at the extreme of the edge occur together in the posts of at least two participants in the NGI Exchange forum. Edge color maps to the number of co-occurrences: redder color means higher number of co-occurrences. We call the number of co-occurrences k and interpret it as an indicator of the strength with which the two codes connected by the edge are associated. A higher k, represented by a redder edge, means that the association is stronger. The strongest connection of personal data is with contact tracing (k = 46), reflecting the intense debate on the COVID-19 contact tracing apps. Other high-k connections are with privacy, app, agency, user control and business model.
The other four codes, albeit much less connected in this graph, tend to connect to some of the codes also connected to personal data, for example location data to privacy and app and open data to user control.
But where are these codes in the structures of the conversation? The following image selects, for legibility, only the strongest associations (k >= 10). At this level of zoom out, only two codes, personal data and location data are still visible. personal data is itself one of the keystone codes of the entire NGI Exchange corpus. It sits close to the part of the network which is about change: technologies (artificial intelligence), legal and technical standards (open source), values (privacy, very central, but also agency and user control) and actions (imagining alternatives and imagining the future).
location data, via its strong connection with contact tracing, ends up in a part of the network centered on the Internet’s role in societal response to COVID-19. Codes like working remotely and uncertaintyshow up here.
The analysis of this corpus seems to suggest that data are central in the project of re-imagining the Internet, as well as the society it serves. They are also the focal point of societal tensions: the re-affirmation of values around personal data indicates that these values need to be re-affirmed, because they are under threat.
On the other hand, location data sits in a “cluster of despair”: it is not clear that data have enabled a surgical response to the main challenge of 2020, the COVID-19 pandemic. After all, the response of major governments has mostly been based on almost medieval remedies, like quarantines and curfews.

