Webinar: Making sense of a COVID-19 world: applying collective intelligence to big data (3rd of June)

Join the webinar to get introduces to giant research of Covid19 related data and interpret and discuss it live together.

How to read and interpret data?

As we at NGI Forward watched SARS-CoV-2 cut into our worlds, we felt the same as everybody else. We feared for loved ones, and for the long-term freedom and prosperity of our societies. We felt powerless, and anxious in the face of uncertainty. We still do.

But, at some point along the way, we realized there was something we could do. We have in our midst a team of data scientists. When the pandemic hit, Kristof and Michal were analysing trends in the evolution of Internet technology. Theirs is a big data method: they do text analysis on large repositories of academic papers, news articles, social media. We realized we could use the same method to investigate the virus and its companion illness. It would be like redirecting our searchlight onto a new target.

So we did that. In the space of only a few weeks, Kristof and Michal came back with results, neatly arranged in an interactive website . And that’s where we need your help.

Here’s the thing: this type of analysis scales well, but its results are not easy to interpret. We can see, for example, that technologies associated with the pandemic fall into four groups: remote work, medical, social distancing, contact tracing. This makes sense at first sight, but what does it mean? That people are using these technologies a lot? That researchers are working to improve these technologies over others? Or is the result an artifact of the way we look at the data? Is there a way that we can determine if there are obvious gaps in our technological arsenal? If we found any, we could look into investing to plug them.

This type of analysis becomes much more actionable when you provide context to it. As I pored over early results, I could not help notice a positive sentiment for PEPP-PT. Except that, in the very same days, the PEPP-PT consortium had collapsed due to the withdrawal of several research institutes over privacy concerns. Before the collapse, I might have interpreted the positive sentiment as validation. “Look, the tech community likes this protocol!”. In the light of the collapse, it made more sense to read it as naïve enthusiasm, or even spin. A possible interpretation could be: “it is a good idea to let the community poke holes in any specific solution before we endorse it”. Another one is: “scientists and engineers are still humans. In a crisis, they reach for a solution, and want it to work. Their critical abilities are, for a time, weakened.”

In sum, we have a large trove of data on COVID-19, which seems to hold important knowledge, if we could just unlock it. So, we propose an experiment. Let’s get together (online), and question the data.
Let’s look at them, turn them around in our heads, interrogate each other on their possible meaning. Let’s formulate hypoteses, and see if we can think of ways to test them. Kristof and Michal can transform them into queries and visualization in real time, or close enough. In other words, let’s augment big data analysis with collective intelligence methods. It will be a way for us to learn more about both COVID-19 and this particular big data methodology.


3rd of June, 5PM CEST.

How to see what came of it:

Follow this link to come to the transcription of the session:

We will record this webinar for research purposes. You can find more information here: Edgeryders Calls and Webinars? - PARTICIPANT INFORMATION SHEET

This event is part of the NGI Forward project Generation Internet (NGI) initiative, launched by the European Commission in the autumn of 2016. It has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 825652 from 2019-2021. You can learn more about the initiative and our involvement in it at https://ngi.edgeryders.eu

Ngi logo

Eu emblem


ping @martin, is this interesting for you?

ping @hugi, would that maybe be interesting for the people who followed your covid crisis updates on facebook and such during the first few weeks of the crisis?

1 Like

For everyone interested in open-source answers to COVID-19: we scanned Github and collected metadata of projects.

The most influential (forked) projects include handy dashboards, medical datasets, data forecasting, and contact-tracing apps. These are interesting themselves, but we also caught projects that are not available any more for political reasons: https://qz.com/1846277/china-arrests-users-behind-github-coronavirus-memories-page/

Join us in a webinar on 3rd of June to discuss the chances and dark sides of Covid-19 tech!
Our analyses are availabe: https://covid.delabapps.eu


cc @stefanoboski useful for those lists…

1 Like

ping @felix.wolfsteller, @mattias, @unclecj, @Emile, @erik_lonroth is that project metadata interesting for you or someone you know?

ping @BlackForestBoi, would you and the worldbrain crowed maybe be interested in joining this discussion? You perspective invalidating data and sharing it and its interpretations in a trustworthy way would be very welcome!

Heads up, @krystof and @Michal: I am going to ask you what you can do, with your data, to validate results 1-4 from the surveillance pandemic event. You can find them here. They are intentionally stated as statements, that lend themselves to falsification. Could give some thought about how to query the dataset so that the response will have something to say about one, or more, of the four results?

Feel free to ping me if you want to discuss this.

1 Like

Here are some of the questions we would like to discuss with you tomorrow:

  1. One of the things that stood out to @johncoate when looking at your data was that anxiety is up while reported depression seems to have gone down during the crisis.
    How could we interpret that? What else other enquiries/studies would need to be made to understand this more?
    Could we use the Mental health graph on https://covid.delabapps.eu/ to explore how to read and interpret such data - and how not to?

  2. During the surveillance pandemic session, one of our results was that “Locational data are impossible to anonymize and of limited utility. Capacity for data governance is bad”. How do you source and deal with locational data? How do you evaluate it? You have graphs regarding contact tracing on https://covid.delabapps.eu/, maybe we can use that to focus the conversation.

  3. Can you “find” data without searching? How did you set out to select what you would collect and display for https://covid.delabapps.eu/?

  4. What has been the most striking (data)point pattern you have found in your COVID Data research?

  5. How would you “read” or interpret the GitHub Repository map you provide on your platform and what does that tell us about the international opensource communities?

  6. How does a data scientist emotionally process what you learn? And how does everyone? We are dealing with large datasets which contain the lives and death of millions of people. How does one fasm that? Should one try to fasm that? Statistically feeling bad


We can check 4 things on the go in news articles:

  1. if a term is trending
  2. which are the co-occurring terms
  3. sentiment of the paragraphs containing a terms (-1, 1)
  4. most pos/neg co-occurrences based on the sentiment of paragraphs containing both terms

points 2-4 take a few minutes to compute

the mentioned insights are very nuanced for this type of analysis, we can address them only in general - we have some analyses on contact tracing (concepts, initiatives, open source projects), passport has not been picked up in the analysis (not trending)


OK, let’s see:

Look for “solutionist” language (solution, effective, efficient, real-time, scalable etc.). See if it trends (but do we have a counterfactual?); see if it co-occurs with epidemics language (COVID, Coronavirus, pandemic) and tech solutions thereto (contact tracing app), especially linked to surveillance companies (Palantir, others?). See if the sentiment is positive. If none of these occur, the hypothesis is not supported.

Repeat the game: identify a batch of words related to anonymity (anonymize, breach, locational) and another linked to the possible issues identified by the Surveillance Pandemic focus group (takeup, adoption, lock-in, credibility, confidence), then look at occurrence, co-occurrence and sentiment.

You get the idea. We can do the same with results 3 on immunity passports and 4 on locational data.

3 posts were merged into an existing topic: Listening to more data - Applying the DeLab method to questions form the listening session.

Here is the text from the June 3 Webinar:
DE Lab NGI Webinar June 3 2020.txt (50.5 KB)


1 Like

A post was merged into an existing topic: Transcript - Webinar: Making sense of a COVID-19 world

You can find the transcript of the webinar here: Transcript - Webinar: Making sense of a COVID-19 world

Dear @kristof_gyodi and @michal_wolny Great work as far as I could follow from here as well as your website! Thanks for sharing. Ref to the comment by an X researcher at the end of the webinar, I fully agree and would like to offer you an opportunity for displaying your research for further dissemination to interested public. We’re working on a blog that will be connected to this one here: https://boasblogs.org/ Mainly media, data and social informatics researchers, interested in critical STS research and data activism: “Outreach” blog for interdisciplinary work. Would you be interested in writing a blog piece on your work?


Is this the right place for this to be? This topic does not have the ethno-ngi-forward tag, so the ethno team will ignore it. The rest of the topic is certainly about organization, not content, so up to now this has been appropriate.

@amelia what is the protocol for coding transcripts? Do you want to coordinate for making one and updating the coding wiki?

A post was merged into an existing topic: :green_book: Consent Process Manual

Hi @atelli, thank you very much for your support! we would be happy to write a post :slight_smile: Could you please provide us some editorial guidelines? my address is k.gyodi@delab.uw.edu.pl
Thanks for the opportunity :slight_smile:

You can find a conversation about using DeLab’s method for further exploration and contribute your own questions/requests here: