Webinar: Making sense of a COVID-19 world: applying collective intelligence to big data (3rd of June)

MariaEuler · May 13, 2020, 3:41pm

Join the webinar to get introduces to giant research of Covid19 related data and interpret and discuss it live together.

How to read and interpret data?

As we at NGI Forward watched SARS-CoV-2 cut into our worlds, we felt the same as everybody else. We feared for loved ones, and for the long-term freedom and prosperity of our societies. We felt powerless, and anxious in the face of uncertainty. We still do.

But, at some point along the way, we realized there was something we could do. We have in our midst a team of data scientists. When the pandemic hit, Kristof and Michal were analysing trends in the evolution of Internet technology. Theirs is a big data method: they do text analysis on large repositories of academic papers, news articles, social media. We realized we could use the same method to investigate the virus and its companion illness. It would be like redirecting our searchlight onto a new target.

So we did that. In the space of only a few weeks, Kristof and Michal came back with results, neatly arranged in an interactive website . And that’s where we need your help.

Here’s the thing: this type of analysis scales well, but its results are not easy to interpret. We can see, for example, that technologies associated with the pandemic fall into four groups: remote work, medical, social distancing, contact tracing. This makes sense at first sight, but what does it mean? That people are using these technologies a lot? That researchers are working to improve these technologies over others? Or is the result an artifact of the way we look at the data? Is there a way that we can determine if there are obvious gaps in our technological arsenal? If we found any, we could look into investing to plug them.

This type of analysis becomes much more actionable when you provide context to it. As I pored over early results, I could not help notice a positive sentiment for PEPP-PT. Except that, in the very same days, the PEPP-PT consortium had collapsed due to the withdrawal of several research institutes over privacy concerns. Before the collapse, I might have interpreted the positive sentiment as validation. “Look, the tech community likes this protocol!”. In the light of the collapse, it made more sense to read it as naïve enthusiasm, or even spin. A possible interpretation could be: “it is a good idea to let the community poke holes in any specific solution before we endorse it”. Another one is: “scientists and engineers are still humans. In a crisis, they reach for a solution, and want it to work. Their critical abilities are, for a time, weakened.”

In sum, we have a large trove of data on COVID-19, which seems to hold important knowledge, if we could just unlock it. So, we propose an experiment. Let’s get together (online), and question the data.
Let’s look at them, turn them around in our heads, interrogate each other on their possible meaning. Let’s formulate hypoteses, and see if we can think of ways to test them. Kristof and Michal can transform them into queries and visualization in real time, or close enough. In other words, let’s augment big data analysis with collective intelligence methods. It will be a way for us to learn more about both COVID-19 and this particular big data methodology.

When:

3rd of June, 5PM CEST.

How to see what came of it:

Follow this link to come to the transcription of the session:

We will record this webinar for research purposes. You can find more information here: Edgeryders Calls and Webinars? - PARTICIPANT INFORMATION SHEET

This event is part of the NGI Forward project Generation Internet (NGI) initiative, launched by the European Commission in the autumn of 2016. It has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 825652 from 2019-2021. You can learn more about the initiative and our involvement in it at https://ngi.edgeryders.eu

Eu emblem

MariaEuler · May 18, 2020, 10:58am

ping @martin, is this interesting for you?

MariaEuler · May 18, 2020, 11:04am

ping @hugi, would that maybe be interesting for the people who followed your covid crisis updates on facebook and such during the first few weeks of the crisis?

kristof_gyodi · May 21, 2020, 3:55pm

For everyone interested in open-source answers to COVID-19: we scanned Github and collected metadata of projects.

The most influential (forked) projects include handy dashboards, medical datasets, data forecasting, and contact-tracing apps. These are interesting themselves, but we also caught projects that are not available any more for political reasons: China arrests users behind GitHub coronavirus memories page

Join us in a webinar on 3rd of June to discuss the chances and dark sides of Covid-19 tech!
Our analyses are availabe: https://covid.delabapps.eu

nadia · May 21, 2020, 4:36pm

cc @stefanoboski useful for those lists…

MariaEuler · May 22, 2020, 8:04am

ping @felix.wolfsteller, @mattias, @unclecj, @Emile, @erik_lonroth is that project metadata interesting for you or someone you know?

ping @BlackForestBoi, would you and the worldbrain crowed maybe be interested in joining this discussion? You perspective invalidating data and sharing it and its interpretations in a trustworthy way would be very welcome!

alberto · May 27, 2020, 3:42pm

Heads up, @krystof and @Michal: I am going to ask you what you can do, with your data, to validate results 1-4 from the surveillance pandemic event. You can find them here. They are intentionally stated as statements, that lend themselves to falsification. Could give some thought about how to query the dataset so that the response will have something to say about one, or more, of the four results?

Feel free to ping me if you want to discuss this.

MariaEuler · June 2, 2020, 3:17pm

Here are some of the questions we would like to discuss with you tomorrow:

One of the things that stood out to @johncoate when looking at your data was that anxiety is up while reported depression seems to have gone down during the crisis.
How could we interpret that? What else other enquiries/studies would need to be made to understand this more?
Could we use the Mental health graph on https://covid.delabapps.eu/ to explore how to read and interpret such data - and how not to?

image912×889 48.7 KB
During the surveillance pandemic session, one of our results was that “Locational data are impossible to anonymize and of limited utility. Capacity for data governance is bad”. How do you source and deal with locational data? How do you evaluate it? You have graphs regarding contact tracing on https://covid.delabapps.eu/, maybe we can use that to focus the conversation.

image982×874 42.6 KB
Can you “find” data without searching? How did you set out to select what you would collect and display for COVID-19 NGI Forward
What has been the most striking (data)point pattern you have found in your COVID Data research?
How would you “read” or interpret the GitHub Repository map you provide on your platform and what does that tell us about the international opensource communities?

image949×876 71.7 KB
How does a data scientist emotionally process what you learn? And how does everyone? We are dealing with large datasets which contain the lives and death of millions of people. How does one fasm that? Should one try to fasm that? Statistically feeling bad

kristof_gyodi · June 3, 2020, 11:50am

We can check 4 things on the go in news articles:

if a term is trending
which are the co-occurring terms
sentiment of the paragraphs containing a terms (-1, 1)
most pos/neg co-occurrences based on the sentiment of paragraphs containing both terms

points 2-4 take a few minutes to compute

the mentioned insights are very nuanced for this type of analysis, we can address them only in general - we have some analyses on contact tracing (concepts, initiatives, open source projects), passport has not been picked up in the analysis (not trending)

alberto · June 3, 2020, 1:05pm

OK, let’s see:

A surveillance pandemic? Results of the community listening post on risks for freedom in the wake of COVID-19

Result 1: there is cause to worry, but also leverage for defense

There are several good reasons to be on our guard.

Policy makers tend to overestimate the effectiveness of technology-based surveillance vis-a-vis the pandemic. People spoke of pervasive solutionism (in the sense of Evgeny Mozorov – “a little magic dust can fix any problem”).

Digital surveillance companies are treating COVID-19 as a business opportunity. Some of these have dubious track records on the respect of human rights online. In the words of one participant:

In the last week, it’s been reported that around a dozen governments are using Palantir software and that the company is in talks with several more. They include agencies in Austria, Canada, Greece and Spain, the US, and the UK.

The public is scared, so willing to accept almost anything.

Look for “solutionist” language (solution, effective, efficient, real-time, scalable etc.). See if it trends (but do we have a counterfactual?); see if it co-occurs with epidemics language (COVID, Coronavirus, pandemic) and tech solutions thereto (contact tracing app), especially linked to surveillance companies (Palantir, others?). See if the sentiment is positive. If none of these occur, the hypothesis is not supported.

A surveillance pandemic? Results of the community listening post on risks for freedom in the wake of COVID-19

Result 2. Contact tracing apps are ineffective against COVID-19, but may help in the next pandemic

[…] Among failure modes, people cited:

Data governance issues: possible breaches, difficulty to anonymize the data, and so on. More on this below.

Lock-in effects: for these app to work, they need 50-60% of the population to take them on. It’s a “winner-take-it-all” service. There is potential for companies to lock authorities into long term contracts, invoke all kinds of confidentiality to protect their business models, and so on. This situation could prevent better solutions from emerging.

Loss of confidence: if the authorities roll out an app, and it does not deliver, the public may lose confidence in any app. This could happen as new cases rise again after lockdown is loosened, as is happening currently across Asia. This might burn an opportunity to help contain the next pandemic at an early stage.

Repeat the game: identify a batch of words related to anonymity (anonymize, breach, locational) and another linked to the possible issues identified by the Surveillance Pandemic focus group (takeup, adoption, lock-in, credibility, confidence), then look at occurrence, co-occurrence and sentiment.

You get the idea. We can do the same with results 3 on immunity passports and 4 on locational data.

MariaEuler · September 10, 2020, 1:19pm

3 posts were merged into an existing topic: Listening to more data - Applying the DeLab method to questions form the listening session.

johncoate · June 23, 2020, 7:04pm

Here is the text from the June 3 Webinar:
DE Lab NGI Webinar June 3 2020.txt (50.5 KB)

@kristof_gyodi
@LouisSH

MariaEuler · September 10, 2020, 1:07pm

A post was merged into an existing topic: Transcript - Webinar: Making sense of a COVID-19 world

MariaEuler · June 24, 2020, 1:27pm

You can find the transcript of the webinar here: Transcript - Webinar: Making sense of a COVID-19 world - #2 by MariaEuler

atelli · June 26, 2020, 10:24am

Dear @kristof_gyodi and @michal_wolny Great work as far as I could follow from here as well as your website! Thanks for sharing. Ref to the comment by an X researcher at the end of the webinar, I fully agree and would like to offer you an opportunity for displaying your research for further dissemination to interested public. We’re working on a blog that will be connected to this one here: https://boasblogs.org/ Mainly media, data and social informatics researchers, interested in critical STS research and data activism: “Outreach” blog for interdisciplinary work. Would you be interested in writing a blog piece on your work?

alberto · June 29, 2020, 1:16pm

Is this the right place for this to be? This topic does not have the ethno-ngi-forward tag, so the ethno team will ignore it. The rest of the topic is certainly about organization, not content, so up to now this has been appropriate.

@amelia what is the protocol for coding transcripts? Do you want to coordinate for making one and updating the coding wiki?

MariaEuler · September 10, 2020, 1:24pm

A post was merged into an existing topic: Consent Process Manual

kristof_gyodi · July 3, 2020, 7:47am

Hi @atelli, thank you very much for your support! we would be happy to write a post Could you please provide us some editorial guidelines? my address is k.gyodi@delab.uw.edu.pl
Thanks for the opportunity

MariaEuler · September 10, 2020, 1:28pm

You can find a conversation about using DeLab’s method for further exploration and contribute your own questions/requests here: