AMA - Data Openness and Covid

MariaEuler · February 05, 2021 13:40

For over ten years, we have been hearing that data have become a key resource, “the new oil”. In combination with the Internet as a general delivery infrastructure and new data processing techniques, like machine learning, they power new services. Some of these are potentially benign (recommendation algorithms), others decidedly creepy (facial recognition); both types are changing our societies. Estimates circulated in 2019 put at over 1 trillion Euro the projected value of the data economy in the EU.

Civic-minded technologists have argued that many important datasets, especially those produced by the public sector or funded by taxpayers, should become public goods, accessible to all without restriction. Open, machine-readable public information would not only benefit the economy, but improve transparency and strengthen democracy. Policy makers, at least in Europe, heard the call; the EU Public Sector Information Directive dates back to 2003, and got an overhaul as the Open Data Directive in 2019.

It looks like all stakeholders agree on the strategic importance of open data as a common societal resource. And yet, after all this time, when the COVID-19 crisis erupted in early 2020, the open data pipelines burst at the seams. In many countries, it has been surprisingly difficult for researchers, journalists and citizens to access even basic data on how the pandemic played out. And this happens at a time when authorities are making controversial decisions to limit the freedoms of citizens – decisions that, naturally, citizens would like to better understand.

Why, after all this time and lofty rhetoric, are we so bad at basic open data?
Is it a capacity problem, with public servants working out of Excel sheets rather than with robust data delivery pipelines?
Is it the temptation of using strategically the information on COVID-19, that struggling politicians simply cannot resist? Or what else?

Covid data AMA (4)

We are Giorgia Lodi and Andrea Borruso, and we care about the issue. Andrea is a co-founder of Italian civic tech NGO OnData. OnData launched a petition to prime minister Giuseppe Conte to release all epidemiological data as open data, and to open the algorithms that convert those data to risk indicators (like Rt) that the authorities base their policy decisions on. To date, the petition has 46,000 signatories, including 162 organizations. Meanwhile, Giorgia, formerly with the Italian government agency for the digital society, is involved in an epic struggle for the openness of the data in Italy.

Join us and others on March 2nd at 18:30 CET to AMA about the openness of covid data!

What is an AMA and how does it work?
AMA, or Ask Me Anything, is an interviewing format popularised via Reddit. In short, you ask the features “expert” questions, and they answer them live for an hour.

How does it work?
Anyone is welcome to post questions/comments below. On the anouced date, the featured speaker(s) come back to this thread to start answering questions they find interesting. They will do their best to reply to as many questions as possible, but please note that not all questions/comments will be addressed.

Who can join?
Anyone! As the conversations get going during the hour, you will see multiple threads naturally emerging. There is an open invitation for everyone to contribute–please feel free to reach out to other community members, either on the thread or via DM, to continue the discussion!

Register here to get a reminder and the link resend an hour before the event: https://tell.edgeryders.eu/15357

MariaEuler · March 02, 2021 16:03

18 posts were merged into an existing topic: AMA on COVID-data planning - WIP

Mcx · February 15, 2021 19:05

What would you like to ask Giorgia Lodi and Andrea Borruso:

What do you think will be the next big topic to focus on and to demand open data after the Pandemic crisis?

mfortini · February 15, 2021 19:27

What would you like to ask Giorgia Lodi and Andrea Borruso:
Thinking in terms of baby steps, which would be the first 3-4 steps to move forward?

GregorySech · February 15, 2021 23:49

What would you like to ask Giorgia Lodi and Andrea Borruso:

alberto · February 16, 2021 11:03

My question to @giorgia.lodi and Andrea Borruso refers to an episode of last year, that we learned about recently. A German magazine called Welt am Sonntag got hold of intense correspondence between Germany’s interior ministry and scientists at the Robert Koch Institute. In these emails, state secretary Markus Kerber instructed researchers to provide a model predicting some kind of Coronapocalypse, so as to scare Germans into accepting lockdown (source, French).

Researchers complied. In only 4 days, they developed a report with a worst case scenario showing that over a million Germans could die of COVID. It was declared secret, but then “distributed via various media” in the following days. In the wake of these revelations, Kerber had to appear before a Bundestag commission, where he declared that his boss, interior minister Horst Seehofer, had greenlighted the operation (source, German). Seehofer has not resigned, and is now busy closing Germany’s borders.

This story gives three hints, on which I would like your opinion.

First: there is a structural tendency of governments towards a strategic use of information. Open data and open modelling limit such use, so that is an obstacle to their adoption.

Second: there is an issue with the independence of the institutions whose job is to gather, publish and use the data. Science should be independent, but can the Koch Institute really annoy the all-powerful federal minister? Here I see a role for civil society, and organizations such as Andrea’s own Ondata.

And third: publicly accessible data can help in building public trust at a time when even governments manufacture fake news (though, in Seehofer’s defense, his intentions might have been good). But, de facto, this does not seem to be working. We have (some) data, but trust is super low. In Belgium anti-vaxxers run rampant.

What, in your opinion, is going on?

muredduf · February 18, 2021 12:10

What would you like to ask Giorgia Lodi and Andrea Borruso:
- To what extent keeping the Covid data closed is hampering the fight against the pandemic

saramarcucci_nestait · February 18, 2021 14:48

What would you like to ask Giorgia Lodi and Andrea Borruso:

Glen · February 19, 2021 06:49

What would you like to ask Giorgia Lodi and Andrea Borruso:

johncoate · February 20, 2021 01:38

What would you like to ask Giorgia Lodi and Andrea Borruso:
I have read your petition and I followed the link to Integrated COVID-19 Surveillance of the Istituto Superiore di Sanità. The “open data” they offer shows cases by date and cases by region. Is what you are demanding the right to see the raw data that creates those totals? How personal is that data if it is disaggregated?

pcmasuzzo · February 22, 2021 11:06

What would you like to ask Giorgia Lodi and Andrea Borruso:
What’s thing number one you would do (with the data, but not only) if Italian COVID data actually became open and accessible to everyone?

Markus_D · February 24, 2021 09:57

Thanks, this looks really interesting!

What would you like to ask Giorgia Lodi and Andrea Borruso:

What kinds of public sector/COVID data should generally NOT be made accessible and why?
How do you reconcile the desire for open data with individual rights, such as privacy (e.g. some ostensibly “non-personal” data can be de-anonymised or aggregate data used to discriminate against groups)?
If you already hold large data sets and have significant resources, you’re better-placed to derive additional value from new data. Does open data run the risk of exacerbating centralisation in the digital economy? Are there any progressive or differential approaches to public sector data sharing that can address this imbalance?

alberto · March 01, 2021 16:04

I have a second question, that concerns data on the COVID apps for tracing. Back in the spring of 2020 there was a lot of attention, with most major governments launching their own apps: and a lively debate on whether those apps would work, and whether their benefits were worth the risks to privacy. We also participated in that debate. People in this community tended to see the emphasis on apps as a sort of technological hail Mary, and were cautious about the usefulness of the apps in this pandemic, though more upbeat about the next one.

Almost a year later, we are left wondering about the effects of those effort. At least, we should have ample grounds for an evaluation. But we are left wondering: the Italian one has been criticized; so, apparently was the German one. There are more positive analyses for Switzerland (paper, paywalled) and, especially, the UK, where it is estimated that every patient who tested positive and agreed to notifying her condition via the app resulted in 1-2 averted infections (paper).

In all this, the Italian app has the merit of maintaining a repo of open data on GitHub. The English/Welsh NHS COVID-19 app also has a page of downloadable data, although I cannot find a license. But it is no so for the Scottish and Northern Irish versions of the app. One of the saddest cases might be that of Belgium, where I live: no official data at all. A recent working paper by U Gent had to resort to an online survey. Only 2.2% of respondents had ever received a notification of exposure from the app (paper).

What do you make of this situation? Why are these data so difficult to obtain? And who do you think is publishing them? Is it the companies who maintain the software (that seems the case, as I cannot find the data from the app on the Italian government’s open data portal, they appear to be only on GitHub)? Is it the public health administrations?

Update:

French data
Swiss data
England-and-Wales data (not sure these are open, cannot find the license).

atelli · March 02, 2021 14:58

I think one of the main problems about the resistance against governments and health authorities collecting this data via the apps is that the duration of data aggregation is unclear, so is the temporality of the pandemic. This might only be the start of an era where such digital certifications, facilitated by AI tech, will be issued with less and less concern for privacy of individuals. We already know how the pandemic is creating extra vulnerabilities for people on the move, border regimes, the class-based inequalities etc. There is now the strong debate about the vaccination certificates and whether these should be only in paper form or also a digitized version that talks to the tracking app data. Many concerns regarding travel rights, right to not be vaccinated…The reason for data openness not being there yet is also due to the bleak approach regarding EU policies as well as national policies on this. The whole thing includes multi-layered approach surrounded by many actors; even the health authorities and their relation to state governments as well as EU governance is a problematic issue in the macro level. The micro and meso levels require more nuances and include insider wars at the moment.

giorgia.lodi · March 02, 2021 17:22

We definitely need to focus on the quality of the data and on sustainable processes! Current, at least in my country, we are still struggling for this and results show that. In addition, from my point of view there is a serious problem for health data and their degree of openness. It is true that it is not an easy domain because it is important to combine openness and privacy. Probably I would invest on this. It is so much important to find the right balance and privacy cannot be an obstacle; rather a right to be always preserved while keeping open the overall knowledge.

Asjad · March 02, 2021 17:24

I will join the live chat on March 2nd, 18:30 CET:

alberto · March 02, 2021 17:27

Hey @Asjad, you made it! Welcome!

giorgia.lodi · March 02, 2021 17:27

Not sure they are baby at all but I am sure they are necessary and we need to really start from these:

start training civil servants and raise the awareness in even larger number of communities. We need to change the mindset: “the data is mine and I decide how and when to open or share it” is something OLD that prevents us to move forward. It is no longer acceptable in the Internet, Web and Social and AI era. We need to start slowly to change the way things are done; therefore point 2)
revise processes. Take one important case with a big impact and work on the overall process and governance in order to identify blocking elements and to let open data be by design part of the process. Process will become sustainable, data will become probably of high quality and an impact, I am sure will show up!
start reasoning in terms of community. Public/private/research institutions and communities with their organisations should start work TOGETHER and not in silos! Promote a greater collaboration among these actors. It is not necessarily bad if it is private

aborruso · March 02, 2021 17:28

The only important step is that the open data is gone. At least those of the public administration.
I try to explain
Open data must no longer be something special, but one of the automatic consequences of the work on public information assets. It must be like opening a water tap at home, a fact that can be taken for granted: quality assurance, available on request, whose contact person is known, whose process documentation exists, etc.

Just talk about open data, let’s talk about data and their value.