I came across a magnificent essay from Harvard researcher Maxwell Neely-Cohen about long-term archival of data. It starts by asking a deceivingly simple question: what would you do to store some data so that they are retrievable in 100 years?
The hidden complexity is that digital forms of storage depend on entire ecosystems of physical means of storage, hardware to read them, firmware to drive that hardware, applicative software to read the data, operating systems to run the computers that the applicative software runs to, all the way through to industry standards and funding mechanisms. For the data to be still readable in 100 years, all of these components must still be functional a century from now – which means entrusting their stewardship to organizations that will not wither, collapse or lose interest.
The essay is very great, and made me think of @matthias , one of few minds comfortable with grappling with “hardcore sustainability” issues. So, I made a mental bookmark to share it with him; but then, in the spirit of archiving for accessibility, why not start that conversation on the Edgeryders forum? So here we are.
In truth, what I want do discuss with Matt is not so much the theoretical question of the century-scale storage, but a more practical one. How long do we think Edgeryders data will remain available? What, if anything, should we do to extend this period?
Notice that we are already an outlier in the modern Internet panorama. Back in the 2000s, when the practice for government-sponsored projects to launch their own website took hold, it became very clear to me that these websites were abandoned the second the project was wrapped up and the funding dried up: and by “abandoned” I don’t mean “retired”, but left online with nobody home. Within one year, the domain name registration would expire; even if you had saved the IP address that the domain name pointed to, the bitrot would set in soon after.
I hated this way of working with a passion. On the one hand, in the age of the so-called Web 2.0, I saw the rhetoric of participation being deployed everywhere: get involved! Your voice counts! Collaborate! Join the community! On the other hand, if you did make an effort to participate, that effort was not honored. How can “your voice” be taken seriously, when whe time and effort you spent in these online participatory processes resulted in digital artefacts that were abandoned to bitrot almost as soon as you produced them?
Bitrot and neglect were also the destiny of Edgeryders, which started in 2011 as a European Commission-funded project of the Council of Europe to learn about how young European were transitioning to adult life during a time of economic crisis.
As was foretold, the project ended in June 2012. The Council of Europe had zero interest in maintaining the website online with no one to foot the bill, as small as it might be. What happened then was this: as project director I had managed to convince my superiors to publish the entire forum content as open data. That meant we could (technically) export the site’s database and re-import it somewhere else. In order to make this possibility a reality, we needed three more things:
- An organization to shoulder the burden
- A revenue stream to fund the organization.
- A business model for the organization that made use of the forum and the conversation it hosted, to incentivize the organization to take good care of both the data and the community.
Solution: aggregate and sell the voice of “people from the edge” of new lifestyles and societal practices. In 2013, we registered a social enterprise called Edgeryders and started to look for opportunities.
This is not the place to look back on the entire Edgeryders history. But, from an archival perspective, the result of all this is that 2025 is here, and the very first Edgeryders posts (September 2011) are still online. For example here is Nadia reporting from the meeting of the 15M movement in Barcelona. If you still remember it, you are an outlier too.
This was not simply “storage”. The initial Edgeryders platform was on a Drupal 6 distribution called “Community”, that in retrospect was a nightmare because it did a lot of things badly. When we first migrated, we upgraded to Drupal 7, but that again was poorly maintained and would get users very frustrated, as increasingly they were getting used to commercial social media, especially Facebook, and their much sleeker user experience. In 2017 Matt led a second migration onto the current Discourse platform – which, however, was upgraded multiple times over the year.
From a content perspective, the data were also not simply “stored”, but greatly expanded. In June 2012 we were looking at fewer than 200 users having written 2,500 posts. As I write this, this form has nearly 8,000 accounts, of which 3,700 have contributed at least one to its grand total of over 140,000 posts. Since we migrated to Discourse, the https://edgeryders.eu
forum served almost 9 million pages.
Since 2019, the forum also serves as the main online workspace of a group of people, mostly in Brussels, who share an interest in cohousing, of which I myself am part. In late 2021, we decided to launch a formal cohousing project: over the three years leading to the end of 2024, this resulted to a very lively and ongoing debate (11,000 posts in The Reef ), as well as to the purchase of a 1,600 m2 plot of land in Brussels for a million euro.
I do not know a single EU projects that can claim anything remotely like this. So, in a sense we are already unusually good at long-term storage. Additionally, the Edgeryders company is just starting a new project, so there is no danger that this forum will shut down any time soon. But it is probably a good time to start thinking about exit strategies and worst case scenarios. Also, it is so interesting! So, if anyone has ideas, I’m listening.