What if we turned the sci-fi economics wiki into data?

alberto · July 29, 2025, 2:45pm

Many moons later, somehow, the stars aligned, and it came to pass that @yudhanjaya came to visit us in Brussels. It was really wonderful to finally meet in person! And, as a very welcome side effect of that encounter, we ended up revisiting my old project of turning our Sci-Fi Economics wiki into some kind of structured data.

This is supposed to be the age of Artificial Intelligence, and for once entity extraction is actually a good use case of modern-day Large Language Models. So, Yudha cooked up almost on the fly an entity extractor that reads reviews of science fiction books, extracts economic concepts from them and arranges it in a JSON file. The magic of entity extraction is farmed out to an appropriately prompted LLM; the entire thing is packaged into a Python script that calls the LLM’s APIs.

The main problem with all that is to get the granularity of the concepts right. For example, “economics” is not granular enough – it contains too much stuff that we want to resolve into more operational concepts. On the opposite side, “market mechanisms” is probably too granular, especially if “markets” is also present, since these two expressions have the same meaning in economics discourse. So I did some tweaking to the code and fed it the economic analysis of 25 sci-fi works (mostly novels) that had been posted on this forum over the years.

Click here to see the entire list of titles and their authors

2312, by Kim Stanley Robinson

A Half-Built Garden, by Ruthanna Emrys

Another Now, by Yanis Varoufakis

Autonomous, by Annalee Newitz

Distraction, by Bruce Sterling

Freedom TM, by Daniel Suarez

Gamechanger, by L. X. Beckett

Makers, by Cory Doctorow

MetaGame, by Sam Landstrom

New York 2140, by Kim Stanley Robinson

Numbercaste, by Yudhanjaya Wijeratne

Our Shared Storm: A Movel of Five Climate Futures, by Andrew Dana Hudson

Red Plenty, by Francis Spufford

Scholomance Trilogy, by Naomi Novik

Stealing Worlds, by Karl Schroeder

The Caryatids, by Bruce Sterling

The Culture Series, by Iain Banks

The Dispossessed, by Ursula Le Guin

The Lost Cause, by Cory Doctorow

The Ministry for the Future, by Kim Stanley Robinson

The Moon is a Harsh Mistress, by Robert A. Heinlein

The Terraformers, by Annalee Newitz

Utopia Five, by A. E. Currie

Walkaway, by Cory Doctorow

Webs of Varok, by Cary Neeper

I ended up with 140 economic concepts associated to the books; I then narrowed them down to 135, by merging some that were very close (for example Creative Destruction and Disruptive Innovation). Next, because everyone loves a graph, I used Tulip to induce a 2-mode network where books connect to economic concepts. Books are represented as green nodes, economic concepts as blue nodes.

Since some economic concepts feature in more than one book, the network is a small world network, where most of the nodes are connected to one another.

What does this tell us? As a macro property, not much. The fictional economies in 22 of my 25 books appear to share some features, with only three of them forming "islands of concepts to the north of the graph. The action – if there is any to be had – is going to be local, with economic institutions and concepts gluing together pieces of work from different authors describing different worlds. Community detection analysis might reveal “families” of economic concepts that are not obvious to scholars – a contribution of sci-fi authors as a group to economic analysis. Another approach would be to extract not generic “economic concepts”, but some more homogenous – hence tractable – aggregate: for example, economic policies (like Job Guarantee), or indicators (like Social credit scoring).

There is also something to be said for looking at methods for extraction other than LLMs – for example ethnographic coding. There are some puzzling phenomena in the data: why did the bot extract Social credit scoring from Karl Schroeder’s Stealing Worlds and Sam Langstrom’s Metagame, but not from Yudha’s own Numbercaste? Yes, Yudha frames his number more in terms of reputation, but it is a score, and it maps to social credit, and it is the entire point of the novel. I could redo the coding manually, or just edit that done by the bot – in this case I could merge Social credit scoring with Algorithmic reputation system – this would connect Numbercaste and the economic concepts therein to the graph’s giant component.

Anyway, this is obviously a sort of “hello world”, rather than a piece of research. Grateful for ideas on how to improve it, especially from @zazizoma, @petussing, @yudhanjaya, @Nica… The Gihub repo is here, including graphs and input files.