Edgeryders and the quest for collective intelligence: a research agenda

alberto · August 17, 2016, 3:27pm

I am knee deep into the research work for opencare. I think I am learning new things on how to use collective intelligence in practice. This has far-reaching implications for my own work in Edgeryders, and beyond. Far beyond, in fact. If we crack collective intelligence, we gain access to a new source of cognition. Forget my own work; this has profound implications for the future of our species. If you think that’s radical, go read the work of cultural evolution scholars, like Boyd, Richerson or Henrich. They think homo sapiens has started a major transition: evolutionary forces are pulling us towards a larger, more integrated “collective brain”. We are en route to becoming to primates what ants are to flies.

Collective intelligence is an elusive concept. It appeals to intuition, but it is hard to define and harder to measure and model. And yet, model it we must if we are to go forward. The good news is: I think I see a possible way. What follows is just a back-of-the-envelope note, plotting a rough course for the next three years or so.

1. Data model: semantic social networks

I submit that the raw data of collective intelligence are in the form of semantic social networks. By this term I mean a way to represent human conversation. The representation is a social network, because it involves humans connected to each other by interactions. And it is semantic, because those interactions encode meaning.

2. Network science: it's all in the links.

Collective intelligence is not additive: it’s interactional. We can only generate new insight when the information in my head comes into contact with the information in yours. So, what makes a collectivity more or less smart is the pattern of linking across its members. Network science is what allows a rigorous study of that linking, looking for the patterns of interaction which associate to the smartest behaviors.

3. Ethnography: harvesting smart outcomes

Suppose we accept that the hive mind can generate powerful insights and breakthroughs. How can we, individual human beings, lift them from the surrounding noise? Looking at what individual members of the community say and do would likely be fruitless. The problem is understanding how the group represents to itself the issue at hand; no individual you ask will be able to hold all the complexity in her head. We do have a discipline that specializes in this task: ethnography. Ethnographers are good at representing a collective point of view on something. Their skills are useful to understand just what the collective intelligence is saying.

4. "Shallow" text analytics: casting your net wider

Ethnography is like a surgical knife: super sharp and precise. But sometimes you what you need is a machete. As I write this, the opencare conversation consists of over 300,000 words, authored by 137 people. This is a very big study by ethnography standards, and these numbers are likely to double again. We are already pushing the envelope of what ethnographers can process.

So, the next step is giving them prosthetics. The natural tool is text analytics, a branch of data analysis centered on text-as-data. It comes in two flavors: shallow-and-robust and deep-and-ad-hoc. I like the shallow flavor best: it is intuitive and relatively easy to make into standard tools. When the time of your ethnographers is scarce and the raw data is abundant, you can use text analysis to find and discard contributions that are likely to be irrelevant or off topic.

5. Machine learning: weak AI for more cost-effective analysis

Beyond the simplest levels, text analytics uses a lot of machine learning techniques. It comes with the territory: human speech does not come easy to machines. At best, computers can evolve algorithms that mimic classification decisions made by skilled humans. A close cooperation between humans and machines just makes sense.

6. Agent-based modelling: understanding emergence by simulation

We do not yet have a strong intuition for how interacting individuals give rise to emergent collective intelligence. Agent-based models can help us build that intuition, as they have done in the past for other emergent phenomena. For example, Craig Reynolds’s Boids model explains flocking behaviour very well.

The above defines the “long game” research agenda for Edgeryders. And it’s already under way.

I am knee-deep in network science since 2009. We run real-time social network analysis on Edgeryders with Edgesense. We have developed an event format called Masters of Networks to spread the culture beyond the usual network nerds like myself. All good.
We collaborate with ethnographers since 2012. We have developed OpenEthnographer, our own tool to do in-database ethno coding I'd love to have a blanket agreement with an anthropology department: there is potential for groundbreaking methodological innovation in the discipline.
We are working with the University of Bordeaux to build a dashboard for semantic social network analysis.
I still need to learn a lot. I am studying agent-based modelling right now. Text analytics and machine learning are next, probably starting towards the end of 2016.

With that said, it’s early days. We are several breakthroughs short of a real mastery of collective intelligence. And without a lot of hard, thankless wrangling with the data, we will have no breakthrough at all. So… better get down to it. It is a super-interesting journey, and I am delighted and honored to be along for the ride. I look forward to making whatever modest contribution I can.

Photo credit: jbdodane on flickr.com CC-BY-NC

noemi · August 19, 2016, 6:07am

What an intellectual journey!

So passionate, @Alberto. You know how to make us want to learn with you. Humbled to be along for the ride.

alberto · August 19, 2016, 1:34pm

Aww, come on. Just a regular network geek, just another day at the office.

But yes, this is interesting. Wonder what’s waiting behind the corner…?

rune · August 20, 2016, 11:32pm

If I was a ‘network nerd’…

It sounds very interresting, but I don’t have the background to understand it. (Still looking for the ‘for dummies’ section)

What is the performance criterion when doing these analysis.

johncoate · August 19, 2016, 5:48pm

Speaking of ‘hive mind’

a phrase coined by my old friend and Whole Earth coleague Kevin kelly, I was looking for a place to put this quote from him, so I’ll put it here:

“It’s taken a while, but we’ve learned that while top down is needed, not much of it is needed. The brute dumbness of the hive mind is the raw food ingredients that smart design can chew on. Editorship and expertise are like vitamins for the food. You don’t need much of them, just a trace even for a large body. Too much will be toxic, or just pushed away. The proper dosage of hierarchy is just barely enough to vitalize a very large collective.

The exhilarating frontier today is the myriad ways in which we can mix large doses of out-of-controlness with small elements of top-down control.

Now we’re trying the same trick with collaborative social technology: applying digital socialism to a growing list of desires – and occasionally to problems that the free market couldn’t solve – to see if it works. So far, the results have been startling. We’ve had success in using collaborative technology in bringing health care to the poorest, developing free college textbooks, and funding drugs for uncommon diseases. At nearly every turn, the power of sharing, cooperation, collaboration, openness, free pricing, and transparency has proven to be more practical than we capitalists thought possible. Each time we try it, we find that the power of the sharing is bigger than we imagined.

The power of sharing is not just about the nonprofit sector. Three of the largest creators of commercial wealth in the last decade – Google, Facebook, and Twitter – derive their value from unappreciated sharing in unexpected ways.”

johncoate · August 19, 2016, 6:14pm

Machine learning to enhance understanding

Back in the 90s when I ran the sfgate.com website, had an abundance of news stories from a variety of sources and not enough staff to manually do what Yahoo used to do, which is make links for you that enhance your understanding of a given document or article.

We worked with what was a new UK company, Autonomy, to provide the reader with links to related news stories, based on their crawler and interpreter, which analyzed the text of documents to discern the document’s meaning. This was pretty revolutionary stuff back then and we did successfully offer to the public news stories where links to related stories were available right there on the same page. This was before Google (and Google News) existed and at the time I saw it as a core part of the future of smart news delivery.

The main use of this technology was for huge corporations that generate gigantic amounts of written documents and find themselves unable to associate the information and date contained in them without hiring armies of librarians to do it for them. We convinced Autonomy that it would be useful to see how this worked in a news environment ans they provided us with a $100K software (that was a lot of money in the 90s) solution that did an amazingly good job of it.

In 2001 sfgate,com was acquired by a corporate owner that did not wish to conduct such experiements and turned the site into the mess it is today. And a few years back, Autonomy was bought by Hewlett-Packard and their technology was bundled into the HP Enterprise stuff where it is part of a suite of tools that only the richest corps. could afford.

But the basic technology is still alive and I found an open source version of the crawler here: An Open-Source Crawler for Autonomy IDOL – Norconex Inc

(And here is a link to the HP site describing it: http://www8.hp.com/us/en/software-solutions/information-data-analytics-idol/ )

It has been awhile and I admit I have not studied this open source version in a lot of detail, but I wonder if we might find a way to take advantage of what it does as we design or at least dream up tools that assist collaboration, sharing and group understanding in networked environments.

alberto · August 20, 2016, 10:19am

Always the pioneer, eh?

I should have imagined you were there at the start, @johncoate

Norconex appears to be a web crawler. Good to know it’s there, but I am thinking much smaller and simpler than that: I am thinking running classifiers in Edgeryders. We want to go deep rather than wide, and the only way to have relatively simple software do sophisticated parsing is by incorporating prior knowledge into it. This requires homogeonous data. So, my idea is to have simple classifiers that will do well one thing: pick all the pieces of content that satisfy a specified set of conditions within our own database.

Of course, we could be thinking about spawning a text data analytics unit, that would also be able to comb sources other than Edgeryders. As I said, early days.

johncoate · August 20, 2016, 2:41pm

crawler

It crawls whatever you want it to. The usual thing is to crawl documents. But unless they changed it around too much, it should be able to be pointed to documents, data, conversations, etc. And I am not particularly advocating it per se, but alerting that there has been a lot of work using machine understanding to do this kind of finding and associating written word from whatever disparate sources one selects. The crawling part is easy. It is the interpreting/associating that is the special sauce and is what costs so much. That said, I’m rather rusty with it all and need to study it more closely. Guy might find it interesting…