Digital ethnography skunkworks episode 1: the takehome points

Hi all, as we discussed in Brussels, everyone is invited to share their take-home points. You will find the relevant materials here. We are still missing @Jan’s slides.

I will add my take-home points when I have had a chance to think a bit more about them.

@Richard @sander @martin @noemi @nadia @amelia @ilaria @hugi @melancon

Being external to the POPRebel / Edgeryders community / project my first insight is about the methodological nucleus of the work: Threads of discussions of an (open) community are analysed. Initially, the treads are indexed by codes with ethnographic meanings. Second the network of codes are described to identify structure of thinking associated with interacting people.

This methodology seems to be generic, that means, it could be applied to various sets of texts. It should deliver a quantified (quantifiable) description of the thinking of people and its structure. I see very valuable potential in the method. Its mixed subjective-objective features (see below) should be turned to be a strength.

The ‚indexing by codes‘ may be a common or divers filter applied to the text, depending whether it is conceived being part of the data-generation or analytical part of the activity. To me, it seems needed to have „training / test sets of text“ to control the indexing /coding; going beyond means to reduce the inter-coder-variability. I wonder about the bias (measurable ?) that could be introduced by given coders. Possibly, this problem is well-known; but the network analysis my give a means to quantify / assess its „size“ and, subsequenttly, to turn it into a source of information.

Being speculative: Through assessing the ‚inter-coder-variability‘ possibly the ‚just words paradigm‘ could be analysed. Differences between coders should reflect the ‚associative cloud‘ that a given coder has with a given string of words.

p.s. Depending on other posts I may post a complement.

1 Like

Sander’s presentation was extremely interesting, logically thought-through and convincing.

I have long been convinced of the important role of narrative in constructing and interpreting lived experience. Narrative has become well established in the social and historical sciences since White’s The Content of the Form: Narrative Discourse and Historical Representation (1990), with Meyer’s Narrative Politics (2015) a more recent example in my field.

The discussion of different ‘non-Western’ conceptualisations of past and present was also interesting but doesn’t apply to the Czechs, Poles and Serbs, as far as I am aware.

The other points about perceptions of risk and about the idea that individuals exist in a field of tension between nature and society and past and future were also interesting and researchers interested in risk perception, for example, could certainly use the Edgeryders data to test specific hypotheses but I wasn’t convinced that our interpretation of the Edgeryders data should be guided by these frames as a general approach. As @Amelia cogently put it, this would be a second level of interpretation. (Amelia, please correct me if I have misinterpreted what you said.) While Sander made a compelling argument for the importance of these concepts, had we invited another speaker, they may have advocated an entirely different set of concepts upon which to focus.

Also, to go looking for specific concepts and their interrelationships would go against the principles of Grounded Theory, an approach that has worked well with other Edgeryders projects in the past. With regard to Grounded Theory, there seemed to be some misunderstanding – even among the ethnographers – as to its key principles and procedures. It may be useful for Amelia to suggest a short reading on the topic, so that we are all on the same page.


@alberto @noemi @natalia_skoczylas @Richard @sander @martin @nadia @amelia @ilaria @hugi @melancon and all the others: here is a link to my Skunkworks Prezi (let me know, please. if it does not work:

I am finishing up my notes and reflections. Jan

Hello Jan - t got a message that the link has expiered. - regards, Martin

@martin I am sorry. I am working on it. Please try now: A Conversation on Race -


Hello friends, after some rest and some reading and thinking I am ready to share my takehome points from the Skunkworks.

My goal, as you may recall, was to spot some low hanging fruit methodological advance in digital ethnography. I hoped to get that by starting from an empirical problem, that of European populism, and moving on from there. I came out of our two days with two promising candidates: Wirtz’s approach to finding out when informants are “sure about what they are saying”, and Beckert’s idea of “thinking the present as dependent from the future, rather from the past”.

1. Wirtz’s interview entropy

Wirtz’s paper (@sander, did he consent to us sharing it? Otherwise I cannot link it here) has a promising idea: reusing the coding of interviews (or posts in our case) for some algorithmic analysis. Its purpose, in the way that Sander presented it, is to determine which statements are delivered with confidence as opposed to only tentatively. I like this idea, and I have myself been thinking about “reliability scores” that would be attributed to associations (i.e. edges in the semantic network), rather than statements. To compute them, I was thinking of using social networks metrics.

His main tool is to confront two measures of Shannon entropy. One is the observed value, which has to do with the frequency with which each specific combination of codes occurs. The other one is the expected value, which is the frequency with which the same combinations would occur, if the codes were independently distributed.

However, after going through the paper itself, my enthusiasm has cooled considerably. Here is why, in decreasing order of importance:

  1. Wirtz does not actually claim, outside of the paper’s title, that he has constructed a “coherence detector”. Instead, he makes considerably weaker claims:

    The frequency of a class in a given set of consecutive statements is the number of occurrences of that class in the set. We superimpose the […] graph of this frequency on the three previous curves (observed information, expected information, and expected minus observed information). By looking at this graph we can see whether there is some degree of resemblance between the frequency time series and those of the three information statistics over part or all the interview. If there is such a correspondence for some classes, we will claim that this class carries the information during this part of the interview. [emphasis Wirtz’s]

    In the final part of the paper, the author discusses his six interviews one by one, and even there he makes only cautious claims. “If we assume that the observed information betrays a temporary difficult with expression, we deduce that the present seems to pose the greatest difficulty for this man”. With a working mathematical coherence detector, we should not need to assume.

  2. I am not convinced by the author’s introduction of a time dimension in the interview. His approach is to first “slice” each interview into statements (sentences, I presume). Next, he takes a “sliding window” of ten statements and assigns them a time code. Time 1 corresponds to statements 1 to 10, time 2 to statements 2 to 11, and so on. At this point, he computes information (via entropy) of the class that seems most meaningful for each window, so that both expected and observed information vary across the interview. This is not how @amelia describes coding: she treats each post as a whole, reading first the whole thing and only later starting to code statement by statement. As so often in data analysis in social sciences, Wirtz’s methodological move (just like our own) hides a non-neutral assumption about the social conditions on the ground. He assumes that interviewees are “winging it”; we, instead, take people at face value, assuming what they are actually saying is what they meant to say, and don’t try to second-guess them. Our assumption is maybe more realistic in texts that were delivered in writing, with the possibility to edit etc.; his is more realistic in texts that were delivered orally. Our assumption is also more in line with our ethical code of treating people on the platform as thinking adults, admittedly a leap of faith.

  3. The method’s scalability is not demonstrated, nor exemplified. Wirtz goes through each of his six (six! We are already looking at 1,000 posts in POPREBEL) interview in painstaking detail. He does get some patterns, but his method is not more efficient than having a live ethnographer code for non-semantic codes like “assertiveness” and “hesitation”, and much less precise.

  4. I am not clear how expected information in a class is computed. There is no formula; if I wanted to write some computer code implement the method, I would have to ask the author or invoking the help of someone like @melancon to help me interpret the paper.

Overall, this method is cool but it does not seem worth the investment to me. Happy to reconsider if you disagree with me.

2. Beckert’s imagined futures

I read the first (theoretical) part of Beckert’s book. It argues that

  1. People need some idea of future states on the world in order to make decisions in the present.
  2. However, we have no way to “compute” such states with any degree of certainty. Rational expectations theory is rejected, which is not hard at all. Instead, Beckert proposes the concept of fictional expectations: expectations are formed as causally credible stories about how the future follows from the present.
  3. Once adopted, fictional expectations inform present behavior. Therefore, who controls the narrative about the future controls the present, as speculators and central bankers know well.

This is great. But I do not see how it can contribute to expanding SSNA. It can play a role in POPREBEL, if we decide to pay attention to how informants see the future (a Europe of nations? A liberal globalist conspiracy ruling the continent? and so on). It can even help us make predictions, because fictional expectations held today will influence individual behavior (and therefore social phenomena) in the near future. Maybe we could discuss this issue with the U Tartu folks. But I do not see a core module of SSNA.

So, as far as I’m concerned, it’s back to the drawing board.

What does everyone think?

Dear Alberto,

Thanks for your comments, which are much appreciated. I will try to put both the papers in context, which I may not have done at the session. But before I get to that, two preliminary remarks

First of all, many people have reproached me for always thinking about future research projects rather than the ones they are working on. That has, certainly, also been some of what I did at the session.

Second, if you would prefer, I’d be happy to cancel our contract saving you some money for other efforts – it was great fun being with you all on that occasion, and for me that is enough satisfaction.

Now, first about Wirtz’ paper.

1 Like

Not me. I approached you for your unique point of view. You are giving it, so as far as I am concerned our agreement is being honored. No need to rediscuss. Just do your thing, don’t worry.

1 Like

Thanks, Alberto – but what, from here forward, is “my thing”?


@sander my ideal scenario would be this: help us in coming up with, as you say, future research projects that (1) are achievable, (2) foreshadow progress in extending the reach, accuracy and efficiency of digital ethnography and SSNA and (3) take the legacy of POPREBEL in a relevant direction.

1 Like


It would be good to have a skype about all that, but I cannot do that before September 24. If we could do that in that week, it would be very good.


But really, when I say “help us” I am intentionally blackboxing what you do. I am hoping someone of your experience can reflect on our interaction in the skunkworks and figure out what, exactly, would give a good push to our group. Injecting new skills/ approaches? Going deep into some methodological aspects? Immerse ourselves more in the raw data and participant observation?

No rush for this, unless you see some urgency. The week of 24 is a bad one for me, but we could speak the following one.

Dear Alberto,

Let’s speak the week after the 24th, which is the first week of October. I will be in Europe at the time, so easy to connect … and I have thus far no appointments …


Sure, deal.

@Jan, this link can not be right. Your presentation is on :slight_smile:

@alberto @nadia @sander @martin @noemi @Richard Whoops! It was supposed to be a link to my Prezi, so it is wrong. I apologize. But more importantly, for some reason these links are unreliable. So, I just converted it to a pdf document and uploaded it to our drive (see below). And I am still working on the longer text/intervention on the significance of stories/narratives since Sander raised this issue. I also want to address some of the issues raised by Alberto on coding (via Wirtz’s piece). Sorry for the delay - a lot to do. My presentation up uploaded here:


For some reason I do not understand I do not have access to your comment through either of my Google accounts – can you send it to me as a pdf by email?


Sure. In a moment. Hope you are well. J

Hi Sander,

this is the only email address I have. I hope it works.

All my best,


Populism introduction for Skunworks.pdf (17.3 MB)