A submission to ICQE2022

Starting a thread on the ICQE22 paper, as we decided to go for it.


  1. all authors of the ANS paper
  2. @Wojt and @Maniamana
  3. @Nica (if she is interested)

The call for submissions ends on May 13th. The submission happens on EasyChair, which is not the easiest system, so I’d like us to be finished on Thursday, May 12th.

What we do

The paper builds on our draft paper for Applied Network Science. It is, however, much shorter about 15 pages of 300 words each, so 4,500 words. The new paper:

  1. Introduces four techniques for codes co-occurrence network reduction, without too much mathematical discussion. This is done by me, by cutting to the bone the ANS paper.

  2. Explains how these techniques map onto major approaches used in socio/anthro. This is basically copy-pasted from the ANS paper. The relevant section (which we should all be familiar with) is the one titled Mapping network reduction techniques onto four major approaches in sociology and anthropology

  3. Illustrates this mapping with examples from the POPREBEL corpus. This part needs to be done ex novo, lead by @wojt and @Maniamama, under the supervision of @Jan and @Richard. I am going to be contributing data visualizations.


  1. Delimit the corpus in a way that makes methodological sense. Decision to be made by Magda, Wojt, Jan, Richard, Nica. By April 25th.
  2. Deliver the coded data. By Magda and Wojt. By April 28th-ish. Ideally, we would have an ethno-ICQE2022 tag that I can use to extract the data.
  3. Deliver the visualizations. By me (I might need help from you, @bpinaud). This is one or more Tulip files with the visualizations of the corpus after undergoing the four reductions. I am going to need 2-3 days after I receive the coded corpus. I will try to deliver by May 2nd.
  4. Explore the data and confirm (or not) the paper’s intuition. Magda and Wojt, with Jan as a counterpart. This should probably be finished by May 9th.
  5. Write the text. Draft: me for parts 1 and 2 (see above, the “What we do” section), Magda and Wojt (and Jan?) for part 3. Supervision by everybody. Finished on May 12th.
  6. Submit the paper. Me. May 13th.

All agree?


yep, no problem. Do not hesitate.

Here is something you want to consider when making this choice: it the corpus is too small, we might not be able to see a meaningful reduction effect. Larger corpora are more “granular” and both require and enable sophisticated data treatment as a pathway to exploration. If the Polish corpus consists of only 10 interviews (it’s just a random small number, I have no idea how many you have already) we might end up with trivial networks.

@Jan @Richard @Nica @Wojt @Maniamana

We were thinking about having a solid 20, that would be enough for the purpose of presentation, but let’s see. We hope to have it done asap, we are on it!

@alberto – you probably have the best sense of what is necessary, and what is ideal, in terms of the size of the corpus. But in terms of the logic of the corpus contours, if I am understanding correctly what you mean by “makes methodological sense”, I imagine it can be anything from “interviews from these X months” to “interviews with X or Y demographic” to “interviews that were over X minutes long” – whatever provides a coherent taxonomical narrative that aligns with the goal of demonstrating the reductions.

Correct, @Nica. The data should be “a corpus” rather than a collection of text.

@Wojt @Maniamana I need some info from you, for the submission.

  • ORCID. These are added to the actual paper, but, annoyingly, EasyChair does not identify authors by ORCID, so I also need the usual boilerplate info.
  • Full name
  • Home institution
  • Country
  • email

Magdalena Halina Góralska
University of Warsaw, University College London

Also from @Nica

I am now working on restructuring the paper. I worry about the potential incoherence between the quantitative approach of the Applied NetSci paper (three corpora, used only as graphs, with no regard for content) and the zoom onto specific subgraphs that this paper requires. Plus, page limit.
I don’t see any solution but restructuing the paper completely. This means, in term of our draft:

  • Eliminate the “Data and pre-processing” section.
  • Eliminate all tables and figures 2 through 5. We renounce to comparing techniques from a quantitative point of view, and instead cite our IC2S2 conference paper.
  • The “Techniques for network reduction” section is much reduced.
  • Figure 1 could also be eliminated.
  • Depending on the new visualizations turn out, we could also get rid of Figure 6.

@bpinaud, would you have 30 minutes to discuss the computation element of this paper?


Wojciech Szymański
University College London

@Maniamana , @Wojt, you both forgot the ORCID.

ORCID: 0000-0002-7098-4338
Veronica Davidov
Monmouth University
United States

Oh, sorry.

It seems we may have about 15 coded interviews for tomorrow evening. In total they will contain about 500-800 coded posts. Is that ok?

My ORCID: 0000-0001-9491-6682
Let’s see how this body of data works for the paper. @alberto should we have a separate meeting on Friday to talk about the dataset?

Seems a bit on the thin side, but it is easy for me to induce a network, and then we’ll know.

Great, let’s see how it turns out!!

OK guys, the first draft of the paper, minus the ethnographic analysis, is ready. Here is the Overleaf link, with editing privileges. It will need another couple of deep reads: it makes sense to me, but then I know the full paper off by heart, and most of that content has been excised from this one.

Cc @icqe22_authors

@alberto I am new to Overleaf and am figuring out the interface right now. Do you want edits, suggested edits, or comments? I see a few things I want to comment on / make suggestions about.