Let's submit a conference paper to IC2S2!

Thumbs up.

The point @alberto is making is that the reduction method is but only one part of the SSNA method.

Point (iii) means: we are applying a mathematical transformation onto the network, and only after application will we know if it indeed does improve legibility and support inference. Results are uncertain.

Point (iv) means that the reduction is a component of the SSNA method. Is it unclear? Anyone wants to propose a different formulation?

I should not speak up, perhaps, as it is a technical language in which I am not fluent, but I do not get how the sentence “We evaluate the extent that each reduction technique… combines with the data model and network construction technique above as part of the SSNA method” (that is point (iv)) means simply “that the reduction is a component of the SSNA method.” How would you say it in plain English? I sense a lot of important stuff here, but it flies above my head…

If we had the space, we could report King’s list and establish a correspondence between his general criteria for good qualitative research and our specific evaluation of network reduction technique. In this case:

“the content is the method” (King); “science can study anything. What makes such a study ‘scientific’ is a specific method” (Jan) => good qualitative research

becomes

“a reduction technique combines harmoniously with other parts of the SSNA method” => good reduction technique (abstract)

A reduction technique could also be inconsistent with the other parts of the method. For example, if we reduce on the basis of the number of co-occurrences, this reduction is consistent with the technique of constructing the network in the first place. A co-occurrence network is interested in showing what pairs of codes occur together. This is a different question from, say, listing the codes, or counting their occurrences. Reduction by number of co-occurrences shows what pairs of codes occur together the most times, and so it is tentatively good.

By contrast, reducing the network to a list of codes would be bad: we would lose the critical information of occurring together. Reducing it to a list of the highest co-occurring pairs would also be bad: we would lose the critical information about network structure (for example, two codes do not co-occur with each other, by they are tightly connected because they co-occur with the same neighbors. By contrast, losing the information about co-occurrences that appear only once seems like a smaller loss.

I would appreciate a suggestion on how to express it more clearly. Deadline tomorrow!

Of course you should! I am way out of my depth here. I am used to deploying methods that are accepted: you might justify the choice of one statistical estimation technique over another (example: logit vs. probit for binary variables), but not to arguing that the whole exercise has meaning – that is done for me by the broader discipline.

I tried to gather our thoughts about this matter here: Qualitative research, knowledge and discovery: a discussion on the epistemological strengths and limits of ethnography and SSNA - #4 by alberto.

@alberto and @melancon It is very useful. I believe you are both right. We can reconstruct our technique, step by step. That will show as one of the first steps Alberto’s “ethnocoding” (a lot of interpreting already there - this is my "classificatory interpretation), then a specific dataset emerges that is structured by cooccurrences, etc. (here all that analysis of algorithms, what they and what they do not do, applies), and then we are back in the ethnographers’ “house” where we all look at those beautiful graphs and come up hypothetical interpretations (again, but on a “higher” level of abstraction). The method of their falsification/verification is/should be both iterative (we go back to any stage we want/need) and relentlessly, intersubjectively self-critical as we try to reach an interpretive consensus.

1 Like

@alberto et al. Here is my shot: Following [King et. al. 1994], we evaluate the extent to which each reduction technique: (i) usefully supports inference, understood as an interpretation of the emerging intersubjective picture of the world; (ii) reinforces reproducibility and transparency that help to increase the researcher’s ability to assess equivalence between any two implementations; (iii) does not foreclose the possibility of updating via abductive reasoning (algorithms alone do not decide how parameters should be set to get optimal readability); (iv) combines harmoniously (also as a the network construction technique) with other parts of SSNA.

1 Like

@jan thanks for this. We are in agreement. I submitted the final version of the abstract, now we wait.

1 Like

We are delighted to inform you that your submission

“Comparing techniques to reduce networks of ethnographic codes co-occurrence”

has been accepted for an oral presentation (by video) at IC2S2 2021. This decision was based on blind reviews of the abstract that assessed the content and fit with the scope of the conference. Please find the reviews below.

We will get back to you with additional organizational details and recommendations for preparing your video. Video presentations are 12 minutes long and have to be uploaded by July 2. We will release the full program as soon as it becomes available.

This is kind of nice, because quite a bit of the work is done already. :slight_smile:

1 Like

Calling co-authors @Jan, @Richard, @amelia, @melancon, @bpinaud and @brenoust.

Following the acceptance of the extended abstract, we have work to do. We need to prepare a 12-minute video presentation. I am not sure if there are proceedings, and what goes into them: most likely that would be the extended abstract we submitted. I wrote to the conference chair to verify this.

But also, it would probably make sense to write an actual paper. The reviews are quite interesting and encouraging (see below). POPREBEL has another year to go, so we might be able to write, submit and get accepted a whole paper. Writing the paper would also help with the video.

Shall I set up a call to discuss this?


Reviewer 1

SCORE: 1 (weak accept)

This paper presents an interesting way forward in applying network analytic techniques to a corpus of ethnographic material, which is in itself an important and promising avenue of research. In this network, edges are annotations that connect ethnographic stories (snippets) to codes / keywords. This is a two mode network. A one mode project is a network of codes, connected when they both occur in a snippet or story. The paper investigates techniques to reduce this rather dense networks in order to make it ready for interpretation.

The paper seems work in progress and results are not yet presented, which is a pity. Especially the backbone reduction strategy would be of interest.

Coding is only one way of dealing with ethnographic material and a rather crude one, the authors should qualify this. At the same time the approach might be useful for a much broader set of qualitative studies.

The main problem however seems to be that the thick and dense networks in the projected network are a direct result of moving from a two mode to a one mode network. This is the case and a problem in many social networks. Why not move a step back an start working with the original, bipartite / 2 mode data? And, perhaps, do the reduction there instead on the projection.

Also, it is not obvious that this is a social network. Its keywords connected to texts, not social interactions.

Reviewer 2

SCORE: 2 (accept)

The contribution fits the conference well. It has a proper theoretical background and decent level of applicability.

Reviewer 3

SCORE: 1 (weak accept)

The submission discusses how quality of text-based co-occurrence networks can be improved via 4 reduction techniques: (1) dropping edges that occur only once or few times; (2) dropping edges associated with a low number of informants; (3) dropping edges not belonging to high-k k-cores; (4) dropping edges with low neighborhood homophily. While the reduction techniques are interesting to examine, it would be useful to go into more details on how the parameters for the reduction techniques are decided, e.g. what does “few times” mean for technique 1; for technique 2, what is considered “low”?; for technique 3, what is considered a “high-k”, k of 3?; for technique 4, what is the threshold for homophily?

It would also be interesting to discuss more results in terms of how this task specifically helps with processing ethnographic data, in particular. I could see potential applications to text-based data beyond just ethnographic data. The submission did attempt to motivate the problem well, substantiated with some seminal papers in network analysis in general, and semantic networks in particular.

1 Like

Excellent news!!! Yes, let’s set up all zoom call very soon to discuss this.

1 Like

Hello all, I am getting down to homework.

@melancon, @bpinaud, @brenoust: you can find the Tulip file and its README on GitHub. So that’s done.

@jan, @Richard: before I start creating the text structure I would like a confirmation that you are comfortable working on Overleaf, which means the LATEX syntax. I know @amelia is. If that is an obstacle, I would rather fall back on Google Docs or Cryptpad.

Mostly for @amelia, here are the meeting notes.

1 Like

I feel we need to move fast on this, so I just went ahead writing the structure, for now on cryptpad. If you, @jan and @Richard, are OK with Latex I will move everything onto Overleaf fairly trivially. If not, we stay here. Moving over will become non trivial as we work more on it, so a quick decision would be appreciated.

Link to the CryptPad document here

I’ve not used Latex before but it looks quite straightforward.

Hello all,
Ok, got it. Thanks.
The tlpx file contains a large number of self loops and multiple edges. Is that normal?
@amelia told me this was normal for the self loops during MoN if my memory is good but I do not remember why.

Bruno, it is possible indeed, because the same code may be present in more than one annotation on the same post.

However, we want to drop those for proper analysis. Are there any in the stacked graphs? I thought I had dropped them.

Yup, pretty intuitive. But how do you want us to add our text? I do not see a review function. Should we use different font color and whatever is accepted will be turned black?

Jan, what you are looking at is CryptPad, not Overleaf, and the syntax is not Latex. OK, I guess we stay in the cryptpad then.

No, please, none of that! I have added instructions at the top of the document itself.

Thnxs @alberto, data downloaded. I see you already organized the data into three separate graphs (great!). Can you explain in two words what is meant and what differrence there are between the stacked and non-stacked datasets?

@brenoust Do you have a ready made Tulip implementation of Mutual Information (you suggested we use this to compare the reduction schemas which I think is a good idea)?