POPREBEL Ethnography Code Review Thread

Jan · May 7, 2021, 2:28am

Are we meeting tomorrow (Friday, 7 May)? Please let me know. If and where. I would love to hear all about the hackathon.

Maniamana · May 7, 2021, 12:54pm

If anyone wants to meet today, me Jan and Wojt are already in here: Launch Meeting - Zoom

amelia · May 7, 2021, 1:00pm

I will be on in 10 minutes!

amelia · May 7, 2021, 2:18pm

Quick notes from today’s meeting, to be fleshed out later:

Use multiple instances of codes in one post now that we are working with interview data (no need to limit to 1 per post).
Use the 6 coded Polish interviews to explore some kind of proximity index for codes (discuss with @alberto) since interviews are long-form. Breaking up interviews ourselves doesn’t make sense because we want to be able to explore them as units.
Before the next meeting, make sure to review the new categories and think about whether they work for us.

Jirka_Kocian · May 21, 2021, 9:30am

Hello everybody, I am really sorry but I will not be able to attend the meeting today, have a good discussion as always!

amelia · May 21, 2021, 12:15pm

@rebelethno, is everyone else able to join today? Our plan was to discuss the new categories and our coding practice going forward.

amelia · May 21, 2021, 2:35pm

Notes from Biweekly POPREBEL Ethnography Meeting
2021-05-21T13:00:00Z → 2021-05-21T14:30:00Z

1. Splitting Interviews

We decided that the most ideal data model is to post interviews as their own topics. Each interviewer question is its own post, and each response should be attached to a pseudonymised Edgeryders account as its own post. The interview then is a thread that consists of questions by the ethnographer and responses by the interviewee.

@Djan gives a good model here and can show @jitka.kralova and @Maniamana how to format this practically)

This helps us solve our issue of giant interview posts (which prevent us from coding precision) and also the conundrum of how to “artificially” separate posts (because it follows the conversation of the ethnographer and accounts for the fact that their questions will shape the next response). This also allows us to do analysis on the co-occurrences in the interview as a whole, to better map out individuals’ thoughts and feelings.

2. Splitting existing interviews.

To make existing interviews fit the above model, we will need to split them. Already coded posts @alberto suggests that we ask dan or @matthias to help split with a script (since the Polish forum has most interviews coded). And Alberto if you could specify how you see this most easily done, that’d be great, as you explained it well in the call!

from @Wojt and @Jirka_Kocian we need a list of the interviews and post IDs so that we can do this.

3. Code Categories

We are reaching a good collective system in the backend. Before next meeting, look at the backend to see what codes could be merged. We envision being able to create an analytical system from the top-level categories, categorising interviews based on the predominant code or group of codes from each category. (e.g. Emotion predominantly anger; Ideology predominantly left-wing; Actions predominantly seeking political alternatives and questioning leadership; Places predominantly Poland).

4. Intercoder reliability.

We may want to explore some quantitative measures of intercoder reliability, like these measures. Can explore further with @alberto, who is interested / probably already nerding out

4. Agenda for next time.

Next Friday we will have a 1.5 hour meeting.

3 to 3:30 Brussels: Discussing field ethnographers comparative decisions, what to hold consistent across fieldsites.

3:30 - 4:30: In-depth discussion of backend coding categories. Merging existing codes together to create a tighter schema.

(@Wojt also bug reporting in GitHub the two issues of code authorship in backend and highlight disappearances and @jitka.kralova coordinate coding training schedule for interested field ethnographers).

Great meeting, thanks @rebelethno!

alberto · May 21, 2021, 2:50pm

Done, as this Github issue,

Email coming up!

That looks like occurrence (not co-occurrence) count to me…

amelia · May 21, 2021, 3:22pm

Yep, that would be – just one suggested way of categorising single interviews based on the typology. We can and should definitely think of others based on co-occurrence rather than occurrence.

Maniamana · May 22, 2021, 8:09am

Hi guys, great work! Happy to see that! I have updated myself on the changes and will proceed with the future interviews as agreed upon.
Looking forward to seeing what will you do with the old interviews regarding spliting them, hope this can be done easily.
See you next Friday!

matthias · May 23, 2021, 12:16am

I had a think (see Github issue) about how to achieve what you want here. It’s a bit difficult, unfortunately.

To estimate what solution would be more economical, can I ask how many interviews would have to be split? Also, please provide 1-3 example links to these interviews so I can have a look.

alberto · May 23, 2021, 9:07pm

News are not good. According to @matthias:

After some thinking, the following is the best process I could come up with:

Ethnographers split the interview topic into multiple posts as outlined by Alberto, but without deleting anything from the original first post that so far contains the whole interview. Most of the interview text will then appear twice in the topic.

@damingo develops a script that will re-anchor annotations appearing in the first post to text in later posts if possible. That would can be done by looking to match the last part(s) of the XPath anchor, as typically whole paragraphs are moved to new posts. Where this fails, the “quote” anchor would be used, with the additional knowledge that the right quote is the first option immediately behind the previously processed annotation.

We run that script on these topics, then delete the duplicate part from the first posts of these topics.

And because that script is not the simplest thing to develop … how many interviews are we talking about? If it’s 4-5, then re-coding manually is certainly faster. (source)

@amelia, I leave the decision to you.

amelia · May 23, 2021, 9:45pm

@Wojt, can you answer this please? Thanks!

amelia · May 24, 2021, 8:34am

I need @Wojt and @Jirka_Kocian to make this decision, please, as you are the ones who will have to implement the solution – answer @matthias question and work with him to figure out a plan. Thanks!

jitka.kralova · May 26, 2021, 7:58am

@rebelethno let us know once this is decided so we know where to upload new interviews and whether to break up the old ones.

amelia · May 26, 2021, 8:09am

This is about existing interviews and doesn’t affect what we agreed on in the meeting. Please go ahead and break up non coded interviews and upload new ones in the broken up fashion we discussed — and as we agreed, create a new thread/topic for each interview and agree on a naming convention with @Djan and @Maniamana

Maniamana · May 26, 2021, 8:30am

Cool, roger that! Do we do all that in the protected area (if so, should we have a joint protected area for all languages, or have separate ones for the purpose of not having a mix up) or in the respective languages’ fora, or elsewhere? @amelia @wojt @Jirka_Kocian

Wojt · May 26, 2021, 5:10pm

@matthias
Thank you for your patience, for the Polish community it’s 8/9 interviews coded so far, with only two of them coded with multiple instances of the same code.

Wojt · May 26, 2021, 5:12pm

I can recode the Polish ones manually all right.
@matthias do you still want the links?
I can’t remember what the situation is like for Czechia.

amelia · May 27, 2021, 9:08am

For you to decide, as long as you coordinate. But for ethics reasons you need to have it in the protected area. If you want a new category made, please ask @matthias as I can’t make new Discourse categories