A first look at the German corpus by gender

So, tonight I delved into the data, but I soon ran into a data issue. There are people who authored posts in topic tagged with #ethno-rebelpop-deutsch-interviews who do not appear in @Richard’s and @Djan’s “German gender” file. The latter has 41 entries, but I find 44 forum users in the dataset (one of them is Djan!). For your convenience, I have created a new version of the file which contains all the usernames, and put it here.

Hi Alberto. Djan and Sabine (Biene) were the interviewers. I only coded their questions if the subsequent answers didn’t make sense on their own (e.g. monosyllabic). The other ‘missing person’ was GERAnnette30b. I had overlooked her interview but have now coded it.


Thanks, @Richard! Here’s a first look at the data. Also ping @Nica and @Djan.

The German corpus is coded with 847 codes, which have their ancestries in 77 additional meta-codes. The corpus has 28,119 co-occurrence edges, derived from 2,311 annotations. The participants are 42 + 2 interviewers. 32 informants and one interviewers are female; the remaining 10 informants and single interviewer are male.

I created co-occurrence networks by gender based on this file. The gender difference is large. The following table shows the number of codes and co-occurrence edges that can be found in the annotations of interviews with females, males, and both genders.

All co-occurrences codes edges
females only 299 4769
males only 341 5903
both 284 354
overlap coefficient 0.48 0.06

A visualization of this situation can be seen below. This graph is already reduced: it shows the 970 edges representing co-occurrences with association depth d > 2, in either gender. Green edges are derived from interviews to female informants, orange edges from interviews to male informants. Green nodes represent in codes associated only to interviews to female informants, and orange nodes represent codes associated only to interviews to male informants. Gray nodes represent codes associated to interviews to informants of both genders.

Below, you can see the deepest associations (d > 4) evoked by males (59 codes, 98 edges) and females (69 codes, 150 edges)

While now most nodes are gray, representing codes associated to both male and female interviews, that does not mean that the two genders share the deepest associations; a code is colored gray if it is used at least once to code an interview to a female informant, and once to code an interview to a male one. Indeed, there is little overlap also for the deepest associations.

d > 4 codes edges
females only 69 144
males only 59 92
both 14 6
overlap coefficient 0.24 0.07
Here is a list of shared codes
'Unvaccinated', 'Vaccinated', 'negative attitude', 'face mask', 'stress',
 'working remotely', 'upside of COVID', 'Media', 'impact of COVID-19', 
'social distancing', 'infection rate', 'employment', 'Impact of COVID-19',
And here one of shared edges

('face mask', 'social distancing'), ('Unvaccinated', 'Vaccinated'), ('LAinsecurity', 'stress'), ('upside of COVID', 'working remotely'), ('working remotely', 'Impact of COVID-19'), ('employment', 'Impact of COVID-19')

The very strongest edges (both in terms of depth and breadth) are highlighted. In the female graph, they form a triangle between GENinequality, impact of COVID19 and childcare:

In the male graph, some of the highest-d edges have b=1. So, the choice of the strongest edges is more difficult. Provisionally, I chose the three strongest-d edges with b > 1. They form a star, connecting impact of COVID 19 to digitisation, working remotely and positive change.

1 Like

Thanks @alberto ! The difference is really interesting. So would this reflect the idea that the Covid-related labor practices modifications exacerbated the gendered division of labor inequality, with women having a more negative experience (maybe the burden becoming greater on them) and men having a more positive experience (advantages of working remotely without having their labor share increased)?

@Richard do you think this fits with the idea of re-traditionalization of gender roles that emerged as one of the most salient points from the German data?