Starting the cleanup of POPREBEL

alberto · October 20, 2022 08:57

So, I presented our paper to ICQE22, and I am happy to report it was well received. I was paired with David Shaffer, who is seen as the Godfather of quantitative ethnography, so I had a very full room and lots of questions. I did miss @Nica and @Jan and @Richard to take the more anthropological questions (“wait, Gramsci was no anthropologist!”), but it went very well.

Now it’s time to cleanup. On my end – besides, of course, the final report – the needed activities are:

Create the Zenodo upload for the data (in Tulip form) used for the ICQE22 paper.
Export all POPREBEL interview data, and put them in a Zenodo upload. This is going to be a bit of a hassle, because my current export script does not cover gender (nor should it), and because we have three corpora. I think what I will do is to export each (single-language) corpus separately, and manually modify the datapackage.json metadata file so that it has resource entries for three annotation files, three codes files, and so on. t should not be a problem, because codes have unique IDs and so people could decide to treat them as a unified corpus, as well as maintain them separate.
Export the codebook in human-readable form, and upload it to Zenodo.
Submit the Applied Network Science paper. @bpinaud, have you finished revising the text?

To do all this, I need you guys to stop tweaking the codes. Or, at least, to stop for now: you can resume in 2023, but at that point we will have consistenc between the POPREBEL report and the POPREBEL data. We need to make sure that the former points to the version of the data I create now, and not to the dataset in general, since the latter will probably be updated later.

Does this work?

bpinaud · October 20, 2022 09:17

Yes. If you’d like, we can plan a zoom call (even with @melancon ) to do a review together.

alberto · October 20, 2022 09:22

Great. What about Monday?

bpinaud · October 20, 2022 09:47

after 4pm

alberto · October 20, 2022 18:10

I have assigned to all interviews a common Discourse tag: ethno-poprebel-all-interviews. This way, we can call both each separate language corpus (via ethno-rebelpop-polska-interviews etc.), and view all POPREBEL interviews as one large corpus. The former thing makes more sense for data analysis, but the latter is perhaps best to build the codebook, since many codes recur across the three corpora.

The POPREBEL codebook is, as you know, a deliverable. In order to build it, the process is:

First access the table rendering of the codebook relative to ethno-poprebel-all-interviews, and any creator:

https://edgeryders.eu/annotator/codes?discourse_tag=ethno-poprebel-all-interviews&order=name&view=table (note: for whatever reason, this link – though correct – returns a 404 error. So, please copy-paste it into the browser’s address line, or just head over to the Codes page in Annotator and manually select the ethno-poprebel-all-interviews tag, Creator => Any creator, Order => Name and View => Simple table

We could also render it as a list, but the problem there is that children codes are nested under parent codes.
Copy the table onto a Google Doc, or whatever word processor you like, and add the logos etc.

@SantosCardonaPR, you have been most involved in this, do you want to give it one last scan? Just follow the link above. And of course, @Nica’s nod of approval is important to me.

Once you give the nod, someone in EDGE will create the deliverable document proper.

Nica · October 20, 2022 21:32

@alberto – that page is actually not accessible to me. It says it either does not exist, or is private. As soon as you update the permission on that, I will take a look.

alberto · October 21, 2022 06:54

See my previous post, I corrected it.

alberto · October 21, 2022 07:14

Calendar invitation sent. @Jan, @Richard, you are also invited too but do not feel obliged, I think that @bpinaud, @melancon , @Nica and I can do this.

melancon · October 21, 2022 07:43

@alberto @bpinaud I just received the zoom invite, I’ll be there. I teach until 4pm, I might arrive a bit late.

alberto · October 24, 2022 08:50

Hello @rebelethno, I have now finished exporting, cleaning and uploading the data from our interviews. You should cite the dataset in your academic work: https://doi.org/10.5281/zenodo.7243713. Please note that this is a different dataset from the one used for the ICQE paper.

I still need to add some authors, though: specifically, the ethnographers who did the interviews and the coding of the Czech and German parts of the corpus. This means @Djan, @SantosCardonaPR, @jitka.kralova. @SZdenek and @Jirka_Kocian … am I forgetting anyone?

For each of you, I need: full name, affiliation and ORCI number.

bpinaud · October 24, 2022 14:19

I am on line since 4pm

alberto · October 24, 2022 14:52

Sorry, I was programming and lost myself! I am also online, and I see you logged in, but I guess you have (rightly) walked away from your computer.

SantosCardonaPR · October 24, 2022 17:50

Hi Alberto, I am including my information below:

Santos Rivera-Cardona
PhD Student
Rutgers University

https://orcid.org/0000-0002-7369-1119

alberto · October 25, 2022 14:51

@SantosCardonaPR thanks for that. Also, I found two more pairs of duplicate codes, created by @Richard and @Jirka_Kocian. I imagine you will want to merge them, right? Can you do it? The first pair in particular looks quite well connected and might show up on the radar.

children
- https://edgeryders.eu/annotator/codes/9701
- https://edgeryders.eu/annotator/codes/4285
civil rights
- https://edgeryders.eu/annotator/codes/9634
- https://edgeryders.eu/annotator/codes/4220

SantosCardonaPR · October 25, 2022 20:21

Hi Alberto,

I just merged them! Please let me know if there are any order codes we need to work on!

Have a lovely rest of your week!

alberto · October 25, 2022 21:18

OK, but now you assigned all POPREBEL annotations to the parentless code (9701), whereas it is 4285 that comes with the path from the Z category… is that intentional?

SantosCardonaPR · October 25, 2022 21:33

From what I understand, @Wojt or @Jan can correct me if I am wrong, the Z category is not supposed to be visualized; hence, we have been trying to eliminate them. I can erase the Z category and leave “children” on its own for visualization tho. What I did is basically leave the code “children” on its own because ultimately, the Z category would be deleted. Does that make sense?

Anyway, I just changed the parent code and made the code “children” that contains POPREBEL annotations fall into the Z-Y-X category. Now the code is 9701.

SZdenek · October 28, 2022 11:00

@alberto here it is:
Zdeněk Sloboda
Charles University in Prague, Institute of International Studies, Faculty of Social Sciences
research assistant
ORCID: 0000-0001-9721-7983

alberto · November 09, 2022 07:59

Ping this. Are we ready to submit? Needs to be done in the next three weeks, or not at all.

jitka.kralova · November 09, 2022 09:32

@alberto here is mine:

Jitka Králová
UCL School of Slavonic and East European Studies, PhD student
ORCID: 0000-0002-7513-6346