Summer report on POPREBEL WP2

As we prepare to finalize reporting, here is where we are.


As I write this, the POPREBEL conversation consists of 1,731 posts, authored by 266 different people and adding up to 365,000 words. Since my last check (2020-05-15) we have added on average 1.9 new post a day, mostly in the international sub-forum.

The figure below show the social network of the conversation, color-coded by sub-forum. In blue, Polish; in red, Czech; in green, English; in orange, Serbian; in purple, German. In gray, users that participate to more than one sub-forum – there are 28 of them out of 266. Users that only participate in one sub-forum are color-coded like that sub-forum.


As I write this, the POPREBEL corpus was annotated with 2,393 annotations, employing 1,157 codes. Since 2020-05-15 we switched to a new strategy for coding, where we no longer provide support for coding in Serbian. This seems to have paid off: in just over two months, the number of annotations more thandoubled, which means that we have been coding at minimum 6 times the rate we had kept before. Coding in Czech, and even more so Polish, have made a big step forward. Coding in Serbian also made progress. It appears to have sped up its own pace.

This is shown in the table below. Notice that, while adding annotations almost always means progress in the work, adding codes does not. Codes are created, but periodically merged into each other, so that a reduction in the number of codes can mean progress to. Also, it can be associated with an increase in the number of annotations.

Annotations Codes
Forum up to May 15th up to July 27th difference up to May 15th up to July 27th difference
English 496 644 148 288 293 5
Polish 98 737 639 68 395 327
Czech 364 736 372 143 332 189
Serbian 210 276 66 120 137 17
Total 1,168 2,393 1,225 619 1,157 538

We have updated the interactive dashboard with this most recent batch of coding. You can find it here and play as much as you like, there is nothing you can break.

Ping @Jan and @Richard.


Having looked at Graph Ryder and deciding that we are still too unconnected (a suspicion we had, confirmed by the graph), we are doing a merging exercise next week (likely Thursday), to merge all the codes we have that have similar meanings, or create new codes together that capture combined meanings. It’s also a good collective intelligence exercise, because it brings us into the same brain space as we see the entire map of each others’ codes and hear each others’ coding rationale and approach. I defined categories based upon my codes and sorted them to create a more manageable, visible breakdown, which each other team is duplicating and roughly categorising their existing codes in the same way. Once they have finished, I will create a master list of the codes by categories (colour coded to show whose is whose) and then we will use it to create merges and hierarchies in real time on Thursday.

After Thursday, I suspect when we redisplay the SSNA it will be much more interlinked. And now that everyone is on track with coding, we can do this exercise regularly.

I agree with both your diagnosis and your response. Another things that happens is that there are some codes co-occurring with other codes named in different languages:


If GraphRyder displayed the English version in all cases, we could have a better outlook at how the people in this conversation see the issue, across all languages.

1 Like

I think this only happens when someone has not assigned an English translation to the code. I can check to confirm that suspicion… otherwise it’s a technical fix.

Yep, suspicion confirmed – it’s only when there is no English translation assigned. @Jirka_Kocian @SZdenek @Wojt @Jan, please make sure that all your codes have English translations :slight_smile:

1 Like