Masters of Networks - POPREBEL Challenge documentation

My mistake, thanks for pointing it out, I forgot to change the date. I also rebranded the version 2.1, it should not have been 2.5!

I confirm the data in there are updated to yesterday.

2 Likes

Found that code #6816 does not exist. It is associated with post 63558.
May we have a new export? @amelia did a lot work on the dataset.

Hi @rebelethno ,
what time today are we re-grouping for the wrap up?

At about 14:30 in the main room!

2 Likes

@alberto, possible to have a new export of the codes?

New nicer version on the latest export on zenodo:

as a reminder : I took all posts in a chronological order. Then I built the code co-appearance dynamic network with a threshold>=5 in the example of the image and the video. The video shows a very basic animation of the graph construction (having a nice and smooth animation is a difficult problem). The tulip file contains 180 subgraphs. Each subgraph can be visualized and studied. The python code and the Tulip file are here: https://github.com/bpinaud/MoN5

2 Likes

@rebelethno, we are now back in the main room! Please join :slight_smile:

@alberto May we have again a new export of the data? @amelia just finished to edit all codes. My graphs will look much better.

Some of Ameliaā€™s notes from 3 straight days of categorizing, merging, and general cleanup of our codes (and big thanks to @Jirka_Kocian and @SZdenek for all the help) :

  • All the codes are now visible on one page!!! hoooraaaay!

  • ā€œActionsā€ codes are going to be useful, in my opinion, and we should try to format them the same (e.g. in ā€œgerundā€ form). same with Resource Needs and Problems. Could break Actions down into things like ā€œcoping strategiesā€ and ā€œsolutionsā€ or something

  • People and Identities has a few different sub-cats

    1. Politicians
    2. Other named people
    3. Ethnic / Religious / Racial IDs (can disaggregate if needed)
    4. social roles (eg. ā€˜employersā€™, ā€˜studentsā€™, ā€˜freelancersā€™, ā€˜doctorsā€™
  • Places also has a few subcats ā€“ proper nouns and more general places (e.g. villages, cities).

    1. thereā€™s a scalar element embedded in there that we might want to disaggregate
  • There is work to be done cleaning up and merging codes ā€“ we could condense many together. also, quite a few could be deleted as they are too vague.

  • Now when we code, we should make sure to add new codes to the new categories.

Other note ā€“ we really need a way to only see annotations done in POPREBEL ā€“ itā€™s confusing to see a code with 50+ annotations only to realise that only 2 are from POPREBEL. (@hugi, @alberto, any thoughts on how we could do this? The mixing in the backend is getting very dicey.)

Check out @bpinaudā€™s great visualisations based on these new cats as well, and with the ability to see the codes unfold over time as new posts are added. Letā€™s discuss in our next bi-weekly!

@Jan @Richard @Wojt @Jirka_Kocian @SZdenek

2 Likes

3 datasets in VR
Here is a early version of the virtual exhibition of the three data sets of data and the fun video about ā€œwalking and stomping on dataā€ from opencare

2 Likes

Iā€™ll be happy to continue working on this with you all and improve the visualizations. The code is on github: https://github.com/bpinaud/MoN5. It is the first time I used pandas for dealing with csv files. That is great. Usually, I let students do the workā€¦

3 Likes

Indeed! I have my own source of confusion: the API endpoint for codes carries the number of annotations the codes has been used in. But that number is platform-wide, not corpus-specific! So, the code migration in NGI appears to be very important, appearing in many annotations, but most of those are not in NGI. @hugi I believe a check-in on multi-tenancy is in order.

1 Like

Done now. Same Zenodo page, version 2.2.

1 Like

Used the new data. Now 185 subgraphs with threshold=5. New tlpx file uploaded on github (https://github.com/bpinaud/MoN5)

Yes, this is because it was a big open care code!

There are no corpus-specific aggregations in OE, as far as I know, and never have been. All corpus-specific calculations happen on the Graphryder end, but those are not available to you through the API. In the Neo4j database (without multi-tenancy), you just count the number of ā€˜REFERS_TOā€™ edges coming in to a given code node. This is a very cheap operation, almost as cheap as storing it in a property.

In the multi-tenant Graphryder Neo4j database, we will store the number of annotations per code per corpus as properties on the IN_CORPUS relation between codes and corpus nodes, where ā€corpusā€ is just a special label applied to tag nodes that define corpora. You will be able to access this through the GraphQL API, along with other pre-calculated data like the co-occurrence number per code pair and corpus.

That is just wonderful work, yā€™all!
I couldnā€™t attend the event because of a conference at my uni, but it looks just beautiful now!
Iā€™m getting down to going through all the novelties.

<3

2 Likes

@rebelethno
Ok, so Iā€™ve had a look at the meta ā€œZā€ categories and it would be of great use to us, I think, if we could apply the ā€œsort by: number of annotationsā€ tool to their entirety. For now it seems to work only for the annotations associated with a code that a given category derived its name from.
E.g. When aplying the sort by: number of annotations to the ethno-poprebel corpus we get COVID-19 with a 100 annotations for the COVID-19 code + 235 annotations for its children.
Next comes Emotions with 3 + 958
Then Ideology with 2 + 655
And so on.
Thus, maybe we could, if possible, also sort the categories by their total number of annotations (parents + children)ā€¦(on the other hand, with just a few of them, itā€™s not a big deal really, since if you click on them, they fold and you get this:
image
Maybe it could also be useful (again, if possible) to introduce a ā€œsort by: number of childrenā€ option? What do you think?
I feel that seeing a total number of annotations could give us a glimpse into how salient certain topics are (once we sort the issue with the number showing all possible annotations, not just those for poprebel) and by looking at the number of children we could maybe see how informed/thorough/nuanced the conversation on a given topic is?
Like, say, if we had 335 people just mentioning Covid in their stories, we would get 335 annotations under the Covid code/category. However, if the same number of people speaks of:

  • Impact of COVID-19 (110)
  • contact tracing (38)
  • lockdown (25)
  • upside of COVID (13)
  • anti-COVID measures (10)
  • vaccinations (8)
  • pandemia (7)
  • Lockdown Light (3)
  • infection rate (3)
  • face mask (2)
  • Mutation (2)
  • respirators (2)
  • aftermath of the pandemic (2)
  • positives of the pandemics (2)
  • false positives (1)
  • sanitary-epidemiological station (1)
  • mortality rates (1)
  • vaccine (1)
  • penalization for non-vaccination (1)
  • quarantine (1)
  • masks (1)
  • compulsory vaccinations (1)
    We can see that they donā€™t just say things along the lines of ā€œOh, so there is this pandemic and I donā€™t like itā€, but that they are discussing various aspects of COVID-19 - Category in detail - something which we canā€™t really see when we just look at the number of annotations (100+235 = 335).
    Of course, we can scroll and see how long the list of children is, but maybe it would be useful to automatically count them and put there an option to sort by that number?

That is, of course, under the proviso that the variance in the number of children is not just due to the codersā€™ bias. It seems rather clear to me that one coderā€™s interest in, and knowledge about, say, various aspects of the pandemic, will probably result in a greater number of children - they will be able recognize and distinguish between a greater number of different phenomena, name them and create specific codes for them.

PS Look at that curiosity - it shows zero codes created by me under poprebel; they appear only under Any Discourse Tag.
image
EDIT: You know how many of my codes appear under Any Discourse Tag? Seven. 7. Sieben. Sedm. Siedem kodĆ³w.
I distinctly remember creating more than 7ā€¦have the past several months been just some kind of a EU-funded figment of my imagination?

@rebelethno

What do you think about putting the proposed category CHANGE under Z Social and Political Processes - Category?
image