Graphryder (RyderEx) experiments

Let’s try it!

Testing to quote this post in another topic.

Also quoting this post in the same topic.

1 Like

SELECT post_id, quoted_post_id FROM quoted_posts
WHERE post_id = 100476

SELECT post_id, reply_post_id FROM post_replies
WHERE reply_post_id = 100476

Discourse counts quotes within a thread as replies to the quoted posts. This means that a single post can be a reply to multiple posts at once.

RyderEx combines the quotes and replies into a single relationship between participants to count participant interactions. However, I now realize that the interaction counter on that relationship is wrong since it will count two interactions when somebody makes a quote which is also a reply. In the example above, the TALK_OR_QUOTED counter in 9, but it should in fact be 6.

Furthermore, Discourse does not consider a post A in topic X with first post Y a reply to topic Y. At the moment, neither does RyderEx, though this is pretty easy to change.

1 Like

This is now fixed. If a post is both a quote and a reply, it now only counts as a single interaction.

1 Like

I have now pushed an update that enables pseudonymization, the inclusion of protected content, and redaction of protected content titles and text on a platform level. This means that BBU, Blivande, and Edgeryders.eu can share the same installation and still have different settings.

Since the POPREBEL corpus includes quite a lot of protected content, I think this is a good compromise for now. I personally think that the pseudonymized usernames are a bit silly given how easy it is to circumvent by just clicking a post, but I suppose it gives a little extra layer of protection by obfuscation.

These are the current settings for the platforms:

Edgeryders dashboard
Usernames: Pseudonymized
Protected content: Included, with text, titles, and annotations quotes redacted
Consent policy: Only includes the content of users who have passed research consent funnel

Blivande dashboard
Usernames: Original
Protected content: Included, with text, titles, and annotations quotes redacted
Consent policy: Assumes consent of all registered users

BBU dashboard
Usernames: Original
Protected content: Omited
Consent policy: Assumes consent of all registered users

1 Like

@alberto and @amelia

I noticed that the POPREBEL corpus has quite a few codes prefixed with *.
There is a function in the RyderEx import script that allows defining a list of prefixes, and any code with one of those prefixes is omitted along with all of its annotations. This config and list of prefixes is applied on a platform level. Let me know if you want to start using a prefix to omit annotations from the graph.

Adding or removing a prefix is very easy, and rebuilding the database only takes a couple of minutes.

@alberto, I needed to revert this back to using the union operator.
However, your hack works:

See your corresponding state on the new server.

In fact, this is a lot more useful than the old Graphryder click-on-edge method to bring up posts because we can look at more than two codes at once, like bringing up all posts are coded with “privacy”, “personal data”, “making trade-offs” and “Google”.

Shame. What happened?

HYPERGRAPHS! :surfing_man:

As we finish the NGI and the POPREBEL report we will know more about this, and, if we are lucky, perhaps we can even pick a single measure, so it can make sense to pospone this development to late 2022.

However, it would be nice if we could get the single measure of link strength to be consistent with the methodological discussion we had with Jan and the ethnographers. This would have the extra advantage that we can get all of them to use the software in early 2022, and we would learn a lot from that.

Is there any way you could build edges based on a count of the occurrence of the two codes within a single post? If code1 occurs, in the same post, n times, and code2 occurs m times, then the contribution of that post to the weight of the code1 <=> code2 edge would be m x n. This would make RyderEx uniform with the analysis we are doing now (example).

Another question. How does Ouestware document the code? To be usable by someone who is not us, RX needs a good manual, at least. With Edgesense we relied a lot on inline help, and I wrote all the help text myself, inserting it into the appropriate places in the code repo. That was a simpler software, though.

It turns out the filters are a little more complicated than I thought, and I would have to do more digging to make sure the change gets applied everywhere. It ended up breaking the graph filtering, so adding users to a scope and seeing the code graph specific to only those users no longer worked.

Database connection is set up now for all three platforms accessed by RyderEx. I have run graphryder-import-psql.py successfully. Runtime is around 150 s, just a few seconds slower than with a local database connection. And wayyy faster than old Graphryder, very nice :slight_smile:

I still have to do some security optimization (read-only user, encrypted connection) tomorrow. But connections are already restricted to a single combination of database, user and IP, so it’s ok for now.

P.S.: RyderEx drawing routines incl. zoom are very smooth and fast for a web-based application. Wondering what’s under the hood for drawing? WebGL?

1 Like

Great, thanks @matthias!

Yes, I think so. In fact, that’s why we had to sacrifice the curved lines in the graphs - WebGL can’t draw them as efficiently.

1 Like

These two go to the same link.

The general entry point for ryderex is now

@hugi, the coding work in POPREBEL is nearing its end. I would like to deliver them a graphryder dashboard, ideally with the “right” measure of link strength. We have,as I recall it, already refinanced that work (you + Ouestware) in internal budget reallocation for POPREBEL (cc @marina). What do you think about implementing it in the next month or so?

Alright, I have an idea how we could do it.

What’s the budget @marina?

Here: https://edgeryders.eu/t/an-opportunity-extra-activities-in-poprebel-in-2022/16491/16

So, 3k staff going to you and 6k service for ouestware

2 Likes

I don’t think getting OuestWare to help us with this on such short notice will not be possible.
Within the next month this is what I can do:

  • Build the new algorithm
  • Change the import configuration so that it can be set on a corpus level which metric to use for c
  • Indicate in the dashboard which metric is used for c in the corpus that is being explored

I could do this on the budget I have (3k). Doing it this way means we sidestep the thing we would need OuestWare for - which is to build a feature that lets you choose between different metrics for the same corpus. We can then use the 6k budgeted for services for other improvements later.

Makes sense @alberto?

Looks good, but you are the boss here.

Also, remember the issue of OR vs. AND. In the old Graphryder, you could choose the operator of what today we call the scope: “show me the people that authored contributions that were coded with code1 OR code2” vs. “show me the people that authored contributions that were coded with code1 AND code2”. This might be overkill, but you told me new Graphryder is nailed to OR. If it has to be nailed to anything, I believe ANDto have more significance.

You even made a code change, but then had to revert it:

Ping @hugi for a question.

When selecting the corpus, one is presented with a list to choose from:

image

However, it does not seem that the list is built from simply looking up the Discourse tags called ethno-SOMETHING. So, how do we add a corpus to that list (specifically: #ethno-icqe2022 )?

That’s odd. There seems to be a bug that prevents this particular tag from showing up in the dropdown. However, there is a simple workaround. Just go straight to server-2021.edgeryders.eu/dashboard/edgeryders/ethno-icqe2022.

2 Likes