Graphryder 2.0 – Workplan

It also looks like some annotations are referring to topic tags?

MATCH (t:tag)<-[:REFERS_TO]-(a:annotation) RETURN t, a LIMIT 10

Yields

{
    "t": [
        {
            "id": 1,
            "label": "hugi-yes",
            "timestamp": "2019-12-25 15:04:08 UTC",
            "url": "https://bbu.world/tag/hugi-yes"
        },
        {
            "id": 3,
            "label": "filip-fav",
            "timestamp": "2019-12-25 14:30:54 UTC",
            "url": "https://bbu.world/tag/filip-fav"
        },
        {
            "id": 4,
            "label": "jakob-fav",
            "timestamp": "2019-12-25 14:30:54 UTC",
            "url": "https://bbu.world/tag/jakob-fav"
        },
        {
            "id": 5,
            "label": "jen-fav",
            "timestamp": "2019-12-25 14:30:54 UTC",
            "url": "https://bbu.world/tag/jen-fav"
        },
        {
            "id": 6,
            "label": "ola-fav",
            "timestamp": "2019-12-25 14:30:54 UTC",
            "url": "https://bbu.world/tag/ola-fav"
        },
        {
            "id": 35
        },
        {
            "id": 11,
            "label": "ola-yes",
            "timestamp": "2019-12-31 02:49:24 UTC",
            "url": "https://bbu.world/tag/ola-yes"
        },
        {
            "id": 9,
            "label": "alja-yes",
            "timestamp": "2019-12-31 02:49:24 UTC",
            "url": "https://bbu.world/tag/alja-yes"
        },
        {
            "id": 10,
            "label": "alja-no",
            "timestamp": "2019-12-31 02:49:24 UTC",
            "url": "https://bbu.world/tag/alja-no"
        },
        {
            "id": 262
        }
    ],
    "a": [
        {
            "id": 1,
            "label": "1",
            "quote": "Department",
            "timestamp": "2020-01-14 12:57:20 UTC"
        },
        {
            "id": 3,
            "label": "3",
            "quote": "collapsing chest",
            "timestamp": "2020-01-24 17:29:30 UTC"
        },
        {
            "id": 4,
            "label": "4",
            "quote": "my mind wanders into terror",
            "timestamp": "2020-01-24 17:29:57 UTC"
        },
        {
            "id": 5,
            "label": "5",
            "quote": "more grey hairs than last time",
            "timestamp": "2020-01-24 17:30:45 UTC"
        },
        {
            "id": 6,
            "label": "6",
            "quote": "more doctor’s appointments",
            "timestamp": "2020-01-24 17:31:14 UTC"
        },
        {
            "id": 7,
            "label": "7",
            "quote": "Less time",
            "timestamp": "2020-05-21 02:20:37 UTC"
        },
        {
            "id": 8,
            "label": "8",
            "quote": "each wall of her house",
            "timestamp": "2020-01-24 17:35:12 UTC"
        },
        {
            "id": 9,
            "label": "9",
            "quote": "orange",
            "timestamp": "2020-01-24 17:33:47 UTC"
        },
        {
            "id": 10,
            "label": "10",
            "quote": "Its warm",
            "timestamp": "2020-01-24 17:34:29 UTC"
        },
        {
            "id": 11,
            "label": "11",
            "quote": "pinned up in the hallway",
            "timestamp": "2020-04-07 09:09:47 UTC"
        }
    ]
}

This is a strange result. First of all, the first 9 tags are not ethnography tags but topic tags. Second, it seems to indicate that the return is not giving us what we asked for, since these annotations are not using these tags.

It looks like the return format is also a bit different than what we’re used to from Neo4j, but should be pretty easy to adapt to.

This works as expected, to get all users who have annotated with code 35:

MATCH (t:tag {id: 35})<-[:REFERS_TO]-(a:annotation)<-[:AUTHORSHIP]-(u:user ) RETURN u

Yields

{
    "u": [
        {
            "id": 438,
            "label": "Bojan",
            "avatar": "/letter_avatar_proxy/v4/letter/b/b782af/{size}.png",
            "timestamp": "2020-06-16 06:41:02 UTC",
            "url": "https://bbu.world/users/Bojan"
        },
        {
            "id": 458,
            "label": "lalalua",
            "avatar": "/letter_avatar_proxy/v4/letter/l/d26b3c/{size}.png",
            "timestamp": "2020-06-25 12:03:22 UTC",
            "url": "https://bbu.world/users/lalalua"
        }
    ]
}

Which is correct.

I think I’ve put in a fix for this; now only topics which have an allowed_group named annotator will perform graphryder syncing.

I also had totally not picked up on there being ‘annotation’ tags and ‘topic’ tags (I thought annotations were applied directly to topic tags), so I’ve swapped those out and they should be syncing now. I’ve also documented the current data model a bit here (which is based on the Discourse importer in the existing graphryder api, which a few additions)

1 Like

Great @gdpelican! Before we ask @daniel to update, did you also do this?

By codes / code names do you mean AnnotatorStore::Tag and AnnotatorStore::TagName? I believe those are the active record models that map to https://edgeryders.eu/annotator/codes, in which case I’ve added them and the relevant relations to the plugin.

1 Like

Alright, let’s try it! @daniel, could you update the plugin for us?

Ok, updated it. @gdpelican there were a couple of errors during the import - I updated the logfile with the new errors.

Cool, tried it! Code names and descriptions are now showing as expected.

MATCH (t:tag)<-[:REFERS_TO]-(a:annotation) RETURN t, a LIMIT 3

Yields:

{
    "t": [
        {
            "id": 1,
            "label": "hugi-yes",
            "timestamp": "2019-12-25 15:04:08 UTC",
            "url": "https://bbu.world/tag/hugi-yes"
        },
        {
            "id": 3,
            "label": "pain",
            "timestamp": "2020-05-21 00:31:21 UTC",
            "url": "https://bbu.world/tag/filip-fav",
            "name": "pain",
            "description": "Highly unpleasant physical sensation caused by illness or injury."
        },
        {
            "id": 4,
            "label": "feeling_dreadful",
            "timestamp": "2020-04-21 12:35:46 UTC",
            "url": "https://bbu.world/tag/jakob-fav",
            "name": "feeling_dreadful",
            "description": ""
        }
    ],
    "a": [
        {
            "id": 1,
            "label": "1",
            "quote": "Department",
            "timestamp": "2020-01-14 12:57:20 UTC"
        },
        {
            "id": 3,
            "label": "3",
            "quote": "collapsing chest",
            "timestamp": "2020-01-24 17:29:30 UTC"
        },
        {
            "id": 4,
            "label": "4",
            "quote": "my mind wanders into terror",
            "timestamp": "2020-01-24 17:29:57 UTC"
        }
    ]
}

Looks like there are still some associations to topic tags? For tag id=1, that is a topic tag. For tag id 2 and 3, those are indeed annotation tags, but the URL is for an annotation tag. Do we just need to hard reset with new data?

1 Like

Alright, I’ve updated the latest to cleanse tag descriptions the way we do with other fields

Yep I reckon that’s the case. @daniel I’ve put in a force option which will allow us to wipe the graph and re-import from ActiveRecord from scratch:

Graphryder::Importer.initialize(force: true)

Good, the re-import worked - just a few error remaining (updated the logfile).

It looks like we need to be a bit careful with the queries we feed the endpoint. Redis crashed when I did this:

MATCH (a:annotation)-[:ANNOTATES]->( p ) RETURN labels(DISTINCT p)

@daniel is going to add Redis to be monitored so that we can restart it automatically if it crashes.

I also noticed something else.

Looking into the results now, but now the TopicTags don’t seem to be there anymore? We need those so that we can get the subgraph of annotations to posts that have been tagged with a certain topic tag.

Example:

On Edgeryders we might have two ethnographic datasets for projects A and B. We tell those datasets apart by applying “topic tags” (in Discourse), usually according to the pattern “ethno-A” and “ethno-B”. These topics and their posts are then annotated with annotator tags. Thus, to get the subgraph of etnographic data for project B, we would add MATCH (tt:topictag {name: "ethno-B"})-[:TAGGED]->( p) to every query related to the ethno-B dataset and then work only with the nodes in ( p).

We should do some more testing away from production until we can confirm that it’s stable. I’ve tried setting up our fork locally but I’m running into problems.

I will also need instructions for how to install RedisGraph on the main redis, ping @daniel.

Redis is now monitored by monit. I will test it later today if an automatic restart works properly.

When Rails is running in development mode the RedisGraph module included in discourse-graphryder will be loaded automatically. I haven’t installed Redis on my local machine so far and so I’m not able to give any instructions on this. On the server we installed the RedisGraph module system-wide.

1 Like