Graphryder 2.0 – Workplan

The Graphryder plugin required a newer Discourse version than what we had installed. So we used the opportunity to upgrade the multisite installation and bring it up to date. :slight_smile: We are now on Discourse 2.4.5.

The plugin is not yet installed as deployments with the Graphryder plugin included still fail ("Redis::CommandError: ERR Error loading the extension. Please check the server logs.“) I’ll look into it later but first have to get some sleep. :sleeping:

1 Like

We’re not able to get the included redisgraph.so module to work on the ER server (invalid ELF header). So we build a package for the ER server and install it globally. This also seems to be more resilient as a faulty Redis module can take down the redis-server - and when the redis-server is down all websites are down as well… Is this solution working for you @gdpelican @hugi ? If so, please update the discourse-graphryder plugin with redisgraph.so excluded.

@daniel That’s fine by me if we can assume that the redis instance has RedisGraph installed outside of the plugin; I’ve updated the plugin so that it just shows a warning if it detects that RedisGraph is not installed.

Was that ELF header error the cause of the Redis::CommandError earlier, or are there further things to troubleshoot?

The discourse-graphryder plugin is now installed on the multisite installation.

There were some errors during the initial import like

errMsg: Invalid input ' ': expected STARTS WITH, SET or START line: 1, column: 910, offset: 909 errCtx:

but at least some content was imported as e.g.
Graphryder::Query.instance.perform("MATCH (u:user) RETURN u, count(*) as count") returns some data when executed on the Rails console.

The ELF header error was different - the Redis::CommandError was due to a permission problem. Either related to our deployment setup, or the Redis user on the server had not sufficient rights to access the redisgraph.so file.

Would it be possible to share the log of that initial import somehow? That’ll help me greatly with squashing any bugs around unexpected input it’s receiving from the prod data.

PS thanks for your effort here! :smiley:

Sure. I’ll send you a link to the log.

Not getting this to work yet with curl -X POST https://bbu.world/graphryder/query -d '{"query": "MATCH (u:user) RETURN u, count(*) as count"}' -H "Content-Type: application/json" -H "Api-Key: MYAPIKEY".

What error(s) are you seeing specifically?

Cool, from looking at the log here it looks like there were 15 failed post imports (of ~3k total), and all of the other models imported successfully. Not perfect, but not bad for a first pass.

The ones that I see are failing on single quotes within code blocks, like this:

print('this is going to fail')

my guess being that discourse escapes single quotes with something like ‘ for regular posts, but doesn’t do so within code tags. I’ll put in some kind of fix for that and then we can try again.

1 Like

Just not getting anything, returns same as https://bbu.world/graphryder/query.
If you log in to https://bbu.world with your ER account I can add you to the annotator user group for testing and send you and API key.

Alright, I’ve logged in there so should have an account now

Cool, just added you to the annotator cat. API key in PM.

@hugi Is the permit-api-cors plugin installed on https://bbu.world?

@daniel I’ve updated the discourse-graphryder plugin to account for apostrophes in the cooked input; can we give the import another go to see if there are any stray issues I’ve missed?

1 Like

Yes, permit-api-cors plugin is installed. Just tried an API request and it seems to work. :slight_smile:

I installed the new version of the discourse-graphryder plugin, ran the initializer again and updated the log file with the output (same link as before) - still a couple of errors but far less than before.

2 Likes

Cool, I’ve put in some code to sanitize topic titles and annotation quotes as well; ready for another go

Two requests:

I would like us to also track something that the old Graphryder left out – the user who created an annotation. Could you add that relation @gdpelican? I suggest we call that relation “annotated”.

If we implemented this, I could use it immediately to answer a question we have in BBU – a list of all codes that have been used twice or more, grouped by the number of times they have been used, and the number of ethnographers that have used them. For a project like BBU where we are about to have eight ethnographers working together, it starts to become really interesting to track which codes are being applied by multiple ethnographers, and which codes are applied very often but only by one of the ethnographers.

And, less importantly, do you think we could somehow access the graph with RedisInsight? That would be amazingly useful to validate the data and test out Cypher queries. Having a good Cypher graph GUI is the only thing I will miss from Neo4j. RedisInsight asks for this:

Does the Graphryder data live on its own instance that could be accessed this way?

Sure, that should be a trivial one; I’ll throw it in now.

EDIT: Turns out I’d put that in there already under an ‘AUTHORSHIP’ relation, heh :sweat_smile:

Hmmm possibly. I highly doubt Discourse supports external access to its redis instance by default, but I know there are some instances out there that have external redis instances and all sorts of wacky setups, so it must be possible.

Other possibility here is that the graphryder/query endpoint will accept arbitrary cypher queries, so we could build a dead simple web interface that hits that and displays the results without too much effort; maybe requiring that you supply your own API key.

No more errors! :slight_smile:

Importing 418 Users...
Importing 23 Tags...
Importing 3075 Posts...
Importing 1054 Topics...
Importing 1581 TopicTags...
Importing 5073 AnnotatorStore::Annotations...
=> nil

Note: Just a little fix was required: FIX undefined method `gsub' for nil:NilClass · edgeryders/discourse@ba36237 · GitHub

1 Like

Sweeeet, I’ve applied that fix to the plugin repo as well. This is really great news. :slight_smile:

I’ll continue hammering on the Graphryder API end of this puzzle, but this is an important piece done.

1 Like

Ok, I’ve tried it out. Getting the Cypher queries to work, but results are not right yet.

I tried this query: MATCH (p:tag) RETURN p, count(p) as count

  • It returned a list of annotation tags, but just the ids. We also need the code name, which should be accessible through rails active record. It also returned some topic tags, so it looks like they are mixed in there now.

  • count turns out as 1

There is also another issue that we should look into solving, which I hadn’t thought about. Right now, the posts in the graph include private messages. That’s not great since it means that we can’t give people access to the SSNA data endpoint without compromising private data. This wasn’t a problem with the old way of doing things, so I hadn’t considered it.

This definitely needs to be fixed before we use it for the Edgeryders platform. It seems logical to me that the SSNA graph should only include the data accessible to the annotator group. Would it be complicated to add a configuration to the plugin that limits the import to content accessible by a certain user group?