Setting up Graphryder API with new dataset

hugi · September 29, 2018, 6:19pm

As per agreement with @alberto, I’ve been working on setting up a new instance of the Graphryder API to query the OpenVillage dataset. I’ve made some progress, but now I’m running into some issues that I’d like to check in with @melancon and @jason_vallet about.

What works

I’ve set up a new Neo4j database on AWS and installed the graphAware plugins.
Fixed a small bug resulting from changes in the neo4j python package
Got the API and dashboard up and running locally

Found the right configurations to import the correct dataset from ER Discourse. All of the needed variables are not in the example config file, so that should be updated and documented.

[importer_discourse]
abs_path = https://edgeryders.eu/
users_rel_path = administration/annotator/users
user_rel_path = u/
tag_rel_path = tags/
tag_focus = ethno-openvillage-mena
topic_rel_path = t/
posts_rel_path = posts/
codes_rel_path = administration/annotator/codes
annotations_rel_path = administration/annotator/annotations
admin_api_key = MY_API_KEY
admin_api_username = MY_USER_NAME

Successfully ran the HardUpdateFromEdgeRydersDiscourse call to populate the new Neo4j database with ethno-openvillage-mena dataset. Just to demonstrate, here is a (slightly pointless) Neo4j cypher query showing the shortest path between me and the “coffee” tag:

Screen Shot 2018-09-29 at 19.41.00.png2328×1024 272 KB
Dashboard conversation view renders correctly

Screen Shot 2018-09-29 at 19.45.03.jpg3464×2146 523 KB

What does not work yet

After this, I run into problems. Next step should be to ask the API to generate the graphs by querying the Neo4j database and create graph data files using the Tulip library. This is done by calling api-url/generateGraphs. This call seems to be successful, and files are generated in data/tlp which look something like this:

(tlp "2.3"
(date "09-29-2018")
(comments "This file was generated by Tulip.")
(nb_nodes 1958)
;(nodes <node_id> <node_id> ...)
(nodes 0..1957)
(nb_edges 4198)
;(edge <edge_id> <source_id> <target_id>)
(edge 0 0 1951)
(edge 1 0 1935)
(edge 2 0 1914)
(edge 3 0 1897)
(edge 4 0 1896)
...

However, going to the Detangler View just renders empty views. On the backend, it looks like something has not loaded correctly, giving error traces ending in lines like this:

  File "/Users/hugiasgeirsson/gitrepos/graph-ryder-api/routes/tulipr/tulip_layout.py", line 25, in get
    private_gid = self.gid_stack[public_gid]
KeyError: 'usersToUsers'

There are actually three of these errors, ending in KeyErrors for usersToUsers, tagToTags and commentAndPost.
It looks like there are supposed to be keys in gid_stack that aren’t there.

Going to Code View or Code View Full simply ends up crashing python. Here’s all I get from the terminal before it crashes after going to Code View:

"GET /count/post/1443549843638/1538244243638 HTTP/1.1" 200 -
"GET /users HTTP/1.1" 200 -
"GET /count/comment/1443549843638/1538244243638 HTTP/1.1" 200 -
"GET /tags/0/1538244243655/10 HTTP/1.1" 200 -
"GET /tags HTTP/1.1" 200 -
Initializing
Export
127.0.0.1 - - [29/Sep/2018 20:04:05] "GET /generateTagDateGraph/2940/0/1538244243655 HTTP/1.1" 200 -

Right after that, python crashes.

Some background about my setup. I’m running the Neo4j database on AWS, having set it up with the template available on the AWS marketplace.

I’m running the API and dashboard locally, on OS X 10.13, running the API on Python 3.6.5.

So, @melancon or @jason_vallet, do you have any hunches as towards what might be wrong? I can’t seem to find any additional instructions in the readme.

melancon · September 30, 2018, 6:09am

Hi @hugi
Wow, you are a determined person The questions you ask are far too deeply rooted into the code for me, I won’t be able to help unless I can find the time to dig into it and track the problem. @jason_vallet will be much more helpful than I to help and point at where you should look at. I’ll email him to see if he can help. Guy

hugi · October 9, 2018, 12:37pm

For future reference, Jason responded by email. I haven’t had time to follow up, but I’ll post his response here for future reference. I’ll follow up on this soon.

Hey Hugi, nice work so far, looks like you are almost there!

While your first problem with the KeyErrors should be easily fixed, the second one is a tad more worrisome to me.

So quite simply, you only have to create the UsersToUsers and CommentAndPosts graphs (and redo after each api restart). Just go to the Settings page, check the corresponding check-boxes and click on the “Regenerate Graphs” button (also a great feature to keep at hand when flask starts having some troubles).
In the case of the TagToTag graph, it ‘should’ compute automatically if non-existent. Well, you can try to generate it using the method above but the fact that python crashes directly after completing the GenerateTagDateGraph route must indicates that there is more going on. As the next route called after the generation is the drawing operation (/draw/tagsToTags/FM%5E3%20(OGDF)), I think (and fear) the problem may come from Tulip and Python.

What is your current Python version? I did not tried in-depth the Tulip package with the latest version (3.7.X) so I do not know if it is performing well. During Opencare, I was working on a LTS (conservative) installation using Python 3.5.2 and Tulip 4.10/11.

While I think that Tulip 5.X should be okay, if I were you, I would try going down to Python 3.5 to check if the problem gets solved.

If not, let me know, I will need a bit more info on you setup to help troubleshooting the issue.

hugi · October 11, 2018, 9:18am

I was using using Python 3.6.5. I’ve tried running it now with python 3.5 and after first generating the graphs (I didn’t realise they had to be regenerated after each restart).

Now the API crashes as soon as I go into Code View, Code View Full or Detangler View.

$ python --version
Python 3.5.0

$ python app.py
 * Running on http://localhost:5000/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger pin code: 249-287-545
Initializing
Read Nodes
Read Edges
Export
127.0.0.1 - - [11/Oct/2018 10:40:19] "GET /generateFullGraph HTTP/1.1" 200 -
Initializing
Read Users
Read Edges
Read Edges
Export
127.0.0.1 - - [11/Oct/2018 10:41:27] "GET /generateUserGraph HTTP/1.1" 200 -
Initializing
Read Edges
Export
127.0.0.1 - - [11/Oct/2018 10:41:59] "GET /generateCommentAndPostGraph HTTP/1.1" 200 -
Initializing
Initializing
Read Posts
Read Comments
Compute Tag-Tag graph
Filter occ
Export
127.0.0.1 - - [11/Oct/2018 10:42:18] "GET /generateTagFullGraph/1/0/1539247324800/1 HTTP/1.1" 200 -
127.0.0.1 - - [11/Oct/2018 10:42:27] "GET /count/post/1444552947163/1539247347163 HTTP/1.1" 200 -
127.0.0.1 - - [11/Oct/2018 10:42:27] "GET /users HTTP/1.1" 200 -
127.0.0.1 - - [11/Oct/2018 10:42:27] "GET /tags/0/1539247347168/10 HTTP/1.1" 200 -
127.0.0.1 - - [11/Oct/2018 10:42:27] "GET /count/comment/1444552947163/1539247347163 HTTP/1.1" 200 -
Initializing
127.0.0.1 - - [11/Oct/2018 10:42:28] "GET /tags HTTP/1.1" 200 -
Export
127.0.0.1 - - [11/Oct/2018 10:42:28] "GET /generateTagDateGraph/2940/0/1539247347168 HTTP/1.1" 200 -

$

I’ve tried upgrading the Tulip to 5.1.0 but that yields the same result.

My setup is a MacBook Pro from 2016, running macOS High Sierra. I’ve set pyenv to use Python 3.5.0 in the API directory.
I’ll consult Jason to know where to go next. I’m curious to know if it’s ever been attempted to run the API on MacOS.

It looks like the tulip package is being tested and kept up to date for MacOS, but even Tulip 5.1.0 was released only a couple of months after the release of MacOS High Sierra. I’m noting this because a bug on MacOS was fixed in the 4.8.0.post1 release where tulip would crash when calling OGDF layout algorithms. Perhaps this problem has returned with newer versions on MacOS but not been caught yet.

hugi · October 11, 2018, 5:23pm

We’re continuing troubleshooting, running some of the Tulip stuff line by line in Python now. Posting logs here for documentation, but everything checks out.

From Jason:

I did not try deploying the api on a Mac personally but Norbert has used his Mac Air almost daily when he was working on GR and another related project and everything was working normally.

Looking at your log I can see that everything seems fine as each graph initialisation is going smoothly and the graph routes do not indicate any problem during the requests.

Normally, the next step should be the layout, so we can try to see if your installation handles the drawing algorithm as expected

I ran some lines of Python, but it all looks like it checks out, so we’ll have to dig deeper.

Python 3.5.0 (default, Oct 10 2018, 22:07:05)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin

>>> # import stuff
... from tulip import *
Tulip Python Plug-in "SIGMA JSON Export" loaded, Author: "Norbert Feron", Date: "01/06/2016", Release: "1.0"
>>> import os
>>> import tempfile
>>> # generate a graph
... ds = tlp.getDefaultPluginParameters('Grid Approximation')
>>> g = tlp.importGraph('Grid Approximation', ds)
>>>
>>> # apply a layout
... ds = tlp.getDefaultPluginParameters('FM^3 (OGDF)')
>>> g.applyLayoutAlgorithm('FM^3 (OGDF)', g['viewLayout'], ds)
(True, '')
>>>
>>> # export the graph
... ds = tlp.getDefaultPluginParameters('SIGMA JSON Export')
>>> path = tempfile.mkstemp()
>>> tlp.exportGraph('SIGMA JSON Export', g, path[1], ds)
True
>>>
>>> # test if file is valid
... os.path.isfile(path[1])
True
>>> exit()

hugi · October 24, 2018, 11:29pm

No reply yet, so started digging into this on my own now.
I’ve managed to narrow down that the crash happens in tulip_layout.py, at line 33, when calling applyLayoutAlgorithm from the Tulip python library.

tulip_graph.applyLayoutAlgorithm(layout)

Printing the variable ‘layout’ shows that it’s using string ‘FM^3 (OGDF)’ as expected.
As seen in the post above, just running this call as a test in a python interpreter works just fine.
I also tried running the code above, but instead of running

g.applyLayoutAlgorithm('FM^3 (OGDF)', g['viewLayout'], ds)

I instead ran

g.applyLayoutAlgorithm('FM^3 (OGDF)')

Which also works fine.

For some reason, the graph as generated by the API does not work and crashes python. However, the Tulip library yields no debugging information at all, it just dies silently. Since it’s not included in the API code but installed globally as a requirement, digging into it would require me to fork it and start messing with the code of the applyLayoutAlgorithm call. I might also have to start reading up on and understanding what data it is expecting and how what we are throwing at it with my setup is messing it up. In short, this is getting hairy.

Before I start going down the path of ripping the tulip library apart, I will probably want to try installing the API on another machine, like a VPN running Ubuntu or something similarly standard. Perhaps there is some incompatibility between recent MacOS releases and some dependencies of the tulip library.

Another possibility is that there is some other branch of the code that is actually running for the OpenCare data, which would also explain why the documentation and example conf file was so out date. Perhaps this has all been fixed in the version running on some mysterious server somewhere.

alberto · November 5, 2018, 5:30pm

@hugi, I have news. So spoke @melancon:

I happened to play with Tulip and Flask with students no later than yesterday and I had a systematic crash when calling the applyLayout mathod.

However, the crash comes from calling the applyLayout with 'FM^3 (OGDF)' as parameter (layout algorithm name). The crash does not happen when you call other layouts, as 'GEM (Frick)' for example. [It] is quite ok but a bit slower – nothing unacceptable with the sizes GraphRyder is dealing with.

hugi · November 5, 2018, 5:54pm

Ok, got it. I will attempt to see if I can rewrite GraphRyder to avoid the problematic algorithm. But it’s a bit worrying, since more of the layout methods might have the same problem. Whenever this bug shows up the API crashes, which is a nasty bug since it renders the service unusable until you notice it and restart the API. We have had this happen to the OpenCare GR a couple of times – that it has gone down and has needed to be restarted by the guys in Bordeaux. Reason might have been that someone chose a new layout method when playing with it and triggered a crash.

It’ll be some time before I can get to this again, I’m in a very intense period of other work right now. But at least I know now that it’s not just my own setup that is the problem.

alberto · November 5, 2018, 6:38pm

Ping @amelia.

amelia · November 7, 2018, 12:24am

Ok. So no MENA code viz for a while?

hugi · November 7, 2018, 12:38am

Last mile to get this up and running is taking longer than expected, so no guarantees since it’s a bug that we’re having a hard time squashing.

bpinaud · November 7, 2018, 7:28am

Hello,
May I have the code from @melancon or the graph you are using? Please also tell me where I can find tulip_layout.py. I need more information to try to reproduce the bug. I can also easily use Tulip compiled in debug mode.
Which version of Python are you using? And also which version of Tulip (installer via pip? )?

Thanks,
Bruno

alberto · November 7, 2018, 9:15am

But, @hugi: does the dashboard work if you replace the string specifying the algorithm in line 33? I am guessing the variable layout contains the visualization algo, so it would be a (relatively simple) matter of setting it so that it contains 'GEM (Frick)' instead. With this, the dashboard would work, right? And, since the MENA corpus is already in the Neo4j, this would make the data explorable to us. Right? If it is so, I would ask you to put it up, so we decouple the possibly long process for a deep fix of the library by @bpinaud and the small analysis @amelia, @Nermine and I have been waiting to run. Doable?

hugi · November 7, 2018, 9:24am

You can see tulip_layout.py here.

Versions of all libraries are in the requirements file.

And this is my setup:

Python 3.5.0 (default, Oct 10 2018, 22:07:05)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin

It does! However not perfectly, the user graph does not display for some reason. But we are now getting somewhere.

I can, but it’s still a couple of hours work to go since I’ve just been testing it locally. I still have to set up a server to run the API and dashboard. I will try to get to this ASAP, hopefully tonight.

alberto · November 7, 2018, 9:28am

Brilliant! BTW, once this thing is up I can download the graph as a Tulip file and handcraft “bespoke” visualizations for @amelia if whe likes the other algo better.

melancon · November 7, 2018, 12:57pm

The code is pretty simple: it calls the FM^3 OGDF layout plugin. I remember it crashed in a previous install (you may ask @jason_vallet for details), and I ran into the same problem when running the plugin from OS X.

bpinaud · November 7, 2018, 2:47pm

Did you try the new version of Tulip Python? We recently released version 5.2.1 of tulip Python with tons of bugfixes since version 4.10. You have nothing to change in your code. The update should be straightforward.

hugi · November 7, 2018, 2:48pm

Yes. I tried upgrading the Tulip to 5.1.0 but that yields the same result. Did not try 5.2.1 though, can have a crack at it.

alberto · November 8, 2018, 3:37pm

Please, let me and @amelia know when you are done!

hugi · November 8, 2018, 4:36pm

I will. I’ve had some pressing work getting in the way in the last couple of days, I will get to it soon.