IT Development Plan 2019-2021

matthias · April 16, 2019, 8:30pm

For archiving coded content, we can create / extend the export script so that it looks for audio / video content referenced in Discourse posts, and if that content has codings associated to it, it would download the raw content by calling a command-line tool like youtube-dl.

alberto · April 16, 2019, 8:55pm

Great.

matthias · May 17, 2019, 10:51pm

As an update from the development side: as of yesterday, this feature is ready and deployed on edgeryders.eu by @daniel. Means, ethnographers can now translate (own or others’) code names in as many languages as they like. They will see code names in their preferred language (which is now obligatory, else it errors). If the code name is not yet translated into their preferred language, they will see it in English, and if not available in English, then in the language that the code was created in originally.

This feature also introduces a small API change in codes.json, which should be self-explanatory for the time being (example), and there is now also a new API endpoint languages.json that simply lists the available languages for code names.

I still have to update the formal documentation to cover this new feature. Also, don’t care about the look & feel for now: it’s not better or worse than the rest of Open Ethnographer, and we will get to it when we are done with the whole multimedia coding feature (as that is a deliverable) and can start to work on usability improvements.

We are now starting the work on the multimedia coding feature, first on the one part where the requirements are (I think) clear already: image coding. This will allow selecting a rectangular element of an embedded image with the mouse and adding codes just as one would add a codes to selected text usually.

@alberto, @amelia and anyone you two want to involve: If you have input for us for image coding, and esp. if you have input / requirements / ideas for audio and video coding, the last chance for that is a discussion we can have about now. Otherwise we will implement things as I proposed above (see bullet point “Multimedia coding” there).

matthias · June 5, 2019, 3:51pm

@alberto I guess we should talk about that and start to work on it …

alberto · June 5, 2019, 4:51pm

Yep. Friday?

matthias · June 5, 2019, 4:56pm

Yes. Friday early afternoon (≤14:00) and later in the evening (≥19:00) should both be ok.

matthias · October 23, 2019, 11:17am

I had to take a good hard look at the Graphryder software while installing it for the NGI Forward dataset. Due to that, and considerig the available budget, the bits and pieces of the plan for the future of Graphryder have fallen into place. There is simply not much wiggle room – it’s pretty clear what we have to and can do with the Graphryder software within the H2020 projects. I have updated that in the plan for Graphryder above accordingly.

@hugi, this also means we don’t need you to make a dedicated software design for Graphryder anymore. But you’re welcome to contribute to the refactoring and development towards the “Graphryder interactive dashboard” v1 and v2 deliverables. If you have time for that, have a look at the plan above and let me know what part would interest you to work on. (Also, please let me know how much the Graphryder installation & documentation work took you in the end, and if you still need to receive money from Edgeryders for that.)

hugi · October 23, 2019, 11:33am

No, that was all covered by core funding already. That also means that the 1600 EUR assigned to me for that work is still not used.

nadia · October 23, 2019, 11:36am

it would be great if that could instead be redirected to cover some of the costs of the interface thing (Natalia’ thing ).

matthias · October 23, 2019, 11:50am

Hmm no, that’s not possible. 1500 EUR of these costs is covered by the IT budget (see) and more is not possible with an IT budget of 3% the project value, give or take the 1600 EUR. It’s already a very, very squeezed budget for handling a bunch of complex software applications.

(I wanted to write “Not in a thousand years @nadia” but I remembered in time that this is a public forum and we should appear more professional But I think I’m clear: it’s really not possible.)

nadia · October 23, 2019, 11:54am

huh? I had interpreted @hugi post above to mean that there is an unspent 1.6K there. But clearly not it seems.

matthias · October 23, 2019, 12:06pm

Well, not anymore It was available for a microsecond and got immediately re-allocated to plug gaping holes in other underfunded parts of the IT infrastructure. Like that 4800 EUR hole that appeared out of nowhere at the beginning of the project when somebody (that we do no longer work with) had cut the IT budget down.

IT budgets are indeed quite flexible as you can cut down on software maintenance and quality for quite some time and nothing is seen on the outside. But to prevent a final collapse, there must be a limit to that cutting, and I think we’re well past it. So whatever budget I can find I’ll have to reinvest in software quality …

nadia · October 23, 2019, 12:07pm

Never ever a dull moment on planet edgeryders.

matthias · October 23, 2019, 12:08pm

That much is certain

hugi · October 23, 2019, 12:41pm

And, in fact, each installation of Graphryder API needs its own Neo4j instance. The community edition of Neo4j can’t handle more than one graph at a time. This means that we need to run multiple instances of neo4j on the same server. This adds a lot of overhead, as neo4j is a pretty expensive piece of software to run, especially from a memory point of view.

In light of this, I think it is a priority to rewrite the Graphryder API to use the same Neo4j database for multiple instances. This shouldn’t be incredibly difficult. What we need to do is to somehow label every node and relationship as belonging to a certain sub-graph. Another, more “graphy” way of doing it that might be more canonical for neo4j would be to introduce the new node type “project” and create a “belongs_to” relationship from that project to all its associated nodes. This is is a more memory-efficient way to do it in a graph database since the relationships are all direct memory pointers from the first object, the only search operation is to get a node from a very small set of indexed project nodes.

matthias · October 23, 2019, 1:37pm

Good point. Didn’t know about that “business model” of the Neo4j community edition so far. I created a Github issue for your proposal, and added a few thoughts. Namely, when doing this we should probably go all the way and let one Graphryder API instance handle all the datasets.

hugi · October 23, 2019, 1:50pm

I would be up for working on Graphryder with you, especially if and when it comes to writing Cypher queries for Neo4j or working on the API in general. I’m less interested in working on the dashboard.

matthias · October 23, 2019, 2:19pm

Great Then your own proposal is a good fit for you. In the variant “making Graphryder API multi-dataset capable”. As it would be part of the “interactive dashboard v1” deliverable, we’d need it until 2020-01-31 as a hard deadline. Paid by the hour under a normal H2020 project staff contract (no issue here at all as you’re already mentioned as staff in the Grant Agreement). But let me know should it become clear during the work that this would need more than 2500-3000 EUR in budget.

I’d do the dataset switcher inside Graphryder Dashboard myself, using the modified API that results from your work. And we’d both fix the most annoying software rot alongside as it suits the work and as we have the time (with about 4000 EUR additional budget for that). Taken together, that will be “interactive dashboard v1”. As there will be all this some repair and maintenance needed alongside, the main part of the interactive dashboard would be in v2, also a deliverable but around the end of the project.

Let me know if that’s acceptable and I’ll send you the contract.

hugi · October 23, 2019, 2:21pm

Yup, it is! Sounds good. What would the deadline be?

matthias · October 23, 2019, 2:30pm

Since 2020-01-31 is our hard deadline for the deliverable (“uploaded to Github and documented”), let’s say 10 days before that so we have time to put it together with my work and test it all.

I’m ok with late deadlines like that if you show the progress alongside working … so I can be reasonably sure we’ll be able to put together the deliverable in the end. Basically this means, make a little plan in the beginning (just edit the wiki above) and a feature branch, and update both as you make progress. Nothing special.