This is what Open Ethnographer will be like

matthias · November 24, 2014, 3:59pm

Forking is a good point; the rest can be decided lateron

Previously I had intended collaboration to happen using the tag hierarchies: one would be able to incorporate a code from another researcher and add to it by creating another code, using it where to code what is missing in the code that one takes over, and create a logical code that combines them both (seen in exports for analysis only). The problems with this solution are that (1) the semantics of adding to someone’s code are not expressed as a concept in the software, (2) one cannot “uncode” from the coded work of a researcher, (3) we might end up not using code hierarchies.

Forking seems to solve these. If we want this, it has to be prepared now in the way coding data is stored, but can be implemented in future versions if it can’t happen now. So I will tentatively include it into the data model (the change simply means storing “code ID, user ID, word ID” triples rather than “code ID, word ID”).

This will not be “git style” collaboration with changesets and commits, but simpler (which is good). And I think forking should not work on the basis of coding files (all codings of one researcher, together) but on the basis of individual codes. This makes data storage, diff-ing etc. quite elegant to implement, and serves the disruptive potential of Open Ethnographer better (as I see it at least): being able to create public semantic metadata in a collaborative and aggregative way, which creates much more of this for later evaluation and research than we could create with current “one researcher, one coding file” methods.

So at every moment, a researcher will be able to compare the coding status of one of their codes with all the forked versions that are around. The software will indicate differences (both added and removed codings to words) with each forked version, excluding only changes that one manually rejected earlier. These differences can then be reviewed and taken over selectively. For the double use of rendering codings as linked data (RDFa markup) into the public website output, Drupal could then use the union set of all publicly shared tags in a “fork set”.

As for the whole code hierarchy discussion, I think we have to leave it for later when it becomes possible to work with the new Open Ethnographer and to see what’s needed. (Because essentially, hierarchies on the coding side are just meant to find a code faster, so after implementing auto-suggest for codes it could already be fast enough.) Keeping the decision for later does not hurt, as currently tags in eComma do use neither hierarchies nor tags, so both would be additions anyway.