Codebook as a wiki: first iteration

alberto · August 23, 2019, 6:34am

This sounds like a great idea. Let’s hammer it into a proper metadata strategy for the data of SSNA studies.

This ties into this issue: I hope to convince you, @matthias, that we do need a database entity to refer to a whole SSNA study. The reason is metadata, which are essential to understanding and reusing the data. Some metadata can be automatically generated from the data: for example, the duration of the conversation can be inferred from the timestamps of the posts; the duration of the coding from the timestamps of the annotations; the number of researchers involved from the names of the annotations’ authors. But others cannot, specifically methodological choices informing the coding (“we use a grounded theory approach…”, “we chose to use names of individual technologies, and even of specific applications, as in vivo codes. This is because the study’s ultimate purpose is to contribute to shaping technology policy…”.

I find it useful to think about these things in terms of exporting a whole project for safekeeping on a repository such as Zenodo. What would a complete export look like? The ethnography proper would have:

pseudonymized topics and posts
annotations thereof, which include the associated codes
the codebook, automatically generated from the annotations and codes. It includes the discussion part as suggested by Matt.
methodological notes.

@amelia, @ccs, @Leonie, am I missing anything?

For the SSN proper, we could take two roads.

The first is this: there is no need to export anything, because the SSN adds no information to the coded data. It is simply a network representation thereof. Interested researchers can rebuild it from the ethno data.

Alternatively, we could add to our data either the Neo4j file powering Graphryder (after pseudonymization), or a docker file containing the whole Graphryder shebang.

I think I am going to fork the final part of this discussion and move it to #ioh:workspace, so other partners in the consortium are more exposed to it.