Setting up Sandbox for Oxford QDA Students

amelia · January 20, 2021, 10:32am

Grant, the professor I work with, is enthusiastic about the idea of the OII MSc students using the platform for coding and analysis for our class. I imagine they’ll be particularly interested in coding the NGI data as well as their own interviews/field notes.

How do we proceed in setting this up? Thank you!

hugi · January 20, 2021, 12:11pm

Hooray!

For this part, we would need to give them annotator access to Edgeryders proper. This is potentially an issue since they are in the production environment of a live research project. Remember that all people with annotator access can add and remove annotations for anything on the platform.

Unless, @matthias, there is some straightforward way of creating a second user group (i.e. annotator-students) and giving them limited access to create annotations that only they can see.

I could then modify the Graphryder import to only include annotations by a certain user group. I have an idea for how to do it, so could work on it pro-bono as I think would be a great step forward for our stack to be used in this context. However, the same might not be true for @matthias and indeed the work on that end might be more complex.

This is easier, as we would set up a custom new platform for them, as @matthias said that he and Daniel would be willing to do. Once they have set it up, I will help configure it and manage access.

alberto · January 21, 2021, 9:48am

The way I would do that is to filter the annotations that feed Graphryder by the name of the annotation’s author: only the ones by @nextgenethno get in.

That might be too much power, indeed. In the short run, we can solve it by drilling a sense of responsibility into them (and we have daily backups anyway).

amelia · January 21, 2021, 10:13am

It’s not a big class so I should be able to keep things under control, I hope!

katejsim · January 21, 2021, 5:35pm

omg this will be a revelation to them after the troglodyte programs they’re trained to use

amelia · January 21, 2021, 6:47pm

mass exodus from nvivo incoming

amelia · January 22, 2021, 9:35am

I was going to wait til next week to write this out but I couldn’t fall asleep last night because I was thinking about the data model for OE for people’s ethnographic projects (I know, nerd alert). So I’m going to write it now. I think rather than seeing this as a “sandbox”, we should use it as a beta test of this part of Open Ethnographer and build on it from there. @alberto @matthias @hugi , what do you think?

Open Ethnographer Data Model

First, we set up a custom platform attached to the existing ER platform (a “community” like the Blivande platform). I think we should call it Open Ethnographer for clarity.

The front page: A welcome, with welcome video and instructions on how to use OE, and then the list of ongoing projects (automatically generated because they will be categories).

If you’re an ethnographer, you can start your own project by creating a category. Let’s say: Smart Cities on Mars is my project.

Each project has its own unique Discourse tag. So I create my “overall” tag, which follows the pattern #ethno-projectname. So let’s say my tag is #ethno-marsscities.
I start adding data. Say I want to add an interview. I create a new post, title it “Interview with Sanjay”, and I apply two Discourse tags: my project tag (#ethno-projectname) and a tag labelling what kind of data it is, attached to my project (#interview-projectname). Example types of data:
#fieldnote-projectname, #focusgroup-projectname, #survey-projectname, #socialmediaposts-projectname, #forumconvo-projectname, #video-projectname, #webcontent-projectname, #archival-projectname. We standardise these by providing a suggested list of project tags so that everyone is using the same tag format, but people are also free to add their own. I paste my interview and I can now code it.
I can choose to make these posts private (locked to me alone) or public (once stripped of any personally identifiable data). This means that I am ok with other people using my data for their project, and also means I can use their data too! All I have to do to add someone else’s data to my project is to throw my project Discourse tag on an open post.
In the backend of OE, I can now see all the codes attached to my particular project (by sorting by my project Discourse tag), all the codes for one kind of data (sort by datatype-projectname) or see all the codes I’ve assigned to any project of mine (by sorting by my name).
I can also choose in the user settings whose codes I want to see suggested when I create new codes – only my own, only ones from a particular project of mine, only those of a particular user group (say, a group of my colleagues), or everyone.
I can visualise my data in my own SSNA, because I’ve used Discourse tags! And if I make my data open, others can too.
I can also host conversations on the platform to add to my own data – holding digital focus groups, events, etc.

From there, we keep working on how to import different kinds of data sources, expanding what OE can do. For example, usibg Tell Forms to conduct surveys on the platform as well and upload that data (I want to develop this with @alberto).

This way, from now on if someone asks if they can use OE for their own projects our answer can be – sure! Come on over.

We also build a community of practicing ethnographers who can feed back and tell us what new functions they’d find useful!

alberto · January 22, 2021, 9:48am

This is a very strong argument.

I guess what is needed for this scheme to come true is a check on the permission structure: for example, ordinary users in Discourse can not create restricted categories. Admins can… but then they can view other people’s restricted categories, so if we make everyone an admin they can still access each other’s data. I defer to @matthias here.

I would be very keen on having people re-use our own data, those that live on edgeryders.eu. How would that work?

An extreme solution is, of course, to start from the datasets on Zenodo and write an “inverse script” that puts the data back into a Discourse instance, except it would be a different Discourse instance. But it seems very baroque!

amelia · January 22, 2021, 9:56am

Hm, is there a way to allow ordinary users to create categories but not see all the data? I ask because the categories are really clear delineations, so having your own category would be great for organisational and visual clarity.

I share this desire but currently the Edgeryders forum is too chaotic to be a landing page for new users – they’ll panic. And the existing Discourse tagging is equally chaotic. I’d very much like to start from scratch.

alberto · January 22, 2021, 10:07am

What? To do my PhD thesis I had to learn:

Python + Tulip
Stata
NetLogo

When you are teaching yourself three programming languages, disentangling a messy user interface seems like the least of your problems! And acquiring data is very expensive.

amelia · January 22, 2021, 10:16am

Alberto, you’ve got to stop extrapolating your own experience to others. This is bad UX. “I figured out something way harder” is probably how Nvivo got so bad in the first place

You are not like other people, friend! Not everyone shares the pleasure of untangling messy user interfaces – it’s not a value-judgement on Edgeryders, it’s making things more accessible right off the bat. Especially if a key point of difference is that it’s not as messy and complicated as Nvivo, which will be the main draw for many!

FWIW, our interface isn’t the problem – it’s how we organise data/information on that interface. Fresh start with same interface is the dream here for me.

alberto · January 22, 2021, 11:15am

Hmm… bit unfair, I would say. “This is bad UX. OTOH, it does come with 300K EUR and 2 years worth of data collection that you can use for free” (for NGI alone) is more like it. But that is beside the point: I was just thinking aloud, you are doing the work and therefore the shots are yours to call! I concede unconditionally.

amelia · January 26, 2021, 12:46pm

OK, let’s run this scenario:

We don’t create a new forum, we create a new Category, “Qualitative Data Analysis” (can rename later). We make it protected.

Students can create their own subcategory within it (I can make one for each of them) and lock it so only they can see it.

They upload and code their own data.

I’ll make sure it’s tidy and establish tagging standards for them right away.

Lingering question is: is it ethical for them to code data that already exists? And can we remove their codes from the dataset?

amelia · January 26, 2021, 1:15pm

As long as there are no issues with the first, I can take care of the second manually, worst case scenario. If that’s OK, then we don’t need to set up a new forum, which means I don’t need to take up Matt’s time and energy – I can just create the new cat and go from there on my own. It also means they won’t need admin permissions and won’t be able to see each others’ data. For just the class, the UX panic isn’t a problem, so we can punt that discussion for the future. Let me know if you see any issues with this @alberto and @matthias.

alberto · January 26, 2021, 1:38pm

Could also be: we curate datasets on Zenodo. Update more often, maybe even use @hugi’s script to add a GraphML file.

This gives people full control on the experience. OTOH, they are now stuck with N-Vivo to do their own coding, and manipulating the data requires some datasci skills.

amelia · January 26, 2021, 2:07pm

Yeah, not ideal (boo nvivo). Any issues with me setting up this new cat?

alberto · January 26, 2021, 6:43pm

No, none at all. Just trying to find solutions that would avoid skin contact between people and the platform, but still give them access to the data.

If that is what they do, there is no issues. We could even create a form where they drag-and-drop the text, the pseudonym of the interviewee, tick the box for consent and then the form uses PUT API methods to create the topics.

If they want to code our data (which would be super super nice!) we need to be careful to pass to graphryder only the annotations filtered by the EDGE annotation creators (@amelia, @katejsim, @Wolha etc.).

hugi · January 26, 2021, 8:01pm

I’m confident that we can do that, and that I can modify the code accordingly.

For the current Graphryder, it would mean that for each dashboard, we would only include annotations by a limited number of users.

For the RyderEx version, it would require filtering the selection of ethnographers by selection in the dashboard itself. It adds a little bit of complexity, but it’s probably something we need to do down the line anyway if we have different research projects overlapping on the same primary data with different coding ontologies for different research questions.

amelia · January 28, 2021, 11:42am

I’m coming across an issue – I assumed it would be easy to toggle privacy settings for a post, but it doesn’t seem possible (or I’m not finding it easily).

If I wanted to create a post that only I can see, would that be possible? Without this ability, this thing is dead in the water, because people have to be able to keep the data private until they can strip it of identifiable information, and you usually code before you do this – and you’ll often decide that the interview itself is too sensitive to release at all. In that case you’d keep it private always and just keep the codes from it in the dataset.

@alberto @matthias @hugi

The workaround that I can see right now is to keep everything private – by giving each student their own category and then setting the security in that category to be a user group of 1 (the student themselves). Tedious but works until we can find another solution. And it wouldn’t allow for certain data in the category to be visible to others, which is the goal of this thing ultimately (to share the data you can share). On top of that, they can’t convene platform conversations in their own category, because that would require allowing those users to see everything else in the category. So it’s not ideal.

amelia · January 28, 2021, 12:35pm

Also, I’m creating an example project from my dissertation research, check it out:

https://edgeryders.eu/c/qualitative-data-analysis/amelias-smart-cities-project/383