Some thoughts on (Collective) Ethnographic Practice

But ethnography, as I understand it from conversing with Amelia as well as my observations here over the years, is a fairly subjective practice with much leeway for selection and interpretation. Disclosure won’t be seen by everyone but will be seen by some. And to those who know the ethnographer is both participating and doing the evaluating, it could raise questions and even provoke discussion about the work, since it will be the words of those people who become the basis of the coding work. That’s an interesting situation…people comment knowing they are being evaluated. Could make them say things that are thought through better. But could also cause a certain amount of intrusion into the ethnography work by engaging the ethnographer/participant into discussions about how the work gets done. (And maybe that’s a good thing too…but it is all time consuming.)

And I think it raises another ethical issue as to whether the ethnographer participates but shouldn’t try to influence the conversation, either consciously or subconsciously, based on whatever biases that person brings to the conversation.

I don’t think getting paid or not has to enter into it, if the person doing it has the right attitude. At the WELL I got paid for being present with everyone, but I was a full participant and that was seen as not just ok, but a part of the value.

In my view, the ethical consent funnel is like the process I go through when I’m interviewing people – I send a participant information sheet explaining the research and the protections involved, get their consent, and continue. But again this is one of those ethical questions that’s ongoing even in offline research: when you’re doing participant-observation, you’re not going to be running around the city with a sign taped to your back saying I’M AN ETHNOGRAPHIC RESEARCHER. But when you’re doing an interview with someone where you could be getting information that, should they be identifiable, could potentially harm them, your ethical commitments are different. More explicit disclosure is necessary.

Most of the time, the ethical procedures required by the university are both too stringent and not stringent enough. Better to have a more robust ethical code that you can assess case by case rather than blanket DO THIS/DON’T DO THAT rules — it’s relatively easy to know when something might be harmful, whether it’s out on an online forum or said over a drink. Use your human being common sense, and err on the side of protection. The point is that you as a researcher have ethical commitments, in my view, and those commitments shouldn’t be hidden. They should be easily discoverable (whether in post form, through an ethical consent funnel, whatever). And above all you shouldn’t be using deception unless you have an extremely compelling reason to do so (not just, ‘people are more likely to talk to me if they don’t know I’m a researcher’).

The trick here, I think, is to a) realise that there is no contribution free of ‘bias’ (e.g. there is no view from nowhere-- every intervention you make comes from a place). The goal is to be as reflexive as possible about what that place is. And part of the goal of social research is to tease out things that are of interest to the researcher (whether that researcher is doing, say, an independent project on smart cities, or is doing a H2020 project on populism). For example, it seems clearly beneficial to the research to stop the conversation from becoming about daisy-growing — but where the skill comes in is being trained enough to know when the conversation about daisy-growing actually IS about populism.

F/e, in Open Care we’d get some really generic posts about happiness. A CM like Noemi might step in and say-- hey, could you tell us specifically what you do? Who you do it with? What kinds of futures you’d like to see? In this way she’s acting like an ethnographer— teasing out the specifics and asking people to think more deeply. She’s guiding the conversation, but she’s still asking for this person’s thoughts and opinions from their own perspective. She’s not saying “We think care is X, do you agree?” for example, because that really restricts the person’s ability to respond freely. There are a lot of nuances to it.

Later on in the research, though, it might actually be useful to test the social theory you’ve come up with (this is the part of ethnography that stresses not just theorising about people up in your tower, publishing it, and moving on to the next project). So in Open Care, I might ask everyone in the community: hey, I think from what you all have been collectively articulating in various forms, care is more about bringing people together than inventing a technological fix: I think you’re saying that people are the best technology. Is this accurate or am I coming out of left field? And then people can engage in meaningful conversation around that theory before you run off and publish it.

1 Like

Payment is tricky because either option has its ethical pluses and minuses for both researchers and participants. If you pay people you’re not extracting free labour from them for your own gains as a researcher, so that’s a good. But it also means people aren’t participating ‘freely’ and, in research into lower income populations, could actually be exerting a form of coercion since that financial incentive is so meaningful. Also your data quality usually goes down because people are just participating for the money, and are often therefore interested in getting the task done as quickly as possible.

From the researcher side, my feeling is that most of the time researchers are getting paid for their research one way or another, either through a formal job agreement or through research funding (and PhD research is often somewhere in between). It’s rare that you have a researcher self-funding for the sake of knowledge, like the wealthy astronomers back in the day.

I very much resonate with this, @amelia. This is a better way to express what I called “being a player, not the referee”.

@amelia @alberto @johncoate This is a quite interesting thread. There is one aspect that you guys do not address, that I thought would pop up as I went down the page. When I read the post title “Collective Ethno Practice”, my first thought went onto the idea that you would be discussing what it means to collectively code conversations.
Well, you may ask why I indeed had this thought to begin with. The reason is, as a computer scientist helping you guys to deal with and make sense of the data, one concern I have is to identify approaches that can help ER’s methodology to scale.
And one approach wold be to have multiple people code “together”. That would inevitably bring those coders to share their codes. Which in turn would probably bring in additional work to make sure coders have a collectively coherent approach to coding, and a convergent approach to use codes, introduce new ones, modify existing codes, etc.
I am eager to hear your thoughts on this.

1 Like

Hi @melancon

Interesting that you mention that. It’s definitely a conversation i have heard a number of time in different ER contexts, especially as we discuss scalability and the next steps of the ethnographic work. It may be that we haven’t mentioned it in the discussions on this thread because we feel we’ve had some of those conversations already somewhere else on the platform. I will look through and see if i can find the threads where that idea has been opened up and discussed already. But when i think about it more i specifically remember conversations around ‘collective coding’ happening face2face at OpenVillage.

Thanks @alex_levene ! I will keep an eye on the thread to hear about other places where I can grab ideas/experiences/opinions.


there’s @alberto’s conversation on github from 2 years ago about the OpenEthnographer project: Architectural choice: decide whether codes should have an author or not · Issue #60 · edgeryders/openethnographer-for-drupal · GitHub
In that theres a discussion about “the vision of ethnography as a collaborative research methodology”

More recently theres the discussion in this thread Open Notebook Science for Ethnographic Coding: Open Codebooks about the idea of ‘open code books’ allowing for multiple ethnographers to share the same reference hierarchies

I’m sure others in this thread may be able to point to other examples of the evolution of this discussion

I see. That must be in link with discussions we had in the context of opencare (I think I remember discussions of this sort when we were at CERN in Geneva …). The open notebook post by @alberto however is already addressing technical/methodological issues while I thought colleagues here would address more “philosophical” issues. Personally, I am curious to here our experts about how they foresee the influence coders may have on each other, affecting how coders individually code and as a consequence how it may affect the overall coding. I am not sure this makes sense …

Trust @melancon to zero in onto The Problem. The Problem is that ethnography is very far indeed from being a collaborative discipline.

As @alex_levene mentions, @amelia and I are now talking of something called an open codebook. This maps perfectly onto my idea of a “coding wiki”, with the great advantage that ethnographers relate to it: they know what a codebook is. The paper codebook serves as a metaphor for my wiki idea, just as the paper address book we used to have pre-computers serves as a metaphor for the piece of software called “address book” or similar.

The difficulty with that stuff is the obviously non-open culture of the discipline:

I found exactly the same thing among archaeologists. They also have a problem of “what exactly constitutes evidence”; they also have developed their own private repositories of knowledge that precede the writing of a paper (“field notes”, “gray literature”, etc.). And they also are very private about them and never share them.

Regarding getting paid:

The community managers for our 2 Horizon 2020 projects will get paid something, I’m not sure of the exact number, but it won’t be much, at least ‘per hour.’ But they will come from close association with the partners, so should be motivated to do that part of the work where you further the conversation. That part of the work takes longer than the ‘keeper of the rules’ functions because you have to really think about what the others are saying or trying to say, and the context from which they say it.

1 Like

as a computer scientist helping you guys to deal with and make sense of the data, one concern I have is >to identify approaches that can help ER’s methodology to scale.

Where does machine learning come into the picture, if it does at all? @hugi and I have discussed this a few times. I used it back in the 90s to analyze multiple news sources so I could combine them onto single pages based on subject (for which we devised a categorizing system). Back then it was all very proprietary, but nominal searching today shows that there are some possibly interesting open source products such as RapidMiner with the Aylien machine learning extension that evaluate text and graph it, including clustering graphs.

Do codes ever get “settled” enough to trust augmentation of this sort, or of some sort? Is this kind of augmentation too crude by our high standards?

Because it does scale. These products were created by large multinationals demanding so much from the “document delivery” community (“super searchers”) that they needed help in their attempts to categorize and cross references thousands of pages of documents once they all got digitized. Our using this to analyze news was unique at the time, which caused one of the big companies in the field, Autonomy, to work with us gratis, or nearly gratis, so they could enlarge the scope of their product. It was pioneering stuff at the time.

So in order to deal with how labor intensive document analysis is, I wonder if we can go from one super ethnographer (Amelia) to a larger community (taught to use OE via the Academy), to a relatively stable code set, to something helped by machine augmentation. Is this realistic? Or, does it break over each data set (such as a set of conversations about populism vs. a set about the future of the internet) needing too many specific codes? Or does it break because of something else?

Ah ah! I did not mention anything – on purpose – about using machine learning or similar approaches. We made some very naive attempts on the occasion of opencare Bordeaux Masters of Network hackathon – and it did not show anything valuable. That being said, I sincerely think we simply did not try it seriously enough.
In any case, if I would be using any machine learning stuff, that would only as suggestion to a human coder. Things we thought would be to maybe rank posts to indicate those that maybe should be addressed in priority. Another would be to suggest codes based on (content/codes/topology features) similarities.
I still have hopes to spend time trying these out – maybe on the occasion of a future EU project?

We are still a fair way off using ML IMHO. Because there simply is no sizable training dataset. So yes, EU funded is the only scenario I would consider putting time behind trying to make it work.

1 Like

This is true and something I had meant to mention - it does require a large dataset. For news we went through years of archives. Not sure if smaller datasets can work with today’s products.

1 Like

Quite relevant. However, the size of the dataset you need all depends on what methods you use (SVM can achieve things with much smaller datasets) and of course what you intend to do with these ML methods. My experience: using just half a thousand comments (audio notes from wine experts recorded on their mobile phones converted to text) we were able to put up a system that could automatically fill a tasting note (a bunch of checkboxes on a wine’s visual, olfactory and/or gustatory attributes) for new wines – with acceptable success rates, although it’s true in most cases the note could only be partly filled.

That gives me modest hopes to be able to categorize posts as to whether they fit this or that code, maybe come up with suggestive ranking to help coders prioritize posts/comments.

But as you mention, that would be something we’d try within the context of a research project.


Relevant new article in American Anthropologist.
If you can’t access it, let me know and I can send along the PDF.

This is an exciting topic, I would like to give some insights I have on the discussion, but legal disclaimer, I’m discovering ethnography (I have read a lot of posts on the platform these last days but that’s it) so forgive my possible bad understanding :slight_smile:

It’s funny because this issue of the participant-observation, with the need of disclosing one’s researcher status and what it possibly implies (influencing the discussion) reminds me of one quantum mechanics theory that says that a measurement on an observation bring changes to this observation. I’m working on creating and then analysing surveys (about stress and well being at work) and when designing it, we have to take into account the bias induced by the question formulation for instance, or the periodicity of the campaign, and there are some methodologies to reduce such error inherent in the survey measurements. But it seems that being a part of the observation should reduce this bias and should allow for more precise analysis, because for example you discover the ideas / concepts covered almost on the fly and intuitively it would converge more naturally to something. (And it is more interesting to participate the subject you are studying, it should be more immersive). I would be very happy to learn more about this field :slight_smile:

That being said, to join another topic of this discussion, when dealing with open ended questions, we have to find topics and there is also the issue of scaling when we have many answers. We tried to do it automatically via nlp technics (as a first step, before manually sharpening between topics) and with 4000 answers, and given the noise induced by open ended question (it is as if you give a blank page and ask someone to tell you how does he feel today, for example, and when asked, you are left to the participant’s imagination, without the ability to readjust anything), we can have some results, and some visualisation of the topics as well (we are using NMF, for those who are familiar with ML, with a lot of cleanup beforehand).
So maybe it is usable in ethnography, as a first step to define code (the big ones).

1 Like

I completely relate to that, and confirm, as Alberto was noting, that even we as comm managers get more excited about some posts then others, and it would be slightly dishonest on my part to say I am equidistant in how I respond or the attention each contribution gives. It’s in the community’s best interest to be part of an interesting and deep conversation, which means it’s in our duty to learn or be intuitive about how to tap into some content and actually induce the participants to develop ideas further, as well as diminish the noise by not trying to squeeze aggressively the same kind of value from i.e. those who are not so available or simply don’t have it at a moment and place in time (very circumstantial… but that’s what it is).
I would sum it up as having a very human approach to community management, far from it being potentially automatized or reduced to 5-steps-and-you’re-done in the near future.

For what it’s worth about the two very intertwined roles of comm managers and ethnographers and how we are approaching collective intelligence, I’m with you on this Amelia. We are all asking I think for 2 different people doing the 2 roles not because they are incompatible or because it creates bias, but because they are too important and each requires its own strong capacity.
That being said, I have to say I was and still am tempted to join a team of ethnographers in one of the projects, and experiment conversations from that angle. Also to learn more. If there will ever be an opening - English or Romanian are the only two languages I master that much I’m afraid - please consider me!

PS I’d also like to attend your trainings in the projects, at a minimum as a learner.