We are still a fair way off using ML IMHO. Because there simply is no sizable training dataset. So yes, EU funded is the only scenario I would consider putting time behind trying to make it work.
This is true and something I had meant to mention - it does require a large dataset. For news we went through years of archives. Not sure if smaller datasets can work with today’s products.
Quite relevant. However, the size of the dataset you need all depends on what methods you use (SVM can achieve things with much smaller datasets) and of course what you intend to do with these ML methods. My experience: using just half a thousand comments (audio notes from wine experts recorded on their mobile phones converted to text) we were able to put up a system that could automatically fill a tasting note (a bunch of checkboxes on a wine’s visual, olfactory and/or gustatory attributes) for new wines – with acceptable success rates, although it’s true in most cases the note could only be partly filled.
That gives me modest hopes to be able to categorize posts as to whether they fit this or that code, maybe come up with suggestive ranking to help coders prioritize posts/comments.
But as you mention, that would be something we’d try within the context of a research project.
Relevant new article in American Anthropologist.
If you can’t access it, let me know and I can send along the PDF.
This is an exciting topic, I would like to give some insights I have on the discussion, but legal disclaimer, I’m discovering ethnography (I have read a lot of posts on the platform these last days but that’s it) so forgive my possible bad understanding
It’s funny because this issue of the participant-observation, with the need of disclosing one’s researcher status and what it possibly implies (influencing the discussion) reminds me of one quantum mechanics theory that says that a measurement on an observation bring changes to this observation. I’m working on creating and then analysing surveys (about stress and well being at work) and when designing it, we have to take into account the bias induced by the question formulation for instance, or the periodicity of the campaign, and there are some methodologies to reduce such error inherent in the survey measurements. But it seems that being a part of the observation should reduce this bias and should allow for more precise analysis, because for example you discover the ideas / concepts covered almost on the fly and intuitively it would converge more naturally to something. (And it is more interesting to participate the subject you are studying, it should be more immersive). I would be very happy to learn more about this field
That being said, to join another topic of this discussion, when dealing with open ended questions, we have to find topics and there is also the issue of scaling when we have many answers. We tried to do it automatically via nlp technics (as a first step, before manually sharpening between topics) and with 4000 answers, and given the noise induced by open ended question (it is as if you give a blank page and ask someone to tell you how does he feel today, for example, and when asked, you are left to the participant’s imagination, without the ability to readjust anything), we can have some results, and some visualisation of the topics as well (we are using NMF, for those who are familiar with ML, with a lot of cleanup beforehand).
So maybe it is usable in ethnography, as a first step to define code (the big ones).
I completely relate to that, and confirm, as Alberto was noting, that even we as comm managers get more excited about some posts then others, and it would be slightly dishonest on my part to say I am equidistant in how I respond or the attention each contribution gives. It’s in the community’s best interest to be part of an interesting and deep conversation, which means it’s in our duty to learn or be intuitive about how to tap into some content and actually induce the participants to develop ideas further, as well as diminish the noise by not trying to squeeze aggressively the same kind of value from i.e. those who are not so available or simply don’t have it at a moment and place in time (very circumstantial… but that’s what it is).
I would sum it up as having a very human approach to community management, far from it being potentially automatized or reduced to 5-steps-and-you’re-done in the near future.
For what it’s worth about the two very intertwined roles of comm managers and ethnographers and how we are approaching collective intelligence, I’m with you on this Amelia. We are all asking I think for 2 different people doing the 2 roles not because they are incompatible or because it creates bias, but because they are too important and each requires its own strong capacity.
That being said, I have to say I was and still am tempted to join a team of ethnographers in one of the projects, and experiment conversations from that angle. Also to learn more. If there will ever be an opening - English or Romanian are the only two languages I master that much I’m afraid - please consider me!
PS I’d also like to attend your trainings in the projects, at a minimum as a learner.