GIGO problem
“Garbage in, garbage out”.
I would not trust those data. People do not really keep profiles up-to-date.
Semantic networks are a better bet. Once we have some orderly SSN data, we can link people to ethno codes via content authored. For example, user John authored post “Communities from offline to online” that has been associated the codes “community”, “facilitation” and “conflict management”. You could then ask the SSN questions like “give me everyone who has at least 3 codes in common with John”. @melancon is working on this, and we should have an implementation on opencare data in a few months.
We will not be able to do this on the whole of the Edgeryders conversation, not in the short and medium run. At 1,000 participants and over 20,000 comments it is too expensive to code entirely. But opencare is a significant subset: we already have 150 unique participants, with some 230 long-form posts and challenge responses and over 1,600 comments. In the long run, we could use machine learning: the human coded sections of the conversation double up as training data for classifiers. Once the classifiers have evolved, we can have them code the rest of Edgeryders. Maybe. More thoughts on this here.