Imported from the hackpad: https://lote5.hackpad.com/SAT-0930-1045-MASTERS-OF-NETWORKS-NETWORKS-OF-CARE-hackathon-for-network-scientists-doctors-and-patients-to-make-sense-o-vxaFSnxaNTg. Thanks to all note takers, especially Rossella B.
This is the main conversation
You can download Tulip from here
You can download the data file here https://drive.google.com/file/d/0B7HgdYQcOLwncWhwQ3lKdmZXMmM/view
The data we have are about edgeryders users, which posts they posted and who reacted
we want to see how they interact, so we want to link them to one another
If you click on “user to post” in tulip, you see emphasized the users that have written the original post, and which posts they wrote, linked to that user.
If you click on one of the posts you see information about the post, the text, the date, the title, plus some information that is internal to the software and is not so interesting for us.
Edgeryders supports in-platform etnographic tagging. Ethnography is a qualitative method to analyze texts and attach tags to the texts that represent what the text says. The tags have to be standardized, so different researchers can be consistent in the use.
In the past ethnographic researches where limited because it was costly to gather the texts via interviews and transcripts. They are not statistically representative.
Edgeryders supports in-platform etnographic tagging. Of course, most content in Edgeryders is NOT ethnographically tagged: ethnography is expensive, and we only do it as part of contracts in which ethnography is used to lift collective expert advice from the conversation.
Online the texts are already written, so they don’t need to be transcribed, and the ethnographers’ codes can be kept together with the texts.
These tags are composed of super-classes (f.i. “Topic” or “Place”) and they can be nested, and for each super-class you can entry different values (f.i. “Charity” or “Cairo”).
An example: the users writes “Our task will be to explore how we may then model interaction beween users,” -> the ethnographer attaches the tag “interaction” . This tag is added in the html code of the platform, directly linked to the original text.
When this happens, a semantic layer is added to Edgeryders content. We can add semantics to the social network representation of the Edgeryders conversation: not just “who is talking to whom”, but, adding semantics, “who is talking to whom about what”.
On Edgeryders, the ethnographic data are linked to other kind of data, like the profile of the users (age, location etc), the circumstances surrounding the text. We have these ethnographer-coded data only for a specific conversation (because we had financing to gather it). We want to do this again in Open Care.
Of course you do not have all the information about the context of a conversation, for instance you don’t know what happened besides the online platform. We will still miss a lot, and make misinterpretations.
Question (Ezio Manzini): what is the object that you are observing? Is it a conversation on Open Care or is it a conversation on Open Care in the context of the Edgeryders platform, and such and such.
A problem about “large scale conversations”: a large number of users does not imply that many people will participate in the conversation. In the course of a EU project called CATALYST we learned that almost no threads have more than 10 participants, even in “large communities” like Loomio, with 10K users.
We should rephrase the question says Ezio, one that is more precisely the question that can be answered by this tool. Yes, but we need to use the tool to discover the question. You have some questions and in the process you will discover more questions.
We have users who initiate posts and reply to posts with comments. We have ethnographers who will tag some of the texts. We want to discover how people interact and what the conversations will be about.
When you click
- "user to user" in Tulip you actually see the people who interact (have a conversation).
- user to comments: which comments they made
- "post to groups" you see the posts that have been added in the group.
- user to tags tells you which users are talking about a tag - if you believe in collective intelligence you need more than this because you still don't know if these users are talking to each other -> you want to see the conversations between the users that contain this tag, that is when the semantic is important
- we also want to link concepts to other concepts
Here you an see if a tag was interesting and if it was important. When people share something or comment is a different thing. Relevance can become important as well. (This previous has to do with sentiment analysis) But it is not the core of what we want to do in Open Care. It can be important to single out the people who talk about a topic.
idealistic idea of human interaction. Deep conversation, rather than big data.
User-to-tags networks are not easy to interpret. Guy and Benjamin Renoust introduced semantic edges show me only the edges that include “real action” as tag. another idea is visualising two networks, one of people to people and one of tags to tags. if you 'lasso" a topic you sese the network of people who talked about it. if you lasso two topics you see the people who talked about either (but you can switch view and see who talked about both). if the people have been talking to each other about the tag then the edge becomes red as well, otherwise only the node. If you select a user you can see the tags he’s been talking about.
In this version of the network, people are connected if one of them reacts to the other’s post of comment. If someone reacts to a comment to somebody’s else’s post, then he’ll be connected to the author of the comment but not to the author of the post. It is possible with these data to make different choices.
Start from a tag, say “open-source software” and
- see if the people talking about it are also talking to each other. If so, we can speak of an emergent group of specialists, emergent because it "happened", nobody assigned them to do that. The word "group" is critical here: if people were just speaking about open source software in isolation, they could be in total disagreement and not even know about it, but if they are talking to each other about open source software there can be a convergence going on, like when Wikipedians weed out mistakes in the encyclopedia's articles.
- lasso those guys and see what else they are also talking about, and get an indicative map of how this emergent group of experts sees the topic.
The present data are “post-mortem” of the conversation. The ideal situation is to be able to analyze the conversation with this tool almost “real-time”, let’s say on a daily basis. This, however, has the problem that ethnographic research does not really work that way: ethnographers find it easier to maintain consistency if they work in batch.
Guy: The tool could influence people to do more if it was available. It can be a user tool, besides a research tool. Alberto: However the Edgesense tool was made available to the community, and there do not seem to be that many people looking at it.
The color coding are the nodes that are more connected to each other that to other members of the group. If you are very active you have “your own” group. For instance Noemi, her group is not exactly a star but very central.
The links are directional: two nodes can be linked by two curved lines, one from node a to node b, the other in the opposite direction.
You can show by degree (number of connections), select only the edges that link to and from the moderators. You can also hide them and see that the network is still holding. Some people fall out, but the giant component remains.
This example also helps clarify Ezio’s question above. What is the question that this network representation can answer? The sociology of the network. It helps manage the group. The network we saw on Tulip, it might be worth spending some time to find a bridge between what the tool can do, and the general question “what is open care”. IN which ways can it be helpful?
This network is updated real-time. What would be a proper time frame for a semantic coding? The coding needs some consistency. So the ethnographer will say you make a conversation and then we tag it. The project on open care does not include a real-time semantic coding. That is a lot of work and would be a big addition to the project, needing its own financing.
Keywords. Quality tracking. Network authoritativeness. You could use the latter to filter what the ethnographer has to code, so the method becomes scalable. Not by using robots (not working with ethnography) but because you use network math to filter the content.
Author generated tags can work if implemented with auto-prompting and proper incentives, Harry Halpin has published on this about Delicious. Ethnography would not agree, Alberto believes., we can’t trust it, although we can try it. But the added values is in small numbers, because then it can be used by smaller communities as well.
Ezio: the tool also needs to help develop some intervention. For instance spotting the emergent group of experts could be a useful intervention for the Comune di Milano. We need to develop a narrative like that about the tool we saw in Tulip for Open Care. Alberto believes he has a narrative, a strong hypothesis. Collective intelligence. Small scale collective intelligence. Problems are local, resources are local. So if we have a permanent think-tank on a platform and a cheap way to harvest from it, then we have a tool for a local community to use. You can identify the emergent groups of experts on specific topics and look at them rather than having to make a large scale study.
Instead of going up, going big I’m trying to go down, go deep, focus on a small group of people, without drawing a line around them (because then you’d loose the openness).
Ezio: I’ve never posted. It would be different if someone would ask around if they have something to say. Now I am here and I talk to people, I feel committed. But now I understand that we have to write because if we talk now we cannot register what we say. Alberto: This is the open. You commit to make the things you said available to people you did never even meet.
Ezio: we are collectively writing a document. You should know who the contributors are. At a certain point, you can say this is the book we wrote and these people wrote. The way this book was written has made it possible to work in a certain way. Alberto: yes, in the end, the ethnographic report will say we had these number of people and…
Ezio: if I want to participate I have to write and I have to be very attentive. Alberto: you can also choose to be a marginal author, just have a role along the side. And you still are a coauthor. Also the cook who made the lunch at the Lote5 meeting mattered in the making of the book.
Ezio: co-design process. These tools could be part of the solution. We are now a network and there could then be a network of nurses doctors etc. This discussion can in a similar way happen among care-givers? Costantino: the conversation will not be the result of this group, but of a larger group. But the network of care-givers might be not committed, skilled, or motivated to use an online tool.
As an example it is possible to use the results of the ethnographic study to feed the discussion at the Comune di Milano. Alberto: Ideally you bridge across. Costantino: this is the hard part. If the discussion is not well documented and published in the coming days, then the online community will miss it. And viceversa. So the tricky part is right here. How can we bring them across each other. In terms of artifact, in terms of codesign, our committment is to at least make transparent what are the needs of the people that we are meeting, how are the codesign sessions organised. Upload the design of a artifact online so that other people can contribute. Many levels: design, conversation… We are going far beyojnd ER as it was right now. We are not only creating a book or a report we are also committed to create real prototypes. So we need to go one step forward, so that we have the tools to manage not only the online conversation but also the other levels. But the online conversation is key to all this.
Alberto: we are not trying to invent a new methodology, just trying to do it well. The online community is key because of permanency, findability, linkability…
A discussion about Wikipedia, the major difficulty is that we do not own that platform so we can’t control. I’m in a village and need info, i go to wikipedia, and ER might use that.
You can’t brief too much in terms of what will emerge in terms of design.
People in Milano have clear problems: Where do I leave my kids. You try to solve, and the solutions kind of work and we try to use this process to make it better.
The process that we are following here is not the prototype. The prototype we talk about is some care solution. Online the local problem in Milano goes global and someone from France can intervene in the conversation. The prototyping part is in Milan, but care issues are similar all over the world.
In the Spot the Future project we saw the the international interaction was very valuable. The capability of transferring the knowledge by solutions found abroad to other context is exactly the point that makes it possible to finding new solutions.
Two groups: quality of data and Wikipedia data
Quality of data session
We used Tulip and Detangler to explore a semantic network dataset similar to the one that will be used for Open Care. We filter our data according to two proxies for quality:
- post/comment length as quality: discard all posts and comments under, say, 100 characters. This would only keep thoughtful contribution.
- user pagerank as quality: pagerank or other eigenvector centrality measures are often associated to correlate to authoritativeness. So, we can reduce the network to the contributions made by the highest-ranking individuals.
The hypothesis is that the reduced semantic social networks would not be much different from the non-reduced ones. If that is confirmed, the methodology is scalable: if you get a ton of content, just filter it for quality, and work only on the top-quality 15%.
We walked through the exercise and found that our reducing code works. However, we found inconsistencies in the data (many comments and posts with apparently no text – though, when we checked them on the Edgeryders platform, of course the text was there!). We do not feel confident to draw any conclusion. The group resolved to redo the exercise starting from a fresh extraction. Guy Melançon is taking the lead for this.
We made a sketch of a graphic representation of the medical articles on wikipedia in all languages. The larger the node, the more often a page has been visited and links represent the hyperlinks between pages. We use this representation to explore the data and look at emerging patterns. For instance we will compare the structure of the networks in the different language to see if there might be cultural or geographical properties that can lead to interesting research questions.
The code for the Wikipedia session can be found here: https://github.com/spaghetti-open-data/visualizing-self-diagnosis