As a part of my quest towards designing for emergent social dynamics in online conversations, I got involved in a EU project on collective intelligence called CATALYST (wearing my Wikitalia hat). I head a small team of two working on – you guessed it – network analysis for the masses. Specifically, we are making a Drupal module for social network analysis of online conversations, based on Python, Networkx, sigma.js and D3. @Luca Mearelli is doing the heavy lifting. We call it Edgesense – the name was chosen by Luca, and has nothing to do with Edgeryders, though it does fit rather neatly.
@Matthias, Luca has produced a working demo that eats JSON files extracted by me using Views Datasource, and we are now looking into integrating it into Drupal. Since we have been using Edgeryders (2012) data to test the demo, he has requested the authorization to set up an exact copy of Edgeryders on his own server to work on the integration. For this he needs:
the Edgeryders platform code
a dump of the Edgeryders database
Can he be accommodated? I take responsibility for data protection. As far as the code is concerned, I believe you maintain a GitHub repository from which the code can be downloaded, so that should be no problem. @La_Gaia, you should like this too: barring accidents, in a month we will have basic network analysis support, updated daily. More or less in the same amount of time we should be able to release an alpha as a Drupal module, so that anybody can use it.
Hi @Mathias Alberto has explained already what we are working on. It would be really precious for the development to be able to work on a copy of the edgeryders site (it’d give us an environment as close as possible to what we’d find in a live, production site), so let me know if you need more informations or anything.
Nice clean and informative interface … I think I understand how to use it. Bottom right graph is probably “community’s share of content”?
I can provide a database dump, with limitations to protect privacy of our community members: I will clean out all e-mail addresses and real names, and replace usernames with pseudonyms. I will also leave only the public groups. The idea is to create a database dump that only contains the Creative Commons licenced stuff, so that we are able to offer it for public download as well. For others to analyze, but also because the database contains most of the Drupal configuration, without which the code repository is pretty useless.
I can provide this stuff on Tuesday. Tell me if I should prioritize it higher.
(For the Drupal integration, if and when the Edgesense interface will be available to non-admins, note that the non-public content needs to be excluded from these views as well.)
Tuesday is more than adequate. Luca and I will be in Germany for a consortium meeting from Monday to Wednesday, so Thursday is the earliest we can look at it. Incidentally, are you anywhere near Wuppertal? It would be nice to meet for a beer.
The database dump is nearly ready, but not completely (and I don’t know if I can get it finished today … sorry). At least I have a new system for selective database exports now that creates compact 15 MiB database dumps (much better for sharing than the 130 MiB database dump size of our full backups). What’s still missing is proper anonymization of the database content, but I’m working on it …
Don’t worry. We have a new dev road map (see my comment below). The priority is to do a “manual” installation on MT2019 only, or on MT2019 and Edgeryders if you are up for it. There is no need for the database dump to do that.
Of course, Luca still needs the DB dump at some point (and yes, we’ll make it downloadable from GitHub), so your work is much appreciated. Just don’t stress.
I did not get that the dev strategy changed, but even better. Anonymizing is a manual process at this time (need to script it one day), so I can save that effort now and do it later for a more current version – just tell me again when you need it. As a nice side effect of the work on this so far, our database backup system is now more flexible and integrated into Drupal itself (using backup_migrate), so no time was wasted.
Edgesense is part of a strategy to tool up the Edgeryders community website to do the sort of distributed knowledge work stuff that we like. It may fail, but we are pursuing it with a pure heart, and pouring into it substantial amounts of work. You, @Ben, can now see why I dislike any maneuvre that pulls interaction away from Edgeryders. Here, content is preserved and re-used to create metacontent (like validation via inference on network structure); elsewhere, it typically is not. I would be glad to tank the damn thing myself and move over to a platform because of better tools for detecting and interpreting emergent social dynamics; but invariably, when people pull away it is because of user experience considerations.
Hah! I still (ok, barely) remember pre-Internet electronic communication, BBSs, dialup connections, installation floppy disks etc. People wanted to reach out, so they put up with all of the gritty tech. How is it that the possibility to add functionalities like Edgesense are never factored in these conversations? When did we get so spoilt for bloody Ajax?
To access an interactive demo start here. No installation needed, it’s just an html page with JavaScript code. It works on “old” Edgeryders data, so it should make sense to you all.
@Matthias, I respectfully ask whether you would be up for installing a very early version of this. “Very early” means that it would not yet be deeply integrated with Drupal. Installing it would imply a Skype/Hangout with @Luca Mearelli and pip-installing NetworkX. Luca estimates two hours to get it up and running. We would be ready for this in about two weeks (some tweaks to the code needed - you can refer to the development milestones and related issues in the GitHub repo if you are curious about the tweaks. They are all to facilitate layout and interpretation: the code seems solid, and has not yet crashed once on me).
Just played a bit with the demo version and it works nicely and just as I expected how it would be used. (With one difference: I expected the time slider to have two sliding dots for selecting an arbitrary range. Does that make any sense in terms of network analysis? In that case, I would just file an enhancement issue report.)
And sure, I’m up for installing the early version of Edgesense here on the live site (resp. trying on dev.edgeryders.eu before of course). Tell me when you’re ready with the software for it. Of course it would be nice to just do drush dl edgesense && drush en edgesense for installation but I’m sure Edgesense will get there some day
Would you find a little time for a chat / hangout early next week where I could explain a little bit how we might deploy an first early version on Edgeryders?
It’s going to be pretty much a manual thing still, but yes in the end the target is to automate as much as possible
RE: the slider, yes it could make sense, the problem is that it would be too heavy computationally to do for all the possible intervals, which is what I’d need to do with current architecture where the metrics are precomputed and the javascript just visualizes them (On the other hand it’d be very easy to do just for the network representation).
Can I haz urls to user profiles on the nodes in the page here on Edgeryders pleez? It would make it an invaluable network weaving and community-management tool.
@matthias I just spoke with Alberto and I’m ready to try out the edgesense tool with the Edgeryders community. The dashboard (for now) doesn’t need to be installed on the Edgeryders server as it can get the data from it via simple JSON views.
I’m going to create the views needed myself and then I’ll configure the dashboard on a separate server we are using to demo edgesense (in this way the impact on your server is really minimal). Alberto is aware of this setup and agrees with it, but let me know if you have any issues
Agree in principle with this setup, seems like a lightweight & good way to test things out. Just one thing: If the view contains user names, user e-mail addresses or other identifiable information, restrict access to it logged-in users with the admin user role, or an equivalent access protection system. While the content is all Creative Commons, we still have to protect user privacy when the data is going to be used for research.
being called outside the drupal site from a bash / python script (in batch) I think I cannot use the regular authorization rules, so the views have public access (if you have a better suggestion, e.g. how to setup some token or basic auth I can adapt the script).
To comply with the request I’ve removed the name from the user view (I think that was the only privacy sensitive data). At the moment I don’t have the actual content in the jsons, but it would be useful to add it. Would that be ok?
Moreover, I’ve got a problem with the biggest view, the one that dumps the comments: If I don’t limit the rows extracted the view times-out and doesn’t return any data. I think this is something also Alberto noticed while creating a view. At first I thought that his problem was just related to opening a very large view interactively in the browser but in my case I’m skipping the browser and returning only the json. What can we do?
at first I tried keeping the content of the nodes/comments in but actually removing the content (comment text) I’m able to remove also the limit on the number of comments, so for now I’m keeping the json dumps without any textual content (we’ll analyze just the structure of the network for the moment)