Heads up: user API not working + preparing for data publication

@Jason_Vallet , this view seems not to be working (you renamed it, but it does not seem to be normalized):

https://edgeryders.eu/en/opencare/users

I am not sure we should publish this file at all, it kind of kills the point of anonymization.

Also notice that I am building anonymized versions of the APIs, then exporting the dumps for upload onto Zenodo. For example https://edgeryders.eu/en/opencare/comments-anonymized is the same as https://edgeryders.eu/en/opencare/comments , except the “user_name” field.

Ping @melancon

Also, why “labels”?

@Jason_Vallet : both the opencare/comments and the opencare/content views have a field called “label”, which is the same as the “Title”. Can we get rid of one of them?

Killed it

I already did it. I can’t think of a reason why we should output the same field twice :slight_smile:

About the User API anonymisation and its fields names

@Alberto: There is several precisions to give concerning the user view and how we use it:

  • First of all, as you may have noticed, the number of users returned by the view (working once more) is limited to 44 users, which I am guessing to be the core of the opencare consortium (?). So this only concerns a small set of users which means that, on our visualisation platform, the rest of the users are created "on-the-fly" when going through the posts, comments, tags and annotations. Those later users are only identified by a given id and their given name.
  • The user view gives a lot of so-called "personnal information" (like facebook, twitter, website, location..., which I do not really consider worthy of interest and yet left anyway). On the other hand, every single piece of information we show in this view (or the others) is also available to everyone via the website. If somebody really wants it, a simple web sniffer will certainly achieve the same job in roughly 10 minutes. So, that being said, anonymising the views will only make such task a little bit harder.
  • Finally, concerning the use of the user's name, no big secret there either considering anybody can infer who is a post or comment author based on the messages' titles. So unless we anonymise everything (like the message title, content and post date, and add some "noise" and "salt"), we do not truly anonymise anything.

About the “label” field, it is more of a commodity used to indicate what information I want to pass as a visual label on the graph, so there is no need to use it in your case.

TL;DR and to conclude: the user view does not contain essential information and is uncomplete so no need for it on Zenodo. The suppression of the “user_name” is a good idea but ultimately does not ensure anonymisation in this case.

Ok, done

Thanks @Jason_Vallet , you rock.

One thing though: as I told Guy, I think you should scrap this users view with the 44 users. The data model used by edgeryders.eu is no longer based on the Drupal groups. So, the set of users participating in the opencare conversation should be derived as follows:

  • Call the content and comments APIs.
  • Iterate over the items, and save user_ids and user_names in a list.
  • compute the length of the list. 

At the moment, this list is 200 users long.