Using the edgeryders.eu APIs

This topic is a linked part of a larger work: “Discourse Admin Manual

Content

1. Overview

2. API access

3. Custom API endpoints

4. API client registry


1. Overview

1.1. API Overview

All relevant information in the edgeryders.eu database is accessible via APIs. We support the following APIs:

  • Public Discourse API. This API gives access to everything that can be viewed as non-logged-in visitor on edgeryders.eu. No API key or user account is required. For the API documentation, see docs.discourse.org.

  • Protected Discourse API. All Discourse content that can only be viewed with a user account requires an API key to access it. See below on how to get an API key for your user account. It will give you access to what your user has access to. For example, access to some protected categories, while a moderator user’s API key is required for example to get access to user e-mail addresses etc… For the API documentation, again see docs.discourse.org.

  • Protected custom API. This API is custom-made for edgeryders.eu and gives access to “secondary content” (the codes and annotations of Open Ethnographer), and to the data collected with our “ethical consent funnel” form. Access requires a user with access to this data, and a Discourse Admin API key for that user. For details, see section 2. API access.

1.2. Tips and tricks

API rate limiting. API access is throttled to avoid server overload. The limits are set in config/discourse.conf and are currently:

  • max_user_api_reqs_per_minute = 20 (default)
  • max_user_api_reqs_per_day = 2880 (default)
  • max_admin_api_reqs_per_key_per_minute = 300 (default was 60)

After adapting any of these limits, you have to restart the relevant Discourse process with, for example, sudo monit restart puma_discourse_production.

Filtering by category with the API. In Edgeryders, we often want to look at content in a specific category. Discourse supports two levels of categories: top-level and sub-level categories. When retrieving content in a category, the API will by default also return return all topics in all of its subcategories. If you want to exclude some subcategories (for example, you might want “everything in the category except what is in the Workspace subcategory”), you need to do as follows:

  1. In a browser, visit the subcategory you want to exclude, and click on any of its topics. For example, visit https://edgeryders.eu/c/ioh/workspace, then https://edgeryders.eu/t/10334.

  2. Add .json to the topic’s URL in your browser. You will see the API response for that topic. It helps to have a JSON-prettifying add-on in your browser.

  3. Locate the category_id key in the JSON, and take note of its value (an integer).

  4. Adapt your code accordingly. For example, if you are trying to count the tops in a certain category, except those in its workspace, you can do something like this:

    # 328 is the category ID of /ioh/workspace
    if topic['category–id'] != 328:
      number_of_topics += 1
    

As an alternative to excluding categories you don’t want, you can also add up topics from all sub-categories you want and then add those from the top-level categories you want excluding the topics of their sub-categories. This is possible by appending /none to the URL of a top-level category, for example https://edgeryders.eu/c/ioh/none.

1.3. Python library

We wrote a small Python module containing a collection of functions to fetch Edgeryders content,consent data and Open Ethnographer codes and annotations via API:

2. API access

For access to the edgeryders.eu APIs, you need the following:

  • Public Discourse API. You do not need an API key for the public Discourse API, see above.

  • Protected Discourse API. You need either a Discourse User API key (which you can generate yourself) or a Discourse Admin API key, and then can access with that API key everything that the associated user can access.

  • Protected Custom API: ethical consent data. You need a Discourse Admin API key of a Discourse moderator or admin user.

  • Protected Custom API: Open Ethnographer data. You need a Discourse Admin API key of any user who has access to Open Ethnographer.

If you do need a Discourse Admin API key, ask @alberto or @matthias (or another edgeryders.eu Discourse admin) to create one for you. You can not create it yourself, and you should save it somewhere because you can not look it up in your Discourse account later.

For Discourse admins, this is how you create Discourse Admin API keys:

  1. Go to “Admin → Users” and select the target user.
  2. Scroll down to find “Permissions” and under it “API key”.
  3. Click the Generate Key button.
  4. Tell the user their API key via a secure channel (for example, an encrypted Matrix chat).

3. Custom API endpoints

3.1. Ethnographic codes

The Open Ethnographer codes API endpoint is accessible at https://edgeryders.eu/annotator/codes.json.

A standard response looks like this:

{
    "id": 13,
    "description": null,
    "creator_id": 3323,
    "created_at": "2017-09-05T16:09:52.870Z",
    "updated_at": "2017-09-05T16:09:52.870Z",
    "ancestry": null,
    "names": [
      { "name": "accessible laboratories", "locale": "en" },
      { "name": "zugängliche Laboratorien", "locale": "de" }
    ]
}

Open Ethnographer supports code hierarchies. The ancestry field returns the parent code of the code at hand.

Codes can be filtered by creator like this:

https://edgeryders.eu/annotator/codes.json?creator_id=3323

You can request info about a single code with its ID like this:

https://edgeryders.eu/annotator/codes/13.json

3.2. Ethnographic annotations

The annotations endpoint is accessible at https://edgeryders.eu/annotator/annotations.json.

A standard response looks like this:

[
  {
    "id": 5579,
    "version": "v1.0",
    "text": null,
    "quote": "comfortable for the patient.",
    "uri": "/post/33751",
    "created_at": "2017-05-22T19:13:10.000Z",
    "updated_at": "2017-05-22T19:13:10.000Z",
    "code_id": 342,
    "post_id": 33751,
    "creator_id": 3323
  },
 ...
]

Fields have been named in such a way that their meaning is self-evident. The one exception is text: this is a comment field that ethnographer can use to explain their thinking behind the annotation.

The following GET parameters can be used to filter annotations in the response:

  • topic_id: Return annotations which belong to the given topic.
  • post_id: Return annotations which belong to the given post.
  • creator_id: Return annotations which were created by the given user.
  • discourse_tag: Return annotations which are tagged with the given Discourse tag. The tag’s name, rather than the tag’s ID, needs to be passed in. A list of available discourse tags can be found here: https://edgeryders.eu/tags
  • code_id: Return annotations which are tagged with the given Open Ethnographer code.

Filter parameters can be combined as needed. For example, to return all annotations that belong to a certain topic and were created by a specific user:

https://edgeryders.eu/annotator/annotations.json?topic_id=111&creator_id=222

The output is paginated. By default, one page contains at most 100 annotations. This can be changed with the per_page GET parameter:

https://edgeryders.eu/annotator/annotations.json?per_page=200

If there are more annotations available than the given per_page limit they can be accessed on subsequent pages by using the page GET parameter:

https://edgeryders.eu/annotator/annotations.json?page=2

If no further annotations are available, an empty array is returned for that page.

3.3. Ethical consent data

The Edgeryders platform has a feature called the ethical consent funnel. It is accessible by API as described below, and (after fixing #194) on the user admin pages as field edgeryders_consent.

When a user tries to post for the first time, the ethical consent funnel is served as a popup form: it asks users to answer some questions before they are able to post in certain categories. They can only proceed past the form when they have answered its questions correctly. When a user answers the questions correctly, the platform updates the value of a field called edgeryders_consent. We interpret this sequence of events as having given informed consent to participating in a research project with Edgeryders, and having understood the nature of their part in the exercise.

Question definitions. The wording of the consent funnel questions and answers is contained in consent.hbs.

Data access by API. The data of field edgeryders_consent is accessible by JSON API at https://edgeryders.eu/admin/consent.json in conjunction with a suitable API key, which you can supply as a api_key GET parameter. In addition, currently you have to supply a request header Accept: application/json as a workaround for #212. Also note that this API endpoint does not support pagination – it provides the consent data for all users in one response.

Basic values. The interpretation of the edgeryders_consent field values as seen by JSON API access is as follows:

  • "edgeryders_consent": "1": User has given consent. Includes valid consent given on the Drupal platform that was later imported (with the consent timestamp reflecting the import time).

  • "edgeryders_consent": null: User has not gone through the consent funnel yet. This is true for many of the earlier users of Edgeryders, as the consent funnel was only fully implemented in July 2017.

    (In the Discourse database, the value is not indeed NULL, but a logical equivalent: the record in table user_custom_field will be missing for this user and name = edgeryders_consent.)

Additional values. In the Fall 2017 we attempted to elicit consensus from 191 users that had contributed to OpenCare before 2017 (but never after). For this, we created two new values for the edgeryders_consent field:

  • "edgeryders_consent": "0": User was re-contacted after contributing to Edgeryders before July 2017, and has denied consent.

  • "edgeryders_consent": "no answer": User user was re-contacted after contributing to Edgeryders pre-July 2017, but failed to answer after repeated attempts.

  • "edgeryders_consent": "unreachable": User user was re-contacted after contributing to Edgeryders before July 2017, but the e-mail they used to create the Edgeryders account is no longer active.

3.4. Multisite account creation

This allows authorized external applications, including JavaScript web applications, to create active Discourse accounts by API, which can then be used by these applications to post content to Discourse in the name of the new user. This is useful for various onboarding, survey and data collection purposes.

Endpoint URL. Since we use a single-sign-on (SSO) system where one Discourse account works on all of our Communities sites (see the top right menu!), the endpoint is provided on our login site because that is the SSO provider where the “master” record of each account is created:
https://communities.edgeryders.eu/multisite_account.json

Request type. GET (Will later be changed to POST once we figured out how to set up the right CORS policy in Discourse. GET is not ideal as it should not be used for requests that cause state changes.)

Parameters. The possible request parameters are:

  • email: Required. The e-mail address that the user wants associated with the new account. If an account with that e-mail address already exists, account creation will fail.

  • username: Required. The username to use for the new account. If an account with that username already exists, account creation will fail. Due to that, it is advisable to test before for username availability using the Discourse public API (https://communities.edgeryders.eu/u/username.json), or to auto-generate a username that will most likely not exist yet. The user may change it later inside Discourse.

  • password: Required. The password to set for the new account that will be created.

  • accepted_gtc: Optional. true or false, referring to the GTCs of the Edgeryders Communities platforms. Assumed as false when not provided, which will result in an error message.

  • accepted_privacy_policy: Optional. true or false, referring to the Privacy Policy of the Edgeryders Communities platforms. Assumed as false when not provided, which will result in an error message.

  • edgeryders_research_consent: Optional. true if the user passed the Edgeryders Consent Funnel or equivalent questions, giving informed consent to the use of their content for research; false otherwise. Assumed as false when not provided. true is required only when requesting an API key for edgeryders.eu.

  • requested_api_keys: Required. Non-empty list of the domains of Edgeryders Communities sites for which the caller requests a Discourse Admin API key. Separate multiple values with whitespace.

  • auth_key: Required. A shared secret without which access to this API endpoint will be prohibited. The currently active auth_key is available in a protected page.

    Background info (click to unfold)

    Since this system should result in published content before the user had to confirm their e-mail address, it is a good target for spam submissions and needs some form of authentication. auth_key is a shared secret to prevent automated spam submissions. The idea is to distribute it in a limited way and only to trustable parties, and to disable it once it has reached untrusted parties who start using it for spam submissions. This seems to be the best approach, as an external web application cannot be trusted when it says “I have let the user go through a good captcha” and as we don’t want to build a captcha-via-API system (and captchas are annoying anyway).

Example request. https://communities.edgeryders.eu/multisite_account.json?email=testuser2@example.com&username=testuser2&password=verysecretpassword123&accepted_gtc=true&accepted_privacy_policy=true&edgeryders_research_consent=true&requested_api_keys=edgeryders.eu&auth_key=8342……3274

Response. A typical response for successfully creating an account would look like this, basically an abridged version of the user records that are normally returned by Discourse:

{
  "id": 5,
  "username": "new-username",
  "email": "username@example.com",
  "active": false,
  "created_at": "2019-09-05T08:35:01.000Z",
  "username_lower": "new-username",
  "trust_level": 0,
  "api_keys": [
    { "site": "edgeryders.eu", "key": "sgev47…fdffd0" }
  ]
}

As seen from the "active": false field, the account on communities.edgeryders.eu is not yet active. When the user tries to log in for the first time, she will be asked to activate the account by requesting a link sent to her e-mail address and clicking on it. However, the accounts created on Communities sites alongside this master account are already set to active. This allows to use the returned Admin API keys without further steps and also makes sure the user is notified by e-mail about replies to the content she posted. The fact that the Communities account(s) are already active includes a slight risk that e-mails are sent to other people’s e-mail addresses in case the user did not enter her own e-mail address. If that turns out to be a problem at least for some applications, we can provide a parameter that allows to switch this behavior on or off, and creating active accounts would only be possible with some but not all auth_keys.

In the case of an error, the error status and message as provided by Discourse for an account creation error will be returned. This happens when:

  • An API endpoint parameter is missing or wrong. See the source code for details on the possible error messages that can happen.
  • An account with the given e-mail address already exists.
  • Other Discourse account creation errors.

Typical usage. A typical usage of this API endpoint would look like this:

  1. Content preparation. A client application of this API would first collect the content or data it wants to post to Discourse under a user’s name, and compile it into a Discourse post format.

  2. E-mail address input and consent. On the last screen of the content collection form, the user is asked to enter an e-mail address and to confirm to the Edgeryders ethical consent funnel, terms and conditions and privacy policy.

  3. Get the auth_key. The application needs a valid auth_key vale to be granted access to the multisite_account.json API endpoint. This can come from several sources. In the simplest case, it would be written next to the screen on a survey computer and the user would enter it into the application. Or it can be provided as a GET parameter in a link (from social media etc.). Or it can be served from a configuration file that is publicly accessible to the JavaScript application from a server but not included in a source code repository.

  4. Request a new Discourse account. Now the we application will make a request multisite_account.json as specified above.

  5. Get the new Discourse account. The Discourse serverside code would do the following: (1) confirm that auth_key is a valid token, (2) make sure that email and (if given) username is not yet associated with an account, (3) make sure the user gave the required consents, (4) create a new account on the SSO provider site, (5) sync that new account to the Discourse Communities sites for which API keys are requested, (6) create and collect the requested API keys on the Communities sites, (8) send a reply with all necessary information to the API client, as specified above.

  6. Show an account summary page. The web application would show a page saying that the content was successfully published and also providing information about the user’s new Discourse account (site URL, username, password, e-mail address). The user would be asked to make a photo or otherwise take note of this information but also be told that she can always reset the password after entering her e-mail address.

  7. Error handling. In case that the account could not be created, for example because an account using that e-mail address already exists, the web application should show the user’s text input and ask the user to (1) provide a new e-mail address to post under or otherwise (2) to log in to Discourse and copy&paste it into a new topic there. The web application could also provide the option to log in to Discourse, to post the text under that identity (after getting a Discourse User API key for that).

  8. Use the API key to post to Discourse. At this point, the API client application has a Discourse Admin API key of a new Discourse user and can use that to do actions on behalf of that user, via the normal Discourse API. It can then use it to create a new topic authored by that user.

    Note that the new user starts as a TL0 user, while all users who sign up on edgeryders.eu manually and confirm their e-mail address start as TL1 users. This is a spam protection measure and may lead to some issues if the user tries to post many links or images.

  9. Log in and forward to Discourse (later). This is for later, not for the use case as a survey system. At the end of this process, there would be a link to bring users directly to Discourse, to the topic they just created, allowing them to interact with other users as a normal Discourse user. The special part here would be that they end up on Discourse in logged-in state, without having to go through the communities.edgeryders.eu login site. To make that work, the web application would simply submit the username and password to communities.edgeryders.eu and get a login cookie in return (assuming proper CORS policy handling). Then it would forward the user to edgeryder.eu/login, which means that the user will end up there in logged-in state. We use that trick at the end of the communities.edgeryders.eu login process already. To bring the user to their own topic automatically, we could extend Discourse with a `edgeryder.eu/login?redirect=/t/…" mechanism.

Source. We provide the source code of this API endpoint under an open source licence.

4. API client registry

Our custom APIs frequently change due to improvements and refactorings. To avoid the complexity of legacy APIs or API endpoint versioning, our method of change management is this API client registry. When you register your client application here, you will be notified when an API endpoint it uses changes. Registration is optional, but you’ll have to track API changes here in the manual if you don’t register. To register, simply edit this wiki. API client means an instance of a software; one application could be run in several instances.

2 Likes

Thanks for your work @matthias and @daniel. I have a few questions.

If I understand this correctly, this could cause some complications.

Our proposed architecture was that a JavaScript front end application would post to the platform by talking to the API. However, if the front end application needs to pass an auth key to do so, that key can’t be stored in a client-side web app without leaving it exposed in the browser. Having a second micro backend service which keeps the key doesn’t sound like it will solve anything either since that could just as easily be abused as the first one. And if the key is kept on a publicly accessible server, that is not any different than just keeping the key in the code. Am I misunderstanding something?

In this request, there is a password parameter which is not in the documentation. Which password is this?

An admin API key? Since the suggested architecture is a client-side web app, that key would then be accessible by the end-user. Can that key be used to make changes on behalf of other users to, or just the user in question?

No, it’s fine, you can store it there. The sole purpose of auth_key is spam protection, not any strong authentication. (Should we name it antispam_token maybe?)

multisite_account.json is a public API, just as signup is a public function in Discourse. To protect against 98% of spam, we just don’t want the API to be completely unprotected, and we want a way to revoke access selectively. That’s why we have this shared secret mechanism. It will be more ephemeral in the future: a key would be used in one campaign and then replaced by another etc. … see a more detailed proposal. Just as it takes spammers some time to pick up a new e-mail address, it will take them some time to pick up a link with a new key. Won’t be perfect but I think, pretty good.

Not in terms of being public, no. But a key in a config file is simpler to modify on the server, as no code deployment is needed. That helps when the keys will be more ephemeral.

Just the user in question. Discourse has two types of API access: User API and Admin API. Bit of a misnomer. Anything that uses an API key in Discourse uses the “Admin API” type of access, and anything that uses interactive authentication of an API client (“connecting an app to my account” style) uses the User API. Admin API access does not give a user admin access (except the user actually is admin).

The password of the new account to create. Fixed this in the documentation now.

1 Like