This topic is a linked part of a larger work: “Discourse Admin Manual”
Content
- 3.1. Ethnographic projects
- 3.2. Ethnographic codes
- 3.3. Ethnographic annotations
- 3.4. Ethical consent data
- 3.5. Multisite account creation
- 3.6. Obtaining an API key by API
1. Overview
1.1. API Overview
All relevant information in the edgeryders.eu database is accessible via APIs. We support the following APIs:
-
Public Discourse API. This API gives access to everything that can be viewed as non-logged-in visitor on edgeryders.eu. No API key or user account is required. For the API documentation, see docs.discourse.org.
-
Protected Discourse API. All Discourse content that can only be viewed with a user account requires an API key to access it. See below on how to get an API key for your user account. It will give you access to what your user has access to. For example, access to some protected categories, while a moderator user’s API key is required for example to get access to user e-mail addresses etc… For the API documentation, again see docs.discourse.org.
-
Protected custom API. This API is custom-made for edgeryders.eu and gives access to “secondary content” (the codes and annotations of Open Ethnographer), and to the data collected with our “ethical consent funnel” form. Access requires a user with access to this data, and a Discourse Admin API key for that user. For details, see section 2. API access.
1.2. Python library
We wrote a small Python module containing a collection of functions to fetch Edgeryders content, consent data and Open Ethnographer codes and annotations via API. See:
1.3. Tips and tricks
API rate limiting. API access is throttled to avoid server overload. The limits are set in config/discourse.conf
and are currently, for both our edgeryders.eu and multisite Discourse installations:
max_admin_api_reqs_per_key_per_minute = 300
(default was60
)
This is the only relevant limit for scripts accessing the API (as they use an Admin API key). Note that this limit is per API key per minute, summing together the requests to different endpoints that your script may make in that minute.max_user_api_reqs_per_minute = 20
(default; only relevant for connected apps)max_user_api_reqs_per_day = 2880
(default; only relevant for connected apps)max_reqs_per_ip_mode = none
(means, no additional global rate limit restrictions; default wasblock
)
The best strategy for long-running scripts is to be proactive: calculate the allowable time for the next request to avoid your script from hitting rate limits. That spaces out requests equally over time, avoiding spikes in the server load. For information on the meaning of the different rate limits, see here.
After adapting any of the limits above, you have to restart the relevant Discourse process with, for example, sudo monit restart puma_discourse_production
.
Filtering by category with the API. In Edgeryders, we often want to look at content in a specific category. Discourse supports two levels of categories: top-level and sub-level categories. When retrieving content in a category, the API will by default also return return all topics in all of its subcategories. If you want to exclude some subcategories (for example, you might want “everything in the category except what is in the Workspace subcategory”), you need to do as follows:
-
In a browser, visit the subcategory you want to exclude, and click on any of its topics. For example, visit https://edgeryders.eu/c/ioh/workspace, then https://edgeryders.eu/t/10334.
-
Add
.json
to the topic’s URL in your browser. You will see the API response for that topic. It helps to have a JSON-prettifying add-on in your browser. -
Locate the
category_id
key in the JSON, and take note of its value (an integer). -
Adapt your code accordingly. For example, if you are trying to count the tops in a certain category, except those in its workspace, you can do something like this:
# 328 is the category ID of /ioh/workspace if topic['category–id'] != 328: number_of_topics += 1
As an alternative to excluding categories you don’t want, you can also add up topics from all sub-categories you want and then add those from the top-level categories you want excluding the topics of their sub-categories. This is possible by appending /none
to the URL of a top-level category, for example https://edgeryders.eu/c/ioh/none
.
2. API access
Required credentials. For access to the edgeryders.eu APIs, you need the following:
-
Public Discourse API. You do not need an API key for the public Discourse API, see above.
-
Protected Discourse API. You need either a Discourse User API key (which you can generate yourself) or a Discourse Admin API key, and then can access with that API key everything that the associated user can access.
-
Protected Custom API: ethical consent data. You need a Discourse Admin API key of a Discourse moderator or admin user.
-
Protected Custom API: Open Ethnographer data. You need a Discourse Admin API key of any user who has access to Open Ethnographer.
If you do need a Discourse Admin API key, ask @alberto or @matthias (or another edgeryders.eu Discourse admin) to create one for you. You can not create it yourself, and you should save it somewhere because you can not look it up in your Discourse account later.
Admin API key creation. For Discourse admins, this is how you create Discourse Admin API keys:
- Go to “Admin → Users” and select the target user.
- Scroll down to find “Permissions” and under it “API key”.
- Click the
Generate Key
button. - Tell the user their API key via a secure channel (for example, an encrypted Matrix chat).
Admin API key usage. Once you have your Discourse Admin API key, you have to supply it in a HTTP header Api-Key:
. This mechanism works for both the standard Discourse API and our custom endpoints. Notably, supplying the API key as a GET parameter does no longer work since 2020-04 (source)! Using the Api-Username:
is not necessary as it’s only used to identify the user to impersonate when using master API keys. See also the official Discourse API access documentation. A working minimal command line example is as follows, to be used with your own Admin API key:
curl -X GET "https://edgeryders.eu/annotator/{project_id}/codes.json" -H "Api-Key: 1fe3…"
3. Custom API endpoints
3.1. Ethnographic projects
The Open Ethnographer projects API endpoint is accessible at https://edgeryders.eu/annotator/projects.json
.
A standard response looks like this:
{
"id": 3,
"name": "ethno-ngi-forward",
"created_at": "2022-11-29T11:31:22.784Z",
"updated_at": "2022-11-29T11:31:22.784Z",
"codes_count": 1131
}
3.2. Ethnographic codes
The Open Ethnographer codes API endpoint is accessible at https://edgeryders.eu/annotator/projects/{project_id}/codes.json
.
A standard response looks like this:
{
"id": 13,
"description": null,
"creator_id": 3323,
"created_at": "2017-09-05T16:09:52.870Z",
"updated_at": "2017-09-05T16:09:52.870Z",
"ancestry": null,
"annotations_count": 1,
"names": [
{ "name": "accessible laboratories", "locale": "en" },
{ "name": "zugängliche Laboratorien", "locale": "de" }
]
}
Open Ethnographer supports code hierarchies. The ancestry
field returns the parent code of the code at hand.
Codes can be filtered by creator like this:
https://edgeryders.eu/annotator/projects/{project_id}/codes.json?creator_id=3323
The output is paginated. By default, one page contains at most 100 annotations. This can be changed with the per_page
GET parameter:
https://edgeryders.eu/annotator/projects/{project_id}/codes.json?per_page=200
If there are more annotations available than the given per_page
limit they can be accessed on subsequent pages by using the page
GET parameter:
https://edgeryders.eu/annotator/projects/{project_id}/codes.json?page=2
If no further annotations are available, an empty array is returned for that page.
You can also request info about a single code with its ID like this:
https://edgeryders.eu/annotator/projects/{project_id}/codes/13.json
3.3. Ethnographic annotations
The annotations endpoint is accessible at https://edgeryders.eu/annotator/projects/{project_id}/annotations.json
A standard response looks like this:
[
{
"id": 5579,
"version": "v1.0", (Annotator schema version.)
"text": null, (A comment field that ethnographer can use to explain their thinking behind the annotation.)
"quote": "comfortable for the patient.",
"uri": "/post/33751", (Used by Discourse Annotator to load annotation data.)
"created_at": "2017-05-22T19:13:10.000Z",
"updated_at": "2017-05-22T19:13:10.000Z",
"code_id": 342,
"post_id": 33751,
"creator_id": 3323
"shape": null, (Shape of the annotation like 'rect'. Image or video annotations only.)
"units": "pixel", (Image or video annotations only.)
"geometry": null, (Annotations geometry. Image or video annotations only.)
"src": null, (Path to the annotated file. Image or video annotations only.)
"ext": null, (File extension. Image or video annotations only.)
"container": null, (Video annotations only.)
"start": null, (Point in time when the annotations starts. Video annotations only.)
"end": null, (Point in time when the annotations ends. Video annotations only.)
"topic_id": 15866,
"revision_number": null, (Discourse post revision-number. All annotations that belong to the same post reference the same post revision.)
"post_creator_id": 7427
},
...
]
The following GET parameters can be used to filter annotations in the response:
topic_id
: Return annotations which belong to the given topic.post_id
: Return annotations which belong to the given post.creator_id
: Return annotations which were created by the given user.tag
: Return annotations which are tagged with the given Discourse tag. The tag’s name, rather than the tag’s ID, needs to be passed in. A list of available discourse tags can be found here: Edgeryderscode_id
: Return annotations which are tagged with the given Open Ethnographer code.
Filter parameters can be combined as needed. For example, to return all annotations that belong to a certain topic and were created by a specific user:
https://edgeryders.eu/annotator/projects/{project_id}/annotations.json?topic_id=111&creator_id=222
The output is paginated. By default, one page contains at most 100 annotations. This can be changed with the per_page
GET parameter:
https://edgeryders.eu/annotator/projects/{project_id}/annotations.json?per_page=200
If there are more annotations available than the given per_page
limit they can be accessed on subsequent pages by using the page
GET parameter:
https://edgeryders.eu/annotator/projects/{project_id}/annotations.json?page=2
If no further annotations are available, an empty array is returned for that page.
3.4. Ethical consent data
The Edgeryders platform has a feature called the ethical consent funnel. It is accessible by API as described below, and (after fixing #194) on the user admin pages as field edgeryders_consent
.
When a user tries to post for the first time, the ethical consent funnel is served as a popup form: it asks users to answer some questions before they are able to post in certain categories. They can only proceed past the form when they have answered its questions correctly. When a user answers the questions correctly, the platform updates the value of a field called edgeryders_consent
. We interpret this sequence of events as having given informed consent to participating in a research project with Edgeryders, and having understood the nature of their part in the exercise.
Question definitions. The wording of the consent funnel questions and answers is contained in consent.hbs.
Data access by API. The field edgeryders_consent
is accessible by JSON API at:
-
https://edgeryders.eu/admin/consent.json
in conjunction with a suitable API key, which you can supply as anapi_key
GET parameter. In addition, currently you have to supply a request headerAccept: application/json
as a workaround for #212. Also note that this API endpoint does not support pagination – it provides the consent data for all users in one response. -
https://edgeryders.eu/u/{username}.json
in conjunction with a suitable API key, which you can supply as anapi_key
GET parameter. Without API key, the hash “user → custom_fields” will be empty because that data is access protected. With an API key, you will find a hash key “user → custom_fields → edgeryders_consent” with a value. If you want to obtain the consent information of a larger (>20) number of users, use theconsent.json
endpoint instead (see above) because it will reduce the server load and script runtime considerably than making one request per user.
Basic values. The interpretation of the edgeryders_consent
field values as seen by JSON API access is as follows:
-
"edgeryders_consent": "1"
: User has given consent. Includes valid consent given on the Drupal platform that was later imported (with the consent timestamp reflecting the import time). -
"edgeryders_consent": null
: User has not gone through the consent funnel yet. This is true for many of the earlier users of Edgeryders, as the consent funnel was only fully implemented in July 2017.(In the Discourse database, the value is not indeed
NULL
, but a logical equivalent: the record in tableuser_custom_field
will be missing for this user andname = edgeryders_consent
.)
Additional values. In the Fall 2017 we attempted to elicit consensus from 191 users that had contributed to OpenCare before 2017 (but never after). For this, we created two new values for the edgeryders_consent
field:
-
"edgeryders_consent": "0"
: User was re-contacted after contributing to Edgeryders before July 2017, and has denied consent. -
"edgeryders_consent": "no answer"
: User user was re-contacted after contributing to Edgeryders pre-July 2017, but failed to answer after repeated attempts. -
"edgeryders_consent": "unreachable"
: User user was re-contacted after contributing to Edgeryders before July 2017, but the e-mail they used to create the Edgeryders account is no longer active.
Known issue. For some users, the value of "edgeryders_consent"
is not a simple string but a list:
{
"user: {
[...]
"custom_fields": {
"edgeryders_consent": [
0: "1";
1: "1"
]
}
}
}
We attribute this to a simple mistake:
It probably means that Discourse treats everything as a multi-value field by default. So you’d keep the last (most recent) value, and ignore the rest.
3.5. Multisite account creation
This allows authorized external applications, including JavaScript web applications, to create active Discourse accounts by API, which can then be used by these applications to post content to Discourse in the name of the new user. This is useful for various onboarding, survey and data collection purposes.
Endpoint URL. Since we use a single-sign-on (SSO) system where one Discourse account works on all of our Communities sites (see the top right menu!), the endpoint is provided on our login site because that is the SSO provider where the “master” record of each account is created:
https://communities.edgeryders.eu/multisite_account.json
Only a HTTPS endpoint is provided, no HTTP version.
Request type. GET
(Will later be changed to POST
once we figured out how to set up the right CORS policy in Discourse. GET
is not ideal as it should not be used for requests that cause state changes.)
Parameters. The possible request parameters are:
-
email
: Required. The e-mail address that the user wants associated with the new account. If an account with that e-mail address already exists, account creation will fail. -
username
: Required. The username to use for the new account. If an account with that username already exists, account creation will fail. Due to that, it is advisable to test before for username availability using the Discourse public API (https://communities.edgeryders.eu/u/username.json
), or to auto-generate a username that will most likely not exist yet. The user may change it later inside Discourse. -
password
: Required. The password to set for the new account that will be created. -
accepted_gtc
: Optional.true
orfalse
, referring to the GTCs of the Edgeryders Communities platforms. Assumed asfalse
when not provided, which will result in an error message. -
accepted_privacy_policy
: Optional.true
orfalse
, referring to the Privacy Policy of the Edgeryders Communities platforms. Assumed asfalse
when not provided, which will result in an error message. -
edgeryders_research_consent
: Optional.true
if the user passed the Edgeryders Consent Funnel or equivalent questions, giving informed consent to the use of their content for research;false
otherwise. Assumed asfalse
when not provided.true
is required only when requesting an API key for edgeryders.eu. -
requested_api_keys
: Required. Non-empty list of the domains of Edgeryders Communities sites for which the caller requests a Discourse Admin API key. Separate multiple values with whitespace. -
auth_key
: Required. A shared secret without which access to this API endpoint will be prohibited. The currently activeauth_key
is available in a protected page.Background info (click to unfold)
Since this system should result in published content before the user had to confirm their e-mail address, it is a good target for spam submissions and needs some form of authentication.
auth_key
is a shared secret to prevent automated spam submissions. The idea is to distribute it in a limited way and only to trustable parties, and to disable it once it has reached untrusted parties who start using it for spam submissions. This seems to be the best approach, as an external web application cannot be trusted when it says “I have let the user go through a good captcha” and as we don’t want to build a captcha-via-API system (and captchas are annoying anyway).
Example request. https://communities.edgeryders.eu/multisite_account.json?email=testuser2@example.com&username=testuser2&password=verysecretpassword123&accepted_gtc=true&accepted_privacy_policy=true&edgeryders_research_consent=true&requested_api_keys=edgeryders.eu&auth_key=8342……3274
Response. A typical response for successfully creating an account would look like this, basically an abridged version of the user records that are normally returned by Discourse:
{
"id": 5,
"username": "new-username",
"email": "username@example.com",
"active": false,
"created_at": "2019-09-05T08:35:01.000Z",
"username_lower": "new-username",
"trust_level": 0,
"api_keys": [
{ "site": "edgeryders.eu", "key": "sgev47…fdffd0" }
]
}
As seen from the "active": false
field, the account on communities.edgeryders.eu is not yet active. When the user tries to log in for the first time, she will be asked to activate the account by requesting a link sent to her e-mail address and clicking on it. However, the accounts created on Communities sites alongside this master account are already set to active. This allows to use the returned Admin API keys without further steps and also makes sure the user is notified by e-mail about replies to the content she posted. The fact that the Communities account(s) are already active includes a slight risk that e-mails are sent to other people’s e-mail addresses in case the user did not enter her own e-mail address. If that turns out to be a problem at least for some applications, we can provide a parameter that allows to switch this behavior on or off, and creating active accounts would only be possible with some but not all auth_key
s.
In the case of an error, the error status and message as provided by Discourse for an account creation error will be returned. This happens when:
- An API endpoint parameter is missing or wrong. See the source code for details on the possible error messages that can happen.
- An account with the given e-mail address already exists.
- Other Discourse account creation errors.
If you receive a “500 Server Error” response, it will be due to this open issue.
Typical usage. A typical usage of this API endpoint would look like this:
-
Content preparation. A client application of this API would first collect the content or data it wants to post to Discourse under a user’s name, and compile it into a Discourse post format.
-
E-mail address input and consent. On the last screen of the content collection form, the user is asked to enter an e-mail address and to confirm to the Edgeryders ethical consent funnel, terms and conditions and privacy policy.
-
Get the
auth_key
. The application needs a validauth_key
vale to be granted access to themultisite_account.json
API endpoint. This can come from several sources. In the simplest case, it would be written next to the screen on a survey computer and the user would enter it into the application. Or it can be provided as a GET parameter in a link (from social media etc.). Or it can be served from a configuration file that is publicly accessible to the JavaScript application from a server but not included in a source code repository. -
Request a new Discourse account. Now the we application will make a request
multisite_account.json
as specified above. -
Get the new Discourse account. The Discourse serverside code would do the following: (1) confirm that
auth_key
is a valid token, (2) make sure thatemail
and (if given)username
is not yet associated with an account, (3) make sure the user gave the required consents, (4) create a new account on the SSO provider site, (5) sync that new account to the Discourse Communities sites for which API keys are requested, (6) create and collect the requested API keys on the Communities sites, (8) send a reply with all necessary information to the API client, as specified above. -
Show an account summary page. The web application would show a page saying that the content was successfully published and also providing information about the user’s new Discourse account (site URL, username, password, e-mail address). The user would be asked to make a photo or otherwise take note of this information but also be told that she can always reset the password after entering her e-mail address.
-
Error handling. In case that the account could not be created, for example because an account using that e-mail address already exists, the web application should show the user’s text input and ask the user to (1) provide a new e-mail address to post under or otherwise (2) to log in to Discourse and copy&paste it into a new topic there. The web application could also provide the option to log in to Discourse, to post the text under that identity (after getting a Discourse User API key for that).
-
Use the API key to post to Discourse. At this point, the API client application has a Discourse Admin API key of a new Discourse user and can use that to do actions on behalf of that user, via the normal Discourse API. It can then use it to create a new topic authored by that user.
Note that the new user starts as a TL0 user, while all users who sign up on edgeryders.eu manually and confirm their e-mail address start as TL1 users. This is a spam protection measure and may lead to some issues if the user tries to post many links or images.
-
Log in and forward to Discourse (later). This is for later, not for the use case as a survey system. At the end of this process, there would be a link to bring users directly to Discourse, to the topic they just created, allowing them to interact with other users as a normal Discourse user. The special part here would be that they end up on Discourse in logged-in state, without having to go through the communities.edgeryders.eu login site. To make that work, the web application would simply submit the username and password to communities.edgeryders.eu and get a login cookie in return (assuming proper CORS policy handling). Then it would forward the user to
edgeryder.eu/login
, which means that the user will end up there in logged-in state. We use that trick at the end of the communities.edgeryders.eu login process already. To bring the user to their own topic automatically, we could extend Discourse with a `edgeryder.eu/login?redirect=/t/…" mechanism.
Source. We provide the source code of this API endpoint under an open source licence.
3.6. Obtaining an API key by API
In order for external websites to interact with a user’s Discourse account (such as by posting chatlogs as Discourse topics, creating additional notifications etc.), that external website has to send properly authenticated API requests to Discourse. This endpoint makes that possible by providing the user’s Discourse admin API key after SSO authentication at our SSO provider site, communities.edgeryders.eu.
Endpoint. This API endpoint is accessible at:
https://communities.edgeryders.eu/multisite_account_api_key.json
Only a HTTPS endpoint is provided, no HTTP version.
Request type. GET
(Will later be changed to POST
once we figured out how to set up the right CORS policy in Discourse. GET
is not ideal as it should not be used for requests that cause state changes.)
Parameters
-
Authentication cookie. To access this API endpoint, you have to first do a login to your web application using communities.edgeryders.eu as SSO provider (instructions). That will, at the same time, log you in to communities.edgeryders.eu itelf, which again authenticates you via a cookie to be able to access this API endpoint. This only works because we allow cross-domain use of the session cookie via the permit-api-cors plugin.
The Discourse authentication cookie, is the
_t
cookie and looks like this:_t:"d1afe7345cd1f6389a0d2ab7792569"
. For testing, you can obtain it from an active communities.edgeryders.eu login under “Storage → Cookies” in the browser’s web developer tools. -
hostname
The hostname of the Edgeryders Communities site for which you want to obtain the Admin API key.
Response. The JSON response informs about the user’s admin API key on the requested site. If you receive a “500 Server Error” response, it will be due to this open issue
Example usage. An example GET request would be:
https://communities.edgeryders.eu/multisite_account_api_key.json?hostname=edgeryders.eu
In addition, the session cookie has to be sent along this GET request. For testing and debugging purposes, using curl
is a good way to create these requests. Together with the session cookie, the full request as a curl
command would look like this:
curl 'https://communities.edgeryders.eu/multisite_account_api_key.json?hostname=edgeryders.eu' -H 'Cookie: _t=bbeb……4a86'
`
And the response would look like:
{
"site":"edgeryders.eu",
"key":"4c0b6…309da0"
}
Typical usage. Here is a full description of the process a web application would use to access this API endpoint and utilize its result:
-
Get your web application’s domain added to “CORS allowed origins” on communities.edgeryders.eu.
-
Initiate a SSO authentication in your web application, using communities.edgeryders.eu as the SSO provider.
-
During SSO login, the user has to enter their username and password on communities.edgeryders.eu. As a result, they get the authentication cookie served from that site. Their browser stores it for domain
communities.edgeryders.eu
, giving them an active session on communities.edgeryders.eu (in addition to the SSO session on other sites via SSO login). -
From your web application, send a request to
https://communities.edgeryders.eu/multisite_account_api_key.json
. -
Since there exists a cookie for
communities.edgeryders.eu
, the browser sends it together with the request automatically. That behavior is allowed by the CORS settings made initially. -
That cookie authenticates your web application’s request to
multisite_account_api_key.json
and the API call should be successful. -
The API call returns the user’s Admin API key for the requested Discourse forum(s) and can use that to access them from your JS application under the user’s account.
Source. We provide the source code of this API endpoint under an open source licence.
Ideas for future improvements (not implemented so far)
Future alternatives: User API. The Discourse User API is meant to be used for this scenario, but requires the user to grant access rights to client software in their Discourse account. Also, it would require serverless HTML+JS web applications to somehow store the User API key in permanent browser storage so that the granting of access rights does not have to be done every time. For the future, a good option would be to modify Discourse so that it auto-confirms the User API key requests of certain applications.
Future alternative: cookie authentication. The permit-api-cors plugin already allows to access the multisite_account_api_key.json
API via the communities.edgeryders.eu session cookie obtained from the SSO login. By installing that plugin also on the various communities sites, it should be possible to do a login on these sites by API, and then to use these sites with the appropriate session cookie for authentication, just like the browser does. The drawback is the rather complicated login process.
4. API client registry
Our custom APIs frequently change due to improvements and refactorings. To avoid the complexity of legacy APIs or API endpoint versioning, our method of change management is this API client registry. When you register your client application here, you will be notified when an API endpoint it uses changes. Registration is optional, but you’ll have to track API changes here in the manual if you don’t register. To register, simply edit this wiki. API client means an instance of a software; one application could be run in several instances.