Ok. So, it seems that:
-
All annotations should have a
tag_id
(but it should be calledcode_id
, @matthias) set to something other thannull
. -
Annotations whose snippet consists of the first word of the post are legit. They refer to the whole post.
-
No other cases of one-word snippets are legit.
How to check for 1
-
Count the cases with . Few cases can be glitches: hundreds of cases point to a probable error in the import script.
-
Check the creation dates of the annotations. If many were created in 2016-2017, there is probably something wrong with the import script.
Checking for 2 is trivial, though probably tedious. I thought I could do these checks myself with 20 lines of code, but I get an annoying glitch: the annotations endpoint returns an object that looks JSON-like, but it is not a list, rather an “instancemethod”:
>>> import requests
>>> url = 'https://edgeryders.eu/administration/annotator/annotations.json?per_page=100000'
>>> response = requests.get(url).json()
>>> type(response)
<type 'instancemethod'>