Platform chat integration for Discourse + Slack

owen · November 6, 2020, 10:51am

Had a call with @Amelia & @nadia today about possible chat integrations with the platform.

The idea being to easily import transcripts of chat messages, with two main goals:

Make it frictionless for people who are not on the platform to share their thoughts
Messages are posted as comments to a topic that are then codeable for research

@Amelia found this Discourse plugin which appears to offer a solid set of integrations with Slack - including sending chat transcripts and threaded replies to the platform.

The main hurdles are:

Setting up the plugin (@hugi, @matthias - I can’t assess this part, but could you take a look and give some feedback?)
Making sure dozens of new posts are not created for short, off topic messages (ie threaded conversations work, and there is some sort of filter on what gets exported)
How does this work for existing platform users - are new users created/how does the account linking or set up process work?
Is Slack the right platform - keeping in mind the type of audience that gravitates towards Slack vs Discord vs Riot/Element.
Are there better alternatives?

Thanks for your thoughts!

owen · November 6, 2020, 11:00am

Another point - we also have a working solution integrated with the webkit and on tell.edgeryders.eu - that allows commenting on existing topics.

However the interest here is in chat applications, real time conversations. I’m aware that it would be possible to create a web application, sort of how now.edgeryders.eu (currently defunct) worked, with a chat export feature. This would be no small job though and I want to make sure there’s a conversation about the goals/needs before any decision is made.

matthias · November 7, 2020, 3:14pm

Ok, so here are my ideas how this could be solved. This is indeed much better to discuss in writing, as I needed a day to think it through. I would not have had any proposal yesterday.

First of all, let me talk you out of using Slack. Not only is Slack the most sluggish, inefficient and bloated of the six web-based chat tools that I have open all the time. It is also not open source, and not frictionless for people who are not on the platform. Because they need an account on our Slack (let’s say edgeryders.slack.com), even if they already have one or more other Slack accounts. And if they are going to need a new account anyway, they could just as well have made one on our Discourse platform.

Also I think that we should look for a more general solution than “just” being able to post Slack transcripts to Discourse. It should be applicable, with little work, to any chat tool; of which we use several already. For example, when community managers hold a webinar in Zoom or Google Meet, the chat among the audience might contain relevant content, and we’d want an edited transcript on Discourse.

Below is my proposal how such a solution could be created technically. (But I have no idea how this could be funded: there is zero IT budget left for new features. Which should not be a surprise, given how underfunded IT has been for the current H2020 projects …)

Generic Chat Transcript Import for Discourse

Here’s how my proposed solution would be used, from chat to coding in Open Ethnographer:

People chat, in any tool. Could be Zoom, Google Meet, Matrix / Element, Slack, Rocket Chat, even Google Docs comments. The only requirement is that one can copy & paste the chat messages in text or HTML format, and that a filter has been defined in the transcript import for that particular chat application. Note that both Matrix / Element and Rocket chat support chat room access by users without having to create an account, providing the frictionless experience you’re looking for.
After the event or chat session is over, somebody goes to Discourse and clicks + New Topic. Same as when posting anything else to Discourse.
In the Discourse editor, they now click “ → Import Chat Transcript”. A wizard dialog opens, which is possible in Discourse (compare “ → Create Poll” right now).
In the dialog, they now select the chat platform to import from, and copy & paste the chat transcript, possibly in multiple parts if only a few sections are relevant to post.
In the next step of the wizard dialog, the system will show a list of detected usernames, each with a box behind to select a corresponding Discourse username (if any).
In the next step, the wizard dialog will disappear and leave the user with a draft of a chat transcript inside the Discourse editor. Similar to what happens when using the Discourse / Slack integration to post to Discourse, as seen in this video.

Notably, the transcript will use the usual Discourse quote formatting to attribute chat messages to users (such as [quote="username"]…[/quote]). It will not create one Discourse post for every chat message. Usernames would correspond to Discourse usernames where a mapping has been provided, and otherwise would use a special prefix such as chatuser_.

Posting each chat message as a Discourse post is not a good idea because Discourse posts are meant to be longer. Many short chat messages will lead to a lot of notification spam, and also many users would have to be created who would never come and post anything in person on Discourse, thus cluttering the database.
The user posting the transcript can edit the draft at will (shortening, fixing spelling and grammar, editing out personal data etc.) and would then post this as a new topic to Discourse. It can serve as a conversation starter, allowing other Discourse users to add their own posts as usual.
When coding in Open Ethnographer, ethnographers would code the transcript as usual, just making sure that annotations use text from one quote only.

For this to work, we would extend Open Ethnographer so that it recognizes that the text is in a quote, and would record this into the database attributed to the quoted user. It would thus show up properly in the network analysis, just that some usernames would not be linked to Discourse users.

As a nice side effect, this solution will also enable the ethnographic coding of video call transcripts. It’s just about manually creating a post made from quotes, in analogy to what the chat transcript importer would generate. This also means that only the Open Ethnographer extension is crucial to start with posting chat transcripts to Discourse, since adding the [quote="…"]…[/quote] syntax can also be done manually during a test period. Still, the Open Ethnographer extension would be 2000–4000 EUR, and better at the upper end because we need to use some more solid programming techniques from now on to keep Open Ethnographer more stable, as it’s becoming too complex for “just adding something here and there”.

Over to @amelia and @nadia for deliberation.

I liked the chat export idea for the now.edgeryders.eu video call application; it had login integration with Discourse anyway, so attributing the transcript properly was easier that way. However, let’s not reinvent the wheel I’d say: creating a chat application with the frictionless UX desired here is a lot of work, which others did already. Also, having our own chat application still would not allow to utilize the transcripts from Zoom, Google Meet etc…

amelia · November 8, 2020, 12:24pm

Totally agree – we were lamenting that Slack is the only chat program set up to integrate fully with Discourse right now since it’s our least favourite haha.

Also totally agree with this. The funding, as you say, is the million dollar question. So we need to find 4000 EUR for this?

matthias:

otably, the transcript will use the usual Discourse quote formatting to attribute chat messages to users (such as [quote="username"]…[/quote] ). It will not create one Discourse post for every chat message. Usernames would correspond to Discourse usernames where a mapping has been provided, and otherwise would use a special prefix such as chatuser_ .Posting each chat message as a Discourse post is not a good idea because Discourse posts are meant to be longer. Many short chat messages will lead to a lot of notification spam, and also many users would have to be created who would never come and post anything in person on Discourse, thus cluttering the database.

The user posting the transcript can edit the draft at will (shortening, fixing spelling and grammar, editing out personal data etc.) and would then post this as a new topic to Discourse. It can serve as a conversation starter, allowing other Discourse users to add their own posts as usual.

This is the main thing we’ll have to work out – because ideally we do have unique users for each chat post, or something along those lines (maybe instead, an aggregation of every message sent by a single user? Or not auto-creating users due to this database cluttering? Will have to think about it). I’m interested in what coding a chat would look like, because my theory is that you’ll have shorter messages with only two or three codes (and some not coded) but more messages, so still an ability to track which concepts co-occur most.

One big block of text is very bad for ethnographic coding, because then everything co-occurs with everything else and we lose meaning AND we lose the social element (who is saying things) – so having it all in one big Discourse post isn’t ideal. Definitely would defeat the purpose of doing this in the first place from an ethno coding perspective.

Great!

matthias · November 8, 2020, 12:38pm

I think you misunderstood. From the perspective of ethnographic coding, the solution I proposed above will use unique users for each chat user. Just that these are not Discourse users in all cases (only where a mapping is provided). Instead, these are users in an additional Open Ethnographer user table. These users would be automatically identified by the handles they use in the chat transcript.

This does not contradict current coding practices, but extends them. So whenever in the future an ethnographer codes a quote of another user (using the [quote="username"]…[/quote] syntax), that code will be attributed to the quoted user, not the quoting one. We’d not apply this to other quotes, such as from books / newspapers / literature, as it makes no sense there.

amelia · November 8, 2020, 2:50pm

Ah amazing!! Yes, I misunderstood, sorry! This would be SO ideal.

matthias · November 8, 2020, 3:41pm

Yes, but the 4000 EUR is the bare minimum, paying for a version of Open Ethnographer that allows properly coding interviews entered as quote collections, or any other quotes for that matter. In that version, creating such posts from chat transcripts would be done by manually editing the transcript in the Discourse editor, not yet with the wizard-style dialog I proposed above.

Another caveat is that I don’t take fixed-sum IT projects; rather, it’s paid by time. But my estimates are not as far off as in earlier years … and I think 4000 EUR is realistic for the Open Ethnographer side of this proposal.

amelia · November 8, 2020, 4:04pm

@nadia, what do you think? Do we have budget for this somewhere or will we need to apply for €€?

owen · November 8, 2020, 11:11pm

Sounds about right, consider it a plan B.

The above mentioned solution is doable as a simple standalone chat application - but if other services (Meet/Zoom/etc) are going to be used it is not the right path.

nadia · November 9, 2020, 8:50am

Hello and welcome back to sanity <3 Let me have a look at the budget this week and see if we can scrape something together. Also, ping @hugi ^^

hugi · November 9, 2020, 11:14am

I fully agree with everything @matthias says about Slack. It’s a nightmare.

I agree with what Matt is proposing. It would be a good option. There are some UX issues to work out though - for example, what to do when two users in the chat have the same display name and when that display name is the same as a third user on the platform. There are quite a lot of edge cases, and the edges will be slightly different for different chat transcripts.

alberto · November 11, 2020, 4:08pm

Really? Chat messages, I would say, make very poor ethno material: short messages, make sense only in context, almost uncodable. I mean, you could code this:

… but it would not be very interesting.

What is the use case?

owen · November 11, 2020, 7:26pm

Ask @nadia and @Amelia

I imagine in the context of webinars or live Q&A’s, relevant discussions will be posted in the chat. There are probably other use cases they have in mind.

nadia · January 15, 2021, 3:02pm

Hallo @matthias , we are picking up on this thread again. Help me understand - so there are two parts to this work:

Doing work on Open Ethnographer that allows properly coding interviews as quote collections
Developing a wizard style widget to transform the imported transcripts into collections of quotes from unique users that could be coded by the ethno team.

Correct?

If so…

a. Could we break this down into 1 as a first step, we do a test with one chat - and then look at developing the widget at a later time?

b. Hong would it take to get to something that would not be “perfect” but “enough” for us to do a test of one such chat-to-platform-to-ethno cycle?

c. Who in our crew would need to do what?

d. When would they have time to do it/ when could it realistically be ready to test?

e. Would the 4K be enough to cover the development work needed to get us to a point where we could run such a test?

hugi · January 15, 2021, 4:44pm

I would advise against doing this chat integration at all, having thought about it.

While this works technically it is:

A nightmare regarding research consent and GDPR
Messy and hard to get consistently right across platforms

In addition, there is this:

My advice is that we should drop this.

johncoate · February 9, 2021, 5:34pm

The reason to do it on the plus side is it helps the platform be more sticky.