When run on my forum, the “most toxic posters” list corresponded well with members our moderators are keeping an eye on. Aside from “toxicity” the measures of “inflammatory”, “attack on author”, “attack on commenter” were also reasonably accurate - at least enough to highlight the most problematic behaviour on the forum.
There were one or two anomalies. For example, one of our “most toxic” members turned out to be a wonderfully kind chap who was one of our forum’s earliest joiners and freely donated money to help with our running costs. Why so toxic? It turns out that he signs off posts with his nickname, “Dick,” and as far as the Perspective API is concerned, that’s rather uncivil.
Please give my app a try on your forum and let me know how you get on. I’d be grateful for any PRs and/or feedback on the code as I’m new to Postgres, Akka-HTTP, Doobie, Docker and Spray-JSON. Cheers!
Is the word list biased towards American cultural slang? It is not uncommon that a word that is perfectly innocent in one country has a not-so-innocent connotation in another.
Then there’s context too. “Dick” as a given name vs. as an adjective is a perfect example of this.
There’s also “privilege”. Meaning for example, if a gay wants to identify themselves as a gay, that would be self-descriptive not derogatory.
That said, as long as it’s more of a “this should be reviewed” and not a “auto-ban” thing it sounds promising.
I’d say it’s extremely likely that it is very biased towards both American idioms (and spelling), and also the kinds of things you get in an online news comments section, as opposed to any other form of online communication.
I used the app to help decide whether installing the discourse-etiquette plugin was a good idea yet. I concluded that the flagging feature of the plugin would probably be useful to mods, but the automated feedback to users might be risky.
Here are the stats by category on my forum - seems mods urgently need to check out “2016 Parents” (an opt-in category) and work out what’s going wrong there!
Fantastically interesting subject, thank you for the hard work on the app.
I think this is surely one of the biggest issues in social networking.
Being from the UK I’ve seen a tremendously poor level of debate around Brexit. We need to encourage people to debate with civility and not constantly resort to personal attacks. People need to focus on the subject of debate and stop being cheap.
I wonder if this is something that could lead to ‘self reflection’ if available online within Discourse:
in your own profile or
as part of the submission preview or
by a flag next to your post viewable by just you or a moderator …
all as a means of judging your own ‘toxicity’ metric, to give you a pre-warning of you becoming too heated in your debating style? Even as a moderator/admin, I find myself going back to edit posts to reduce the temperature of some of my statements. As moderator/admin you clearly have a responsibility and need to set a good example so this is even more critical (though of course I think one really knows when one’s truly crossed the line)
Haha, wonderful find! And great work all around. I’ve already forwarded this to the Google team, who can be emailed at conversationai-questions@google.com.
For those interested in reading more about Perspective’s models and biases, here’s some recommended reading:
So far, Perspective API has received a mixed response. The first release, while intended to be an early test version of their approach, seemed to have several serious deficits. Experiments by the interaction designer Caroline Sinders (who has also done work with us) suggested that it was missing some key areas of focus, while Violet Blue, writing for Engadget, used Jessamyn West’s investigation among others to show that the system was returning some truly troubling false positives. For their part, Jigsaw says it is aware of these issues, and recently wrote a blogpost about the limitations of models such as theirs.
So, will you be running with it? Keep in mind you can soft-disable the JIT notifications by setting a crazy high certainty score so that it’s never triggered. In the near future we will add the ability to disable JIT notifications altogether.
I love the approach of being able scan the history of a forum and surface historical issues.
I think we should introduce a new model to store all this information and just have the official plugin backfill the model, then you can run queries about history in data explorer.
I would say a great first move here @erlend_sh with the perspective plugin is focusing on validating this historically.
Plugin runs in a “no warnings, no jits, only update model” mode (site setting)
Plugin has a job that backfills N posts every N minutes.
We then have a few data explorer queries to report on:
Most toxic categories
Most toxic users
Most toxic posts
Most toxic posts today
I feel way more easy about enabling this if I could see how it performed historically.
Working through history can allow forum admins to adjust all the params according to historic behavior in the forum.
I was looking forward to trying this out, but I got stuck in a loop trying to get an API key, got frustrated, and gave up trying. I’m not sure what Google wants or doesn’t like about my login values, so I’ll try again after digging around a bit.
Hi, I work on Perspective, sorry you’re running into issues.
Did you get approved for the API already (by applying on https://perspectiveapi.com/)? If so, did you have issues following our guide? Feel free to message me directly and I can try to help you out.