Discourse Etiquette: take actions when users intend to post inappropriate remark

official

(Madhukar Mudunuru) #21

menu.json (12.7 KB)


(Diego Barreiro) #22

Yes, still waiting for it…


(Erick Guan) #23

Yes, we can do this. The model doesn’t report the reason why it thought so.

@ChrisBeach Let’s say I’ll fix this with next version. In that, JIT notification will be replaced by a dialog.


(Erick Guan) #24

As I played around, TOXICITY model doesn’t work like this. We couldn’t know which word cause the problem now.


(Sam Saffron) #25

OK, testing this on meta I am seeing an extra 400ms delay on posting (from Australia) cause we need a full extra round trip here.

I wonder if you can hook in to posts/create so you don’t require a full round trip to do this. It is a bit tricky cause of the hijack part though. But I guess you could just introduce a custom /posts/create that hijacks everything including the create itself… I am not sure… we can not do partial hijack without a fiber server, so this is tricky.

Thinking about this.

This is super urgent though, disabling this on meta now:

It is just too easy to hit the rate limit, and once you do no error is displayed!


(Chris Beach) #26

Sounds nasty. If the plugin hits Google’s rate limit, is the topic posting flow broken?


(Erlend Sogge Heggen) #27

Perhaps we could limit the dialog to a specified Trust Level (TL1 by default) while the rest of the posts are only subject to automated flagging, which doesn’t need to happen at post-time.


(Chris Beach) #28

I’ve been running since yesterday and announced the feature to users this morning:

https://se23.life/t/automated-etiquette-analysis-on-se23-life/8279?u=chrisbeach

A mixed reception so far.

I created an opt-in category so members could play with the new feature and we’ve uncovered one or two issues so far - it seems the browser must be refreshed for the feature to become active for some users (possibly a general issue with plugin updates)


(Chris Beach) #29

The plugin doesn’t seem to catch words that are caught by the censor watchlist (and are automatically blocked out). This seems like a bug.


(Erick Guan) #30

Basically I want it to fail silently. It will be flagged by the bg job.

@erlend_sh rate limit for a user is 6/min. From Google API console, we didn’t reach the threshold.

This is implemented in hijack branch now. GitHub - fantasticfears/discourse-etiquette at hijack. This doesn’t apply to post editing yet.

Also, please ask me if you want a lower timeout! Now it’s allowed for around 10 seconds.

This plugin has nothing to do with censor watchlist so it won’t complain anything.


(Chris Beach) #31

The plugin would ideally operate on the raw post before it is processed by the censor watchlist.

Otherwise it won’t accurately score toxic posts.


(Erick Guan) #32

I think you just didn’t hit the threshold since the raw text is sent to the server anyhow even it’s censored.


(Chris Beach) #33

Ah yes. I’ve just tried again using different censored words and managed to trigger the filter.

Looks like my original filthy words were British slang and weren’t part of Google’s American English lexicon.


(Erick Guan) #34

Google have used datasets from Wikipedia talk pages. This could introduce bias as you said. In fact, they have an API to submit correction so that they could improve false negatives. The plugin doesn’t implement anything for this.


(Chris Beach) #35

FYI @fantasticfears - I’m seeing regular errors in my logs:


(Erick Guan) #37

Fixed. Thanks for reporting.


(Chris Beach) #38

Great! :+1: That appears to have fixed it for me. Thanks for sorting this out.

I also see a lot of these errors, which may be a problem at Google’s end rather than with the plugin.

Just FYI:


(Erick Guan) #39

Those may be from timeout and limit exceeded. If the job checks a bunch of posts, they are likely to hit 1000 requests / 100 seconds quota.


(Chris Beach) #40

I had to be quite careful with quota management when I used the Perspective API for bulk analysis. Even quite fine grained rate limiting on my side sometimes still threw rate limit errors on Google’s side.

I think we do have to be careful not to trigger too many rate limit errors on their server. I’m not sure of Google’s tolerance to it, but I remember Twitter blocked one of my keys once because I’d exceeded their limit too many times.


(Alexander Wright) #41

Are there any analytics available on how many posts the plugin has analysed, and how many triggered the warnings?

If not, could that be added please?

Another useful feature would be a warning that the rate limit had been hit.