Image Upload - Image Recognition API Support

In light of the SESTA/FOSTA laws, which effectively remove a lot of protections for social/forum/ugc webmasters (Section 230 safe harbor). Making webmasters liable for what their users do.

It might be wise to support using an image recognition API as one solution for improving protection. In order to automate blocking uploads of explicit content like (unsafe, nudity, gore etc.).

Also improve protection from exploits like uploading inappropriate images in drafts, then hotlinking to those images elsewhere. Using it like free anonymous image hosting. I’m not sure how exploitable this is with Discourse, but it looks like with default settings, it could be exploitable for 180 days after draft creation without the webmaster knowing what has been uploaded (delete drafts older than n days).

Some APIs:

4 Likes

This would have to start with a plugin, unlikely to ever be a core discourse feature.

8 Likes

Checking all images uploaded to Discourse via the Google Cloud Vision API would be really nice in order to stay safe for Adsense. We did that on our former website and never got any nude or gore pictures uploaded.

A Ruby Gem is provided by Google:

A potential plugin should hook into the main image upload process of Discourse for all images (posts, avatars, profile backgrounds etc.) and reject images that contain disallowed content:

  puts "Adult:    #{safe_search.adult}"  puts "Spoof:    #{safe_search.spoof}"  puts "Medical:  #{safe_search.medical}"  puts "Violence: #{safe_search.violence}"  puts "Racy:     #{safe_search.racy}"
['UNKNOWN', 'VERY_UNLIKELY', 'UNLIKELY',        'POSSIBLE', 'LIKELY', 'VERY_LIKELY']

Where needs such a plugin hook into in the Discourse code base?

Is anybody interested to develop it via marketplace?

2 Likes

This is absolutely possble but it would take away the seamlessness IMHO from the experience. If there’s any way the model could be built into the plugin, that would be super cool.

Did you guys use the vision api itself?

Here’s a plugin built by @angus which can act as a starting point. GitHub - angusmcleod/discourse-machine-learning

2 Likes

What exactly do you mean by “seamlessness”?

1 Like

I mean before uploading, checking the image by POSTing it to the api and getting a green flag would take some time right?

1 Like

Well, the upload process for inline images is async already IMHO. And the Google API is very fast.

On the other hand, I would also be happy to check images after a user has posted a new post via an external webhook (Discourse API), and alter the user’s post (e.g. change the image and replace it with a text “IMAGE REMOVED BY ADMIN”). That part seems to be possible with the API, but I can’t find any reference how to actually DELETE the “bad” image via the API in such a case, because I don’t even want to keep the somewhere in the shadow.

2 Likes

I’m happy to work on this as a paid gig. Can you help with the API side of the things i.e. which api is used to detect offensive content etc.

It’s all very good documented here for Ruby:

https://cloud.google.com/vision/docs/detecting-safe-search#vision_safe_search_detection-ruby

In PHP we implemented this into Drupal in just under 2 hours.

3 Likes

2 hours should be acceptable for this. Should I send you a PM regarding this?

3 Likes

Yes, please. That would be great.

1 Like

@Terrapop - Something you may want to take into account is the accuracy of the recognition. It can be good to be able to see some of the content that was blocked, to ensure it isn’t configured too strict in terms of ‘POSSIBLE’, ‘LIKELY’ and ‘VERY_LIKELY’. False positives and negatives are quite common.

I think it might be a better implementation to send any posts that include images above a certain ‘possibly adult’ level to the review queue. So the post is never public, but you can still approve it if the recognition wasn’t accurate. If it’s rejected from there, the images will be deleted, I believe after a time period according to what is set for the ‘clean_orphan_uploads_grace_period_hours’.

This would allow using the ‘POSSIBLY’ detection level with more confidence.

2 Likes

We have tested the API on our current website and know which levels work for us quite well.

@fzngagan is developing the plugin for us open source, thus once finished you can alter and pull request an option to not directly reject but instead forward to mod queue.

3 Likes

Here’s the plugin.

I love the idea of tying the likelyhood to the review queues when the image is a part of the post. Happy to accept sponsorship/PR in that regard. :slight_smile:

5 Likes

If that is optional as an addition on top, I am fine with that of course.

We used the API in our former community for quite some time and know the levels acceptable for us. And most of the time the API was correct to deny, and the user simply uploaded a less severe image instead.

Also, I wanted not only posts but also image uploads of avatars and profile backgrounds checked. Don’t know if a queue option is being possible for those as well?

2 Likes