Prevent scraping script running on my discourse platform


(Anna Naumova) #1

Hello. My name’s Anna and I want to have my own server running Discourse platform.

However, I have installed discourse platform on my server and it’s working fine.

One concern is that if the user access to the site using scraping script, for example, using PHP or Python, then I want to prevent sending content to those IP.

Idea is that if the user doesn’t use the browser then doesn’t send him the content.

I want this to be done by a plugin but I am open to your suggestion.

I am also open to any budget suggestion from you. Thanks in advance. Anna.


#2

Its not simple to detect scrappers because toda headless browsers are used to scrape.

A simple first step would be to block all the ips you know are scrappers.

Also, how much harm are scrapers doing? What kind of content do you have?


(Anna Naumova) #3

Hello @PabloC.

Thanks for your reply. For now, I am just opening the site so there is no content at all~

But I just want to detect if the user is using a browser or not.

If he is not using a browser then block sending content at the server side. Is it possible?


(Muhlis C) #4

Something like Cloudflare’s browser integrity check?


(Anna Naumova) #5

Thanks, @mbcahyono for your reply.

Sure very similar to that.

Is there anyway to block content to non-browser users?


(Felix Freiberger) #6

Currently, there isn’t. Something like this would have to be a plugin, and wouldn’t be effective – as @PabloC mentioned, modern scrapers are browsers.


(Anna Naumova) #7

Thanks, @fefrei for your reply.

I am going to give up making that kind of plugin.

Because there is not much benefit having that, even though I can have one!

I have posted many posts to get help, but I finally give up… However thanks.


(system) #8

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.