Forums that forbid AI content… How's it going?

I’d love to hear the experiences of staff of forums who don’t allow AI/LLM generated content. How have you been able to communicate this to users? How do you detect it? How do you approach users who post it anyway?

All thoughts are welcome.

Most, if not all, AI text can be easily detected just be reading it. Google’s SynthID is cool tech for detecting AI images and it claims to even be able to detect text probably only written by Gemini but OpenAI also supports the standard. Being able to personally detect the text myself is probably an acquired skill but I appreciate the work being done to respond to the current crisis we have of not being able to detect AI imagery or text.

Muting/suspensions are still the right way to go for this in my opinion, especially if the account is new. if there is a random new account that joins your site and instantly posts an AI generated topic I see no reason why you shouldn’t just suspend the account and block it.

As for the entire scraping dilemma: My site is for internal communication & documentation within a small company at the moment and I’m planning on using it as a backend for blogging eventually. It was not hard to set up a honeypot to deter the crawlers that opt to ignore the robotstxt files on my domains.

Just this one tactic alone has led to somewhere in the neighborhood of 6 MILLION requests over the span of two weeks (about 6 reqs/s to the domain):

Whenever an AI crawler visits said site, they are led to an infinite maze of spam using the lovely iocaine project self-hosted with a dataset of roughly ~7000 made up words, some gibberish HTML, random words, and fake news made by 8B Llama) .

Obviously this is a nuclear “go away” tactic and is not for everybody but it has been great for me in my goal of stopping LLMs from taking my code or text content. I remember reading a case study Anthropic did about LLM poisoning but I can’t find the article anymore so it won’t be attached here, but surely at some point they need to block my domain when they realize the bot has sent a cool 5 million requests to my domain as recently.

1 Like