Automatically add staff notice about duplicate on the internet?

The Akismet spam filter is quite good at finding duplicate posts between the Docker forum and, say, Stack Exchange sites (mostly Stack Overflow), GitHub and Reddit. These posts end up in review, but then don’t reveal where Akismet may have found the matching post:


Akismet flagged this post as potential spam.

I guess I wanted to ask if the Akismet plugin could be configured to show URLs of other occurrences. But actually, I want more…

Often just copy/pasting part of the text into Google reveals the source after all. And on the Docker forums I then tend to reject the flag (approve the duplicate post) but also add a staff notice for the volunteers who’re answering questions. Like so:


:warning: This was also posted on Stack Overflow. If you want to spend time on answering, you may want to check if new details were added or if someone already answered there.

So, wondering: did anyone ever try to automate something similar?

Asides:

  • I also tend to post a link back to the forum on Stack Overflow; that’s a manual action anyway. So, if automated then one may want to be notified anyhow.

  • I quite often use the same approach for “New user typed their first post suspiciously fast, suspected bot or spammer behavior.” which is not detected (or marked) as a duplicate by Akismet (yet).

4 Likes

Hi @Arjan. :wave:

I didn’t know Akismet filtered for duplicate copies online; I’m guessing it was the inclusion of certain markup used in those examples that triggered Akismet.

I can’t find mention of Akismet providing that service, could you provide guidance? If that information is available via their service maybe we can tap into it. :slight_smile:

2 Likes

Hmmm, you may be right. I boldly assumed that Stack Exchange was also using Akismet (which I do not actually know). I think, but will need to find examples, that I also saw the review being triggered for existing posts, after it was duplicated to Stack Exchange. Most often it seems the Stack Exchange post was older, which also explains copy-paste triggering the “typed their first post suspiciously fast” review.

Also, for some time, we surely saw many false positives after posts were edited. This made me assume the filter was confused by its own algorithm to find duplicates, not understanding the duplicate from some online database was the very same post on the very same forum. When searching for the cause of this, I did not find any references in Akismet’s services.

So, many assumptions. I’ll try to find some examples, but maybe even more posts are duplicated between the forum and other places, and maybe I’ve only found few of them after all. :thinking:

It seems Stack Exchange has their own home made solution, at least they did 2 years ago: How does spam protection work on Stack Exchange? - Stack Overflow Blog

Of course, Akismet could still subscribe to the public feed of Stack Exchange posts, but it’s not their goal to find duplicates. (Or maybe the Stack Exchange duplicates that Akismet flagged were also duplicated elsewhere. Oh well.)

1 Like

@maiki I’ve not run into examples to confirm this seemed to have happened. Surely Akismet flagged existing posts as spam after some time passed, but I’ve no clue about its internals to determine that.

1 Like