Human-driven copy-paste spam

Mittineague · September 17, 2018, 11:32pm

Might it be possible to give posts that were edited an “unreviewed”, or modify the WHERE in the query to include them, so that those sites that wanted to could use this plugin?

sam · September 17, 2018, 11:35pm

discourse moderator attention already works that way but I did not really want to open this pandora’s jar, it is also an atomic way of handling this

sam · September 17, 2018, 11:37pm

This is a very very interesting use case @rishabh, we want to be able to make some queries available to moderators in data explorer if flagged explicitly by admins, at the moment only admins have access to data explorer for security reasons.

jsha · September 18, 2018, 12:11am

I had forgotten about Data Explorer! I would be interested in seeing this query, at least as a first pass so we can evaluate how much stuff we might be missing today.

codinghorror · September 18, 2018, 2:27am

Narrowing the allowed edit window to 60 minutes or 30 minutes (or even less) should adequately address this, I would think.

rishabh · September 19, 2018, 7:31am

Last 500 posts that were edited by TL0/TL1 users

SELECT
  p.id AS post_id,
  topic_id
FROM posts p
  JOIN users u
    ON u.id = p.user_id
  JOIN topics t
    ON t.id = p.topic_id
WHERE p.last_editor_id = p.user_id
  AND p.self_edits > 0
  AND (u.trust_level = 0 OR u.trust_level = 1)
  AND p.deleted_at IS NULL
  AND t.deleted_at IS NULL
  AND t.archetype = 'regular'
ORDER BY p.updated_at DESC
LIMIT 500

This query should do it, it lists recent posts that have been edited by the OP if the the user has a trust level of 0/1.
Shoutout to @simon for helping me finish this query!

jsha · September 19, 2018, 5:10pm

Wow, thanks. I am, as always, blown away by the level of friendliness and helpfulness here.

I’m poking through the output of that list, and here’s a good example of why this particular type of spam is so frustrating:

Post: https://community.letsencrypt.org/t/future-of-wildcard-certificates-obtaining/72213
Original: Future of wildcard certificates obtaining : letsencrypt

You can see that several forum members spent a while helping this person out with genuine answers to their questions, not realizing that those questions had actually been asked on Reddit and they were talking to a spammer.

Anyhow, the Data Explorer query is a cool tool; we’ll see how that works for us. And I’ll additionally lower the post edit time.

pfaffman · September 19, 2018, 5:37pm

Wow. That’s a fairly intricate and devoted attempt to spam. And the links have no-follow, so they don’t even do the spammers any good.

Stranik · September 19, 2018, 8:36pm

Nofollow does not transfer the weight of the page further, but search engines continue to take transitions from them. Behavioral factors have been very much appreciated recently. That’s why spammers want to place any links where there are real transitions.

codinghorror · September 20, 2018, 3:37am

Looking at the example… the original post was at 9-14 5:28am, and the spammy edit was much much later at 9-14 3:07pm. So tightening the allowed edit window considerably would, again, be my first recommendation.

jsha · September 20, 2018, 4:01am

Yep, I’ve tightened to 60 minutes. We’ll see what effect it has! If the spammers continue without paying much attention, we may wind up answering a lot of copy-pasted questions and just not getting the follow-up edits that add the links. It’s entirely possible for both the spammers and us to continue oblivious after the change.

codinghorror · September 20, 2018, 4:26am

We are always interested in ways to better defeat spammers by default so keep us advised on the results!

jsha · September 22, 2018, 5:31pm

So far we’ve run into one example (that we know of) where the 60-minute window restricted legitimate activity: A forum regular (TL3) wanted to edit their post at Compatibility testing of No Common Name - Issuance Tech - Let's Encrypt Community Support and was surprised to find they couldn’t. They followed up on our Lounge thread.

Is it true that lowering the edit window also restricts the ability to “Make Wiki?” If so, that makes sense, but it could be clearer. Maybe by keeping the “Make Wiki” option but providing an informative error?

A TL4 user later came along and made the post a wiki. I assume “make any post a wiki” is a TL4 privilege, but it doesn’t appear to be listed at Understanding Discourse Trust Levels. Might make a good edit to that post!

jsha · October 15, 2018, 4:37am

As an FYI, we just had our first instance (AFAIK) of a reply that was edited within the 60-minute window to add spam links: Plesk wildcard certificate renewal fails - Help - Let's Encrypt Community Support.

The reply was made at 2:25 am and the edit was made at 3:21 am. Which might be just a coincidence or might indicate intentional adaptation to the new limit.

codinghorror · October 15, 2018, 6:08am

No, I don’t think this is true. Have you found it to be the case? I’m unclear.

Your options at this point are to further tighten the time limit for editing, from 60 minutes to 30 minutes, 15 minutes, etc… or…

I believe a new release of Data Explorer should have the “show me recently edited posts” query bundled with it, but I am not sure when that will be released. What’s the planned date of release for that @rishabh?

rishabh · October 15, 2018, 6:18am

That has already been merged last week with:

github.com/discourse/discourse-data-explorer

FEATURE: Ship default queries with the Data Explorer

discourse:master ← rishabhnambiar:ship_default_queries

opened 10:15AM - 28 Sep 18 UTC

rishabhnambiar

+352 -11

**How it works:** -Queries are added to the Data Explorer from json -These de…fault queries are saved in the db IF they are run -If changes are made to the json file, the new values are updated whenever we run a query. - [x] Reads default queries from json and adds to serializer - [x] Hides edit/delete buttons for default queries - [x] Gives a persistent id to each default query - [x] Updates last run at for default queries - [x] Choose which default queries to ship and make a post on meta for selection

Sites that are up to date can see this on admin/plugins/explorer:

jsha · October 15, 2018, 5:47pm

Yes, one of our forum users reported trying to self-wiki a post and failing, after I had changed the edit window. They were TL3 at the time. After I bumped them to TL4 they were able to wiki the post.

jomaxro · October 15, 2018, 6:34pm

I can confirm that this is true, just tested on try. Both the “edit” and “make wiki” buttons disappear outside the post edit time limit.

I seem to recall it was intentional, not a bug. If a user is restricted from editing their post, they shouldn’t be able to make the post a wiki such that they can edit it. That bypasses the edit restriction. In this case staff interaction is required. We see this occur even here on Meta with some of our older howto and plugin topics that aren’t already wiki’d.

codinghorror · October 15, 2018, 6:42pm

Reading more closely, it sounds like this is already the case so nothing else to do here then?

jsha · October 15, 2018, 7:00pm

Yep, I think this was all “working as intended.” It was a bit confusing since the post edit time limit setting didn’t mention that it also affected wiki’ing, and when attempting to wiki, there was no notification that “you can’t wiki this post because it’s outside the time limit.” Those might be a couple minor doc improvements, though I also acknowledge this is a pretty niche area, so I don’t feel strongly if you want to leave it as is.

Topic		Replies	Views
Spam bots tricking Discourse filter by editing Support	28	3028	April 13, 2023
Free to edit post at any time Feature	34	15238	May 22, 2023
Editing Old Posts and Adding Links Doesn't Alert Anybody Feature	23	7615	October 7, 2014
People editing posts into spam Support	16	1325	August 21, 2023
Diagnosing spam attack of 100 topics Feature	34	2894	May 29, 2017

Human-driven copy-paste spam

Last 500 posts that were edited by TL0/TL1 users

Related topics