[Paid] Prune Spam users

What would you like done?

Remove spam users from discourse. Specifically, remove users if not logged in for a year and never posted.

When do you need it done?

ASAP

What is your budget, in $ USD that you can offer for this task?

Please give me a quote!

1 Like

Removing spam users would be good.

User.joins(:user_stat).where("user_stats.post_count = 0 AND previous_visit_at <= '2016-05-22'::timestamp").destroy_all
Should do the trick :slight_smile: Of course make sure to make a backup first :slight_smile:
Edit:
Sorry, forgot usage tips :slight_smile: Of course you have to log into your server, go into the discourse directory [probably /var/discourse], enter your Docker container [probably ./launcher enter app] and then enter rails console by typing rails c . These are defaults on docker based installations :slight_smile:
@treb0r

7 Likes

Hey @MakaryGo, thanks dude, that’s kind of you.

I will give it a go.

Cheers :smile:

2 Likes

Don’t share email addresses in public, go ahead and PM each other as needed!

One caution: destroying users can be slow, so be sure you batch this somehow in case a giant mass delete times out or anything weird.

1 Like

There are probably fewer than 1000 users with zero posts who haven’t logged in in the past year, so this should be reasonably safe.

2 Likes

The users were originally imported into Discourse from BBpress, and that’s where the spam users came from.
I think there’s probably about 5000 or so.

What is the syntax to get a count of users before I run the destroy?

put a .count at the end of the line above where .destroy_all is.

I think you can add limit 1000 in the where clause to limit the query.

3 Likes

Thanks. That’s great.

I’ll give it a go…

Not quite. The limit thing works like this:

User.joins(:user_stat).where("user_stats.post_count = 0 AND previous_visit_at <= '2016-05-22'::timestamp").limit(1000)

You can add .destroy_all at the end of it.

2 Likes

Just getting around to trying this.

Does anyone have any guidance on the best way to maintain a staged copy of a production discourse forum?

I’m worried about running these kind of queries on production, even if I do have a fresh backup.

I was thinking about setting up an LXD container on my local machine and installing docker discourse there. What’s the best way of handling this?

Here’s how I do it:

It’s enough to import a backup just once.

Then you can try anything out in your sandbox, including checking that the latest OS patches and Discourse updates are ok. Our live site is on AWS, but if you run your sandbox on a different provider than your live site then you’ll be open to differences in OS/Droplet changes.

I don’t run any mail on the sandbox, so regardless of what I mess up it won’t start sending stuff to the users.

2 Likes

That’s great, thanks for the reply.

One question - what happens to the urls in this situation?

If I import a backup from mydiscourse.com into mydiscourse.dev do I need to manually change the address in the admin?

1 Like

It depends. If you leave .com in .css, for example, then the sandbox will keep taking you to the live site.

iirc I just went through the settings and changed any that seemed might cause a problem, but to be absolutely safe you might want to change all of them.

Also, if you’re using S3 for storage, do something about that so that you are sure that you’re not influencing the live site.

Links to other posts inside the sandbox resolve to the corresponding item in the sandbox, but I think this might be dependent what you do with hostname in app.yml (if you copy your app.yml over from your live site then be veewy, veewy cawefuw…)

1 Like

Thanks. I’ll be careful :pray:

You could also change your DNS where you’re browsing from, so you’d not keep getting pushed to the live site.

1 Like

Great idea. Thanks. /etc/hosts did the trick.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.