Delete deleted-posts permanently in bulk?

If I am not mistaken there is a field called deleted_at

If it is deleted, then there should be a date time Stamp.

If it is not deleted then is null

Maybe look for the entry that is <> Null and delete

1 Like

Thanks Gav, but the ones that I am targeting arenā€™t marked as deleted. Rather, their Topic has been deleted and then destroy_allā€™d.

2 Likes

I was playing around with this and i notice that part of your query was not returning the response you need.

SELECT topic_id from posts does not return an integer, but it returned a string

image

This might be the cause that your db still contains orphaned posts.

2 Likes

I think that you are seeing Data Explorer automagically converting the integer into URL, which it does when the label is topic_id.

When I run this query in Data Explorer, all of the posts that I am attempting to identify are captured (well over 5000):

SELECT id, topic_id
FROM posts
WHERE topic_id not in (select id from topics)
ORDER by id

Iā€™m clearly doing something wrong with my Rails syntax as this is what Iā€™m getting:

[1] pry(main)> Target = Post.where('topic_id not in (select id from topics)')
=> []

Can anyone tell me what I am doing wrong?

1 Like

Well, thanks to @pfaffman I have identified the relevant posts using this:

Post.find_by_sql("select id from posts where topic_id not in (select id from topics)")

I get this output:

[1] pry(main)> Post.find_by_sql(ā€œselect id from posts where topic_id not in (select id from topics)ā€)
=> [#<Post:0x000055df30d4ee90 id: 150>,
#<Post:0x000055df2e538ff0 id: 51097>,
#<Post:0x000055df2e50ba28 id: 83>,
#<Post:0x000055df2e4ee8b0 id: 40636>,
#<Post:0x000055df2e4a92d8 id: 62562>,
#<Post:0x000055df2e4b7978 id: 13522>,
etc

However, I cannot work out how to then apply destroy_all to this selection.

This might help (for my own benefit):

Any suggestions?

1 Like

I think this will do it

posts=
Post.find_by_sql("select id from posts where topic_id not in (select id from topics)")


posts.destroy_all

Or you could add the .destroy_all to your find_by_sql

2 Likes

Iā€™ve tried that. The data seems to be returned as an array with some post identifier and an ID (see Delete deleted-posts permanently in bulk? - #45 by nathank).

This is the error I get when I either add .destroy_all or use the posts= that you suggest:

[2] pry(main)> posts.destroy_all
NoMethodError: undefined method destroy_all' for #<Array:0x000055fe7bc7fc98> from (pry):3:in pryā€™

1 Like

Oh. Then maybe see if

 p=posts.first

Is a post_id. If t thatā€™s it then you can

x=Post.find(p)
x.destroy

Then you could loop through those.

I think youā€™d need to wrap your query in something to get it to be an array of posts rather than post ids.

1 Like

Thanks Jay. This is what I get:

[3] pry(main)> p=
[3] pry(main)* posts.first
=> #<Post:0x0000563a24cab908 id: 150>
[4] pry(main)> x=Post.find(p)
ArgumentError: You are passing an instance of ActiveRecord::Base to find. Please pass the id of the object by calling .id.
from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/finder_methods.rb:467:in `find_oneā€™

Now, tell me about these wrap things. This is so annoying, because I know that @Samā€™s rails command here should work, but I think that Rails has changed since then:

1 Like

What did that do? I canā€™t imagine that rails changed.

1 Like

Does that give you the posts you want to destroy?

1 Like

I did read somewhere online when searching for converting an Active Record Array into an Active Record Relation that things changed between Rails 3.x and Rails 4 and the syntax needs to be different, but it went over my head really.

That seemed to pick up a few when I first ran it, and I duly destroy_allā€™d them. But there werenā€™t many. Now doesnā€™t pick up any at all, while the SQL picks up thousands when run in Data Explorer.

These are orphaned Posts, where the Topic has been Destroy_allā€™d.

1 Like

Can you use each{} to iterate the members of that array, calling destroy on each post individually?

Post.find_by_sql(ā€œselect id from posts where topic_id not in (select id from topics)ā€).each { |p| p.destroy }
2 Likes

Well, Iā€™ve given it a try. First it didnā€™t like the SQL wrapped up like that nor the syntax:

Post.find_by_sql(ā€œselect id from posts where topic_id not in (select id from topics)ā€).each { |p| p.destroy_all }
SyntaxError: unexpected `inā€™, expecting ā€˜(ā€™
ā€¦rom posts where topic_id not in (select id from topics)ā€)ā€¦
ā€¦ ^~
SyntaxError: unexpected local variable or method, expecting end-of-input
ā€¦t in (select id from topics)ā€).each { |p| p.destroy_all }

So I had another hacky crack by splitting it up:

posts=Post.find_by_sql("select id from posts where topic_id not in (select id from topics)")
posts.each do |p|
p.destroy
end

This seemed to run okay, but adding the p.destroy throws this up:

ActiveModel::MissingAttributeError: missing attribute: user_id
from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activemodel-6.1.4.1/lib/active_model/attribute.rb:222:in `valueā€™

Tried a few ways to get that in there but then gave up. @sam, can you help?

P.S.

I even tried to convert the SQL to AR via scuttle.io as per this:

Post.select([:id, :topic_id]).where(Topic.select(:id))

Alas, I get this error:

ArgumentError: Unsupported argument type: #Topic::ActiveRecord_Relation:0x000055c67a7131d0 (Topic::ActiveRecord_Relation)

3 Likes

When I copied and pasted that part from a previous message, it looks like the quotes got converted to curly quotes somewhere, I expect thatā€™s the real error. Sorry about that.

find_by_sql is described as returning an object with the values specified in the SQL query, which presumably means youā€™re getting back a Post object which only has the id property set, user_id and everything else is missing.

... find_by_sql("select * ... will deal with that. There is probably a subset of values you could select to achieve a destroy, rather than selecting everything, but I donā€™t know what that subset would be.

So the whole thing: (without curly quotes this timeā€¦)

Post.find_by_sql("select * from posts where topic_id not in (select id from topics)").each { |p| p.destroy }
4 Likes

:partying_face: Hallelujah!!! :partying_face:

Thank you Simon - that worked an absolute treat. The orphaned posts are all gone and I look forward to my uploads shrinking down to almost nothing over the next 24hrs or so.

later
And they did! 3.5GB down to 0.7GB. Sweet!!

7 Likes

Excellent, glad to hear that did the job for you. So for the sake of combining the answers, one can use the following in the Rails console to destroy all topics that were deleted more than 90 days ago, repeating as many times as necessary if there are more than 1000 topics:

Topic.with_deleted.where(deleted_at: ...90.days.ago).limit(1000).destroy_all

After that has completed, the following can be used to destroy all posts that have been orphaned from destroyed topics:

Post.find_by_sql("select * from posts where topic_id not in (select id from topics)").each { |p| p.destroy }

Itā€™s worth noting that the above commands will not destroy deleted posts however, only deleted topics and their orphaned posts. To also destroy deleted posts older than 90 days, use the following, again repeating as necessary:

Post.with_deleted.where(deleted_at: ...90.days.ago).limit(1000).destroy_all

P.S. Out of interest, did you try destroy_all without limit(1000) and have problems or did you not try it without the limit?

8 Likes

I tried it without the limit and something went a bit crazy - but I forget the details sorry.

Can we mark your post as a solution or use this from the OP?

image

3 Likes

No worries. If you hadnā€™t tried it, I would have added a note that the limit might not be necessary but since you had problems, I can just leave it as it is.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.