Recently I ended up being the last remaining admin and maintainer of a basic Discourse Docker image instance that was originally installed on our server in 2021 (I think) and mostly updated by someone else. For some time now, possibly right from the start we’ve been having a problem with uploads from soft-deleted posts not getting orphaned and purged and I’ve been trying to troubleshoot this issue again for a few days as the obsolete files keep piling up and wasting storage space. We are not using S3 and there’s plenty enough storage for the uploads that we actually want to keep available.
I’ve migrated the full Discourse backup file including uploads to a separate staging server for testing by rebuilding with our app.yml following the official Discourse Docker installation guides and after that restoring the backup from commandline. Both installations seem to be running identically fine with no other obvious problems, but the upload issue remains.
I can’t seem to find any relevant errors from any logs and Sidekiq is running the cleanup jobs as scheduled. I have run rake db:migrate on the staging version and rebuilt many times, tried destroying posts permanently and checking settings. After permanently deleting some posts straight from the rails console and trying to run the cleanup job manually, I noticed that the tombstone directory had grown in size just a little at some point and there was some files to begin with anyways so the mechanism must’ve been working in some situations, right? Judging from the small size increase, almost all of the obsolete files are still not detected as orphans.
Current relevant admin panel settings listed below. Can I set the last ones to 0 to effectively skip the grace periods during testing?
clean up uploads = true
clean orphan uploads grace period hours = 1
purge deleted uploads grace period days = 1
How can I troubleshoot this efficiently? I’m comfortable with the commandline but my database skills are rudimentary, so I’d really appreciate some tips to avoid going through every possible server setup detail without a clue about what I’m looking for at this point.
I’ve been desparately searching and reading this forum for similiar cases, but there’s only a few and those threads appear to stop either to a dead end or manual solutions for single files, so not directly suitable for this usecase.
Please ask me for more details if needed, I’m doing my best to solve this for good.
Those topics and a few more linked in them have become quite familiar to me while trying to solve this issue but unfortunately they haven’t provided any definitive solution to this problem.
Yesterday on the staging server I ran these commands modified for topics and posts deleted more than 9 days ago:
After this I noticed a slight increase in tombstone directory content size and I’m still monitoring the situation due to the grace period, still wondering if changing the relevant settings to zero hours/days would bypass the waiting time during testing.
Earlier on the original server I have tried removing uploads from the newest post revisions but the files were still available after the grace period.
At this point I’d be personally more than delighted to find out any working manual solution for permanently deleting even a single topic with its posts and uploads not referenced anywhere else, but this could be a big problem for other people running Discourse naturally assuming that the cleanup settings in the admin panel would be effective as described but not necessarily noticing if it’s not the case and ending up with potentially sensitive uploads expected to be permanently deleted but in fact staying in the filesystem. Our problem is luckily only considering wasted storage, but for someone else this could be much worse.
There’s another similiar mention just two months back:
So, any tips on how to figure out if this is a misconfiguration on our end or an actual bug? We’ve been very happy with Discourse otherwise and I’m very motivated to solve this and help others on the way.
This is purely speculative but from a quick look at the post, post_upload and upload models, you can probably find out if you have orphaned uploads (database objects) with this:
Upload.find_by_sql("select * from uploads where id in (select upload_id from post_uploads where post_id not in (select id from posts))")
I haven’t tested that so I can’t be sure if it will find orphaned uploads correctly or even execute without error. In case it doesn’t work as-is and someone else might be able to make it work, as well as just for anyone else interested, I’ll break down the intent.
Upload.find_by_sql() returns a collection of Upload objects that are matched by the provided SQL query.
(select id from posts) gets all of the IDs for existing posts.
(select upload_id from post_uploads where post_id not in () gets all of the IDs for post uploads for which no post exists.
select * from uploads where id in () gets all of the uploads matching those post upload IDs.
That’s just one possible avenue to investigate though, unfortunately I don’t know the upload system well enough to really contribute much otherwise, except to say that the above definitely does not account for all situations. Edited rather than deleted posts being an obvious one.
There are also other types of uploads not accounted for like user uploads which I assume is things like uploading a profile picture.
Plugins can also create and hold on to uploads, I don’t know what happens with those if say the plugin is removed. I think plugin data remains in the database after a plugin is removed which potentially means any uploads created by that plugin are never removed in that situation.
The query works but it lists only two uploads and their details. There should be hundreds or thousands of uploads matching the orphan criteria, most are image files originally uploaded by users while making normal posts.
There was some sort of an overhaul on the upload process a while after our original installation, wondering if it could be related to our situation somehow: A new era for file uploads in Discourse
The grace period should’ve been passed on the staging server by now but I see no effect in the upload directory size and test files are still available. What should I be looking for next? Could this be caused by some faulty filesystem permissions or such, is there an easy way to check? I’m running out of ideas for specific targets, everything else is running great and this is the only problem we currently have.
Going through similiar topics to gather possibly matching unsolved cases, here’s a good example on how these situations might even cause legal issues due to user uploads not getting orphaned and permanently removed like they should:
Another similiar situation all the way back from 2016:
These kinds of conditions create a huge opening for abuse and even targeted attacks for uploading illegal content that might not get permanently removed from the server even when the admins assume it would. Of course deleting single files manually straight from the filesystem is possible, but I don’t think people should be forced to take that route for such a basic need, especially when there’s a GUI setting indicating an automatic purge process and moderators often don’t have direct access to the server anyways. Also manual deletion is not practical with loads of files scattered around in different deleted topics.
Is here enough basis for an actual bug report? I’m still not ruling out possible misconfiguration on our end, but I’m baffled by the lack of error messages and everything else seems to be running just fine. I’ve spent a growing amount of days on troubleshooting and testing, gaining more knowledge about Discourse and its components in the process so I think with some guidance I could be able to help figuring out if there is some corner case detail triggering this weird behaviour. I hope it’s OK to ping @zogstrip at this point?
For a temporary solution, is it possible to manually move all uploads to the tombstone directory and use the upload recovery methods to restore only the non-orphan files back to their correct directories? I actually tried to do this today, but rake uploads:recover_from_tombstone didn’t restore any files. Could this be pointing to some bigger problem with uploads’ database entries?