Faster disk space calculation for upload heavy instances

The ‘du’ command, which is used to retrieve disk space used by uploads, was causing performance issues for my admin dashboard. Yes, we have a LOT of image uploads. Rather than disabling it altogether (we have grafana/prometheus dashboard after all), I decided to replace it with a much faster very rough approximation using ‘df’. This change is of course selectable by the admin with the default being ‘du’.

I have submitted a PR for this change. It is my first PR, so please go easy on me :))

You can view the PR here:

Do you have a rough idea how long the du was taking for you? I’m not really keen on this approach to resolve the performance issue and I think there are two alternatives:

  1. Just use Upload.sum(:filesize).to_i + OptimizedImage.sum(:filesize).to_i to determine uploads_used_bytes like how we do for external stores.
  2. Introduce a background job that periodically recalculates the bytes used by uploads and cache it in Redis.

Personally, I’m more keen on (1) as it is a much simpler solution. We do lose some accuracy but we don’t need 100% accuracy here. Still it will be much better than the approximation we get from df.

2 Likes

Thanks for your feedback.

du was taking more than minute on HDD and with SSD it’s roughly 20 seconds.

My reasoning is that if you have an instance depended on uploads like ours, you have a dedicated partition for uploads anyway.

But yes, your solution #1 looks more elegant. I’ll look into it and submit a modified PR.

Modified PR is here:

Thank you for your consideration.