List of all uploaded files

Hello friends,

what are you thinking about some extra tab at the admin interface to keep control over the current disk usage, files sizes, click rates, filtering them after extension and searching for specific terms?

I wish, I could get more control over all attachments.

Best

7 Likes

That’s something I definitely want done. Not sure when though.

11 Likes

Sounds useful!

There is a bunch of product questions around how this could/should work in my mind. It looks like a problem that has been already figured out or solved in other forum software, but I wonder how will this be framed / built around Discourse’s philosophy.

I am also wondering more about the manifestations of providing uploads settings, e.g.: if you chose to introduce extension configuration, the composer needs to cater for this when users are selecting/uploading attachments, and so on. Also, whether this should be in core, or as a plugin.

On the bright side, if we ignore all questions / assumptions / product / design considerations, it doesn’t look like it is technically tricky, I made a one hour spike here (this is very immature implementation though :blush:).

You can definitely extend this with a bunch of useful features: settings, search / filtering, sorting, etc (and some other considerations like supporting pagination, 
)

6 Likes

If youbtalking about admin tools. You can considure usong plugin data explorer whic allows admins to query database. When I hunting free space I am querying db there is table uploads whic gives you size location and other useful things. Also one good thing is that deleating rows from uploads also dealeting files. Becouse ther is sidekiq job pruge oprhan uploads.

I am totally open to adding something here, but I feel a top level tab is a bit too much .

Conceptually this feels like a “report” to me with a drilldown vs a section for uploading things.

I would like to see this link to the new report

When I think about this problem I think the main use case is around admins trying to get a handle on the 
 upload problem


  • Why do I have so many uploads?

  • Which users have the most/largest uploads ?

  • What are the 100 biggest uploads on my forum?

  • How many things were uploaded in the last month? That way I can keep track of trends.

@codinghorror / @joffreyjaffeux what are your thoughts here?

12 Likes

Yes I agree it could be a report, we might have to start working on the filtering logic I talked with you months ago. But other than that it should be good.

3 Likes

Great line of questioning @sam !

I think perhaps I lack some context on what is the admin’s job / use case for the upload problem, but it appears to be (if there is any research or perhaps admin opinions to confirm that, it would be great!) that it can be framed as: I am an Admin, I want to gain understanding of my instance’s uploads usage.

I wonder if there are any after actions when the job / use case is satisfied. For example, if the admin notices a problem or a trend, will the admin/site_settings/category/files is the place they can optimise their file uploading strategy?

Also, I agree that the Uploads section is heavy as a top level tab for this use case.

Any news about this topic?

Im looking for a option to look through the uploaded files.

3 Likes

I guess, sorting and listing files isn’t as much as important for Discourse as for us. :see_no_evil:

Similar issue:

Here are some Data Explorer queries for getting details about post uploads. Possibly something similar could be added to the admin reports section for sites that don’t have the Data Explorer plugin installed. Let me know if this data is not what you are looking for.

Which users have the most/largest uploads?

Returns the user, their upload count, and their total uploads in kb rounded to 2 decimals. The join on the users table is to prevent data from deleted users from being returned. Results are ordered by total upload size in descending order.

Including optimized images

WITH uploads_with_optimized AS (
SELECT
ul.user_id,
ROUND((SUM(COALESCE(oi.filesize, 0)) + SUM(ul.filesize)) / 1000.0, 2) AS total_kb
FROM post_uploads pul
JOIN uploads ul
ON ul.id = pul.upload_id
LEFT JOIN optimized_images oi
ON ul.id = oi.upload_id
GROUP BY ul.user_id
)

SELECT
uwo.user_id,
COUNT(uploads.user_id) AS upload_count,
total_kb
FROM uploads_with_optimized uwo
JOIN uploads
ON uploads.user_id = uwo.user_id
GROUP BY uploads.user_id, uwo.user_id, total_kb
ORDER BY total_kb DESC
LIMIT 50

Excluding optimized images

SELECT
ul.user_id,
COUNT(ul.user_id) AS upload_count,
ROUND(SUM(ul.filesize) / 1000.0, 2) AS total_kb
FROM post_uploads pul
JOIN uploads ul
ON ul.id = pul.upload_id
GROUP BY ul.user_id
ORDER BY total_kb DESC
LIMIT 50

What are the 100 biggest uploads on my forum?

Returns the user, the post with the upload, and the upload’s file size in kb. Results are ordered by upload size in descending order.

Including optimized images

SELECT
ul.user_id,
pul.post_id,
ROUND((SUM(oi.filesize) + ul.filesize) / 1000.0, 2) AS total_kb
FROM post_uploads pul
JOIN uploads ul
ON ul.id = pul.upload_id
JOIN optimized_images oi
ON ul.id = oi.upload_id
GROUP BY oi.upload_id, ul.user_id, pul.post_id, ul.filesize
ORDER BY total_kb DESC
LIMIT 100

Excluding optimized images

SELECT
ul.user_id,
pul.post_id,
ROUND(ul.filesize / 1000.0, 2) AS total_kb
FROM post_uploads pul
JOIN uploads ul
ON ul.id = pul.upload_id
ORDER BY total_kb DESC
LIMIT 100

Uploads for past 30 days

The query requires you to supply an :end_date parameter. The date should be in the form ‘yyyy-mm-dd’. For example ‘2020-01-08’. It returns results for the 30 period that ends with the end_date. It returns the day, upload count, and the day’s total uploads in kb for any days in the period that have post uploads. Results are ordered by day.

Including optimized images

--[params]
-- date :end_date

SELECT
ul.created_at::date AS day,
COUNT(1) AS upload_count,
ROUND((SUM(COALESCE(oi.filesize, 0)) + SUM(ul.filesize)) / 1000.0, 2) AS total_kb
FROM post_uploads pul
JOIN uploads ul
ON ul.id = pul.upload_id
LEFT JOIN optimized_images oi
ON ul.id = oi.upload_id
WHERE ul.created_at::date BETWEEN :end_date::date - INTERVAL '30 days' AND :end_date
GROUP BY ul.created_at::date
ORDER BY ul.created_at::date DESC

Excluding optimized images

--[params]
-- date :end_date

SELECT
ul.created_at::date AS day,
COUNT(1) AS upload_count,
ROUND(SUM(ul.filesize) / 1000.0, 2) AS daily_upload_kb
FROM post_uploads pul
JOIN uploads ul
ON ul.id = pul.upload_id
WHERE ul.created_at::date BETWEEN :end_date::date - INTERVAL '30 days' AND :end_date
GROUP BY ul.created_at::date
ORDER BY ul.created_at::date DESC

11 Likes

I’m a new (meta) community member, and have recently taken over the admin duties for another community using Discourse (and hosted by Discourse). On the dashboard, I just saw a “jump” in the space that the Uploads are using (from about 0.7GB to 1.2GB). It seems the Data Explorer is only available for Business plans or higher
 Would there be any other way of finding out what recent files are using the additional 0.5GB?

2 Likes

This is entirely my fault.

We had a big bug in the past where we were not properly counting size of all uploads.

When a user uploads an image we often resize it up to 3-4 times for various resolutions and optimization. These optimized images are stored in the cloud and still cost storage space.

To see the actual images you would:

SELECT * FROM optimized_images

@simon maybe update the above to account for optimized_images?

5 Likes

Ah, that makes a lot of sense. I did briefly wonder if that might be the case–after I wrote the post here :wink:

Hmmm, maybe it’s worth adding a brief sentence to that effect on the storage gauge on the dashboard, so people don’t think someone is siphoning off their storage space with unknown data :wink:

And if you ever create an “Uploads inspector” that doesn’t require the Business plan plugins, that’d be super appreciated!!

Cheers!

4 Likes

finally I didn’t understand how can I see list of member’s upload in different subjects and post!!??

1 Like

I would like to browse the uploaded files directory. My particularly use case is to investigate the details of this linkrotted URL:

Any interpreted admin support to do so would be a great help. Or have I missed something? With best wishes, R

1 Like

Another workaround is to:

  • create and download a site backup with “include uploads”
  • navigate to that archive and unzip it, say from the command line: $ tar -xvzf xxxx.tar.gz
  • swap to the uploads directory: $ cd uploads
  • select between original and optimized sections and explore the resulting file tree
  • none of the upload names are present, all names deploy random (or encoded) strings

A GUI interface on the admin portal would provide a nicer UX. So my vote is for that functionality. R

1 Like