S3 uploads fail on large files


(dobon) #1

Hi all,

I have configured my discourse installation to upload files to S3, but am having unusual delay when uploading large files. To preempt the question: yes, I have configured nginx and discourse to accept arbitrarily large files (in fact, I made a thread that sussed out a bug preventing uploads larger than 10MB due to a hard coded limit on the frontend).

When I upload a large file (50MB-2GB), the uploader shows progress that crawls to 100% at the speed that seems right for my connection and filesize, but then the bar gets stuck at 100% for a very long while, and may never complete. Some files (~25MB) seem to ‘upload’ for x minutes, then stall for around x + ~3 minutes, then ‘finish’ uploading and are available from S3. Others (~100MB) will seem to upload for x minutes until the bar says 100%, then stall indefinitely (or at least longer than overnight).

I am happy to investigate this issue, and will even contribute a PR to patch it, but would like to check-in to see if this is a known feature/limitation, and if anyone is already working on this bug.

(Jeff Atwood) #2

What kind of files? What is the file extension? Are they images?

(dobon) #3

The upload fails on large binary files. I haven’t tried with very large image files, but regular (1-3MB) images upload correctly. I have seen this occur with .rar’s, .zip’s, and various video formats (.mkv, .avi, etc.).

(Régis Hanol) #4

The progress you see on the composer is only reporting the progress of the upload from your computer to the server. Then, the server has to validate the file and upload it to S3. This step depends on how large the file is and how fast your server can upload to S3. It’s not usually a problem since servers have much faster upload links but it might be worth checking the speed on your server by trying to upload some files manually to S3.

(dobon) #5

You’re right, the delay is from the server–>S3 leg of the upload; I spent some time on the weekend reading through the relevant code. Once I set up a dev environment (probably this Wednesday evening), I’ll have a go at implementing ember-uploader as an alternative to the current jquery.fileupload.js, as it can communicate with S3 directly and will circumvent the unwanted server-to-server latency. Before I get too far into this plan, should I clear it with any of the maintainers?

The plan consists of adding a signing service to the rails app, and swapping out the file upload plugin in the ember app.

(Régis Hanol) #6

How will it handle all the “work” we do on images? When someone uploads an image, we automatically downsize it (if needed), we fix the rotation, we optimize the image before sending it to S3.

(dobon) #7

The best solution I can think of is to filter uploads by their file extension (or MIME content-type, or something else?): if the file is recognized as an image, upload it to the server for processing prior to upload to S3 (ie: same as the current system), otherwise upload directly to S3.

Does this sound like an OK plan to you? I won’t get time to work on this until next week, at least.