Configure an S3 compatible object storage provider for uploads

Thanks for the thank you! I did flag for mod attention becuase I didn’t have edit rights at that point, but I’ve got them now thanks to the post I made. Ironic how that works huh?

But before I make a global blurb outside the MinIO section, can we confirm that Discourse as a whole does not support non-domain based paths anymore like this post started my edit hunt for?

If we know that Discourse as a whole does not support path mode (i.e. minio.server.com/BUCKET/foo/bar/... paths) and instead only supports domain paths (i.e. BUCKET.minio.server.com/foo/bar/...) then we can make that a global notice in the wiki and I’ll be happy to do so - however I need to hear it from someone far higher up the chain than me (as a simple community person) that this IS in fact the requirement for Discourse. If it is, then I can edit it in, otherwise… well, then I’ll just leave it as a blurb in the MinIO requisites.

2 Likes

MinIO is the only popular S3 clone with a past of using the now deprecated S3 path style, so I do not think it warrants a global warning, just in the MinIO section is enough.

4 Likes

Thanks, Falco, I’ve left it in the MinIO requisites, but put some strong emphasis on that caveat section as well because of the linked thread above that refers to why i’m poking again.

2 Likes

Seem to have an issue:

Entered

  after_assets_precompile:
    - exec:
        cd: $home
        cmd:
          - sudo -E -u discourse bundle exec rake s3:upload_assets

Rebuilt:

FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && sudo -E -u discourse bundle exec rake s3:upload_assets failed with return #<Process::Status: pid 2064 exit 1>
Location of failure: /usr/local/lib/ruby/gems/2.7.0/gems/pups-1.1.1/lib/pups/exec_command.rb:117:in `spawn'
exec failed with the params {"cd"=>"$home", "cmd"=>["sudo -E -u discourse bundle exec rake s3:upload_assets"]}
bootstrap failed with exit code 1
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.
1 Like

Can you follow the guidance and

3 Likes

Added a step to clean up old assets on S3 to the OP. Should work everywhere but GCP.

3 Likes

Most of the 170K files uploaded on rake s3:migrate_to_s3, but I think 12 had this:

: Unsupported value for canned acl 'private' (Aws::S3::Errors::InvalidArgument)

Maybe those were in PMs? Is there something I can do to fix those?

2 Likes

Hey @Falco. Does this make sense? I’m going through the replies to see that they’ve been dealt with so that we can turn on the delete-after-30-days thing on this topic.

I checked a few of the uploads marked private and they were in regular topics, so I couldn’t figure out why they were marked secure. (Secure uploads wasn’t set?)

See above

2 Likes

Secure Uploads are AWS S3 only, so it won’t work with any of those clones indeed.

2 Likes

Makes sense. I updated the top of the OP with that info. Any idea why local uploads would have been marked as secure on a site that didn’t have S3 or secure uploads enabled?

1 Like

Someone enabled that for sometime and then rolled it back when saw that it didn’t work?

2 Likes

I think that this problem with uploading to Cloudflare R2 may be resolved with Upload.where(secure: true).update_all(secure: false). I’ll try to get to that before this message gets deleted (but I added a note to the OP).

2 Likes

Hmm, we don’t have any secure uploads. I think I’m going to give Cloudflare R2 a go, as there free limits are pretty generous and they have a (beta) S3 migrator. I guess I’ll find out, but did you see if R2 was ok in the end @pfaffman?

1 Like

I can no longer remember if the issue was when I tried to upload images or when I just uploaded a new one. Thinking about it again, I would think that I tested it on a brand new site.

Migrating to a different S3 platform is rather tricky. There are some topics about that, but if you want to use Cloudflare R2 I would first try it out on a test site; there is a very good chance it won’t work.

1 Like

It sort of works, but not ready for production usage.

I had an older Discourse 2.7 local dev install hanging around and it worked fine, as in image uploads, use of CDN and backups to a private bucket when set up for Cloudflare R2. I updated to latest 2.9 (like our forum uses) and it seems fail on the Jobs::UpdateGravatar catch-up processing, where it is use the bucket notation incorrectly for Cloudflare where it tries to cache a gravatar remote image to R2. Example (where my bucket name on R2 is ‘uploads’):

Upload Update (0.3ms) UPDATE "uploads" SET "url" = '//uploads.123123redact.r2.cloudflarestorage.com/original/1X/123123example.jpeg', "updated_at" = '2022-12-12 20:44:02.929494', "etag" = '9c02b086b2aa5e2088ed44e1017fa63e' WHERE "uploads"."id" = 3

So I’d start up the UI and the avatars in my local test/dev set-up will be pointing to

//uploads.123123redact.r2.cloudflarestorage.com/original/1X/123123example.jpeg

So my best guess is that S3 is fine with the bucket dot notation and cloudflare R2 isn’t, or maybe the gravatar cache needs to be using the S3 CDN value instead?. It’s a shame as it seems pretty close…

2 Likes

I got a reply from Cloudflare that for R2, until the implement the object acl correctly, they’ve changed it to return a 200 if they get values in that property they don’t support yet. This changed behavior on R2 about end of November apparently. Obviously not ideal (it’s in a public bucket, with the API asking it to be private) but means for this issue it now ‘works’.

3 Likes

Oh! That does sound promising. I think maybe you’d need a separate bucket for uploads, though it would be quite a guessing game to get a backup’s filename.

I’ll try to have a look at this Real Soon Now.

2 Likes

I do use separate upload and backup buckets, and it did seem to work ok, in that it backup R2 bucket is not set to public and Discourse via the admin UI wrote a compressed backup in there ok. I downloaded it and had a look and it seemed sane (not the same as an actual restore, but good enough for a Monday). My uploads bucket I’ve tested out with a custom domain setting for the S3_CDN and that seems to work well.

To be honest I can’t really tell what’s up with my 2.9 not rendering a UI in the ember-cli (it blanks on me after creating the admin user), but it’s probably something you’ll see wrong quickly.

EDIT: Oh, it seems like it’s trying to load the plugin javascript assets from Pseudo-S3/R2 e.g. lots of 404’s on things like assets/locales/en.br.js

I don’t get any luck with a bundle exec rake s3:upload_assets, so that’s my best current clue.

EDIT2: Yes, it’s assets related, in that if I clear my DISCOURSE_S3_CDN_URL then the UI loads. I’ll try just manually uploading the assets in place for now.

EDIT3: The issue just seems to be one of the app needing the assets to be precompiled and where DISCOURSE_S3_CDN_URL is pointing, and for a dev environment locally that’s not usual. Will be interested in what @pfaffman finds, as I think this would work in a proper docker deployed self host instance, using the after_assets_precompile hook for R2.

2 Likes

Yeah. A CDN for a local dev environment? I can’t imagine that it could work. It does sound like it should work for a production deployment.

1 Like

Yep on retrospect not surprisingly in a dev set-up. I think what I wasn’t expecting was that with DISCOURSE_S3_CDN_URL set to the Cloudflare CDN for R2 but then DISCOURSE_CDN_URL set to blank (or a local or ngrok) but then it still wanted to use the Cloudflare pull URL for the precompiled assets etc and not just the uploads/images alone. I think it will work in production ok in a proper container, so I might do a quick try. My plan will be something like make media uploads TL4 temporarily, switch the IDs etc to Cloudflare in the app.yaml, test it out and, if good, leave it to R2 and then rclone all the existing S3 objects across - after that I’ll rebake the posts and hopefully everything will be fine :crossed_fingers:.

2 Likes