Configure an S3 compatible object storage provider for uploads

OK. I decided to post the step-by-step guide I created when I was performing this configuration.

How To Configure Discourse Forum S3 Backup And S3 CDN (Itechguides.com)

The steps in this guide is for using DigitalOcean Space and StackPack CDN.

3 Likes

@pfaffman I was wondering whether you could help me with this additional backup problem:

After completing the S3 backup and cloning configuration, my automatic backup is not running. The screenshot below shows my configuration.

I can run backups manually. The problem is the scheduled automatic backup - they do not run

1 Like

Don’t know. You can look at the sidekiq jobs and make sure they are running. It should work.

1 Like

4 posts were split to a new topic: Tips on Google Cloud S3

Hi,

Thanks for correct my post. But I cannot edit my post to avoid misleading information.

Hi,

I’m a bit stuck and confused and hope somebody can help me.
I first had a bitnami install and realized how much trouble this would give me along the way, I reinstalled using the standard install.
I was able to restore my backup and everything was fine, even though I went from the 2.8 to the 2.9 beta.

I tested again my backup on my google bucket and it still worked like a charm.

Note that all the S3 config was done through the web interface and not via ENV variables.

For GDPR reasons, I created a new backup bucket in europe (lets call it discourse-backup-eu ), and now that I was able to change the ENV variable, I set DISCOURSE_S3_ENDPOINT: https://storage.googleapis.com, rebuild the app, changed the backup bucket name in the web interface, reran the backup and I was very please to see the backup files appearing on my new backup bucket in europe.

Now I wanted the uploads to go to another bucket and avoid filling up my vm disk space.

So I configured a new bucket (lets call it discourse-uploads), made it public, added the Storage Legacy Bucket Owner role to my service account on that new bucket.
Then added a rule to my existing load balancer (lets call it https://www.example.com) to use a backend bucket with cloud CDN enabled as instructed here. The rule being /discourse-uploads/* points to the bucket discourse-uploads

I tested my CDN with a test.jpg in the root of the bucket but I couldn’t reach it via https://www.example.com/discourse-uploads/test.jpg and had to create a subfolder called discourse-uploads inside the bucket, moved the test.jpg inside and now I can see my test picture via https://www.example.com/discourse-uploads/test.jpg

In the web UI, I changed the dummy bucket name under “s3 upload bucket” (I was forced to set previously while setting up the backup) to discourse-uploads, filled the CDN URL with https://www.example.com/discourse-uploads and ticked “enable s3 uploads”.

From there on, if I would try to upload a image, I would get a popup saying Invalid Argument in the browser window (coming from a 422 error with a json content saying basically the same).

I tried to rebake all post, but had no effect, I still had the error.

So I figured, I should try using the env variables instead of the Web UI.

and use the following config:

DISCOURSE_USE_S3: true
DISCOURSE_S3_REGION: whatever
DISCOURSE_S3_INSTALL_CORS_RULE: false
FORCE_S3_UPLOADS: 1
DISCOURSE_S3_ENDPOINT: https://storage.googleapis.com
DISCOURSE_S3_ACCESS_KEY_ID: MY_KEY_ID
DISCOURSE_S3_SECRET_ACCESS_KEY: MY_ACCESS_KEY
DISCOURSE_S3_CDN_URL: https://www.example.com/discourse-uploads
DISCOURSE_S3_BUCKET: discourse-uploads/discourse-uploads
DISCOURSE_S3_BACKUP_BUCKET: discourse-backup-eu
DISCOURSE_BACKUP_LOCATION: s3

I rebuilt the app.
Then I cannot open discourse anymore because none of the assets were uploaded to the bucket and get a 404
https://www.example.com/discourse-uploads/assets/admin-31467dc73634cbfb81799737c43df0e2939307d893ef32713f1d0770bcb3532c.br.js

I thought that trying to upload directly to a sub folder in the bucket directly was a bit of a stretch enven though the OP suggests it works (at least for the backup bucket)

changed the env variable to
DISCOURSE_S3_BUCKET: discourse-uploads
(Thinking that later I can play with the host rule instead to avoid having to upload to a sub folder)

and rebuilt to see if anything gets uploaded, but nothing gets uploaded to the bucket and discourse still fails to open because of 404s.

So my questions are :

  • Do the Web UI and ENV variable collide ?
  • When are the assets supposed to be uploaded to the bucket ?
  • How can I debug this ? I don’t see any error in the logs
  • Is it possible to set a subfolder of a bucket in the config ?
  • Once this works, Are the previously uploaded images transferred to the bucket ? If I rebake, what will the url of the previously uploaded images look like ?

Thank you !

1 Like

Did you include this bit?

I submitted a PR with a template to do that a while back but I don’t think it ever got any attention.

Also, changing buckets is hard. You need not only to copy all of the assets from the old one to the new one, but also update the database to use the new bucket. There is a topic about it, I believe.

If you use the ENV variables (which you should) those settings are no longer visible in the Web UI.

1 Like

A post was merged into an existing topic: Tips on Google Cloud S3

Yes. If my memory serves me, there is discussion above about google not allowing something (list access, maybe?), but there was a workaround about using some “legacy” something. That’s what I remember. You’ll have to scroll through the above 100 messages to find it. If it works, it would be great if you could update the OP to say how you made it work so the next person who needs to know will be able to find it more easily.

1 Like

Thanks again for your answer !
The warning about google bucket was about using for backups because it couldn’t list the files.
I already posted on how to fix this

Are you suggesting I update the OP with that information ? I don’t believe I can.

Again, the backup works, but the upload of the assets doesn’t, according to the OP, this was supposed to work even without the Storage Legacy Bucket Owner rights.

I think there might be a regression here, what do you think @Falco ?

1 Like

There may be a regression. Are you sure you added the custom

that only Google needs?

2 Likes

Oh. Well, I thought that somebody had. :person_shrugging:

That was what I was suggesting. It’s a wiki, so I’m pretty sure you can, though I’m not 100% sure what trust levels are involved.

1 Like

Thanks for your answer, yes I did includes it:

Note that I tried with and without the subfolder
DISCOURSE_S3_BUCKET: discourse-uploads/discourse-uploads
and
DISCOURSE_S3_BUCKET: discourse-uploads

Thanks again

2 Likes

@tuanpembual initially did but referred to Storage Legacy Object Owner instead of Storage Legacy Bucket Owner

I’m only a “basic user” that must be the reason I can’t edit it.

3 Likes

I will try to summarize the answers to my questions:

  • Do the Web UI and ENV variable collide ?
  • When are the assets supposed to be uploaded to the bucket ?
    By adding this snippet to the app.yml in the hook section, it will be uploaded after_assets_precompile (during rebuild app).
  • How can I debug this ? I don’t see any error in the logs
    By running :
cd /var/discourse
sudo ./launcher enter app
sudo -E -u discourse bundle exec rake s3:upload_assets --trace
  • Is it possible to set a subfolder of a bucket in the config ?

You can use prefixes to organize the data that you store in Amazon S3 buckets. A prefix is a string of characters at the beginning of the object key name. A prefix can be any length, subject to the maximum length of the object key name (1,024 bytes). You can think of prefixes as a way to organize your data in a similar way to directories. However, prefixes are not directories.

  • Once this works, Are the previously uploaded images transferred to the bucket ? If I rebake, what will the url of the previously uploaded images look like ?
3 Likes

Hi, I’ve been looking for object storage providers, and I saw on the OP that for some of them, you’ll need “to skip CORS and configure it manually.”, I am not familiar with CORS or anything about configuring it, so should I stay clear from the ones needing this setting, or is it simple to set it up ?

1 Like

If you need to ask (as I would) then I would go with another one.

1 Like

Just confirm, once I’ve done the

rake uploads:migrate_to_s3
rake posts:rebake

steps, I can remove the local uploads folder in it’s entirety, yes?

1 Like

Hey @mcwumbly. This was very easy to find when I could search for “S3 clone”. I was unable to find it just now. Was there something wrong with that title? Is there a search that will find it? Could we add a (I can’t remember what it’s called) thing so it can auto link on some words like standard install does (but I can’t think of what words to use).

3 Likes

As someone who links that topics multiple times a week I kinda agree :stuck_out_tongue:

Maybe adding “s3 clones” to the OP body helps the search-fu?

2 Likes