Hi everyone
This feature was requested by a lot of community members in this post.
I’m planning to add support for DigitalOcean Spaces because it’s fully S3 compatible but if the community prefers another cloud option, suggestions are welcome!
Here’s a link to the DigitalOcean API.
I’m excited to work on this feature but I’m new to Ruby and the code base so I would need a bit of guidance:
It would be super helpful if someone could share some documentation or resources about modifying site settings and integrations as this would help me understand how Discourse integrations work.
What’s an ideal development environment for testing the existing AWS features? VPS with a production Discourse instance? I wanted to know because I might have to deal with nginx etc.
To see if I’ve understood the requirements properly before I start working:
The goal here is to extend s3 configuration and use the existing s3 code for using another compatible backup service. The official aws-sdk-s3 gem supports custom HTTP endpoints as shown here, so for example, I would have to select “custom” in the s3 location and then add a custom_s3_url.
Is that correct?
@sam I did the research and yes, this is well supported by aws-sdk-ruby even though the documentation is lacking. This is a demonstration of aws-sdk-ruby being used to upload an archive to DigitalOcean Spaces. This works really well and I hope this helps and gives everyone an idea about how compatible these services are.
Paste the Space name (bucket), key pair (access_key_id, secret_access_key), file name and path in the variables below. Run the code and the file should be visible in your Space.
require 'aws-sdk-s3'
name = 'test.tar.gz'
path = '/home/workspace/test.tar.gz'
# Before uploading, ensure that you've created a Space on DigitalOcean
bucket = 'discoursetest1'
# Configure an S3 Resource for use with Spaces
# Note: Generate Spaces Access Keys from cloud.digitalocean.com/settings/api/
s3 = Aws::S3::Resource.new(
access_key_id: '',
secret_access_key: '',
endpoint: 'https://nyc3.digitaloceanspaces.com',
region: 'nyc3',
)
# Add a file to a Space
obj = s3.bucket(bucket).object(name)
obj.upload_file(path)
@sam I’ve made this work by adding one optional field named spaces_endpoint to site settings.
This enables Spaces support for all existing S3 features like upload, delete etc.
Does this look fine to you?
Can you put your code changes up as a PR on Github?
Also, that setting is being inserted as Aws::S3::Resource.new(... endpoint ...), right?
The setting should be named s3_endpoint, default of https://s3.amazonaws.com – as it’s not specific to DO Spaces.
There are should be some changes in nginx template. Its better to serve files via nginx, not directly from spaces. Also, its strongly recommended to add nginx caching to make minimum number of read requests to spaces API. Also, for local development (and for some production cases) you can use https://www.minio.io/ - its s3 api compatible storage software.
Yes, that’s much better. I’ve changed the name to s3_endpoint now.
I need to check the endpoint for DO to set DO-specific parameters and that’s what I’ve done in s3_endpoint(). I was not sure where to put the code for configuring different platforms. Thanks for the help!
Shouldn’t everyone using S3-like services use a CDN? We even have a separate setting DISCOURSE_S3_CDN so people can have a s3-like only CDN.
Maybe after we merge the PR we can work out an optional template with nginx caching and another one using CDNs (Cloudflare should fit here, since it will be caching only statics).
Not everyone use CDN. There are many small installations that just dont need additional layer.
Also, its easy to add caching to nginx for any kind of s3-compatible storage.
And in case of minio, for small installations it can be used to directly serve images.
Can we just ignore the region if the opts[:endpoint] is different than the default? Objective is allowing to use any s3-compatible service without depending on Discourse to add code.
I agree, that would be much cleaner.
From the Spaces API docs, I read that region and endpoint were the required parameters.
But I just tested it again without the region parameter and it still works. So yes, we can ignore the region if the given endpoint is different from the default.
I’ve updated the PR, Thanks!
I’d like to remind everyone that DigitalOcean Spaces features work by just using an endpoint like https://sgp1.digitaloceanspaces.com or https://nyc3.digitaloceanspaces.com.
There is no need for a separate region field, at least for DigitalOcean Spaces support.
I will check if other services can also work in this manner, with a single endpoint.
All 3 platforms will not require an additional region field and the current PR will add support for DigitalOcean Spaces and Minio.
Minio: endpoint format: server_ip:9000 or user domain
Requires region but it works perfectly with the pre-existing s3_region options (drop-down menu) in site settings so it does not need an extra region field.
Google Cloud Platform: endpoint format: https://storage.googleapis.com
Does not require a region parameter but I haven’t got it to work yet because of an incorrect header issue that I’m investigating but it will work without an extra region field once I debug the issue.
@Falco Before we ship it, the last problem is that Minio requires { force_path_style: true } in s3_options and AWS requires the setting to be false. This is because AWS and Minio use different addressing styles so if we force one, it breaks the other.
I know we wanted to avoid this but I don’t see how we can make both work without adding something specific like this:
if opts[:endpoint].includes? "minio"
opts[:force_path_style] = true
But even this won’t work because a lot of Minio users might use IP addresses to their servers instead of having “minio” in their endpoint. We have to think of a way to detect a Minio endpoint and we might need Minio specific code in Discourse if we want to support it.
We could support a lot of cloud options easily if we could have a free-text field for entering key:pair options for greater flexibility and avoiding issues like this. The region and force_path_style issues would be solved. @riking What do you feel about a field like this?
Sure, I’ve added that and the PR now works great with S3, DigitalOcean Spaces and Minio
@Falco, one question: When enable_s3_backups is disabled and enable_s3_uploads is enabled, do we expect a remote backup on clicking ‘Backup’ ?
On my installation, a remote backup only occurs when enable_s3_backups is enabled.