S3 Uploads / IAM user / backups questions


#1

On the Admin>Settings>Backups page, i have checked “Upload backups to S3 when complete.” The text also mentions

IMPORTANT: requires valid S3 credentials entered in Files settings.

Very well. In Admin>Settings>Files, there is an “enable s3 uploads” checkbox.

Question #1: Presumably, i check this if i want all attachments my users to create to be sent to a bucket instead of my DigitalOcean droplet (?) Should this box be checked or unchecked if i DO want s3 backups, but DO NOT want all uploads (attachments, etc) to go to a bucket?

Question #2: Do i actually want everything to go to the bucket anyway? Is that a good plan? i bought the el cheapo $10/mo droplet. Will i run into storage issues very quickly?

Question #3: Amazon S3 constantly yells at me to not use the root access keys, but to use an IAM user instead. i have one of those, BUT… the checkbox “s3 use iam profile” says

Use AWS EC2 IAM role to retrieve keys. NOTE: enabling will override “s3 access key id” and “s3 secret access key” settings.

Okay then. i check the box. But where exactly do i tell the system the name (and presumably some sort of password/access code) of that IAM user?


(Jeff Atwood) #2

We should clarify the copy here a bit @mpalmer do you have any suggestions?


(Matt Palmer) #3

Use AWS EC2 IAM role to retrieve keys.

Yeah, this isn’t the most trivial thing to describe to people who don’t need it. What it’s trying to say is that if you’re running your instance in EC2, you can potentially avoid the need to hard-code a keypair by having the S3 client library pull the creds it needs to use out of the instance metadata store.

The easiest thing to do that will reduce confusion would be to remove the string “IAM” from the option and description, because people think of IAM as something for creating users, not machine roles. It’s not ideal, because most of the AWS-related documentation calls them “IAM roles”, but the “technically correct” (which is, of course, the best kind of correct) name is “instance profile”. I think a reasonable name for the setting itself is probably s3 use ec2 instance profile, with a description of something like:

Use an AWS EC2 instance profile to grant access to the S3 bucket. NOTE: requires Discourse to be running in an EC2 instance, and overrides the s3 access key id and s3 secret access key settings.

I’m not a fan of documentation that insinuates “you are too dumb to understand what this does”, but this might be a situation in which tacking onto the end of the description “If you don’t know what this means, you don’t need it” would be worth it.

In short, @Untoldent, you should follow AWS’ advice and not use the root access/secret keys, because they automatically grant full access to everything (including things like spinning up ridiculous numbers of huge EC2 instances), in the event that some miscreant managed to get a hold of them. Instead, follow our S3 uploads guide, which covers creating a restricted-permissions IAM user, amongst everything else.


(Jeff Atwood) #4

Sure feel free to make those copy changes as you see fit!


(Matt Palmer) #5

I’ve created this PR as a proposed alternate description. IIRC the option name comes from the variable name, which is rather tricky to change, so I’ve left it alone on the assumption that the description will be sufficient to inform people of what the option is used for.


#6

Thanks, guys. If i had read

i would immediately have known this setting wasn’t for me.

i’m finding in more than a few places that there’s a perfectly good tutorial for many of the things i want to do, but i don’t quickly find them via search (and part of the problem is you don’t know what you don’t know, so i might not even possess the words i need to use in a search query)

At any point do you (or have you considered) linking out to these tutorials directly from the description boxes next to the settings fields?


(Sam Saffron) #7

Not yet, but its an interesting suggestion for some of the settings. It has some downsides cause hardcoding links to meta would be a problem if you don’t have access to meta. I think improving descriptions is the best first step we can take.


#8

Not all that different from linking out to Quora or some other membership-walled site, no? (Also, the READ ME FIRST post links out to meta.discourse.org multiple times)

Anyway… i’m not sure i got answers to all three of my questions. Let me rephrase:

  1. What do i check to have my backups (and backups only) go to S3?

  2. Should i send all my uploads (backups and media both) to S3?

  3. When i click the button on the AWS site to regenerate a key for my root user (for the purpose of giving Discourse access to my bucket), will that break anything else that was using the previous key for access?

Thanks again!


(Matt Palmer) #9

It’s not the wall that’s the problem (and meta doesn’t have any walls anyway), it’s the fact that some Discourse installs are performed in disconnected environments (yes, it surprises me, too, that there are still computers not connected directly to the Internet). It’s not a huge problem, though, because it’s not a problem for most people, and the few people that are trying to run with both legs tied behind their back (how’s that for a mental image?) aren’t any worse off than they are now (because a non-existent link is no more informative than a link that doesn’t load).

Fill in a keypair and tick “enable s3 backups” whilst ensuring “enable s3 uploads” isn’t ticked.

Yes. No. Maybe. 42.

We can’t answer that question for you, because we don’t run your site and don’t know the full extent of your financial, operational, and organisational constraints.

Ignoring for now the admonition that you shouldn’t use root AWS credentials for granting access to S3 buckets, as long as you don’t retire or delete the previous key, generating another key should not interrupt service to anything using the existing key.


#10

Ok - here’s where you may need to speak slowly and clearly (and perhaps loudly) to me, because - i apologize - i am just not getting it.

Where does the keypair go - on the Backups page or on the Files page? i don’t see the option on the Backups page, so i’m assuming the Files page (and this is one half of my confusion: i’m entering credentials for S3 in a completely different area than i think they should go. It feels like the flow should be:

  • Admin>Backups
  • Check “Upload backups to S3 when complete.”
  • Enter bucket name
  • Enter keypair

But instead, it’s

  • Admin>Backups
  • Check "Upload backups to S3 when complete. "
  • Enter bucket name
  • Go to a completely different page, and enter the keypair into fields that have labels describing something i’m not actually trying to do.

So! Next part of my confusion:

The fields where i think i’m supposed to enter the keypair, on the Files page. They say secret access key id and s3 secret access key. Are those the correct fields? EVEN THOUGH they are labeled The Amazon S3 access key (id) that will be used to upload images but i’m not currently trying configure Discourse to upload images? (Do you see what i mean?)

UPDATE: i think that’s where they go. What i think i’m really doing here is complaining of an unclear flow, with language and labels that are (i hope) understandably confusing.

Thanks again!


(Matt Palmer) #11

Honestly, I have no idea which category of setting anything is in. I just type s3 into the search box and up pops the relevant settings.

Some of your confusion comes, I think, from the fact that uploading backups to S3 was added some time after the ability to upload and serve images from S3, so the descriptions and classifications target the older terminology. PRs welcome to improve the wording, etc.

As far as the field names go for the keys, blame that one on Amazon: they’re appalling names, but if we tried to use anything else, even fewer people would know what we’re referring to.


#12

More missing knowledge! Do i have to branch the entire repo to suggest a small copy change like that? (i’d prefer not to, but i do want to contribute…)


(Andrew Schleifer) #13

Yes, you will have to fork the whole repository. The file to change will be discourse/server.en.yml at master · discourse/discourse · GitHub


#14

I just want to confirm that if we are in fact running Discourse under an EC2 instance, all we need to do is check the Use an AWS EC2 instance profile to grant access to the S3 bucket box, remove our previous S3 secret/access keys, and Discourse will then use the S3 client library lto pull the creds it needs to use out of the instance metadata store all on its own?


(Andrew Schleifer) #15

Yes. Discourse uses the offical AWS ruby SDK to access S3.


#16

Cool. So the connection is treated more or less like `localhost’ whereby the connection to the credentials library doesn’t require anything special given that the request is being made local?


#17

I just tried it using the method discussed here and here’s what I got back from Discourse:

Access Denied /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/aws-sdk-core-3.6.0/lib/seahorse/client/plugins/raise_response_errors.rb:15:in call

S3 uploads via keys worked prior.