General guidelines for size of AWS Instance Types

Guys I’m trying to figure out an instance type on aws for about 200 users a month using the platform about twice a day each. Is there a general rule of thumb I can follow to figure out what size instances I’d need to support the users.

The number seems pretty small, so I think I can get away with a medium instance, but I’d like to know if there are any scaling characteristics I should think about.

The best rule of thumb is “try it, and if you need more resources, get bigger instances, and if the instance is basically idle, get smaller instances”. Users is a terrible metric for sizing, BTW; page views is much better, particularly absolute peak page views per second. If you’re using auto-scaling you can cheat a bit and just use lots of little instances and have them appear and disappear, but that’s a lot more work (by like a couple of orders of magnitude) than just spinning up an instance per AZ and heading off to the pub.


Wouldn’t spinning up an instance per AZ be of a similar complexity as the autoscaling option?
Only difference being that I’d have to add a bunch of rules on when to autoscale? (or is the autoscaling part of this the portion where the complexity increase tremendously?)

Static-machine-per-AZ is trickier than one machine for everything (separate database/redis, need to transport containers around on rebuild, etc), but that’s just the cost of doing business in AWS, where there are no guarantees about the longevity of anything running in a single AZ. Autoscaling, on the other hand, requires automating a whoooooooole pile of stuff (like central container build, pulling down the container image from somewhere, ensuring the app.yml is available, dealing with in-progress rebuilds, and a bunch of other stuff), and then on top of that, automating it quickly enough that the autoscaling is actually useful (no point taking 20 minutes to do all that stuff because by then the load spike’s probably gone because everyone’s left because your site was down), so you’re probably going to have to automate the build of the entire AMI on rebuild, which is another layer of entertaining complexity. Don’t forget the fiddling required to gracefully transition between launch configs when you roll your new AMI – and somewhere in there you need to run database migrations…

AWS is not simple, and anyone who says differently is selling something. Probably AWS.


I’m currently undergoing this exercise myself - we want to host on AWS, but aren’t sure how we’d set up Discourse for “higher availability” without a lot of faff.

The way I’d see it, you’d need to do something like this:

  • RDS to host Postgres (with multi-AZ enabled - this allows configuration changes including changing the size of the RDS instance without downtime, and gives you automatic failover if the DB seizes-up)
  • EC2 r4.large instance for Redis (or ElastiCache Redis, but beware of vendor lock-in)
  • EC2 m5.large or any appropriate size for the actual Discourse instances
  • Classic ELB to load-balance traffic between the Discourse “web workers”
  • S3 for upload storage

As you can see, the majority of it involves breaking the Discourse stack up into its individual components and putting them onto separate machines. That obviously doesn’t give you automatic scaling (you’d need AMIs, a Launch Config and ASGs for that), and you’d obviously still have the issue with managing upgrades on all instances plus any DB migrations that are needed along with it.

Perhaps in the future, there could be a Cloudformation script for AWS, or Packer script (for all providers) that does all this for you.

After you’ve gone through all that, you wonder if it’d just be easier to have a single AWS instance of the right size with daily backups going into S3. It’s not high availability, but it should suffice for all but the largest communities.

What I haven’t investigated yet (but will get to) is the possibility of using Kubernetes to provision/deploy Discourse. That way, in theory, you could set up a Kubernetes cluster on AWS with kops or something similar, and delegate responsibility for the docker containers to Kubernetes. This might be worth reading.

If I make any progress, I’ll get back to you!


Also very helpful, even critical, to know: what percent of your pageviews are anonymous versus logged in?