Meta is moving to the Cloud 🌩

FYI: I had to re-login to meta since the move - I don’t know if it’s related.

8 Likes

Highly-likely related :wink:

2 Likes

This is very exciting news for those of us working globally. Look forward to hearing more.

Just to be clear we estimate running on AWS costs about $1000/month, for this site. It is extremely expensive. (Meta is deployed as an enterprise site on our end.)

5 Likes

Also @dax and @tobiaseigen have reported massively delayed oneboxing. @hawk has reported big delays in desktop notifications as well.

5 Likes

Wow that’s a lot.

Can you share your monthly page views please Jeff? Around a million a month?

More than double that.

5 Likes

Congratulations on the basically flawless move! I’d love to read what your architecture looks like on AWS.

3 Likes

Notification of this post just took 14 minutes to arrive:

UPDATE: It then took ~16 minutes to update this post to Onebox the above post reference.

We might have to just live with massively delayed notifications for a while, as @sam is working on another much more urgent internal problem.

4 Likes

Well, this one is a doozie. But I can explain it and it makes perfect sense. Ever since we move meta to AWS, AWS started playing “silly buggers” with all our outbound mail. This is a known issue due to:

https://aws.amazon.com/premiumsupport/knowledge-center/ec2-port-25-throttle/

We already opened an issue with AWS about this but, they are super slow to respond. To circumvent it we can use a different port (port 587 for example) however that would require extensive reconfiguration of some public infrastructure we have. So instead, @supermathie is just going to move outgoing mail from meta to a third party provider. We were able to get this working.

TLDR

  • Amazon clogged our mail, by making mail jobs take forever.
  • Our mail is constantly retrying clogging our Sidekiq
  • All jobs are massively delayed including onebox, notifications and so on
  • Profit

Should be fixed soon.

Illustrated explanation:

image

17 Likes

Durrrrrrrrrrrrrrr oh man so obvious when you realize it.

Nice one!

3 Likes

That explains why the notification to a reply came three hours later :laughing: well done in sorting it Sam n co :clap:

Well actually not, it is peanuts, we are not even considering this style of hosting for our regular standard/business/enterprise customers. For them we will plan to continue hosting on bare metal.

We are investigating a new region with bare metal in Europe at the moment to service people who must be hosted in Europe. They will get the same excellent performance and reliability our current hosting in SF provides.

AWS customers are in the “super enterprise” level. They are the type of customer that MUST be hosted on AWS, cause of mantras such as “Nobody ever got fired for choosing IBM”. For them they must be on AWS cause they simply must be. They are used to the very high costs of AWS, it is a fact of life.

Our “super enterprise” customers get an isolated “rack” in the cloud. This includes extensive monitoring, extensive levels of failover (multiple AZs), large EC2 instances and large DB instances, Logs forwarded to elastic search and the list goes on and on. This means that for this “modest” meta config we have something like 12-15 ec2 instances and dedicated database and ElasticCache instances.

Yes, there are economies of scale if we host a giant multisite in the cloud at which point we get to share the monitoring aspect of our cloud infrastructure and cut costs, however this is not on our plans for 2018. A rack in Europe is though.

21 Likes

Thanks to @supermathie mail is A-OK again and notification will arrive nice and quick, the AWS :sneezing_face: is over.

10 Likes

As someone who has been following the “cloud vs. dedicated hardware” argument since Discourse’s early days, I’m seriously interested in a detailed discussion of the differences/costs/etc. What’s here is already instructive.

6 Likes

We’ve slowly been working costs down as we go, and we have hosting meta down to around $1000/month on AWS – that’s with multiple tweaks over the last 6-8 months. When we started this it was closer to $3000/month. Really!

Before: $2,717 / month

image

After: $1,030.06 / month

image

Note that meta is deployed as “super enterprisey” in our testing so it is somewhat … overprovisioned. :wink:

We could do better long term by doing long term reserved instances which could cut the cost in half, essentially pre-paying for multiple years of service.

Any other thoughts this far in @sam?

15 Likes

I am curious what changes/tweaks you’ve been doing to cut the cost in half? Is it more paying attention to AWS requirements, or is it tweaking code? Combination of many factors? Would the tweaks your making be useful information for others who may be or may be thinking of hosting on AWS?

The main piece of advice I have is … don’t do it. Don’t take on a complex, “enterprisey” cloud install unless you have to. It’s extremely expensive for what you get. Compare to a simple monolithic Digital Ocean droplet running our standard Docker image, which can get you a very long way even at the $40 and $80 per month price points.

9 Likes

Fair, I never plan to use AWS for it, as I said was more of a curiosity for me. :slight_smile: