Youtube embeds missing

Hello @Iceman

Yes, I searched the DB for you just now and could not find a table or field where that setting is stored (was not in the site_settings table)

As far as your comment:

… and the Ban is still up 5h later

I don’t have any special working knowledge of how Google manages these “bans”, but I suspect after their algos trigger a ban, it could take a much longer time for Google to “unban” (days or even weeks).

But then again, I do not have any working knowledge of how this process of banning “works” or even what the “official Google name of this banning process” is.

Do you?

Is there a Google support page where they discuss this? What is the exact name of this process of banning you are referring to?

2 Likes

There’s definitely not a support page where Google explains exactly how their DoS-prevention blocking, or any related system, works.

3 Likes

After Blood, Sweat and Tears, I think it’s solved.

However, I’m not particuarly “proud” of the solution but hey, it’s Google, they won’t talk nor explain anything to you, so… conclusions:

  • First of all, one important lesson: Don’t enable IPv6 on DigitalOcean if you are using Discourse, because their IPv6 range is blocked by YouTube.

  • After the IPv6 change was fixed, due to increasing traffic, regardless of host (changed a couple of times, what a journey), what happened after that was that YouTube was IP-Blocking my Discourse installation, due to the quantity of YouTube videos posted on the site and how Discourse loads them.

  • In order to check this block, you need to either use your server as proxy to one with a browser or just do a curl and search for this line: “Sorry for the interruption. We have been receiving a large volume of requests from your network.” (there is this topic for reference)

  • Thanks to the help from @riking @neounix and @Overgrow I did a series of commands (that you can read above) to try to either stop, limit, or change the rate at which we bake the YouTube embeds. For most sites that would be enough, but we had the increased drama of being migrated after I tried a couple of hosts, so all the previous posts needed baking. As a matter of fact, limiting it to 1 every hour kind of solved it at first. But I guess my community just really likes to share videos because that didn’t last long.

  • Obviously there is no feedback or help from YouTube here, except for a couple of threads on their forum with the error and all the comments saying “yeah I also have that issue” but no solutions there.

  • Given the circumstances, remembering that infomercial logic of “There has to be another way!” I opted for a “Rambo” approach: Purchased another IP address. Then added a cron that switches the outbound IP address every hour. Issue solved.

It is expected that, if the site keeps growing and people just keep sharing that YouTube love I may need to acquire a third IP. But hey, until I figure out a correct way of doing a “distributed discourse” in a K8S or something, it’s as good as it gets.

Not the most elegant of solutions, I know.

Once again, thanks for all the help (and mostly patience, because I know I’m very n00bish with all the Rails/Sidekiq/RubyConsole combo, but I’m trying to improve by reading the Discourse Code).

Thanks!

7 Likes

Brilliant!

That’s a creative, effective, “out-of-the-box thinking” solution.

Congrats on solving your puzzle with style and finesse!

4 Likes

Having followed advice/recommendations I set up a CloudFront CDN for our AWS S3 bucket on our Discourse a few days ago.

I added the S3 CDN URL in our control panel, then duly then issued a rebake command on 200,000+ posts.

Didn’t think much of it at that point, it was off and working it’s magic for the next 12 hours or so.

We have many, many videos embedded in our Discourse. We are a drone/uav community and people are posting and sharing their pictures and videos all day long. Tens of thousands of YouTube videos are on our Discourse posts.

Hindsight…? After adding a CDN URL, I probably only needed to rebake posts matching a *.jpg pattern or similar :man_facepalming:t2: :cry:

Anyway, what’s happened?

YouTube have blocked the IP address of our server :pensive:

We can no longer onebox any YouTube links, our community is met with:

429 Too Many Requests

:pensive:

(a simple curl / wget on the server itself also returns the same thing)

We obviously got blocked at some point during the rebake as half of the existing posts that did have working videos, don’t anymore :sob:

I’m assuming this block is permanent but as you’ll know, it’s impossible to find anyone at YouTube to contact and beg forgiveness.

On the off chance it’s permanent, a question for @Iceman please. Can you share the details of how you obtained a second IP address at Digital Ocean, and the changes you made to route “out” on that IP but leave incoming traffic on the existing IP?

And a question for everyone, does anybody know if this block is likely to be just temporary? :crossed_fingers:t2: And/or is there anything I can do in order to fix my now very broken YouTube posts?

For a heavily media-driven community, this is quite disastrous for us.

1 Like

It’s unlikely to be permanent, it will probably go away over time.

2 Likes

If someone is looking for the same change, you can use a Remap instead of a rebake, which is almost instant and doesn’t make requests to anywhere.

5 Likes

Sorry on the late reply.

I can’t help you with Digital Ocean, moved away from them when their IPV6 support was lacking.

I keep adding more and more IPs to the “switch” that does outgoing requests to YT rotating between IPs to avoind being banned. But they eventually get banned and it’s a deadlock. If you are “banned” and that IP keeps doing requests it keeps getting banned. You need to be between 1 to 8h without requests for YT to “unban” you. The more users you get and the more YT Videos they post, the worst it gets.

Thanks to the latest update to oneboxes it’s easier to spot (because you can see the 429 within Discourse instead of hunting for it because videos don’t display correctly). However, within my limited understanding I wonder if there is a better way of handling YT embedding. Because when the video is played the request does come from the client’s IP (I guess), but when the video is displayed it is your site doing the request per video.

3 Likes

Thanks @Iceman :+1:t2:

This is really useful to know.

I installed the onebox assistant plugin and routed all oneboxes through the embed.rocks proxy on Friday.

Today, on the server console, I tried a random wget of a YouTube video.

Sure enough, we’ve been unblocked!

Onebox assistant disabled again this afternoon and we’re pulling direct with no issues so far.

Had I not done that, I think you’re right and we would have never been unblocked because we’d be hitting YouTube every hour or so as people constantly post new videos :grimacing:

Thanks again :smiley:

3 Likes

Thanks for the additional info @Iceman and @Richie – it’s come up a fair bit recently, so any additional info about the way YouTube approaches rate limiting is super helpful.

We also want to reassure people that these “bans” are automatic in both directions – if you are patient and reduce the number of YouTube requests your site makes, you should be removed from Santa’s naughty list within a few days. :santa::page_with_curl:

4 Likes

I used a variant of this to gently process through a lot of posts. A low-risk, huge time-saver suggestion, thanks @riking!

FYI I used this to track progress:

Post.where(baked_version: nil).count
1 Like