"Show Full Post" button doesn't work in subfolder installations

I recently moved our Discourse installation to a subfolder. After doing that, the “Show full post” button stopped working – you click to expand the content, but it doesn’t load the full post.

Nothing changed in my WP Discourse configs.

https://tecnoblog.net/comunidade/t/paramount-oferece-us-108-bilhoes-em-dinheiro-para-tomar-warner-da-netflix/157441

When accessing the embed url directly in the browser, it returns a 404 error:

https://tecnoblog.net/comunidade/posts/483289/expand-embed

לייק 1

This is unrelated, this route only responds with an application/json content-type. https://tecnoblog.net/comunidade/posts/483289/expand-embed.json is returning

"<div><div></div></div>\n<hr>\n<small>Este é um tópico de discussão auxiliar para a entrada original em <a href='https://tecnoblog.net/noticias/paramount-oferece-us-108-bilhoes-em-dinheiro-para-tomar-warner-da-netflix'>https://tecnoblog.net/noticias/paramount-oferece-us-108-bilhoes-em-dinheiro-para-tomar-warner-da-netflix</a></small>\n"

The <div><div></div></div> should be the content.

Did you also change the blog URL by any chance?

The onebox display also feels odd to me, I’d expect it to have a cached truncated content instead, so I’m assuming body.present? is false in the above conditional.

Can you enter the Rails console and check if TopicEmbed.where(topic_id: 157441).pick(:embed_url) shows you the correct blog content URL?

Can you spot any related errors on https://tecnoblog.net/comunidade/logs?

2 לייקים

Oh, ok!

It returns the post url:

discourse(prod)> TopicEmbed.where(topic_id: 157441).pick(:embed_url)
=> “``https://tecnoblog.net/noticias/paramount-oferece-us-108-bilhoes-em-dinheiro-para-tomar-warner-da-netflix”

I don’t think there’s any related errors in the log.

Nop! The blog URL has always been `tecnoblog.net`

Also worth mentioning that the server’s IP is bypassed in CF’s Firewall:

2 לייקים

I had to debug this issues like this a couple of times and it’s complicated, so bear with me.

Run the following script and share the output here

# Replace with the topic ID or URL you're debugging
topic_id = 386983

# 1. Check if TopicEmbed exists and its content
te = TopicEmbed.find_by(topic_id: topic_id)
puts "TopicEmbed exists: #{te.present?}"
puts "Embed URL: #{te&.embed_url}"
puts "Content cache present: #{te&.embed_content_cache.present?}"
puts "Content cache length: #{te&.embed_content_cache&.length || 0}"
puts "Content SHA1: #{te&.content_sha1}"

# 2. Check the actual cached content (first 500 chars)
puts "\n--- Cached content preview ---"
puts te&.embed_content_cache&.truncate(500)

# 3. Try fetching from the remote URL
if te&.embed_url.present?
  puts "\n--- Attempting remote fetch ---"
  begin
    response = TopicEmbed.find_remote(te.embed_url)
    puts "Remote fetch success: #{response.present?}"
    puts "Remote body present: #{response&.body.present?}"
    puts "Remote body length: #{response&.body&.length || 0}"
    puts "Remote title: #{response&.title}"
    puts "Remote body: #{response&.body&.truncate(500)}"
  rescue => e
    puts "Remote fetch FAILED: #{e.message}"
  end
end

# 4. Check what expanded_for would return
if te.present?
  puts "\n--- Testing expanded_for ---"
  post = Post.find(te.post_id)

  # Clear cache to force fresh fetch
  Discourse.cache.delete("embed-topic:#{topic_id}")

  begin
    expanded = TopicEmbed.expanded_for(post)
    puts "Expanded content present: #{expanded.present?}"
    puts "Expanded content length: #{expanded&.length || 0}"
  rescue => e
    puts "expanded_for FAILED: #{e.message}"
  end
end

# 5. Check relevant settings
puts "\n--- Site Settings ---"
puts "embed_truncate: #{SiteSetting.embed_truncate}"
puts "allowed_embed_selectors: #{SiteSetting.allowed_embed_selectors}"
puts "blocked_embed_selectors: #{SiteSetting.blocked_embed_selectors}"

This will show why https://tecnoblog.net/comunidade/t/governo-renova-app-da-cnh-para-baratear-obtencao-do-documento/157462?u=falco is failing

4 לייקים
discourse(prod)> # Replace with the topic ID or URL you’re debugging
discourse(prod)> topic_id = 386983
discourse(prod)>
discourse(prod)> # 1. Check if TopicEmbed exists and its content
discourse(prod)> te = TopicEmbed.find_by(topic_id: topic_id)
discourse(prod)> puts “TopicEmbed exists: #{te.present?}”
discourse(prod)> puts “Embed URL: #{te&.embed_url}”
discourse(prod)> puts “Content cache present: #{te&.embed_content_cache.present?}”
discourse(prod)> puts “Content cache length: #{te&.embed_content_cache&.length || 0}”
discourse(prod)> puts “Content SHA1: #{te&.content_sha1}”
discourse(prod)>
discourse(prod)> # 2. Check the actual cached content (first 500 chars)
discourse(prod)> puts “\n— Cached content preview —”
discourse(prod)> puts te&.embed_content_cache&.truncate(500)
discourse(prod)>
discourse(prod)> # 3. Try fetching from the remote URL
discourse(prod)* if te&.embed_url.present?
discourse(prod)*   puts “\n— Attempting remote fetch —”
discourse(prod)*   begin
discourse(prod)*     response = TopicEmbed.find_remote(te.embed_url)
discourse(prod)*     puts “Remote fetch success: #{response.present?}”
discourse(prod)*     puts “Remote body present: #{response&.body.present?}”
discourse(prod)*     puts “Remote body length: #{response&.body&.length || 0}”
discourse(prod)*     puts “Remote title: #{response&.title}”
discourse(prod)*     puts “Remote body: #{response&.body&.truncate(500)}”
discourse(prod)*   rescue => e
discourse(prod)*     puts “Remote fetch FAILED: #{e.message}”
discourse(prod)*   end
discourse(prod)> end
discourse(prod)>
discourse(prod)> # 4. Check what expanded_for would return
discourse(prod)* if te.present?
discourse(prod)*   puts “\n— Testing expanded_for —”
discourse(prod)*   post = Post.find(te.post_id)
discourse(prod)*
discourse(prod)*   # Clear cache to force fresh fetch
discourse(prod)*   Discourse.cache.delete(“embed-topic:#{topic_id}”)
discourse(prod)*
discourse(prod)*   begin
discourse(prod)*     expanded = TopicEmbed.expanded_for(post)
discourse(prod)*     puts “Expanded content present: #{expanded.present?}”
discourse(prod)*     puts “Expanded content length: #{expanded&.length || 0}”
discourse(prod)*   rescue => e
discourse(prod)*     puts “expanded_for FAILED: #{e.message}”
discourse(prod)*   end
discourse(prod)> end
discourse(prod)>
discourse(prod)> # 5. Check relevant settings
discourse(prod)> puts “\n— Site Settings —”
discourse(prod)> puts “embed_truncate: #{SiteSetting.embed_truncate}”
discourse(prod)> puts “allowed_embed_selectors: #{SiteSetting.allowed_embed_selectors}”
discourse(prod)> puts “blocked_embed_selectors: #{SiteSetting.blocked_embed_selectors}”
TopicEmbed exists: false
Embed URL:
Content cache present: false
Content cache length: 0
Content SHA1:

— Cached content preview —

— Site Settings —
embed_truncate: true
allowed_embed_selectors:
blocked_embed_selectors:
=> nil
discourse(prod)>

:thinking:

לייק 1

Are you sure this is the right topic id? https://tecnoblog.net/comunidade/t/-/386983 leads to a 404.

לייק 1

Oh that is it. The topic I linked is actually the 157462.

My bad!

Here the results for the correct topic ID

TopicEmbed exists: true
Embed URL: https://tecnoblog.net/noticias/governo-renova-app-da-cnh-para-baratear-obtencao-do-documento
Content cache present: true
Content cache length: 22
Content SHA1:

— Cached content preview —

<div><div></div></div>

— Attempting remote fetch —
Remote fetch success: true
Remote body present: true
Remote body length: 22
Remote title:
Remote body: 

— Testing expanded_for —
Expanded content present: true
Expanded content length: 309

— Site Settings —
embed_truncate: true
allowed_embed_selectors:
blocked_embed_selectors:
=> nil

Did your Cloudflare bypass work? Looks like the body for https://tecnoblog.net/noticias/governo-renova-app-da-cnh-para-baratear-obtencao-do-documento is just 22 chars, which no title tag.

Yes! All requests from the discourse server are bypassed:

What i’ve noticed is that the embed URL doesn’t have a trailing slash at the end. All urls should have the trailing slash.

So maybe discourse is not following the redirect?

But also, why is it saving the URL without the trailing slash?

לייק 1

That is easy to test, try

url = "https://tecnoblog.net/noticias/governo-renova-app-da-cnh-para-baratear-obtencao-do-documento/"
response = TopicEmbed.find_remote(url)
puts "Remote fetch success: #{response.present?}"
puts "Remote body present: #{response&.body.present?}"
puts "Remote body length: #{response&.body&.length || 0}"
puts "Remote title: #{response&.title}"
puts "Remote body: #{response&.body&.truncate(500)}"

I think it works:

discourse(prod)> url = “https://tecnoblog.net/noticias/governo-renova-app-da-cnh-para-baratear-obtencao-do-documento/”
discourse(prod)> response = TopicEmbed.find_remote(url)
discourse(prod)> puts “Remote fetch success: #{response.present?}”
discourse(prod)> puts “Remote body present: #{response&.body.present?}”
discourse(prod)> puts “Remote body length: #{response&.body&.length || 0}”
discourse(prod)> puts “Remote title: #{response&.title}”
discourse(prod)> puts “Remote body: #{response&.body&.truncate(500)}”
Remote fetch success: true
Remote body present: true
Remote body length: 3776
Remote title: Governo renova app da CNH para baratear obtenção do documento • Tecnoblog
Remote body: 


<figure><img src="https://files.tecnoblog.net/wp-content/uploads/2025/12/cnh-brasil-app-1060x596.jpg">

	<figcaption>Aplicativo CNH do Brasil (imagem: Emerson Alecrim/Tecnoblog)</figcaption></figure>

</div>

<details>
    Resumo
    <div><ul>
<li>App CNH do Brasil substitui CDT e passa a oferecer recursos para obtenção da CNH, em especial, aulas teóricas gratuitas;</li>
<li>Aulas práticas continuam obrigatórias, mas a carga horária mínima foi reduzida de ...
=> nil


Here without the trailing slash:

discourse(prod)> url = “https://tecnoblog.net/noticias/governo-renova-app-da-cnh-para-baratear-obtencao-do-documento”
discourse(prod)> response = TopicEmbed.find_remote(url)
discourse(prod)> puts “Remote fetch success: #{response.present?}”
discourse(prod)> puts “Remote body present: #{response&.body.present?}”
discourse(prod)> puts “Remote body length: #{response&.body&.length || 0}”
discourse(prod)> puts “Remote title: #{response&.title}”
discourse(prod)> puts “Remote body: #{response&.body&.truncate(500)}”
Remote fetch success: true
Remote body present: true
Remote body length: 22
Remote title:
Remote body: 
=> nil

The same error happens in old posts, where the post slug has changed.

For example, in this post, the url used to be:

https://tecnoblog.net/486925/o-que-e-pirataria-digital/

Now, it changed to:

https://tecnoblog.net/responde/o-que-e-pirataria-digital/

לייק 1

That is the principal issue it appears. When using Embed Discourse comments on another website via Javascript you control that via a parameter, it’s super easy to fix.

I’m not familiar with how WP-Discourse determines this, it should be using the post canonical, but I’m not sure about that. Any ideas @angus ?

Is there a way to force discourse to update all the embed urls from a category, following it to the final destination?

I intend to migrate to the embed discourse (that full embed that you’ve been testing) when it’s ready for production. But if the embed urls doesn’t match, then it probably would create new topics for every post and lose the comments…

לייק 1

Run

te = TopicEmbed.find_by(topic_id: 157462)
te.embed_url = te.embed_url + "/"
te.save

Does that fixes https://tecnoblog.net/comunidade/t/governo-renova-app-da-cnh-para-baratear-obtencao-do-documento/157462 ?

It works!

But is there a fix for cases like this?

Gemini suggested this code:

# Configuração
CATEGORY_SLUG = 'tb' 
category = Category.find_by(slug: CATEGORY_SLUG)

unless category
  puts "ERRO: Categoria '#{CATEGORY_SLUG}' não encontrada."
  exit
end

puts "Iniciando varredura completa de URLs na categoria '#{category.name}'..."
puts "Isso pode demorar dependendo da quantidade de tópicos e da resposta do seu site..."

count_updated = 0
count_errors = 0
count_ok = 0

Topic.where(category_id: category.id).find_each do |topic|
  current_url = topic.custom_fields["embed_url"]
  
  # Pula se não tiver embed_url
  next unless current_url.present?

  begin
    # Faz a requisição GET seguindo redirects
    response = Faraday.get(current_url)
    final_url = response.env.url.to_s

    # Se a requisição foi bem sucedida (200 OK)
    if response.status == 200
      # Verifica se a URL final é diferente da URL salva no banco
      # A comparação ignora diferenças sutis se necessário, mas aqui comparamos string exata
      if final_url != current_url
        puts "\n[ATUALIZAR] Tópico ##{topic.id}:"
        puts "   De:   #{current_url}"
        puts "   Para: #{final_url}"
        
        topic.custom_fields["embed_url"] = final_url
        topic.save_custom_fields(true)
        count_updated += 1
      else
        # print "." # Descomente para ver progresso visual (pontinhos)
        count_ok += 1
      end
    else
      puts "\n[ERRO HTTP #{response.status}] Tópico ##{topic.id} - URL: #{current_url}"
      count_errors += 1
    end

  rescue Faraday::ConnectionFailed, Faraday::TimeoutError => e
    puts "\n[FALHA DE CONEXÃO] Tópico ##{topic.id} - URL: #{current_url} - #{e.message}"
    count_errors += 1
  rescue StandardError => e
    puts "\n[ERRO GERAL] Tópico ##{topic.id} - #{e.message}"
    count_errors += 1
  end
  
  # Opcional: Pausa pequena para não sobrecarregar seu servidor WordPress
  # sleep 0.1 
end

puts "\n\nResumo Final:"
puts "------------------------------------------------"
puts "Tópicos Verificados (OK): #{count_ok}"
puts "Tópicos Atualizados:      #{count_updated}"
puts "Erros Encontrados:        #{count_errors}"
puts "------------------------------------------------"

Finally some progress :sweat_smile:

A script like that is a great idea, just take a backup before running it.

Even a backup of just this small table would be great.

לייק 1

Ok! I’ll try to run it later, when the team finishes their shift.

לייק 1

Hey guys, I see the trailing slash has struck again :slight_smile:

[trailing slashes are] the principal issue it appears. When using Embed Discourse comments on another website via Javascript you control that via a parameter, it’s super easy to fix.

Just at note that all topic embeds in Discourse strip trailing slashes from the embed_url; see TopicEmbed.normalize_url. As a result of a separate case involving the intersection of javascript embeds and WP Discourse embeds we standardised this handling across both methods of embeds. See Apply TopicEmbed url normalisation to embed urls inserted in the PostCreator by angusmcleod · Pull Request #30641 · discourse/discourse · GitHub

@Thiago_Mobilon In the course of this move did you also update your Discourse? It may be that we’re seeing the application of the standardisation of embed_url normalization to WP Discourse embeds being applied here as a result of an update of your Discourse, which occurred at the same time as the move to the subfolder installation. What version of Discourse are you currently running? (and what version were you running prior to the move, if you know that?)

Just a side note that when I run these two commands locally on the latest version of Discourse, I’m getting the same result, namely the HTML body of the article

# with trailing slash
TopicEmbed.find_remote("https://tecnoblog.net/noticias/governo-renova-app-da-cnh-para-baratear-obtencao-do-documento/")

# without trailing slash
TopicEmbed.find_remote("https://tecnoblog.net/noticias/governo-renova-app-da-cnh-para-baratear-obtencao-do-documento")

# produces the same result

Did you perhaps make a change on the Wordpress side of things?

** edit Ah reading this topic a bit more closely I see that your issue is perhaps not to do with your move of Discourse to a subfolder installation, or trailing slashes, rather it is perhaps your migration of your Wordpress urls, i.e.

For example, in this post, the url used to be:

https://tecnoblog.net/486925/o-que-e-pirataria-digital/

Now, it changed to:

https://tecnoblog.net/responde/o-que-e-pirataria-digital/

So perhaps the issue is that you have topic_embeds.embed_url with your old url structure and FinalDestination is not resolving the new urls for whatever reason (i.e. it’s unable to follow a redirect)

In that case you will either need to ensure that your old blog urls redirect to your new blog urls, or you’ll need to migrate the topic_embeds.embed_url. On the migration front, note that your script is incorrect, e.g. topic.custom_fields["embed_url"] is not where the embed_url is stored.

Here’s what I would suggest if you want to go the migration route (as opposed to redirecting your old blog urls to the new ones). First, confirm your issue is an incorrect blog url format in topic_embeds.embed_url by looking at an example, e.g. TopicEmbed.find_by(topic_id: 157441). Then, if you see that the issue is indeed that you have the old url format saved in that column, run this to update all of the old format embed_urls in a specific category:

category_id = # enter a category id here
TopicEmbed.joins(:topic).where(topics: { category_id: category_id  }).find_each do |embed|
   new_url = embed.embed_url.sub(%r{/\d+/}, "/responde/")
   embed.update!(embed_url: new_url) if new_url != embed.embed_url
 end

Note that the regex substitution of the old format to the new (sub(%r{/\d+/}, "/responde/")) is just a guess based on the example you provided. You can test the effect of that on your real urls here: https://regex101.com/

לייק 1

Hi Angus!

No, these are different issues. Trailing slash started when we moved to subfolder, but there are also old URLs from years ago that have a different slug, now.

I had to rebuild the installation, so yeah, i think that this new standard might be the cause.

My suggestion to fix this issue: can’t Discourse follow at least one or two redirects to retrieve the data? That would fix the trailing slash issue and also buletproof the website in case of possible URL changes in the future.

Also, it’s safer, since there would be no need to run scripts to update old topics, that could also do some damage in the database.