AI Translation Progress Graph

I activated ai translations on my self-hosted Discourse server yesterday. Everything is going well… sort of.

The translation progress graph is either not refreshing or inaccurate. Here is what I am seeing:

This would indicate that I am at 99% for all posts since Feb 23, 2026.

This is not accurate. I have roughly 3,000 posts for this timeframe. Based on the translation logs, it is currently working on posts from about 6 days ago.

So, does anyone know:

  • What is the refresh rate for this graph?
  • The data explorer query for posts waiting to be translated?
  • The data explorer query for posts translated?
  • The data explorer query for posts that was tried to translate but failed do to some error/reason.

Thank you in advance.

We had to cache this page as it was timing out on big sites.

The translation has basically two steps:

  1. Detect post original language
  2. Translate to every other language

At the beginning, your site will mostly be stuck doing step 1, which we initially showed in this translation progress page but have since removed because feedback said it was “too much information”.

That said, this is good feedback. The progress page cache and lack of information of locale detection is making the experience here awful at the beginning of the process.

Last thing, we set the backfill rate very conservatively, you may want to increase it according to your budget.

I would suggest considering a simpler reporting approach.

Based on my understanding, the translation process works in two stages:

  1. Translate eligible topic data.
  2. Translate eligible post data.

From an administrative and reporting perspective, I am less interested in what is currently in flight and more interested in overall progress against the eligible workload. I would prefer to see reporting based on the configured translation eligibility rules.

For example:

Status

Backfill settings are configured to translate all content created after February 23, 2026.

Area Total Eligible Translated Completed
Topics 25,000 540 450 83%
Posts 400,000 3,700 800 22%

Failed Translations

Post ID Reason
34543 Malformed characters on line xxxx

The current graph appears to show operational activity, which is certainly useful. However, what I really want to understand is how much of the eligible work has been completed.

Personally, I am not very interested in completion percentages by language. A topic or post is either translated or it is not. The key question for me is how much of the configured backlog has been processed successfully.

This approach also seems like it would be more database-friendly since it focuses on aggregate counts rather than tracking progress across every language combination.

If language-specific reporting is still valuable, perhaps it could be exposed through a filter. An administrator could select a language and view the same progress table for that language only.

Just some thoughts.

p.s.

What is the current cache timeframe?

I have been working on the data explorer SQL to create the report listed above. I am not a SQL expert, but these are working. The queries below will provide this level of information.

Translation Status Report

Area Eligible Translated To Be Translated
Topics 540 450 90
Posts 3,700 800 2900

Topics:

-- Setup:
-- Upate sql your translation settings
-- Days to Backfill - Interval 'xxx'
-- Category to ignore - Category_id NOT IN ()
-- Topic type - Regular or private_message
--
-- Translation status:
--  Total Topics to translate: comment out both 'and topics.locale' where statements
--  Topics not translated: uncomment only - topics.locale is null
--  Topics translated: uncomment only - topics.local = 'en'

SELECT count(distinct topics.id)
     FROM topics
     JOIN posts  ON topics.id  = posts.topic_id
    WHERE posts.created_at >= NOW() - INTERVAL '100 days' 
     AND  posts.user_id > 0
     AND  topics.category_id NOT IN (22,3)
     AND  topics.archetype = 'regular'
--   AND  topics.locale = 'en'
--   AND  topics.locale is null

Posts:

-- Setup:
-- Upate sql your translation settings
-- Days to Backfill - Interval 'xxx'
-- Category to ignore - Category_id NOT IN ()
-- Topic type - Regular or private_message
--
-- Translation status:
--  Total posts to translate: comment out both 'and posts.locale' where statements
--  posts not translated: uncomment only - posts.locale is null
--  posts translated: uncomment only - posts.locale = 'en'

SELECT count(*)
     FROM posts
     JOIN topics  ON topics.id  = posts.topic_id
    WHERE posts.created_at >= NOW() - INTERVAL '100 days' 
     AND  posts.user_id > 0
     AND  topics.category_id NOT IN (22,3)
     AND  topics.archetype = 'regular'
--     AND  posts.locale = 'en'
--     AND  posts.locale is null

Thanks a ton for the feedback @LotusJeff

I’m noting down some of your points here and will make improvements soon™ (probably July).

Re: the DE queries, let me get back to you tomorrow for a more precise one.