The usage problem after using AI translation

co_choa · July 15, 2025, 10:22am

I followed the official tutorial to configure Discourse AI for translation and set it to translate all past posts over a span of days. Indeed, this resulted in a substantial amount of input and output tokens. However, after two days, I encountered a situation where only input tokens were being processed without any output. I am unsure of the cause—could it be that all the previous posts have already been translated? If so, what measures can I take to reduce token input and thereby conserve costs?

nat · July 15, 2025, 10:33am

Hey there, have you followed these recommendations?

The usage graph definitely looks concerning. Can you try out this data explorer query:

SELECT 
  a.id,
  a.language_model,
  LENGTH(p.raw) as raw_length,
  a.response_tokens,
  a.raw_request_payload,
  a.raw_response_payload,
  a.topic_id,
  a.post_id
FROM ai_api_audit_logs a
LEFT JOIN posts p ON p.id = a.post_id AND p.deleted_at IS NULL
LEFT JOIN topics t ON t.id = a.topic_id AND t.deleted_at IS NULL
WHERE a.created_at > CURRENT_DATE - INTERVAL '1 days'
AND p.deleted_at IS NULL
AND t.deleted_at IS NULL
AND p.user_deleted = false
AND a.feature_name = 'translation'
AND LENGTH(p.raw) < 1000
AND a.response_tokens > 10000
ORDER BY a.created_at DESC
LIMIT 100

The query should show you the number of response tokens used based on the post’s raw length. Ideally you should see a similar number, not more than 1.5x tokens. The AiApiAuditLog will help with determining what is going on.

Additionally please share,

What model are you using?
What’s your backfill hourly rate? I suggest to keep it to a low value, like 50 for starters.
How many languages are you supporting? Does your selected model support them?

co_choa · July 15, 2025, 10:51am

I have configured GPT-4.1 Nano as the translation model. The hourly backfill rate was previously set at 1,000, but today I adjusted it to 100. I have enabled support for both Japanese and English, and this model indeed supports these languages.

nat · July 15, 2025, 10:54am

Yeah, 1000 is probably not a good idea and I should add a site setting limit here.

I am not sure how the OpenAI API deals with getting hit about 3000x per hour. Basically for your setup, per post we are doing locale detection, translate to Japanese, translate to English. If you check /logs I suspect you’ll see your site hitting rate limits.

I suggest lowering it still to 50 and see how it goes.

We’ll be implementing a way to view translation progress of the entire site as well in the near future.

co_choa · July 15, 2025, 10:55am

Furthermore, when I execute the query command you provided, the database returns no results. Is there perhaps a need for some customization or modification?

nat · July 15, 2025, 10:56am

Hmm this query should work. Do you have the discourse-data-explorer plugin?

co_choa · July 15, 2025, 10:56am

Very well, I shall give it a try first. Thank you.

co_choa · July 15, 2025, 10:57am

I will install it afterward and then give it another try. Currently, rebuilding the forum is not feasible because the users are still actively using it.

Topic		Replies	Views
AI translation backfill not working after all settings configured Support ai , content-localization	4	45	July 18, 2025
Discourse AI - AI usage Site Management how-to , ai	0	256	January 23, 2025
Content Localization and Automatic Translations for Your Community Announcements ai , content-localization	26	525	July 18, 2025
Discourse Translator Plugin official , multi-lingual , translator	23	69004	July 10, 2025
How do I calculate the token usage of every user in Discourse AI? Data & reporting ai	6	294	June 13, 2024

The usage problem after using AI translation

Related topics