内容本地化 - 使用 Discourse AI 进行手动和自动本地化

In this topic, we will walk you through the Content Localization features and how to enable them. The features are split into two parts: What is available by default in Discourse; and Discourse AI for automatic translations. For quick access to the relevant sections, use the wiki headings :backhand_index_pointing_right:t2:

Localizing Your Community’s Content

An updated version of Discourse (3.5.0.beta7-dev) gives you access to several localization features available for configuration at:

  • <your-site-url>/admin/site_settings/category/content_localization
New Content Localization in Site Settings 📸

Getting information on your users

Firstly, it is good to get some information on your community. The following data explorer query can give you an idea of how many users may have set their locale in /my/preferences/interface

SELECT locale, count(*) as count
FROM users
WHERE (locale IS NOT null AND locale <> '')
GROUP BY locale
ORDER BY count DESC
Sample results from Data Explorer

Setting locales that your community supports

With the information above, we are now more informed about which locales your community should support.

In <your-site-url>/admin/site_settings/category/content_localization, you can select locales to support.

  • Content localization enabled - turns on the feature that replaces original written user content with localized content. Read on for auto and manual modes of localizing.
  • Content localization supported locales - the list of languages your site supports
  • Content localization anon language switcher - covered just below
List of locales in Site Settings 📸

Enabling the subsequent setting Content localization anon language switcher also allows you to make your community more accessible to non-logged-in users by showing the list of languages you’ve chosen in the list of supported locales:


Language switcher at the top right of the page

Viewing localized content


Localized welcome topic on meta.discourse.org

For viewers of localized content (all site visitors), they may cursor over the indicator next to the post’s date to view the original language of the post. This indicator only shows up if the post is not in their language.

If a user wishes to only see original content, they may use the toggle above the topic timeline to disable localizations for the whole site.

Automatic translations with Discourse AI :sparkles:

Discourse AI are the vitamins essential for the localization feature, and takes away the need to do manual translations.

As an admin, you’d want to head to our new AI features section for Translation.

Discourse AI Features in Admin Settings 📸

Scroll down in /admin/plugins/discourse-ai/ai-features

To cover some important settings and recommendations:

  • AI translation backfill hourly rate - this setting is hidden in the UI and defaults to 0. :warning: Automatic translation will not begin if this value is 0. Assuming the rate is 50, your site will translate 50 posts, 50 topics, and 50 categories per hour, to the locales you have set in Content localization supported locales. Keep this to a low number when starting out.
  • AI translation backfill max age days - defaults to 5. This means topics and posts older than 5 days will not be translated. You may increase this to a large number to translate all topics and posts.
  • AI translation backfill limit to public content - defaults to true. This prevents PMs and content in private categories from being sent to the LLM. When set to false, group PMs, and private categories will be included in translations. PMs between individuals will not be translated.
  • AI translation max post length - defaults to 10000. This is a safeguard and prevents posts above a certain length from being translated.
  • AI translation post raw translator persona (and other personas) - In more formal communities, admins may choose to create their own persona. This allows you to set a prompt that is more fine-tuned to the language or vocabulary you prefer.

You can refer to AI bot - Personas on how to configure suitable personas and fine-tune prompts for each function.

Translation Progress

You may find more information about how automatic translations are progressing in the Translation Progress chart on /admin/plugins/discourse-ai/ai-translations

This chart will show up if

  • all translator personas have a valid LLM
  • discourse ai enabled :check_mark:
  • ai translation enabled :check_mark:
  • content localization supported locales is filled
  • ai translation backfill max age days is more than 0
  • ai translation backfill hourly rate is more than 0

Manual localization

As localization is a core feature in Discourse, we provide the ability for you to fill in and edit localizations manually in the event automatic translations with Discourse AI is not available.

By default, admins and moderators are set up to edit localizations.

Localization allowed groups in Site Settings 📸


Admin Site Setting for Content Localization

Currently, we have post content, topic title, and category name and category description localizations. Tags are not supported yet, but will be in the near future. Subsequent sections below will show you how they work.

Category localization

Localized categories are visible in the following areas, with both category name and description localized:

Places where categories are localized 📸
  1. Homepage, sidebar, and category dropdown
  1. Categories page
  1. A specific category with subcategories

As an admin, you should be able to access category settings as usual, and find the new “Localizations” nav item on the left.

Editing category localizations in Category Settings 📸

Topic and Post localization

From the screenshots above in Category localization, you may have noticed topic titles and excerpts being localized.

There are some pre-requisite settings

  • Ensure your user is in content localization allowed groups
  • Add addTranslation in site setting for post menu. This allows the :globe_with_meridians: to show up in the post menu for users in content localization allowed groups
2 Site Settings 📸


:backhand_index_pointing_down:t2:

Once again, the list of localizable languages is in the Content localization supported locales setting mentioned above.

Editing a localized post

In the event the user might be viewing a localized post, and wants to edit the post, a dialog will appear to ask which version they would prefer to edit:

The appropriate composer will appear after deciding.

Deleting a post’s translation for a certain locale

If you’ve followed instructions above regarding the post menu setting correctly, you should be able to do the following if you’re in the content_localization_allowed_groups:

FAQ

I’ve set things up, but automatic translation is still not working for me
Confirm if you’ve these set up

  • Content localization supported locales has at least one language
  • Content localization enabled is :check_mark:
  • Ai translation enabled is :check_mark:
  • Ai translation max age days is not 0
  • Ai translation backfill hourly rate is more than 12. This is a hidden site setting which requires console access.
  • You must have a working LLM set for each translation persona

If all else fails, you can enable SiteSetting.ai_translation_verbose_logs.

Is every post getting translated?
If AI translation backfill limit to public content is :check_mark: , all posts in public categories except for Bot (user id < 0) posts will be translated.

Are the automatic translations saved, or is it being sent to the LLM each time someone views a topic?
The translations are saved, each post is only sent once per language and the translations are reused.

If my forum supports English and Japanese (via Content localization supported locales), and someone writes in Spanish, will their post be translated?
Yes. All topics and posts will be translated to English and Japanese, regardless of the written language.

If the original post is edited, is it re-translated?
Yes – with a maximum of 2 times per day. When a post is edited, it gets sent to re-translation 5 minutes later (or the SiteSetting.editing_grace_period) to account for ninja edits.

Will translations be deleted if I change the Persona or LLM?
No, translations will typically persist across settings changes unless explicitly deleted using the post menu item or the translation composer.


21 个赞

是否有针对现有分类批量执行此操作的建议?最坏的情况是,也许通过 API

2 个赞

嗯,好问题。我会确保 API 文档得到更新,以包含类别更新端点。:memo:

4 个赞

是否会支持按语言划分的版主?我正在考虑元(meta)——我可能会在那里志愿检查特定语言的帖子并手动更新它们。特别是文档,可能需要一些人工润色。但你说只有版主才能做到,而我可能永远也做不到。

2 个赞

嗯,好建议。我认为可以做到,但我们需要考虑如何设置的细节。

2 个赞

如何访问它?您能提供一个命令吗?

Sidekiq 是否有任何关联的任务?是否可以手动触发它?

1 个赞
2 个赞

在 Moin 上面的帖子中补充一下,一旦进入控制台,只需输入 SiteSetting. ai_translation_backfill_hourly_rate 即可。该作业每五分钟运行一次,并相应地进行速率限制。

2 个赞

文档中现在提供本地化了。谢谢 @nat

3 个赞

太棒了,团队棒极了!我正在测试,并将分享我的想法和整体体验。

我们希望列表中有世界语;这可以“简单地”添加,还是需要先将其内置到 discourse-languages 中?

哇,你真快——我正要在这里报告。 :laughing:

是的,差不多。我们希望获得完整的本地化体验,其中控件(按钮、标签等)通过 Crowdin(参见 https://meta.discourse.org/c/dev/translations/27)得到正确且充分的翻译(70% 就很好了),这样我们就可以为该语言提供支持。

1 个赞

内容本地化是否适用于文档类别?在我看来,即使我本地化了索引主题,侧边栏内容也没有得到翻译。

我还注意到一个奇怪的行为。当我看到用原始语言显示的本地化主题并刷新时,它会切换到本地化版本。我必须手动再次切换到原始版本。

2 个赞

哦,太棒了!是的,它还没有起作用,但 @nat 会跟踪这件事!

我想知道这是否会促使我们为侧边栏文档链接提出更好的抽象/数据模型。

1 个赞

是的,没错——Discourse 中有很多地方需要显式翻译,所以我会在发现时记录下来。最近,我们也本地化了主题标题的通知。这是一个我创建的功能主题示例 - https://meta.discourse.org/t/show-translated-user-bios/378908。

我将创建一个新主题并@你,以确保我们涵盖侧边栏的所有内容。

编辑:@tvavrda 在这里已涵盖 - https://meta.discourse.org/t/translate-sidebar-documentation-links/379540。请查看并判断是否合理。

你说的“再次切换”是什么意思?

下次发生这种情况时,你介意分享一个视频录像(包括地址栏)吗?:folded_hands:t2: 如果内容不适合公开,请随时私信我。另外,你当时是登录状态吗?从技术上讲,这些东西是通过 cookie 来跟踪的,所以对我来说有点令人费解。

1 个赞

已发送视频。

另一个观察结果——我无法看到已翻译内容的差异,对吗?如果内容有更新,这可能会很有用。虽然不是非常重要,但我认为这有道理。

还有另一个——主题下的反向链接没有显示本地化主题名称。

还有一个问题:本地化类别设置中的类别描述有什么意义?类别描述应该来自“关于”主题的本地化版本,不是吗?本地化版本不支持 Markdown,所以我无法使用我想要的链接。

1 个赞

嗯……旧的 GitHub - discourse/discourse-docs-sidebar 组件实际上尊重本地化 :slight_smile: 我暂时切换到了那个组件。

是的,这目前也不支持,而且将是一项艰巨的任务。

当帖子版本发生变化时,我们会有一个特殊颜色的指示器(类似于旁边的帖子编辑指示器),表示翻译可能已过时。

1 个赞

我还看到固定主题摘要中的未翻译内容。因此,我看到的主题列表是翻译后的语言,但固定主题的摘要显示的是原始语言。

1 个赞

我们可以手动进行或修复翻译,但能否手动触发翻译构建?有点像按需工作。

我想的是,我允许翻译一年以上的主题。但如果这一年是从当前日期开始计算的,那么这个限制会一直朝着已翻译的内容移动。但最大的需求是那些有价值的旧内容,我希望能够快速地触达它们,而无需进行缓慢的大批量操作。

我想知道,有人在使用翻译功能后有成本数据吗?我们的网站已经运行了一段时间了,虽然我希望能够翻译整个网站,但成本确实是一个问题。所以,如果有人能根据你们的经验提供一些大致的成本数据,例如 1000 篇帖子花费 1 美元,这将极大地帮助我们估算成本。

内容本地化是一次性完成然后存储起来,而不是按需进行的吗?如果是这样,有什么能阻止我在我的桌面上启动 Ollama 和一些开源 LLM,比如 Llama 3 或 Deepseek 3,然后让它运行直到完成吗?

编辑:我猜这可能有助于降低初始翻译成本,但对于新帖子来说则不然,除非决定永久运行本地 LLM。