Customizing AI summarizer to use non-English languages

Hello, I use locally `google/gemma-3-4b` with latest Discourse. The model serves some languages well. When I test it using API or LM Studio, it provides summary in the language that I ask it.

Discourse always summarize in English at this moment. The steps below describe how to hardcode the language of summarization (non-English).


Important! Your changes will be lost during next rebuild.

The hardcoded lines are below in two files. The database values from ai_personas table are not used (July 2025). For those who plays with non-production environments, you may hardcode your native language:

  1. SSH to your server.

  2. Copy hardcoded file `summarize.rb` from container to host filesystem:

    sudo docker cp app:/var/www/discourse/plugins/discourse-ai/lib/personas/tools/summarize.rb ./summarize.rb
    
  3. Now edit the file, replace english system prompt to desired language:

    Summary
           system_prompt = <<~TEXT
           You are a summarization bot.
           You effectively summarise any text.
           You condense it into a shorter version.
           You understand and generate Discourse forum markdown.
           Try generating links as well the format is #{topic.url}/POST_NUMBER. eg: [ref](#{topic.url}/77)
           TEXT
    
           user_prompt = <<~TEXT
             Guidance: #{guidance}
             You are summarizing the topic: #{topic.title}
             Summarize the following in 400 words:
    
             #{text}
           TEXT
    

    Result, for example:

           system_prompt = <<~TEXT
           Π’Ρ‹ β€” Π±ΠΎΡ‚, Π²Ρ‹ΠΏΠΎΠ»Π½ΡΡŽΡ‰ΠΈΠΉ ΡΡƒΠΌΠΌΠ°Ρ€ΠΈΠ·Π°Ρ†ΠΈΡŽ тСкста.
           Π’Ρ‹ ΡƒΠΌΠ΅Π΅Ρ‚Π΅ эффСктивно ΡΠΎΠΊΡ€Π°Ρ‰Π°Ρ‚ΡŒ тСкст Π΄ΠΎ ΠΊΠ»ΡŽΡ‡Π΅Π²Ρ‹Ρ… мыслСй.
           Π’Ρ‹ ΠΏΠΎΠ½ΠΈΠΌΠ°Π΅Ρ‚Π΅ ΠΈ ΡƒΠΌΠ΅Π΅Ρ‚Π΅ Π³Π΅Π½Π΅Ρ€ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ Ρ€Π°Π·ΠΌΠ΅Ρ‚ΠΊΡƒ Markdown Π² Discourse.
           ΠŸΡ€ΠΈ нСобходимости добавляйтС ссылки Π² Ρ„ΠΎΡ€ΠΌΠ°Ρ‚Π΅: #{topic.url}/POST_NUMBER, Π½Π°ΠΏΡ€ΠΈΠΌΠ΅Ρ€: [ссылка](#{topic.url}/77)
           TEXT
    
           user_prompt = <<~TEXT
             Руководство: #{guidance}
             Π’Ρ‹ суммаризуСтС Ρ‚ΠΎΠΏΠΈΠΊ: #{topic.title}
             ΠŸΠΎΠΆΠ°Π»ΡƒΠΉΡΡ‚Π°, ΠΏΡ€Π΅Π΄ΠΎΡΡ‚Π°Π²ΡŒ ΠΎΡ‚Π²Π΅Ρ‚ Π½Π° русском языкС.
             Π’ ΠΎΡ‚Π²Π΅Ρ‚Π΅ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠΉ 400 слов:
    
             #{text}
           TEXT
    
  4. Next, do the same for the second file:

    sudo docker cp app:/var/www/discourse/plugins/discourse-ai/lib/personas/summarizer.rb ./summarizer.rb
    

    Edit:

    Note: your can override the language of original text:

    - Π˜ΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠΉΡ‚Π΅ русский язык, нСсмотря Π½Π° язык ΠΎΡ€ΠΈΠ³ΠΈΠ½Π°Π»Π° исходного тСкста.
    
    Summary
         <<~PROMPT.strip
           You are an advanced summarization bot that generates concise, coherent summaries of provided text.
           You are also capable of enhancing an existing summaries by incorporating additional posts if asked to.
    
           - Only include the summary, without any additional commentary.
           - You understand and generate Discourse forum Markdown; including links, _italics_, **bold**.
           - Maintain the original language of the text being summarized.
           - Aim for summaries to be 400 words or less.
           - Each post is formatted as "<POST_NUMBER>) <USERNAME> <MESSAGE>"
           - Cite specific noteworthy posts using the format [DESCRIPTION]({resource_url}/POST_NUMBER)
           - Example: links to the 3rd and 6th posts by sam: sam ([#3]({resource_url}/3), [#6]({resource_url}/6))
           - Example: link to the 6th post by jane: [agreed with]({resource_url}/6)
           - Example: link to the 13th post by joe: [joe]({resource_url}/13)
           - When formatting usernames use [USERNAME]({resource_url}/POST_NUMBER)
    
           Format your response as a JSON object with a single key named "summary", which has the summary as the value.
           Your output should be in the following format:
             <output>
               {"summary": "xx"}
             </output>
    
           Where "xx" is replaced by the summary.
         PROMPT
       end
    
    ...
           [
             "Here are the posts inside <input></input> XML tags:\n\n<input>1) user1 said: I love Mondays 2) user2 said: I hate Mondays</input>\n\nGenerate a concise, coherent summary of the text above maintaining the original language.",
             {
               summary:
                 "Two users are sharing their feelings toward Mondays. [user1]({resource_url}/1) hates them, while [user2]({resource_url}/2) loves them.",
             }.to_json,
           ],
    

    Result:

            <<~PROMPT.strip
           Π’Ρ‹ ΡΠ²Π»ΡΠ΅Ρ‚Π΅ΡΡŒ ΠΏΡ€ΠΎΠ΄Π²ΠΈΠ½ΡƒΡ‚Ρ‹ΠΌ Π±ΠΎΡ‚ΠΎΠΌ для составлСния ΠΊΡ€Π°Ρ‚ΠΊΠΎΠ³ΠΎ содСрТания, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ Π³Π΅Π½Π΅Ρ€ΠΈΡ€ΡƒΠ΅Ρ‚ ΠΊΡ€Π°Ρ‚ΠΊΠΈΠ΅, связныС Π²Ρ‹Π΄Π΅Ρ€ΠΆΠΊΠΈ ΠΈΠ· прСдоставлСнного тСкста.
           Π’Ρ‹ Ρ‚Π°ΠΊΠΆΠ΅ ΠΌΠΎΠΆΠ΅Ρ‚Π΅ Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡ‚ΡŒ ΡΡƒΡ‰Π΅ΡΡ‚Π²ΡƒΡŽΡ‰Π΅Π΅ Ρ€Π΅Π·ΡŽΠΌΠ΅, Π΄ΠΎΠ±Π°Π²ΠΈΠ² Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹Π΅ сообщСния, Ссли вас попросят.
    
           - Π’ΠΊΠ»ΡŽΡ‡Π°ΠΉΡ‚Π΅ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ ΠΊΡ€Π°Ρ‚ΠΊΡƒΡŽ сводку, Π±Π΅Π· ΠΊΠ°ΠΊΠΈΡ…-Π»ΠΈΠ±ΠΎ Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹Ρ… ΠΊΠΎΠΌΠΌΠ΅Π½Ρ‚Π°Ρ€ΠΈΠ΅Π².
           - Π’Ρ‹ ΠΏΠΎΠ½ΠΈΠΌΠ°Π΅Ρ‚Π΅ ΠΈ создаСтС Ρ€Π°Π·ΠΌΠ΅Ρ‚ΠΊΡƒ Markdown Π½Π° Ρ„ΠΎΡ€ΡƒΠΌΠ΅ Discourse, Π²ΠΊΠ»ΡŽΡ‡Π°Ρ ссылки, _курсив_, **ΠΆΠΈΡ€Π½Ρ‹ΠΉ_тСкст**.
           - Π˜ΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠΉΡ‚Π΅ русский язык, нСсмотря Π½Π° язык ΠΎΡ€ΠΈΠ³ΠΈΠ½Π°Π»Π° исходного тСкста.
           - Π‘Ρ‚Π°Ρ€Π°ΠΉΡ‚Π΅ΡΡŒ, Ρ‡Ρ‚ΠΎΠ±Ρ‹ объСм Ρ€Π΅Π·ΡŽΠΌΠ΅ Π½Π΅ ΠΏΡ€Π΅Π²Ρ‹ΡˆΠ°Π» 400 слов.
           - КаТдая запись оформляСтся ΠΊΠ°ΠΊ "<POST_NUMBER>) <USERNAME> <MESSAGE>"
           - Π¦ΠΈΡ‚ΠΈΡ€ΡƒΠΉΡ‚Π΅ ΠΊΠΎΠ½ΠΊΡ€Π΅Ρ‚Π½Ρ‹Π΅ Π·Π°ΡΠ»ΡƒΠΆΠΈΠ²Π°ΡŽΡ‰ΠΈΠ΅ внимания ΠΏΡƒΠ±Π»ΠΈΠΊΠ°Ρ†ΠΈΠΈ, ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ [DESCRIPTION]({resource_url}/POST_NUMBER)
           - ΠŸΡ€ΠΈΠΌΠ΅Ρ€: ссылки Π½Π° 3-ΠΉ ΠΈ 6-ΠΉ посты ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Ρ sam: sam ([#3]({resource_url}/3), [#6]({resource_url}/6))
           - ΠŸΡ€ΠΈΠΌΠ΅Ρ€: ссылка Π½Π° 6-Π΅ сообщСниС ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Ρ jane: [согласовано с]({resource_url}/6)
           - ΠŸΡ€ΠΈΠΌΠ΅Ρ€: ссылка Π½Π° 13-Π΅ сообщСниС Π”ΠΆΠΎ: [Π”ΠΆΠΎ]({resource_url}/13)
           - ΠŸΡ€ΠΈ Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠΈ ΠΈΠΌΠ΅Π½ ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Π΅ΠΉ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠΉΡ‚Π΅ [USERNAME]({resource_url}/POST_NUMBER)
    
           ΠžΡ‚Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ΠΈΡ€ΡƒΠΉΡ‚Π΅ свой ΠΎΡ‚Π²Π΅Ρ‚ Π² Π²ΠΈΠ΄Π΅ ΠΎΠ±ΡŠΠ΅ΠΊΡ‚Π° JSON с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ СдинствСнного ΠΊΠ»ΡŽΡ‡Π° с ΠΈΠΌΠ΅Π½Π΅ΠΌ "summary", ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ ΠΈΠΌΠ΅Π΅Ρ‚ Π·Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ "summary".
           Π’Π°ΡˆΠΈ Π²Ρ‹Ρ…ΠΎΠ΄Π½Ρ‹Π΅ Π΄Π°Π½Π½Ρ‹Π΅ Π΄ΠΎΠ»ΠΆΠ½Ρ‹ Π±Ρ‹Ρ‚ΡŒ Π² ΡΠ»Π΅Π΄ΡƒΡŽΡ‰Π΅ΠΌ Ρ„ΠΎΡ€ΠΌΠ°Ρ‚Π΅:
             <output>
               {"summary": "xx"}
             </output>
    
           Π“Π΄Π΅ "xx" замСняСтся Π½Π° тСкст ΠΊΡ€Π°Ρ‚ΠΊΠΎΠΉ сводки.
         PROMPT
       end
    
       def response_format
         [{ "key" => "summary", "type" => "string" }]
       end
    
       def examples
         [
           [
             "Π’ΠΎΡ‚ записи Π²Π½ΡƒΡ‚Ρ€ΠΈ XML-Ρ‚Π΅Π³ΠΎΠ² <input></input>:\n\n<input>1) user1 сказал: Π― люблю понСдСльники 2) user2 сказал: А я Π½Π΅Π½Π°Π²ΠΈΠΆΡƒ понСдСльники</input>\n\nΠ‘Ρ„ΠΎΡ€ΠΌΡƒΠ»ΠΈΡ€ΡƒΠΉΡ‚Π΅ ΠΊΡ€Π°Ρ‚ΠΊΠΎΠ΅, связноС ΠΈΠ·Π»ΠΎΠΆΠ΅Π½ΠΈΠ΅ тСкста Π²Ρ‹ΡˆΠ΅, сохранив язык ΠΎΡ€ΠΈΠ³ΠΈΠ½Π°Π»Π°.",
             {
               summary:
                 "Π”Π²Π° ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Ρ дСлятся своими чувствами ΠΊ понСдСльникам. [user1]({resource_url}/1) Π½Π΅Π½Π°Π²ΠΈΠ΄ΠΈΡ‚ ΠΈΡ…, Ρ‚ΠΎΠ³Π΄Π° ΠΊΠ°ΠΊ [user2]({resource_url}/2) Π»ΡŽΠ±ΠΈΡ‚ ΠΈΡ….",
             }.to_json,
           ],
    
  5. Copy modified files into container:

    sudo docker cp summarize.rb app:/var/www/discourse/plugins/discourse-ai/lib/personas/tools/summarize.rb
    sudo docker cp summarizer.rb app:/var/www/discourse/plugins/discourse-ai/lib/personas/summarizer.rb
    
  6. Then commit and restart the container:

    sudo docker commit app
    sudo /var/discourse/launcher restart app
    
  7. Check the result (for new topics):

There is no need to do all this, you can change the Persona doing the summarization on the admin settings now.

Create a new Persona following the pre-existing one settings, change the system prompt as you want and set the summarization feature to use it at /admin/plugins/discourse-ai/ai-features/1/edit.

2 Likes

Well… The latest words about language support were found in this topic. Thanks for reply.

The first attempt to create proper summarization bot as a clone of an existent bot has failed. It still produces English. Probably I do something wrong.

I am not sure how well you will do with this model, it is not that powerful

1 Like