Discourse AI - Анализ тональности

:bookmark: Эта тема посвящена настройке функции анализа тональности (Sentiment) плагина Discourse AI.

:person_raising_hand: Требуемый уровень доступа пользователя: Модератор

Функция анализа тональности отслеживает ваше сообщество, анализируя сообщения и предоставляя показатели тональности и эмоций, чтобы дать вам общее представление о состоянии вашего сообщества за любой период времени. Эти данные могут быть полезны для определения типа пользователей, публикующих сообщения в вашем сообществе, и их взаимодействия друг с другом.

Возможности

  • Общая тональность: сравнивает количество сообщений, классифицированных как позитивные или негативные.
  • Столбчатая диаграмма с возможностью переключения для отображения числовых значений позитивных, негативных и общих показателей.
  • Эмоции: количество тем и сообщений, классифицированных по множеству эмоций, сгруппированных по временным интервалам:
    • Сегодня
    • Вчера
    • Последние 7 дней
    • Последние 30 дней
  • Отчеты за любой период времени, доступные через настройки:
    • Годовой
    • Квартальный
    • Ежемесячный
    • Еженедельный
    • Пользовательский диапазон
  • Доступно для всех сотрудников (администраторов и модераторов)

Включение анализа тональности

Настройка

Анализ тональности включен по умолчанию для клиентов на хостинге. Для ручных шагов см. ниже.

  1. Перейдите в настройки AdminPlugins → найдите или выполните поиск discourse-ai и убедитесь, что он включен.
  2. Включите параметр ai_sentiment_enabled для анализа тональности.
  3. Перейдите по адресу /admin/dashboard/sentiment, чтобы просмотреть соответствующие отчеты.

:information_source: После включения функция анализа тональности автоматически классифицирует все новые сообщения и выполнит повторную обработку (backfill) сообщений за последние 60 дней с помощью запланированной задачи, которая выполняется каждые 5 минут. Для повторной обработки сообщений старше 60 дней увеличьте настройку сайта ai_sentiment_backfill_post_max_age_days.

:discourse2: Размещено нами?

Свяжитесь с нами по адресу team@discourse.org, если вам нужна помощь в настройке параметров повторной обработки.

:mechanic: Собственный хостинг?

Увеличьте значение ai_sentiment_backfill_post_max_age_days в настройках вашего сайта, чтобы охватить желаемый временной диапазон. Запланированная задача повторной обработки автоматически обработает более старые сообщения. Подробную информацию о настройке необходимых конечных точек моделей см. в статье Размещение анализа тональности и эмоций для DiscourseAI на собственном хостинге.

Технические вопросы и ответы

Как обрабатываются данные о темах/сообщениях? Как присваиваются оценки?

  • Анализ тональности работает с точностью «на каждое сообщение». Для каждого сообщения мы определяем тональность, а затем можем разбивать эти данные по различным критериям (по тегам, категориям, времени и т. д.). Он сравнивает количество сообщений, классифицированных как позитивные или негативные. Эти расчеты производятся, когда позитивные или негативные оценки превышают фиксированный порог 0,6 (в настоящее время не настраивается).

Планируется ли добавление поддержки других языков?

  • В будущем да! Это будет реализовано как путем добавления многоязычных простых моделей машинного обучения (ML), так и путем использования многоязычных больших языковых моделей (LLM) для классификации данных вместо специализированных моделей.

Какие модели используются для работы анализа тональности?

Ограничения

  • Сообщения, классифицированные как нейтральные (ни позитивные, ни негативные), не отображаются.
  • Личные сообщения (ЛС) исключаются из расчетов.
10 лайков

A post was merged into an existing topic: Problems with Sentiment Backfill

A post was merged into an existing topic: Problems with Sentiment Backfill

The OP has been updated with a new video showcasing the updated features of Sentiment including a ton more emotions and understanding which topics/posts are associated with each emotion

I configured the sentiment model config with a model_name, endpoint, and api_key copied from the LLM settings, where it passes the test, but I get the error below in /logs.

(But maybe I don’t understand correctly, because why doesn’t sentiment use one of the configured LLMs?)

Using claude-3-5-sonnet.


{"type":"error","error":{"type":"invalid_request_error","message":"anthropic-version: header is required"}} (Net::HTTPBadResponse)
/var/www/discourse/plugins/discourse-ai/lib/inference/hugging_face_text_embeddings.rb:71:in `classify'
/var/www/discourse/plugins/discourse-ai/lib/sentiment/post_classification.rb:142:in `request_with'
/var/www/discourse/plugins/discourse-ai/lib/sentiment/post_classification.rb:78:in `block (4 levels) in bulk_classify!'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/promises.rb:1593:in `evaluate_to'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/promises.rb:1776:in `block in on_resolvable'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:359:in `run_task'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:350:in `block (3 levels) in create_worker'
<internal:kernel>:187:in `loop'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:341:in `block (2 levels) in create_worker'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `catch'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `block in create_worker'
Status: 400

{"type":"error","error":{"type":"invalid_request_error","message":"anthropic-version: header is required"}} (Net::HTTPBadResponse)
/var/www/discourse/plugins/discourse-ai/lib/inference/hugging_face_text_embeddings.rb:71:in `classify'
/var/www/discourse/plugins/discourse-ai/lib/sentiment/post_classification.rb:142:in `request_wit...

The sentiment module doesn’t use general LLMs, but models specifically fine tuned to sentiment classification. If you want to run those models on your own that is documented at Self-Hosting Sentiment and Emotion for DiscourseAI

3 лайка

@Falco I just noticed the sentiment has stopped running as of Jan 2025. My guess that there’s this new setting ai_sentiment_model which as the above link explains if for running your own dedicated sentiment model/image. I noticed that after updating discourse now ai_sentiment_model_configs is all blank (should it be blank?).

When I try to run a backfill rake it gives me an error:

rake ai:sentiment:backfill
rake aborted!
ActiveRecord::StatementInvalid: PG::SyntaxError: ERROR:  syntax error at or near ")" (ActiveRecord::StatementInvalid)
LINE 1: ...e_upload_id", "posts"."outbound_message_id" FROM () as posts...
                                                             ^
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/db/pg.rb:69:in `exec_params'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/db/pg.rb:69:in `exec_params'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:894:in `block (2 levels) in exec_no_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:1004:in `block in with_raw_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.2.1/lib/active_support/concurrency/null_lock.rb:9:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:976:in `with_raw_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:893:in `block in exec_no_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.2.1/lib/active_support/notifications/instrumenter.rb:58:in `instrument'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:1119:in `log'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:892:in `exec_no_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:872:in `execute_and_clear'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:66:in `internal_exec_query'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:647:in `select'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:73:in `select_all'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/query_cache.rb:251:in `select_all'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/querying.rb:70:in `_query_by_sql'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1431:in `block (2 levels) in exec_main_query'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:415:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_handling.rb:296:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1430:in `block in exec_main_query'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/query_cache.rb:143:in `disable_query_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/query_cache.rb:30:in `uncached'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:78:in `block in uncached'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1355:in `_scoping'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:541:in `scoping'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:78:in `uncached'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1450:in `skip_query_cache_if_necessary'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1414:in `exec_main_query'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1392:in `block in exec_queries'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/query_cache.rb:143:in `disable_query_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/query_cache.rb:30:in `uncached'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:120:in `public_send'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:120:in `block in method_missing'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1355:in `_scoping'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:541:in `scoping'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:120:in `method_missing'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1450:in `skip_query_cache_if_necessary'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1386:in `exec_queries'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1167:in `load'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:336:in `records'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/batches.rb:380:in `block in batch_on_unloaded_relation'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/batches.rb:378:in `batch_on_unloaded_relation'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/batches.rb:269:in `in_batches'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/batches.rb:157:in `find_in_batches'
/var/www/discourse/plugins/discourse-ai/lib/tasks/modules/sentiment/backfill.rake:7:in `block in <main>'
/usr/local/bin/bundle:25:in `load'
/usr/local/bin/bundle:25:in `<main>'

Caused by:
PG::SyntaxError: ERROR:  syntax error at or near ")" (PG::SyntaxError)
LINE 1: ...e_upload_id", "posts"."outbound_message_id" FROM () as posts...
                                                             ^
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/db/pg.rb:69:in `exec_params'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/db/pg.rb:69:in `exec_params'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:894:in `block (2 levels) in exec_no_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:1004:in `block in with_raw_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.2.1/lib/active_support/concurrency/null_lock.rb:9:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:976:in `with_raw_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:893:in `block in exec_no_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.2.1/lib/active_support/notifications/instrumenter.rb:58:in `instrument'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:1119:in `log'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:892:in `exec_no_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:872:in `execute_and_clear'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:66:in `internal_exec_query'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:647:in `select'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:73:in `select_all'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/query_cache.rb:251:in `select_all'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/querying.rb:70:in `_query_by_sql'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1431:in `block (2 levels) in exec_main_query'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:415:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_handling.rb:296:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1430:in `block in exec_main_query'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/query_cache.rb:143:in `disable_query_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/query_cache.rb:30:in `uncached'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:78:in `block in uncached'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1355:in `_scoping'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:541:in `scoping'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:78:in `uncached'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1450:in `skip_query_cache_if_necessary'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1414:in `exec_main_query'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1392:in `block in exec_queries'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/abstract/query_cache.rb:143:in `disable_query_cache'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/query_cache.rb:30:in `uncached'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:120:in `public_send'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:120:in `block in method_missing'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1355:in `_scoping'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:541:in `scoping'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/delegation.rb:120:in `method_missing'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1450:in `skip_query_cache_if_necessary'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1386:in `exec_queries'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:1167:in `load'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation.rb:336:in `records'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/batches.rb:380:in `block in batch_on_unloaded_relation'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/batches.rb:378:in `batch_on_unloaded_relation'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/batches.rb:269:in `in_batches'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/batches.rb:157:in `find_in_batches'
/var/www/discourse/plugins/discourse-ai/lib/tasks/modules/sentiment/backfill.rake:7:in `block in <main>'
/usr/local/bin/bundle:25:in `load'
/usr/local/bin/bundle:25:in `<main>'
Tasks: TOP => ai:sentiment:backfill
(See full trace by running task with --trace)

My question is, this was working fine until Nov/Dec 2024. How was it working earlier without a dedicated sentiment model/server? Is there a way to get it working again using a generic public service or built in service without having to run a dedicated server for it? I really like discourse is a self contained setup which makes it easy for simple deployment. Having to do custom deployments increases the complexity and maintenance costs which is troublesome for small deployments. Is there way to get back to the pre 2025 setup where sentiment worked out of the box?

The plugin defaulted to a server located in DigitalOcean which I put together to make testing easier.

I have since changed the plugin defaults to a clean slate and people who want AI classified need to run servers following the documentation here on Meta.

Indeed, but we were paying that cost for testing purposes. It’s not sustainable to offer that for every self hoster.

Worth mentioning, that we do offer this classification service on GPU accelerated servers as part of our hosting service.

2 лайка

Discourse AI Sentiment Analysis Issues: Hugging Face Model Format & Azure Endpoint Failure


Hi Discourse Community and Developers,

I’m encountering significant issues when attempting to configure and use the Sentiment Analysis feature within the Discourse AI plugin on my forum. It appears there are two distinct problems preventing it from working correctly.


Issue 1: Hugging Face Model Response Format Mismatch

I’ve configured the cardiffnlp/twitter-roberta-base-sentiment model from Hugging Face for sentiment analysis. While my API key is valid and I can successfully call the API from my Discourse instance using curl, the Discourse AI plugin seems to be parsing the response incorrectly due to a change in the Hugging Face model’s output format.

My curl command (confirming valid API key and new format):

Bashcurl -X POST https://api-inference.huggingface.co/models/cardiffnlp/twitter-roberta-base-sentiment \ -H "Authorization: Bearer hf_xxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d "{\"inputs\": \"I love Discourse!\"}"

Output from curl (showing the new nested array format):

[[{"label":"LABEL_2","score":0.9891520738601685},{"label":"LABEL_1","score":0.009014752693474293},{"label":"LABEL_0","score":0.0018332178005948663}]]

Problem: The twitter-roberta-base-sentiment model used to return a single array of label-score hashes: [{"label": "LABEL_2", "score": 0.98}, ...]. However, it now returns a nested array: [[{"label": "LABEL_2", "score": 0.98}, ...]].

The Discourse AI plugin’s hardcoded parsing logic (specifically, classification["label"][/\d+/].to_i as indicated by the backtrace) does not account for this outer array layer. This leads to a TypeError when it tries to access a Symbol as an Integer.

Error Message (from Job Exception):

no implicit conversion of Symbol into Integer (TypeError)
/var/www/discourse/plugins/discourse-ai/lib/sentiment/post_classification.rb:163:in block in transform_result’
/var/www/discourse/plugins/discourse-ai/lib/sentiment/post_classification.rb:163:in each’
/var/www/discourse/plugin…`

Full Backtrace for Hugging Face Issue:

/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/promises.rb:1268:in `raise'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/promises.rb:1268:in `wait_until_resolved!'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/promises.rb:998:in `value!'
/var/www/discourse/plugins/discourse-ai/lib/sentiment/post_classification.rb:93:in `bulk_classify!'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/sentiment_backfill.rb:27:in `execute'
/var/www/discourse/app/jobs/base.rb:316:in `block (2 levels) in perform'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:303:in `block in perform'
/var/www/discourse/app/jobs/base.rb:299:in `each'
/var/www/discourse/app/jobs/base.rb:299:in `perform'
/var/www/discourse/app/jobs/base.rb:379:in `perform'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:137:in `process_queue'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:77:in `worker_loop'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:63:in `block (2 levels) in ensure_worker_threads'
<internal:kernel>:187:in `loop'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:341:in `block (2 levels) in create_worker'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `catch'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `block in create_worker'
no implicit conversion of Symbol into Integer (TypeError)
/var/www/discourse/plugins/discourse-ai/lib/sentiment/post_classification.rb:163:in `block in transform_result'
/var/www/discourse/plugins/discourse-ai/lib/sentiment/post_classification.rb:163:in `each'
/var/www/discourse/plugin...

Issue 2: Microsoft Azure Model Configuration Leads to Hugging Face Error

When I attempted to switch to the Microsoft Text Analytics model within the Discourse AI settings, I encountered a 404 Resource not found error, and surprisingly, the backtrace still points to hugging_face_text_embeddings.rb.

Error Message (from Job Exception):

Job exception: 416 errors
{"error":{"code":"404","message": "Resource not found"}} (Net::HTTPBadResponse)

Relevant Backtrace Snippet (pointing to Hugging Face despite Microsoft model selected):

/var/www/discourse/plugins/discourse-ai/lib/inference/hugging_face_text_embeddings.rb:76:in `do_request!'
/var/www/discourse/plugins/discourse-ai/lib/inference/hugging_face_text_embeddings.rb:51:in `classify_by_sentiment!'
/var/www/discourse/plugins/discourse-ai/lib/sentiment/post_classification.rb:156:in `request_with'

Observation: This indicates that even when I select and configure the Microsoft model’s endpoint and API key, the Discourse AI plugin appears to be hardcoded or incorrectly routing the sentiment analysis requests through Hugging Face-specific logic or endpoints. This prevents the Microsoft model from being used at all.


Configuration Screenshots:

I’ve attached screenshot of my Discourse AI settings to show the configuration:

  • Detailed configuration for AI sentiment models (showing both Hugging Face and Microsoft models) - I tested with just Hugging Face or just Microsoft model configs with the same result

These issues make the sentiment analysis feature effectively unusable. It seems the plugin requires an update to handle the new Hugging Face response format and to correctly route requests when different sentiment providers are configured.

Any assistance or guidance on these issues would be greatly appreciated.

Thank you!

2 лайка

I am curious if sentiment reporting is functioning for others, or if I have misconfigured something. I would like to know what else I need to check or configure to enable sentiment reporting, as I am still experiencing the same issue.

1 лайк

We are trying to use this feature with Azure AI Language (from our self-hosted Discourse instance) - as we are already using our Azure subscription to integrate GPT-4.5 with Discourse (for summarization and chat-bot functionality):

…but we are getting no data in the the sentiment dashboard, and can see these errors in the logs:

Discourse AI: Errors during bulk classification: Failed to classify 208 posts (example ids: 2256, 909, 2270, 2260, 2797) : JSON::ParserError : An empty string is not a valid JSON string.

The backtrace shows that Discourse might be trying to use HuggingFace - are these the only models supported at the moment?

Thanks,

N

1 лайк

So, is the only way to use this feature to setup your own instances of the models (which need resource heavy GPU instances that would be costly)? This feature looks very useful, but it seems like it would cost me more to set up than my actual Discourse hosting.

Yes, the supported models are the ones listed in the OP.

We will eventually add support for classifying using LLMs for people whose cost isn’t an issue.

Well, the whole feature is build around classifying posts using ML models, so yes, you need somewhere to run those.

And since Discourse can run in the very cheapest VPS out there, running ML models is indeed more expensive. If you wanna have the feature on the cheapest way possible, it is doable to run it on a server with just a handful of CPU cores, as long as you have enough RAM to load the models.

1 лайк

Sorry if this question has already been asked before, but I was unable to find a reference to where exactly the threshold score can be configured :sweat_smile:

1 лайк

Unfortunately, the threshold score is not something that is configurable by users or admins. It’s a specific value set in the code base.

1 лайк