Settings for Hugging Face bge-large-en embeddings? -> RAG bots unresponsive!

StevePlex · May 26, 2024, 10:54pm

Please advise… what are best settings to allow bge-large-en embedding model to work as the default Discourse AI vector service?

I have a bge-large-en instance running in AWS and I know my Discourse AI is talking to it (see test below) but embedding is not generally working ( openAI embedding does work fine.)

PROBLEM SUMMARY: RAG bots unresponsive when embedding set to HF bge-large-en

here’s the AWS embedding model:

here’s the Discourse AI settings:

here’s a Discourse custom LLM ‘Run test’ just to check connectivity …

here’s the bge-large-en logs on the AWS side:

Many thanks !!

Here’s the error log …

Job exception: can't quote Array

hostname ai-qa-ubuntu-s-1vcpu-2gb-amd-sfo3-01-app
process_id 1165935
application_version f9192835a7e4d2067c3d1844f43f9e7b69de39e7
current_db default
current_hostname ai-qa.net
job Jobs::CreateAiReply
problem_db default
time 7:22 pm
opts post_id 618
--- --- --- ---
--- ---
bot_user_id -1208
persona_id 5
current_site_id default


Backtrace

/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/connection_adapters/abstract/quoting.rb:25:in `quote'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/connection_adapters/postgresql/quoting.rb:69:in `quote'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/connection_adapters/abstract/quoting.rb:51:in `quote_bound_value'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:193:in `block in quote_bound_value'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:193:in `map!'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:193:in `quote_bound_value'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:171:in `replace_bind_variable'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:180:in `block in replace_named_bind_variables'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:176:in `gsub'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:176:in `replace_named_bind_variables'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:128:in `sanitize_sql_array'
/var/www/discourse/lib/mini_sql_multisite_connection.rb:21:in `public_send'
/var/www/discourse/lib/mini_sql_multisite_connection.rb:21:in `encode'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/connection.rb:64:in `to_sql'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/postgres/connection.rb:202:in `run'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/active_record_postgres/connection.rb:38:in `block in run'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/active_record_postgres/connection.rb:34:in `block in with_lock'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.8.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `handle_interrupt'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.8.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `block in synchronize'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.8.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.8.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/active_record_postgres/connection.rb:34:in `with_lock'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/active_record_postgres/connection.rb:38:in `run'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/postgres/connection.rb:99:in `query'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:272:in `asymmetric_rag_fragment_similarity_search'
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/personas/persona.rb:286:in `rag_fragments_prompt'
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/personas/persona.rb:156:in `craft_prompt'
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/bot.rb:54:in `reply'
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/playground.rb:424:in `reply_to'
/var/www/discourse/plugins/discourse-ai/app/jobs/regular/create_ai_reply.rb:18:in `execute'
/var/www/discourse/app/jobs/base.rb:305:in `block (2 levels) in perform'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rails_multisite-6.0.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rails_multisite-6.0.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:292:in `block in perform'
/var/www/discourse/app/jobs/base.rb:288:in `each'
/var/www/discourse/app/jobs/base.rb:288:in `perform'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:202:in `execute_job'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:170:in `block (2 levels) in process'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:177:in `block in invoke'
/var/www/discourse/lib/sidekiq/pausable.rb:132:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:179:in `block in invoke'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:182:in `invoke'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:169:in `block in process'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:136:in `block (6 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_retry.rb:113:in `local'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:135:in `block (5 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq.rb:44:in `block in <module:Sidekiq>'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:131:in `block (4 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:263:in `stats'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:126:in `block (3 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_logger.rb:13:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:125:in `block (2 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_retry.rb:80:in `global'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:124:in `block in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_logger.rb:39:in `prepare'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:123:in `dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:168:in `process'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:78:in `process_one'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:68:in `run'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/component.rb:8:in `watchdog'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/component.rb:17:in `block in safe_thread'

sam · May 28, 2024, 5:07am

Thanks for raising, we will have a look!

Falco · May 28, 2024, 1:18pm

What is the output of running the following commands in a rails console?

strategy = DiscourseAi::Embeddings::Strategies::Truncation.new
vector_rep = DiscourseAi::Embeddings::VectorRepresentations::Base.current_representation(strategy)
vector_rep.vector_from("test")

Also, our API is designed to work against someone running GitHub - huggingface/text-embeddings-inference: A blazing fast inference solution for text embeddings models themselves per the documentation so it’s possible that it doesn’t work against the hosted version.

If you provide the backtrace we can look into making it work.

StevePlex · May 29, 2024, 2:05pm

@Falco

here’s what happened when I ran the test code (with bge-large-en running on AWS dedicated endpoint instance configured as embedding model)

root@studyqa-ubuntu-s-1vcpu-2gb-amd-sfo3-01-app:/var/www/discourse# rails c

[1] pry(main)> strategy = DiscourseAi::Embeddings::Strategies::Truncation.new

puts "Strategy initialized"

vector_rep = DiscourseAi::Embeddings::VectorRepresentations::Base.current_representation(strategy)

puts "Vector representation obtained"

vector = vector_rep.vector_from("test")

[1] pry(main)> strategy = DiscourseAi::Embeddings::Strategies::Truncation.new

puts "Strategy initialized"

vector_rep = DiscourseAi::Embeddings::VectorRepresentations::Base.current_representation(strategy)

puts "Vector representation obtained"

vector = vector_rep.vector_from("test")

puts "Vector generated"

puts vector.inspect

Strategy initialized

Vector representation obtained

Vector generated

[:embeddings, [-0.0020444912370294333, 0.008787356317043304, -0.010865539312362671, 0.01865551434457302, -0.02099628746509552, -0.009864491410553455, -0.0011329081607982516, 0.02949545904994011, 0.027839021757245064, 0.043966952711343765, 0.0406080037355423, 0.0016647017328068614, 0.007204003632068634, -0.03770752251148224, -0.025242917239665985, -0.0015279072104021907, -0.02805529721081257, -0.020901955664157867, -0.029206447303295135, -0.006209365092217922, -0.02105099707841873,

etc.

it seems to be hitting bge-large-en in aws:

- 2024-05-29T13:57:34.609+00:00 Batches: 0%| | 0/1 [00:00<?, ?it/s] Batches: 100%|██████████| 1/1 [00:00<00:00, 4.80it/s] Batches: 100%|██████████| 1/1 [00:00<00:00, 4.79it/s]

• 2024/05/29 09:57:34
INFO | POST / | Duration: 212.84 ms


- 2024-05-29T13:57:53.806+00:00 Batches: 0%| | 0/1 [00:00<?, ?it/s] Batches: 100%|██████████| 1/1 [00:01<00:00, 1.97s/it] Batches: 100%|██████████| 1/1 [00:01<00:00, 1.97s/it]

• 2024/05/29 09:57:53
INFO | POST / | Duration: 1978.36 ms

Falco · May 29, 2024, 4:41pm

So looks like it’s working just fine?

Maybe the problem is the re-ranker? Can you unset the ai_hugging_face_tei_reranker_endpoint and test if RAG works?

StevePlex · May 29, 2024, 5:47pm

turned off reranker… no embedding yet… getting this message on both ends:

Discourse LLM run test:

Trying to contact the model returned this error: {“error”:“Body needs to provide a inputs key, recieved: b’{"model":"bge-large-en","temperature":0.7,"messages":[{"role":"system","content":"You are a helpful bot"},{"role":"user","content":"How much is 1 + 1?"}],"max_tokens":1009}'”}

bge-large-en log

• 2024/05/29 13:40:03

ERROR | Body needs to provide a inputs key, recieved: b’{“model”:“bge-large-en”,“temperature”:0.7,“messages”:[{“role”:“system”,“content”:“You are a helpful bot”},{“role”:“user”,“content”:“How much is 1 + 1?”}],“max_tokens”:1009}’

discourse b1b218aa99
discourse-ai d812ecf5

sam · May 30, 2024, 12:25am

This is not how we should be testing embeddings that is an LLM test not an embedding model test which would expect numbers back. The LLM UI is not where you would add this we would need an Embedding UI for it which we dont have yet. Embedding models are only configured in site settings.

StevePlex · May 30, 2024, 1:04am

Yes. Makes sense.

( I tried to note that I was only using the LLM run test to confirm “connectivity” (see below) ! should of made that more clear . )

Topic		Replies	Views
Self-Hosting Embeddings for DiscourseAI Self-Hosting ai , ai-search , related-topics	21	1971	April 14, 2025
Gemini Embeddings are not working Bug ai	4	128	January 2, 2025
API access to the embedding(s) for a post Feature completed	4	409	September 15, 2024
Discourse AI - Embeddings Site Management ai , ai-search , related-topics	24	5513	May 20, 2025
AI embeddings backfill rake aborted Support ai	4	490	January 30, 2024

Settings for Hugging Face bge-large-en embeddings? -> RAG bots unresponsive!

Related topics