Settings for Hugging Face bge-large-en embeddings? -> RAG bots unresponsive!

Please advise… what are best settings to allow bge-large-en embedding model to work as the default Discourse AI vector service?

I have a bge-large-en instance running in AWS and I know my Discourse AI is talking to it (see test below) but embedding is not generally working ( openAI embedding does work fine.)

PROBLEM SUMMARY: RAG bots unresponsive when embedding set to HF bge-large-en

here’s the AWS embedding model:

here’s the Discourse AI settings:


here’s a Discourse custom LLM ‘Run test’ just to check connectivity …

here’s the bge-large-en logs on the AWS side:

Many thanks !!


Here’s the error log …

Job exception: can't quote Array

hostname ai-qa-ubuntu-s-1vcpu-2gb-amd-sfo3-01-app
process_id 1165935
application_version f9192835a7e4d2067c3d1844f43f9e7b69de39e7
current_db default
current_hostname ai-qa.net
job Jobs::CreateAiReply
problem_db default
time 7:22 pm
opts post_id 618
--- --- --- ---
--- ---
bot_user_id -1208
persona_id 5
current_site_id default


Backtrace

/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/connection_adapters/abstract/quoting.rb:25:in `quote'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/connection_adapters/postgresql/quoting.rb:69:in `quote'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/connection_adapters/abstract/quoting.rb:51:in `quote_bound_value'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:193:in `block in quote_bound_value'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:193:in `map!'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:193:in `quote_bound_value'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:171:in `replace_bind_variable'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:180:in `block in replace_named_bind_variables'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:176:in `gsub'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:176:in `replace_named_bind_variables'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.8.1/lib/active_record/sanitization.rb:128:in `sanitize_sql_array'
/var/www/discourse/lib/mini_sql_multisite_connection.rb:21:in `public_send'
/var/www/discourse/lib/mini_sql_multisite_connection.rb:21:in `encode'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/connection.rb:64:in `to_sql'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/postgres/connection.rb:202:in `run'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/active_record_postgres/connection.rb:38:in `block in run'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/active_record_postgres/connection.rb:34:in `block in with_lock'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.8.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `handle_interrupt'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.8.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `block in synchronize'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.8.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.8.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/active_record_postgres/connection.rb:34:in `with_lock'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/active_record_postgres/connection.rb:38:in `run'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/mini_sql-1.5.0/lib/mini_sql/postgres/connection.rb:99:in `query'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:272:in `asymmetric_rag_fragment_similarity_search'
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/personas/persona.rb:286:in `rag_fragments_prompt'
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/personas/persona.rb:156:in `craft_prompt'
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/bot.rb:54:in `reply'
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/playground.rb:424:in `reply_to'
/var/www/discourse/plugins/discourse-ai/app/jobs/regular/create_ai_reply.rb:18:in `execute'
/var/www/discourse/app/jobs/base.rb:305:in `block (2 levels) in perform'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rails_multisite-6.0.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rails_multisite-6.0.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:292:in `block in perform'
/var/www/discourse/app/jobs/base.rb:288:in `each'
/var/www/discourse/app/jobs/base.rb:288:in `perform'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:202:in `execute_job'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:170:in `block (2 levels) in process'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:177:in `block in invoke'
/var/www/discourse/lib/sidekiq/pausable.rb:132:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:179:in `block in invoke'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:182:in `invoke'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:169:in `block in process'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:136:in `block (6 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_retry.rb:113:in `local'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:135:in `block (5 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq.rb:44:in `block in <module:Sidekiq>'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:131:in `block (4 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:263:in `stats'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:126:in `block (3 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_logger.rb:13:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:125:in `block (2 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_retry.rb:80:in `global'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:124:in `block in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_logger.rb:39:in `prepare'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:123:in `dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:168:in `process'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:78:in `process_one'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:68:in `run'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/component.rb:8:in `watchdog'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/component.rb:17:in `block in safe_thread'
2 Likes

Thanks for raising, we will have a look!

2 Likes

What is the output of running the following commands in a rails console?

strategy = DiscourseAi::Embeddings::Strategies::Truncation.new
vector_rep = DiscourseAi::Embeddings::VectorRepresentations::Base.current_representation(strategy)
vector_rep.vector_from("test")

Also, our API is designed to work against someone running GitHub - huggingface/text-embeddings-inference: A blazing fast inference solution for text embeddings models themselves per the documentation so it’s possible that it doesn’t work against the hosted version.

If you provide the backtrace we can look into making it work.

2 Likes

@Falco

here’s what happened when I ran the test code (with bge-large-en running on AWS dedicated endpoint instance configured as embedding model)


root@studyqa-ubuntu-s-1vcpu-2gb-amd-sfo3-01-app:/var/www/discourse# rails c

[1] pry(main)> strategy = DiscourseAi::Embeddings::Strategies::Truncation.new

puts "Strategy initialized"

vector_rep = DiscourseAi::Embeddings::VectorRepresentations::Base.current_representation(strategy)

puts "Vector representation obtained"

vector = vector_rep.vector_from("test")

[1] pry(main)> strategy = DiscourseAi::Embeddings::Strategies::Truncation.new

puts "Strategy initialized"

vector_rep = DiscourseAi::Embeddings::VectorRepresentations::Base.current_representation(strategy)

puts "Vector representation obtained"

vector = vector_rep.vector_from("test")

puts "Vector generated"

puts vector.inspect

Strategy initialized

Vector representation obtained

Vector generated

[:embeddings, [-0.0020444912370294333, 0.008787356317043304, -0.010865539312362671, 0.01865551434457302, -0.02099628746509552, -0.009864491410553455, -0.0011329081607982516, 0.02949545904994011, 0.027839021757245064, 0.043966952711343765, 0.0406080037355423, 0.0016647017328068614, 0.007204003632068634, -0.03770752251148224, -0.025242917239665985, -0.0015279072104021907, -0.02805529721081257, -0.020901955664157867, -0.029206447303295135, -0.006209365092217922, -0.02105099707841873,

etc.


it seems to be hitting bge-large-en in aws:

- 2024-05-29T13:57:34.609+00:00 Batches: 0%| | 0/1 [00:00<?, ?it/s] Batches: 100%|██████████| 1/1 [00:00<00:00, 4.80it/s] Batches: 100%|██████████| 1/1 [00:00<00:00, 4.79it/s]

• 2024/05/29 09:57:34
INFO | POST / | Duration: 212.84 ms


- 2024-05-29T13:57:53.806+00:00 Batches: 0%| | 0/1 [00:00<?, ?it/s] Batches: 100%|██████████| 1/1 [00:01<00:00, 1.97s/it] Batches: 100%|██████████| 1/1 [00:01<00:00, 1.97s/it]

• 2024/05/29 09:57:53
INFO | POST / | Duration: 1978.36 ms
1 Like

So looks like it’s working just fine?

Maybe the problem is the re-ranker? Can you unset the ai_hugging_face_tei_reranker_endpoint and test if RAG works?

turned off reranker… no embedding yet… getting this message on both ends:


Discourse LLM run test:

Trying to contact the model returned this error: {“error”:“Body needs to provide a inputs key, recieved: b’{"model":"bge-large-en","temperature":0.7,"messages":[{"role":"system","content":"You are a helpful bot"},{"role":"user","content":"How much is 1 + 1?"}],"max_tokens":1009}'”}


bge-large-en log

• 2024/05/29 13:40:03

ERROR | Body needs to provide a inputs key, recieved: b’{“model”:“bge-large-en”,“temperature”:0.7,“messages”:[{“role”:“system”,“content”:“You are a helpful bot”},{“role”:“user”,“content”:“How much is 1 + 1?”}],“max_tokens”:1009}’


discourse b1b218aa99
discourse-ai d812ecf5

This is not how we should be testing embeddings :slight_smile: that is an LLM test not an embedding model test which would expect numbers back. The LLM UI is not where you would add this we would need an Embedding UI for it which we dont have yet. Embedding models are only configured in site settings.

Yes. Makes sense.

( I tried to note that I was only using the LLM run test to confirm “connectivity” (see below) ! should of made that more clear . )



1 Like