Don  
                
                  
                    4 Dicembre 2024, 11:40am
                   
                  1 
               
             
            
              Hello 
I use text-embedding-3-large for ai embeddings model and something wrong with it. I mean, I have to top-up my OpenAI account now twice, since (30 Nov) which is crazy because it should be enough for months… Is that anything changed with related topics? It maybe backfills topics always which already done or I don’t know. 
It generates ~ 24 million input tokens / day
Before (30 Nov) it was  ~ 60 - 220k
             
            
              2 Mi Piace 
            
            
           
          
            
              
                Falco  
              
                  
                    4 Dicembre 2024,  3:40pm
                   
                  2 
               
             
            
              Please share the values of all embeddings settings:
ai_embeddings_enabled
ai_embeddings_discourse_service_api_endpoint
ai_embeddings_discourse_service_api_endpoint_srv
ai_embeddings_discourse_service_api_key
ai_embeddings_model
ai_embeddings_per_post_enabled
ai_embeddings_generate_for_pms
ai_embeddings_semantic_related_topics_enabled
ai_embeddings_semantic_related_topics
ai_embeddings_semantic_related_include_closed_topics
ai_embeddings_backfill_batch_size
ai_embeddings_semantic_search_enabled
ai_embeddings_semantic_search_hyde_model
ai_embeddings_semantic_search_hyde_model_allowed_seeded_models
ai_embeddings_semantic_quick_search_enabled
 
            
              1 Mi Piace 
            
            
           
          
            
              
                Don  
              
                  
                    4 Dicembre 2024,  3:51pm
                   
                  3 
               
             
            
              ai_embeddings_enabled: true
ai_embeddings_discourse_service_api_endpoint: ""
ai_embeddings_discourse_service_api_endpoint_srv: ""
ai_embeddings_discourse_service_api_key: ""
ai_embeddings_model: text-embedding-3-large
ai_embeddings_per_post_enabled: false
ai_embeddings_generate_for_pms: false
ai_embeddings_semantic_related_topics_enabled: true
ai_embeddings_semantic_related_topics: 5
ai_embeddings_semantic_related_include_closed_topics: true
ai_embeddings_backfill_batch_size: 250
ai_embeddings_semantic_search_enabled: true
ai_embeddings_semantic_search_hyde_model: Gemini 1.5 Flash
ai_embeddings_semantic_search_hyde_model_allowed_seeded_models: ""
ai_embeddings_semantic_quick_search_enabled: false
 
            
              1 Mi Piace 
            
            
           
          
            
              
                Falco  
              
                  
                    4 Dicembre 2024,  3:55pm
                   
                  4 
               
             
            
              How many embeddings do you have?
SELECT COUNT(*) FROM ai_topic_embeddings WHERE model_id = 7;
How many topics do you have?
SELECT COUNT(*) FROM topics WHERE deleted_at IS NULL AND archetype = 'regular';
             
            
              1 Mi Piace 
            
            
           
          
            
              
                Don  
              
                  
                    4 Dicembre 2024,  4:14pm
                   
                  5 
               
             
            
              How many embeddings do you have?
How many topics do you have?
             
            
              1 Mi Piace 
            
            
           
          
            
              
                Jagster  
              
                  
                    4 Dicembre 2024,  4:22pm
                   
                  6 
               
             
            
              I checked mine. It exploded 27.11. and before that it was under 100k tokens a day, but then it increased to 7 million and is increasing everyday and yesterday it was close to 20 million.
Edit: October, embeddings cost was 46 cents. Now, December close to four days: almost 6 dollars.
Yeah. I disabled embeddings.
             
            
              2 Mi Piace 
            
            
           
          
            
              
                Falco  
              
                  
                    4 Dicembre 2024,  6:57pm
                   
                  7 
               
             
            
              24M a day is your entire forum, that looks buggy. Unless you get updates in all those topics every day that is most certainly a bug.
             
            
              1 Mi Piace 
            
            
           
          
            
              
                Falco  
              
                  
                    4 Dicembre 2024,  7:44pm
                   
                  8 
               
             
            
              One thing that may be related is that we used to skip calling embeddings API when the topic digest didn’t change, but we regressed this on gen_bulk_reprensentations @Roman .
@Don  do you know how many embeddings requests are you making per day?
             
            
              2 Mi Piace 
            
            
           
          
            
              
                Jagster  
              
                  
                    4 Dicembre 2024,  8:05pm
                   
                  10 
               
             
            
              I’m not Don, but my API requests has increased from 80-100 to 3825.
             
            
              2 Mi Piace 
            
            
           
          
            
              
                Don  
                
                  
                    4 Dicembre 2024,  8:15pm
                   
                  11 
               
             
            
              It generally ~150 - 200 requests / day
but at the end of november it increased.
             
            
              1 Mi Piace 
            
            
           
          
            
              
                Roman  
              
                  
                    4 Dicembre 2024,  8:51pm
                   
                  12 
               
             
            
              I’m really sorry, this was a bug in the new code we added to backfill embeddings faster. It should be fixed by:
  
  
    
    
  
      
    
      discourse:main ← discourse:bulk_embeddings_digest_check
    
      
        
          opened 08:38PM - 04 Dec 24 UTC 
        
        
        
       
   
 
   
  
    
    
  
  
 
Please let me know if things are not back to normal.
             
            
              6 Mi Piace 
            
            
           
          
            
              
                Falco  
              
                  
                    4 Dicembre 2024,  8:59pm
                   
                  13 
               
             
            
              
 Don:
 
 
 
Given the 250 per hour, we have a hard limit of 6k per day. This numbers are still inside the limit.
However, if they are only getting triggered by our “update a random sample” of topics, it should be limited to 10% of that, which would, at worse, be 600 requests.
@Roman  is this limit here not getting applied somehow? Or is the problem elsewhere?
  
  
    
    
      
          # Finally, we'll try to backfill embeddings for topics that have outdated 
          # embeddings due to edits or new replies. Here we only do 10% of the limit 
          relation = 
            topics 
              .where("#{table_name}.updated_at < ?", 6.hours.ago) 
              .where("#{table_name}.updated_at < topics.updated_at") 
              .limit((limit - rebaked) / 10) 
       
     
  
    
    
  
  
 
             
            
              1 Mi Piace 
            
            
           
          
            
              
                Roman  
              
                  
                    4 Dicembre 2024,  9:09pm
                   
                  14 
               
             
            
              Yeah, I think the bug I fixed revealed another one that the digest check was hiding.
I think the bug is here:
  
  
    
    
      
          posts = 
            Post 
              .joins("LEFT JOIN #{table_name} ON #{table_name}.post_id = posts.id") 
              .where(deleted_at: nil) 
              .where(post_type: Post.types[:regular]) 
              .limit(limit - rebaked) 
          # First, we'll try to backfill embeddings for posts that have none 
          posts 
            .where("#{table_name}.post_id IS NULL") 
            .find_in_batches do |batch| 
              vector_rep.gen_bulk_reprensentations(batch) 
              rebaked += batch.size 
            end 
          return if rebaked >= limit 
          # Then, we'll try to backfill embeddings for posts that have outdated 
          # embeddings, be it model or strategy version 
          posts 
            .where(<<~SQL) 
       
     
  
    
    
  
  
 
I changed it from find_each to find_in_batches last week (the former uses batches internally), and since both rely on limit to specify batch size, the original limit of limit - rebaked is ignored. We should use pluck + each_slice instead.
             
            
              4 Mi Piace 
            
            
           
          
            
              
                Don  
              
                  
                    4 Dicembre 2024, 11:37pm
                   
                  15 
               
             
            
              Thanks for the fix 
I’ve updated my site but it looks like there is an issue in /logs. I am not sure is it related with this…
Message
Job exception: ERROR:  invalid input syntax for type halfvec: "[NULL]"
LINE 2: ...1, 1, 'e358a54a79f71861a4ebd17ecebbad6932fc1f9a', '[NULL]', ...
                                                             ^
Backtrace
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/db/pg.rb:110:in `exec'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/db/pg.rb:110:in `async_exec'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/postgres/connection.rb:217:in `run'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/active_record_postgres/connection.rb:38:in `block in run'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/active_record_postgres/connection.rb:34:in `block in with_lock'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.2/lib/active_support/concurrency/null_lock.rb:9:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/active_record_postgres/connection.rb:34:in `with_lock'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/active_record_postgres/connection.rb:38:in `run'
/var/www/discourse/lib/mini_sql_multisite_connection.rb:109:in `run'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/postgres/connection.rb:196:in `exec'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:423:in `save_to_db'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:86:in `block in gen_bulk_reprensentations'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:86:in `each'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:86:in `gen_bulk_reprensentations'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:131:in `block in populate_topic_embeddings'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:130:in `each'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:130:in `each_slice'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:130:in `populate_topic_embeddings'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:36:in `execute'
/var/www/discourse/app/jobs/base.rb:308:in `block (2 levels) in perform'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:295:in `block in perform'
/var/www/discourse/app/jobs/base.rb:291:in `each'
/var/www/discourse/app/jobs/base.rb:291:in `perform'
/var/www/discourse/app/jobs/base.rb:362:in `perform'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_scheduler-0.17.0/lib/mini_scheduler/manager.rb:137:in `process_queue'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_scheduler-0.17.0/lib/mini_scheduler/manager.rb:77:in `worker_loop'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_scheduler-0.17.0/lib/mini_scheduler/manager.rb:63:in `block (2 levels) in ensure_worker_threads'
 
            
              1 Mi Piace 
            
            
           
          
            
              
                Roman  
              
                  
                    4 Dicembre 2024, 11:51pm
                   
                  16 
               
             
            
              At first glance, it doesn’t look related. Looks like it failed to generate the embedding and it’s trying to insert NULL. Could it be OpenAI is returning an error? Maybe something quota related?
Can you please run this from a console?
DiscourseAi::Embeddings::VectorRepresentations::Base
          .find_representation(SiteSetting.ai_embeddings_model)
          .new(DiscourseAi::Embeddings::Strategies::Truncation.new)
          .vector_from("this is a test")
          .present?
It should log the error in your logs if it raises a Net::HTTPBadResponse.
             
            
              1 Mi Piace 
            
            
           
          
            
              
                Don  
              
                  
                    5 Dicembre 2024, 12:02am
                   
                  17 
               
             
            
              I got back in console: truet? and nothing in /logs.
Maybe this is a delay from OpanAI because I top up my account again an hour ago and probably this process not instant…
             
            
              
            
           
          
            
              
                Roman  
              
                  
                    5 Dicembre 2024,  1:00am
                   
                  18 
               
             
            
              That means it can generate embeddings then. Do these errors persist? You should see these errors every five minutes if so.
I ran some tests on my local instance against our self-hosted embeddings service and confirmed that backfilling works under the following conditions:
There are no embeddings. 
The digest is outdated and the embeddings’ updated_at is older than 6 hours. 
The digest is not outdated and the embeddings’ updated_at is older than 6 hours (it does not update in this case). 
 
             
            
              1 Mi Piace 
            
            
           
          
            
              
                Don  
              
                  
                    5 Dicembre 2024,  6:21am
                   
                  19 
               
             
            
              
 Roman Rizzi:
 
Do these errors persist?
 
 
Nope, I don’t see those errors in /logs anymore, everything works now. Thank you 
             
            
              1 Mi Piace 
            
            
           
          
            
              
                Falco  
              
                  
                    5 Dicembre 2024,  7:12pm
                   
                  21 
               
             
            
              
 Don:
 
I’ve updated my site
 
 
We merged another fix 5h ago, please update again.
  
  
    
    
  
      
    
      main ← embedding_backfill_limit
    
      
        
          opened 09:30PM - 04 Dec 24 UTC 
        
        
        
       
   
 
  
    I'm trying to solve two issues here:
1. Using `find_in_batches` or `find_each… ` will ignore the limit because it relies on it to specify a batch size.
2. We need to individually apply the limit on each step since `rebaked` gets bigger every time we process a batch. 
   
   
  
    
    
  
  
 
After that, please let me know how the rate is looking.
cc @Jagster .
             
            
              2 Mi Piace 
            
            
           
          
            
              
                Jagster  
              
                  
                    5 Dicembre 2024,  7:16pm
                   
                  22 
               
             
            
              I don’t know anything about limits, but amount of API requests etc. dropped back to normal after the earlier fix. So thanks guys for fast reaction.
             
            
              2 Mi Piace