Don
4 december 2024 om 11:40
1
Hello
I use text-embedding-3-large
for ai embeddings model
and something wrong with it. I mean, I have to top-up my OpenAI account now twice, since (30 Nov) which is crazy because it should be enough for months⦠Is that anything changed with related topics? It maybe backfills topics always which already done or I donāt know.
It generates ~ 24 million input tokens / day
Before (30 Nov) it was ~ 60 - 220k
2 likes
Falco
(Falco)
4 december 2024 om 15:40
2
Please share the values of all embeddings settings:
ai_embeddings_enabled
ai_embeddings_discourse_service_api_endpoint
ai_embeddings_discourse_service_api_endpoint_srv
ai_embeddings_discourse_service_api_key
ai_embeddings_model
ai_embeddings_per_post_enabled
ai_embeddings_generate_for_pms
ai_embeddings_semantic_related_topics_enabled
ai_embeddings_semantic_related_topics
ai_embeddings_semantic_related_include_closed_topics
ai_embeddings_backfill_batch_size
ai_embeddings_semantic_search_enabled
ai_embeddings_semantic_search_hyde_model
ai_embeddings_semantic_search_hyde_model_allowed_seeded_models
ai_embeddings_semantic_quick_search_enabled
1 like
Don
4 december 2024 om 15:51
3
ai_embeddings_enabled: true
ai_embeddings_discourse_service_api_endpoint: ""
ai_embeddings_discourse_service_api_endpoint_srv: ""
ai_embeddings_discourse_service_api_key: ""
ai_embeddings_model: text-embedding-3-large
ai_embeddings_per_post_enabled: false
ai_embeddings_generate_for_pms: false
ai_embeddings_semantic_related_topics_enabled: true
ai_embeddings_semantic_related_topics: 5
ai_embeddings_semantic_related_include_closed_topics: true
ai_embeddings_backfill_batch_size: 250
ai_embeddings_semantic_search_enabled: true
ai_embeddings_semantic_search_hyde_model: Gemini 1.5 Flash
ai_embeddings_semantic_search_hyde_model_allowed_seeded_models: ""
ai_embeddings_semantic_quick_search_enabled: false
1 like
Falco
(Falco)
4 december 2024 om 15:55
4
How many embeddings do you have?
SELECT COUNT(*) FROM ai_topic_embeddings WHERE model_id = 7;
How many topics do you have?
SELECT COUNT(*) FROM topics WHERE deleted_at IS NULL AND archetype = 'regular';
1 like
Don
4 december 2024 om 16:14
5
How many embeddings do you have?
5964
How many topics do you have?
5563
1 like
Jagster
(Jakke Lehtonen)
4 december 2024 om 16:22
6
I checked mine. It exploded 27.11. and before that it was under 100k tokens a day, but then it increased to 7 million and is increasing everyday and yesterday it was close to 20 million.
Edit: October, embeddings cost was 46 cents. Now, December close to four days: almost 6 dollars.
Yeah. I disabled embeddings.
2 likes
Falco
(Falco)
4 december 2024 om 18:57
7
24M a day is your entire forum, that looks buggy. Unless you get updates in all those topics every day that is most certainly a bug.
1 like
Falco
(Falco)
4 december 2024 om 19:44
8
One thing that may be related is that we used to skip calling embeddings API when the topic digest didnāt change, but we regressed this on gen_bulk_reprensentations
@Roman .
@Don do you know how many embeddings requests are you making per day?
2 likes
Jagster
(Jakke Lehtonen)
4 december 2024 om 20:05
10
Iām not Don, but my API requests has increased from 80-100 to 3825.
2 likes
Don
4 december 2024 om 20:15
11
It generally ~150 - 200 requests / day
but at the end of november it increased.
1 like
Roman
(Roman Rizzi)
4 december 2024 om 20:51
12
Iām really sorry, this was a bug in the new code we added to backfill embeddings faster. It should be fixed by:
discourse:main
ā discourse:bulk_embeddings_digest_check
opened 08:38PM - 04 Dec 24 UTC
Please let me know if things are not back to normal.
6 likes
Falco
(Falco)
4 december 2024 om 20:59
13
Don:
Given the 250 per hour, we have a hard limit of 6k per day. This numbers are still inside the limit.
However, if they are only getting triggered by our āupdate a random sampleā of topics, it should be limited to 10% of that, which would, at worse, be 600 requests.
@Roman is this limit here not getting applied somehow? Or is the problem elsewhere?
# Finally, we'll try to backfill embeddings for topics that have outdated
# embeddings due to edits or new replies. Here we only do 10% of the limit
relation =
topics
.where("#{table_name}.updated_at < ?", 6.hours.ago)
.where("#{table_name}.updated_at < topics.updated_at")
.limit((limit - rebaked) / 10)
1 like
Roman
(Roman Rizzi)
4 december 2024 om 21:09
14
Yeah, I think the bug I fixed revealed another one that the digest check was hiding.
I think the bug is here:
posts =
Post
.joins("LEFT JOIN #{table_name} ON #{table_name}.post_id = posts.id")
.where(deleted_at: nil)
.where(post_type: Post.types[:regular])
.limit(limit - rebaked)
# First, we'll try to backfill embeddings for posts that have none
posts
.where("#{table_name}.post_id IS NULL")
.find_in_batches do |batch|
vector_rep.gen_bulk_reprensentations(batch)
rebaked += batch.size
end
return if rebaked >= limit
# Then, we'll try to backfill embeddings for posts that have outdated
# embeddings, be it model or strategy version
posts
.where(<<~SQL)
I changed it from find_each
to find_in_batches
last week (the former uses batches internally), and since both rely on limit to specify batch size, the original limit of limit - rebaked
is ignored. We should use pluck
+ each_slice
instead.
4 likes
Don
4 december 2024 om 23:37
15
Thanks for the fix
Iāve updated my site but it looks like there is an issue in /logs
. I am not sure is it related with thisā¦
Message
Job exception: ERROR: invalid input syntax for type halfvec: "[NULL]"
LINE 2: ...1, 1, 'e358a54a79f71861a4ebd17ecebbad6932fc1f9a', '[NULL]', ...
^
Backtrace
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/db/pg.rb:110:in `exec'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/db/pg.rb:110:in `async_exec'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/postgres/connection.rb:217:in `run'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/active_record_postgres/connection.rb:38:in `block in run'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/active_record_postgres/connection.rb:34:in `block in with_lock'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.2/lib/active_support/concurrency/null_lock.rb:9:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/active_record_postgres/connection.rb:34:in `with_lock'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/active_record_postgres/connection.rb:38:in `run'
/var/www/discourse/lib/mini_sql_multisite_connection.rb:109:in `run'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_sql-1.6.0/lib/mini_sql/postgres/connection.rb:196:in `exec'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:423:in `save_to_db'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:86:in `block in gen_bulk_reprensentations'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:86:in `each'
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector_representations/base.rb:86:in `gen_bulk_reprensentations'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:131:in `block in populate_topic_embeddings'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:130:in `each'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:130:in `each_slice'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:130:in `populate_topic_embeddings'
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:36:in `execute'
/var/www/discourse/app/jobs/base.rb:308:in `block (2 levels) in perform'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:295:in `block in perform'
/var/www/discourse/app/jobs/base.rb:291:in `each'
/var/www/discourse/app/jobs/base.rb:291:in `perform'
/var/www/discourse/app/jobs/base.rb:362:in `perform'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_scheduler-0.17.0/lib/mini_scheduler/manager.rb:137:in `process_queue'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_scheduler-0.17.0/lib/mini_scheduler/manager.rb:77:in `worker_loop'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/mini_scheduler-0.17.0/lib/mini_scheduler/manager.rb:63:in `block (2 levels) in ensure_worker_threads'
1 like
Roman
(Roman Rizzi)
4 december 2024 om 23:51
16
At first glance, it doesnāt look related. Looks like it failed to generate the embedding and itās trying to insert NULL
. Could it be OpenAI is returning an error? Maybe something quota related?
Can you please run this from a console?
DiscourseAi::Embeddings::VectorRepresentations::Base
.find_representation(SiteSetting.ai_embeddings_model)
.new(DiscourseAi::Embeddings::Strategies::Truncation.new)
.vector_from("this is a test")
.present?
It should log the error in your logs if it raises a Net::HTTPBadResponse
.
1 like
Don
5 december 2024 om 00:02
17
I got back in console: truet?
and nothing in /logs
.
Maybe this is a delay from OpanAI because I top up my account again an hour ago and probably this process not instantā¦
Roman
(Roman Rizzi)
5 december 2024 om 01:00
18
That means it can generate embeddings then. Do these errors persist? You should see these errors every five minutes if so.
I ran some tests on my local instance against our self-hosted embeddings service and confirmed that backfilling works under the following conditions:
There are no embeddings.
The digest is outdated and the embeddingsā updated_at
is older than 6 hours.
The digest is not outdated and the embeddingsā updated_at
is older than 6 hours (it does not update in this case).
1 like
Don
5 december 2024 om 06:21
19
Roman Rizzi:
Do these errors persist?
Nope, I donāt see those errors in /logs anymore, everything works now. Thank you
1 like
Falco
(Falco)
5 december 2024 om 19:12
21
Don:
Iāve updated my site
We merged another fix 5h ago, please update again.
main
ā embedding_backfill_limit
opened 09:30PM - 04 Dec 24 UTC
I'm trying to solve two issues here:
1. Using `find_in_batches` or `find_each⦠` will ignore the limit because it relies on it to specify a batch size.
2. We need to individually apply the limit on each step since `rebaked` gets bigger every time we process a batch.
After that, please let me know how the rate is looking.
cc @Jagster .
2 likes
Jagster
(Jakke Lehtonen)
5 december 2024 om 19:16
22
I donāt know anything about limits, but amount of API requests etc. dropped back to normal after the earlier fix. So thanks guys for fast reaction.
2 likes