Discourse AI تتسبب في أخطاء SSL جديدة وإعادة تعيين الاتصال بواسطة Peer

Priority/Severity:
Recent repository changes make Discourse AI mostly unworkable with current OpenAI API.

Platform:

  • Self-hosted, using standard standalone build
  • Ubuntu 24.04 host VM, Docker containers
  • OpenAI API
  • Anthropic API

Description:

Discourse AI has been calling an external API - OpenAI with models and working well as of Feb 15 (last container rebuild). Today (Feb 21) I rebuilt the container and things are not working.

Here’s what I know:

As at Feb 15
OpenAI models configured and working well:

  • LLM/Persona
    • GPT4 Omni
    • GPT4 Omni Mini
  • Embeddings
    • text-embedding-ada-002

As at Feb 21

All OpenAI models have about 70-80% error rate for LLM calls with the error message “Connection Reset by Peer”. Some chats go through, some fail partway in between. Embedding calls fail with an Faraday::ConnectionFailed SSL error.

Additional OpenAI models fail:

  • o1-mini and o1-preview fail to test/save the LLM with a code error (‘developer’ is not a valid role) because developer role is only valid for o1 and o3 models, not their -mini versions. Source code github.com/discourse/discourse-ai/…/chat_gpt.rb:61 needs to be updated to do an exact model name match not a starts_with match. For the else case on line 73, there is no longer a system user and needs to be updated to be simply user. As of today, o1-mini cannot use tools.

Have tried:

  • Checked OpenAI platform limits and we are well under Rate Limits and OpenAI account is funded.
  • Rebuilding container
  • Deleting and recreating LLM personas and users
  • Deleting and creating LLM models
  • Creating new API token keys
  • Ensuring SSL and certificates were updated inside container
  • Logging into container and using bash & curl to call API (successfully)
  • Logging into rails console RAILS_ENV=production bundle exec rails console and using a http object to call the OpenAI API (successfully)
  • Anthropic API calls to claude-3.5-sonnet (successfully)

Reproducible steps:

Create new container build using the latest Discourse and add the Discourse AI plugin to plugins:

  ...
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - git clone https://github.com/discourse/discourse-ai.git

Configure OpenAI LLM and Embeddings models with the following:

  • GPT4 Omni, GPT4 Omni Mini
    • All default values, insert your api key
    • Tokens: 64000
    • Click “Run test”, wait for response, sometimes success, often “Internal Server Error”. When successful, trying to chat with the persona gives the Inference LLM Model stack trace
  • text-embedding-ada-002, text-embedding-3-large
    • Saves successfully, generates error logs, multiple repeating every 5 mins

Internal Server Error Stack Trace

Internal Server Error stack trace
Message (2 copies reported)
Errno::ECONNRESET (Connection reset by peer)
app/controllers/application_controller.rb:427:in `block in with_resolved_locale'
app/controllers/application_controller.rb:427:in `with_resolved_locale'
lib/middleware/omniauth_bypass_middleware.rb:35:in `call'
lib/content_security_policy/middleware.rb:12:in `call'
lib/middleware/anonymous_cache.rb:409:in `call'
lib/middleware/csp_script_nonce_injector.rb:12:in `call'
config/initializers/008-rack-cors.rb:26:in `call'
config/initializers/100-quiet_logger.rb:20:in `call'
config/initializers/100-silence_logger.rb:29:in `call'
lib/middleware/enforce_hostname.rb:24:in `call'
lib/middleware/processing_request.rb:12:in `call'
lib/middleware/request_tracker.rb:385:in `call'
Backtrace
openssl (3.3.0) lib/openssl/buffering.rb:217:in `sysread_nonblock'
openssl (3.3.0) lib/openssl/buffering.rb:217:in `read_nonblock'
net-protocol (0.2.2) lib/net/protocol.rb:218:in `rbuf_fill'
net-protocol (0.2.2) lib/net/protocol.rb:199:in `readuntil'
net-protocol (0.2.2) lib/net/protocol.rb:209:in `readline'
net-http (0.6.0) lib/net/http/response.rb:625:in `read_chunked'
net-http (0.6.0) lib/net/http/response.rb:595:in `block in read_body_0'
net-http (0.6.0) lib/net/http/response.rb:570:in `inflater'
net-http (0.6.0) lib/net/http/response.rb:593:in `read_body_0'
net-http (0.6.0) lib/net/http/response.rb:363:in `read_body'
plugins/discourse-ai/lib/completions/endpoints/base.rb:374:in `non_streaming_response'
plugins/discourse-ai/lib/completions/endpoints/base.rb:160:in `block (2 levels) in perform_completion!'
net-http (0.6.0) lib/net/http.rb:2433:in `block in transport_request'
net-http (0.6.0) lib/net/http/response.rb:320:in `reading_body'
net-http (0.6.0) lib/net/http.rb:2430:in `transport_request'
net-http (0.6.0) lib/net/http.rb:2384:in `request'
rack-mini-profiler (3.3.1) lib/patches/net_patches.rb:19:in `block in request_with_mini_profiler' 
rack-mini-profiler (3.3.1) lib/mini_profiler/profiling_methods.rb:44:in `step' 
rack-mini-profiler (3.3.1) lib/patches/net_patches.rb:18:in `request_with_mini_profiler' 
(eval at /var/www/discourse/lib/method_profiler.rb:38):12:in `request'
plugins/discourse-ai/lib/completions/endpoints/base.rb:122:in `block in perform_completion!'
net-http (0.6.0) lib/net/http.rb:1632:in `start'
net-http (0.6.0) lib/net/http.rb:1070:in `start'
plugins/discourse-ai/lib/completions/endpoints/base.rb:105:in `perform_completion!'
plugins/discourse-ai/lib/completions/endpoints/open_ai.rb:44:in `perform_completion!'
plugins/discourse-ai/lib/completions/llm.rb:281:in `generate'
plugins/discourse-ai/lib/configuration/llm_validator.rb:36:in `run_test'
plugins/discourse-ai/app/controllers/discourse_ai/admin/ai_llms_controller.rb:128:in `test'
actionpack (7.2.2.1) lib/action_controller/metal/basic_implicit_render.rb:8:in `send_action'
actionpack (7.2.2.1) lib/abstract_controller/base.rb:226:in `process_action'
actionpack (7.2.2.1) lib/action_controller/metal/rendering.rb:193:in `process_action'
actionpack (7.2.2.1) lib/abstract_controller/callbacks.rb:261:in `block in process_action'
activesupport (7.2.2.1) lib/active_support/callbacks.rb:121:in `block in run_callbacks'
app/controllers/application_controller.rb:427:in `block in with_resolved_locale'
i18n (1.14.7) lib/i18n.rb:353:in `with_locale'
app/controllers/application_controller.rb:427:in `with_resolved_locale'
activesupport (7.2.2.1) lib/active_support/callbacks.rb:130:in `block in run_callbacks'
activesupport (7.2.2.1) lib/active_support/callbacks.rb:141:in `run_callbacks'
actionpack (7.2.2.1) lib/abstract_controller/callbacks.rb:260:in `process_action'
actionpack (7.2.2.1) lib/action_controller/metal/rescue.rb:27:in `process_action'
actionpack (7.2.2.1) lib/action_controller/metal/instrumentation.rb:77:in `block in process_action'
activesupport (7.2.2.1) lib/active_support/notifications.rb:210:in `block in instrument'
activesupport (7.2.2.1) lib/active_support/notifications/instrumenter.rb:58:in `instrument'
activesupport (7.2.2.1) lib/active_support/notifications.rb:210:in `instrument'
actionpack (7.2.2.1) lib/action_controller/metal/instrumentation.rb:76:in `process_action'
actionpack (7.2.2.1) lib/action_controller/metal/params_wrapper.rb:259:in `process_action'
activerecord (7.2.2.1) lib/active_record/railties/controller_runtime.rb:39:in `process_action'
actionpack (7.2.2.1) lib/abstract_controller/base.rb:163:in `process'
actionview (7.2.2.1) lib/action_view/rendering.rb:40:in `process'
rack-mini-profiler (3.3.1) lib/mini_profiler/profiling_methods.rb:115:in `block in profile_method' 
actionpack (7.2.2.1) lib/action_controller/metal.rb:252:in `dispatch'
actionpack (7.2.2.1) lib/action_controller/metal.rb:335:in `dispatch'
actionpack (7.2.2.1) lib/action_dispatch/routing/route_set.rb:67:in `dispatch'
actionpack (7.2.2.1) lib/action_dispatch/routing/route_set.rb:50:in `serve'
actionpack (7.2.2.1) lib/action_dispatch/routing/mapper.rb:32:in `block in <class:Constraints>'
actionpack (7.2.2.1) lib/action_dispatch/routing/mapper.rb:62:in `serve'
actionpack (7.2.2.1) lib/action_dispatch/journey/router.rb:53:in `block in serve'
actionpack (7.2.2.1) lib/action_dispatch/journey/router.rb:133:in `block in find_routes'
actionpack (7.2.2.1) lib/action_dispatch/journey/router.rb:126:in `each'
actionpack (7.2.2.1) lib/action_dispatch/journey/router.rb:126:in `find_routes'
actionpack (7.2.2.1) lib/action_dispatch/journey/router.rb:34:in `serve'
actionpack (7.2.2.1) lib/action_dispatch/routing/route_set.rb:896:in `call'
lib/middleware/omniauth_bypass_middleware.rb:35:in `call'
rack (2.2.11) lib/rack/tempfile_reaper.rb:15:in `call'
rack (2.2.11) lib/rack/conditional_get.rb:27:in `call'
rack (2.2.11) lib/rack/head.rb:12:in `call'
actionpack (7.2.2.1) lib/action_dispatch/http/permissions_policy.rb:38:in `call'
lib/content_security_policy/middleware.rb:12:in `call'
lib/middleware/anonymous_cache.rb:409:in `call'
lib/middleware/csp_script_nonce_injector.rb:12:in `call'
config/initializers/008-rack-cors.rb:26:in `call'
rack (2.2.11) lib/rack/session/abstract/id.rb:266:in `context'
rack (2.2.11) lib/rack/session/abstract/id.rb:260:in `call'
actionpack (7.2.2.1) lib/action_dispatch/middleware/cookies.rb:704:in `call'
actionpack (7.2.2.1) lib/action_dispatch/middleware/callbacks.rb:31:in `block in call'
activesupport (7.2.2.1) lib/active_support/callbacks.rb:101:in `run_callbacks'
actionpack (7.2.2.1) lib/action_dispatch/middleware/callbacks.rb:30:in `call'
actionpack (7.2.2.1) lib/action_dispatch/middleware/debug_exceptions.rb:31:in `call'
actionpack (7.2.2.1) lib/action_dispatch/middleware/show_exceptions.rb:32:in `call'
logster (2.20.1) lib/logster/middleware/reporter.rb:40:in `call'
railties (7.2.2.1) lib/rails/rack/logger.rb:41:in `call_app'
railties (7.2.2.1) lib/rails/rack/logger.rb:29:in `call'
config/initializers/100-quiet_logger.rb:20:in `call'
config/initializers/100-silence_logger.rb:29:in `call'
actionpack (7.2.2.1) lib/action_dispatch/middleware/request_id.rb:33:in `call'
lib/middleware/enforce_hostname.rb:24:in `call'
rack (2.2.11) lib/rack/method_override.rb:24:in `call'
actionpack (7.2.2.1) lib/action_dispatch/middleware/executor.rb:16:in `call'
rack (2.2.11) lib/rack/sendfile.rb:110:in `call'
plugins/discourse-prometheus/lib/middleware/metrics.rb:14:in `call'
rack-mini-profiler (3.3.1) lib/mini_profiler.rb:334:in `call'
lib/middleware/processing_request.rb:12:in `call'
message_bus (4.3.9) lib/message_bus/rack/middleware.rb:60:in `call'
lib/middleware/request_tracker.rb:385:in `call'
actionpack (7.2.2.1) lib/action_dispatch/middleware/remote_ip.rb:96:in `call'
railties (7.2.2.1) lib/rails/engine.rb:535:in `call'
railties (7.2.2.1) lib/rails/railtie.rb:226:in `public_send'
railties (7.2.2.1) lib/rails/railtie.rb:226:in `method_missing'
rack (2.2.11) lib/rack/urlmap.rb:74:in `block in call'
rack (2.2.11) lib/rack/urlmap.rb:58:in `each'
rack (2.2.11) lib/rack/urlmap.rb:58:in `call'
unicorn (6.1.0) lib/unicorn/http_server.rb:634:in `process_client'
unicorn (6.1.0) lib/unicorn/http_server.rb:739:in `worker_loop'
unicorn (6.1.0) lib/unicorn/http_server.rb:547:in `spawn_missing_workers'
unicorn (6.1.0) lib/unicorn/http_server.rb:143:in `start'
unicorn (6.1.0) bin/unicorn:128:in `<top (required)>'
vendor/bundle/ruby/3.3.0/bin/unicorn:25:in `load'
vendor/bundle/ruby/3.3.0/bin/unicorn:25:in `<main>'

Viewing the log errors are:

Embeddings Model

Error message in logs: (every 5 mins) Connection reset by peer (Faraday::ConnectionFailed)

application_version: 00907363d4b290df1c755df1a2494b95265e40b4

job: Jobs::EmbeddingsBackfill

Embeddings model error Stack trace

Embeddings model error Stack trace
Job exception: 5 errors
Connection reset by peer (Faraday::ConnectionFailed)
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/openssl-3.3.0/lib/openssl/buffering.rb:217:in `sysread_nonblock'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/openssl-3.3.0/lib/openssl/buffering.rb:217:in `read_nonblock'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:218:in `rbuf_fill'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:199:in `readuntil'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:209:in `readline'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-http-0.6.0/lib/net/http/response.rb:625:in `read_chunked'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-http-0.6.0/lib/net/http/response.rb:595:in `block in read_body_0'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-http-0.6.0/lib/net/http/response.rb:570:in `inflater'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-http-0.6.0/lib/net/http/response.rb:593:in `read_body_0'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-http-0.6.0/lib/net/http/response.rb:363:in `read_body'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-http-0.6.0/lib/net/http/response.rb:401:in `body'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-http-0.6.0/lib/net/http/response.rb:321:in `reading_body'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-http-0.6.0/lib/net/http.rb:2430:in `transport_request'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/net-http-0.6.0/lib/net/http.rb:2384:in `request'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/net_patches.rb:19:in `block in request_with_mini_profiler'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/mini_profiler/profiling_methods.rb:50:in `step'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/patches/net_patches.rb:18:in `request_with_mini_profil...
Backtrace
concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/promises.rb:1268:in `raise' 
concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/promises.rb:1268:in `wait_until_resolved!' 
concurrent-ruby-1.3.5/lib/concurrent-ruby/concurrent/promises.rb:998:in `value!' 
/var/www/discourse/plugins/discourse-ai/lib/embeddings/vector.rb:50:in `gen_bulk_reprensentations' 
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:134:in `block in populate_topic_embeddings' 
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:133:in `each' 
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:133:in `each_slice' 
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:133:in `populate_topic_embeddings' 
/var/www/discourse/plugins/discourse-ai/app/jobs/scheduled/embeddings_backfill.rb:36:in `execute' 
/var/www/discourse/app/jobs/base.rb:316:in `block (2 levels) in perform' 
rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:303:in `block in perform' 
/var/www/discourse/app/jobs/base.rb:299:in `each' 
/var/www/discourse/app/jobs/base.rb:299:in `perform' 
/var/www/discourse/app/jobs/base.rb:379:in `perform' 
mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:137:in `process_queue' 
mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:77:in `worker_loop' 
mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:63:in `block (2 levels) in ensure_worker_threads' 

Inference LLM Model

Error message in logs: Job exception: Connection reset by peer

application_version: 00907363d4b290df1c755df1a2494b95265e40b4

job: Jobs::CreateAiReply

LLM model error Stack trace

LLM model error Stack trace
Message
Job exception: Connection reset by peer
Backtrace
openssl-3.3.0/lib/openssl/buffering.rb:217:in `sysread_nonblock' 
openssl-3.3.0/lib/openssl/buffering.rb:217:in `read_nonblock' 
net-protocol-0.2.2/lib/net/protocol.rb:218:in `rbuf_fill' 
net-protocol-0.2.2/lib/net/protocol.rb:199:in `readuntil' 
net-protocol-0.2.2/lib/net/protocol.rb:209:in `readline' 
net-http-0.6.0/lib/net/http/response.rb:625:in `read_chunked' 
net-http-0.6.0/lib/net/http/response.rb:595:in `block in read_body_0' 
net-http-0.6.0/lib/net/http/response.rb:570:in `inflater' 
net-http-0.6.0/lib/net/http/response.rb:593:in `read_body_0' 
net-http-0.6.0/lib/net/http/response.rb:363:in `read_body' 
/var/www/discourse/plugins/discourse-ai/lib/completions/endpoints/base.rb:374:in `non_streaming_response' 
/var/www/discourse/plugins/discourse-ai/lib/completions/endpoints/base.rb:160:in `block (2 levels) in perform_completion!' 
net-http-0.6.0/lib/net/http.rb:2433:in `block in transport_request' 
net-http-0.6.0/lib/net/http/response.rb:320:in `reading_body' 
net-http-0.6.0/lib/net/http.rb:2430:in `transport_request' 
net-http-0.6.0/lib/net/http.rb:2384:in `request' 
rack-mini-profiler-3.3.1/lib/patches/net_patches.rb:19:in `block in request_with_mini_profiler' 
rack-mini-profiler-3.3.1/lib/mini_profiler/profiling_methods.rb:50:in `step' 
rack-mini-profiler-3.3.1/lib/patches/net_patches.rb:18:in `request_with_mini_profiler' 
(eval at /var/www/discourse/lib/method_profiler.rb:38):5:in `request'
/var/www/discourse/plugins/discourse-ai/lib/completions/endpoints/base.rb:122:in `block in perform_completion!' 
net-http-0.6.0/lib/net/http.rb:1632:in `start' 
net-http-0.6.0/lib/net/http.rb:1070:in `start' 
/var/www/discourse/plugins/discourse-ai/lib/completions/endpoints/base.rb:105:in `perform_completion!' 
/var/www/discourse/plugins/discourse-ai/lib/completions/endpoints/open_ai.rb:44:in `perform_completion!' 
/var/www/discourse/plugins/discourse-ai/lib/completions/llm.rb:281:in `generate' 
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/bot.rb:65:in `get_updated_title' 
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/playground.rb:252:in `title_playground' 
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/playground.rb:561:in `ensure in reply_to' 
/var/www/discourse/plugins/discourse-ai/lib/ai_bot/playground.rb:561:in `reply_to' 
/var/www/discourse/plugins/discourse-ai/app/jobs/regular/create_ai_reply.rb:18:in `execute' 
/var/www/discourse/app/jobs/base.rb:316:in `block (2 levels) in perform' 
rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:303:in `block in perform' 
/var/www/discourse/app/jobs/base.rb:299:in `each' 
/var/www/discourse/app/jobs/base.rb:299:in `perform' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:202:in `execute_job' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:170:in `block (2 levels) in process' 
sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:177:in `block in invoke' 
/var/www/discourse/lib/sidekiq/pausable.rb:132:in `call' 
sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:179:in `block in invoke' 
sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:182:in `invoke' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:169:in `block in process' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:136:in `block (6 levels) in dispatch' 
sidekiq-6.5.12/lib/sidekiq/job_retry.rb:113:in `local' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:135:in `block (5 levels) in dispatch' 
sidekiq-6.5.12/lib/sidekiq.rb:44:in `block in <module:Sidekiq>' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:131:in `block (4 levels) in dispatch' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:263:in `stats' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:126:in `block (3 levels) in dispatch' 
sidekiq-6.5.12/lib/sidekiq/job_logger.rb:13:in `call' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:125:in `block (2 levels) in dispatch' 
sidekiq-6.5.12/lib/sidekiq/job_retry.rb:80:in `global' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:124:in `block in dispatch' 
sidekiq-6.5.12/lib/sidekiq/job_logger.rb:39:in `prepare' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:123:in `dispatch' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:168:in `process' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:78:in `process_one' 
sidekiq-6.5.12/lib/sidekiq/processor.rb:68:in `run' 
sidekiq-6.5.12/lib/sidekiq/component.rb:8:in `watchdog' 
sidekiq-6.5.12/lib/sidekiq/component.rb:17:in `block in safe_thread' 

Any ideas, suggestions, etc would be most welcome

إعجاب واحد (1)

لقد أجرينا أكثر من 2000 طلب لواجهة برمجة تطبيقات OpenAI على هذا الموقع في الشهر الماضي، وليس لدينا أي أخطاء من هذا القبيل في سجلاتنا.

لقد دخلت إلى موقع آخر نستضيفه يستخدم OpenAI بكثافة، مع 80 ألف طلب في الشهر الماضي وليس لديهم أي أخطاء Faraday، وفقط خطأين “Connection Reset by Peer” في تلك الفترة.

هل من الممكن أن يكون لديك شيء سيء في شبكة الخادم الخاص بك؟ لقد واجهت Errno::ECONNRESET (Connection reset by peer) مرة واحدة بسبب برنامج تشغيل بطاقة شبكة معيب.

سألقي نظرة على هذا.

إعجاب واحد (1)

كما كتبت، اعتقدت في الأصل أن هذه مشكلة في مكدس الشبكة، ومع ذلك، فإن استدعاءات OpenAI API المتكررة من نفس الحاوية تعمل بشكل جيد.

هذا أيضًا مع أحدث إصدار، التزامات اعتبارًا من 21 فبراير.

لإثبات ذلك (بتكلفة حرق الرموز)، قمت بإعداد برنامج نصي سريع للتحقق من مكدس شبكة OpenAI.

  • يعمل لمدة 600 ثانية (10 دقائق)
  • يقوم بإجراء مكالمة إكمال دردشة واحدة في الثانية
  • يغير المطالبة لتجنب ذاكرة التخزين المؤقت

قم بالتنفيذ داخل الحاوية ./launcher enter app، واحفظ البرنامج النصي أدناه، واجعله قابلاً للتنفيذ باستخدام chmod +x test_openai.sh ثم قم باستدعائه باستخدام OPENAI_API_KEY=.... ./test_openai.sh

test_openai.sh
#!/bin/bash

# Duration to run
DURATION_SECS=600

# Initialize counters
successful=0
unsuccessful=0
declare -A error_messages

# Function to calculate percentage
calc_percentage() {
    local total=$(($1 + $2))
    if [ $total -eq 0 ]; then
        echo "0.00"
    else
        echo "scale=2; ($2 * 100) / $total" | bc
    fi
}

# Function to print statistics
print_stats() {
    local percent=$(calc_percentage $successful $unsuccessful)
    echo "-------------------"
    echo "Successful calls: $successful"
    echo "Failed calls: $unsuccessful"
    echo "Failure rate: ${percent}%"
    echo "Error messages:"
    for error in "${!error_messages[@]}"; do
        echo "  - $error (${error_messages[$error]} times)"
    done
}

end_time=$((SECONDS + DURATION_SECS))

counter=1
while [ $SECONDS -lt $end_time ]; do
    # Make the API call with timeout
    response=$(curl -s -w "\n%{http_code}" \
        -X POST \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OPENAI_API_KEY" \
        -d "{
            \"model\": \"gpt-4o-mini\",
            \"messages\": [{\"role\": \"user\", \"content\": \"Use this number to choose a one word response: $counter\"}]
        }" \
        --connect-timeout 5 \
        --max-time 10 \
        https://api.openai.com/v1/chat/completions 2>&1)

    # Get the last line (status code) and response body
    http_code=$(echo "$response" | tail -n1)
    body=$(echo "$response" | sed '$d')

    # Check if the call was successful
    if [ "$http_code" = "200" ]; then
        ((successful++))
    else
        ((unsuccessful++))
        # Extract error message
        error_msg=$(echo "$body" | grep -o '"message":"[^"]*"' | cut -d'"' -f4)
        if [ -z "$error_msg" ]; then
            error_msg="Connection error: $body"
        fi
        # Increment error message counter
        ((error_messages["$error_msg"]++))
    fi

    # Print current statistics
    print_stats

    ((counter++))
    
    # Wait for 1 second before next call
    sleep 1
done

بالنسبة للبرنامج النصي للاختبار، كان معدل الفشل لدي أقل من 0.5٪، وهو أمر مقبول بهذا الحجم.

هذا يخبرني أن المشكلة تكمن في برنامج Discourse البرمجي بدلاً من الحاوية أو مكدس الشبكة الذي يدعمها.

إذا لم يتم إصلاحه بتثبيت حديث، فسألقي نظرة أعمق عليه.

إعجابَين (2)

لقد أصلحت الانحدار حول o1-mini و o1-preview هنا:

لكنني مرتبك بشأن مشكلات SSL، لم نغير أي شيء في مكتبتنا الأساسية هنا.

ربما يتعلق هذا بالبث، حاول تعطيل البث على نموذج اللغة الكبير الخاص بـ OpenAI الخاص بك وانظر ما إذا كان هذا يحل المشكلة بنفسه؟ اختبارك هناك يقوم بـ gpt-4o-mini دون استخدام البث.

3 إعجابات

هذا رائع! أحسنت صنعًا!

في عملية التشخيص، وجدت خطأً آخر - في صفحة تكوين LLM (/admin/plugins/discourse-ai/ai-llms/%/edit)، يؤدي تحديد أي من الخيارين "تعطيل دعم الأدوات الأصلي (استخدام أدوات تعتمد على XML) (اختياري)" أو "تعطيل إكمال التدفق (تحويل طلبات التدفق إلى طلبات غير تدفقية)" والنقر فوق حفظ إلى عرض رسالة مؤقتة “نجاح!”، ولكن بعد إعادة تحميل الصفحة، يتم إلغاء تحديد كلا الخيارين أو أحدهما.

لا تزال مشكلات إعادة تعيين الاتصال قائمة وما زلت أبحث فيها، ومع ذلك يبدو أنها مزيج من التعليمات البرمجية Ruby (FinalDestination / تحليل DNS / Faraday) ومعالجة المقابس، جنبًا إلى جنب مع حاوية Debian 12 على جهاز افتراضي Ubuntu 24.04.

لقد قمت بتشغيل جهاز افتراضي تجريبي Ubuntu 22.04 ولا توجد مشكلات، وجميع التضمينات والاستدلال تعمل بشكل مثالي. لم أرَ إعادة تعيين واحدة.

سأستمر في العمل على ذلك، ربما يتعلق الأمر بطريقة جديدة تدير بها Ubuntu 24.04 مكدس TCP باستخدام netplan.

إعجابَين (2)

شكرا، تم حل مشكلة الاستمرارية اليوم، هل يمكنك الترقية والمحاولة مرة أخرى

3 إعجابات

حسنًا، تحديث بسيط - لم نتمكن من تشغيل اتصال OpenAI API المباشر على نطاق IP الخاص بالشركة. سترسل Cloudflare حزم RST بعد حوالي 1 مللي ثانية بعد TLS.

لذلك، قمنا بإعداد Cloudflare AI Gateway كبديل مباشر لنقطة نهاية OpenAI API من خلال عنوان URL، وهي تعمل بشكل لا تشوبه شائبة مع تكوين LLM.

يبدو أن Cloudflare لديها سياسة حدود معدل غير موثقة لنطاقات IP غير المعروفة (أي، ليست Azure أو AWS أو GCP، إلخ) والتي يتم تفعيلها. تجمع الاتصالات البالغ عددها 100 لـ Embeddings سيؤدي إلى تجاوز هذا الحد.

بشكل منفصل، تمتلك Cloudflare ميزة Authenticated Gateway التي تضيف رمز رأس خاص.

من وثائقهم:

curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
  --header 'cf-aig-authorization: Bearer {CF_AIG_TOKEN}' \
  --header 'Authorization: Bearer OPENAI_TOKEN' \
  --header 'Content-Type: application/json' \
  --data '{"model": "gpt-4o" .......

سيكون من الرائع لو كانت هناك ميزة لإضافة رؤوس لكل LLM في شاشة تكوين LLM.

بهذه الطريقة، يمكننا إضافة مفتاح وقيمة cf-aig-authorization إلى LLM لكل استدعاء نقوم به.

هذه مهمة صعبة، إنها الكثير من واجهة المستخدم لحالة هامشية.

هل هناك أي فرصة يمكنك تجربة openrouter.ai فقد يحل هذا الأمر أيضًا؟

أنا لست ضد السماح برؤوس تعسفية بشكل قاطع، ولكنها تكوين متقدم جدًا. ربما إذا كانت خلف إعداد موقع مخفي لكان الأمر جيدًا (إعداد موقع لتمكين واجهة المستخدم المتقدمة).

هل يمكن لشركتك المساعدة في دفع هذا المساهمة للمكون الإضافي مفتوح المصدر؟

إعجاب واحد (1)

لم أتمكن من الحصول على موافقة للمساهمة، ومع ذلك سنواصل العمل على ذلك. شكرًا للمساعدة حتى الآن!