Debugging adding new LLM

I’m trying to add a custom LLM to Discourse AI plugin. When I press the ‘test’ button I get “Internal Server Error”.

Is there a way of debugging this or getting a better error message? When I go into the docker image and curl the /v1/models, I’m able to fetch this correctly.

The model name is “models/Meta-Llama-3-8B-Instruct.Q6_K.gguf” and I’m not sure whether there could be any issue with special characters.

Trying another one gives: Trying to contact the model returned this error: {"error":{"code":404,"message":"File Not Found","type":"not_found_error"}}

But it doesn’t display what URL/model it is trying to fetch which might help to debug.

The same settings were pasted into Open WebUI which was able to contact both LLM endpoints and inference correctly.

What inference server are you using? vLLM?

When configuring the URL, add the path /v1/chat/completions at the end.

This was the issue. Note that in LLM software, it is customary to include only upto the /v1 as the endpoint URL. Selection of /chat/completion etc. is then normally added by the software.

I’m trying to get one running on the localhost to test so put the URL as: “http://172.17.0.1:8081/v1/chat/completions” and get internal server error. I’m able to curl “http://172.17.0.1:8081/v1/models” from the discourse docker container so the connectivity is working.

Are there any other gotchas (e.g. does Discourse allow non-https and arbitrary ports for the LLM endpoint)?

Both should work.

What is the error you see on /logs ?

Ah. I didn’t know about /logs!

NameError (undefined local variable or method 'tokenizer' for an instance of DiscourseAi::Completions::Dialects::ChatGpt) app/controllers/application_controller.rb:424:in 'block in with_resolved_local

Hmm. The one that works is for a model that I quantized myself. I’ll try to quantize the others to see if it is a model format issue.

Anyone managed to get DeepSeek API working? I’m trying to figure out the right incantation to get it to work with Discourse.

I have it working in Open WebUI and other clients.

There’s a topic here about it

2 Likes