How do I use hugging face paid inference endpoints as Discourse custom LLMs

I have several open source LLM models running on the Hugging Face Inference Endpoint service ( essentially AWS ) …

For all the models I have tested (llama, phi, gemma, etc) … I’m able to connect from Discourse LLM settings page, but inference doesn’t work. here’s the error:

“Trying to contact the model returned this error: Failed to deserialize the JSON body into the target type: missing field inputs at line 1 column 163”

What am I doing wrong !? Thanks much…

View from Hugging Face:

View from Discourse:

It’s been over a year since I last tried their API. Is it OpenAI compatible nowadays? If so you can try setting Provider to OpenAI and pointing to their endpoint.

I have tried most all the providers available on the Discourse LLM setup screen, including OpenAI…

They either give the “Failed to deserialize the JSON body into the target type” error or “Internal Server Error”.

I also tried an actual OpenAI model on the HF endpoint service ( GPT2! :slight_smile: but that didn’t work. … same sort of errors.