Self-Hosting an OpenSource LLM for DiscourseAI

I’m using vLLM too. I also would recommend the openchat v3.5 0106 model, which is a 7B parameter model which performs very well.

I actually am running it in 4bit quantized so that it runs faster.