@merefield plugin has been around for longer and has many more knobs to configure it. AI Bot is also a bit more ambitious (especially since we have GPT 4 access) in that we attempt to integrate it into the Discourse experience - it knows how to search and summarize topics , for example.
Notable differences as of today are probably
We stream replies and offer a stop button
@merefield offers a lot more settings to tune stuff
We offer a ācommandā framework for getting the bot to act on your behalf - albeit experience is fairly flaky on GPT 3.5
@merefield offers discourse chat integration atm, we do not yet
How can we use it with Stable Difussion? I have the API but I donāt know how to prompt it (I tried to read that from the code but I didnāt make success).
To add: From my tests, it looks like AI Bot only works in PM and Chatbot works everywhere, unless Iām doing something wrong with the AI bot.
Image generation and streaming are nicely done, as well as search API, however, it sometimes still falls back to āI canāt search the web or can not generate imagesā. Are you using something similar to LangChain agents, that decide what steps to take?
Are we supposed to create a CX with scope for the full web, or just our instance URL?
That is correct. We will probably get to wider integration, but are taking our time here and trying to polish the existing stuff first.
Yes, this is the very frustrating thing about GPT 3.5 vs 4. Grounding the model for 3.5 is just super duper hard.
I am considering having an intermediary step prior to replying in GPT 3.5 that first triages prior to actually responding (Eg: does this interaction INTERACTION look like it should result in a !command, if so which?) It would sadly add cost and delay so this is my last resort
We use a āsort ofā langchain, limited to 5 steps, but we try to be very frugal with tokens so balance is hard.
Up to you⦠I like having access to all of Google, it is mighty handy
What I do to ground 3.5 is adding a second, shorter system prompt lower in the final prompt to āremindā the model of some of the rules in the main system prompt.
So it would look something like (typing from phone, tryingā¦)
system role
user
assistant
ā¦
ā¦
system role āreminderā
new user prompt
Just by repeating the most important system role contents, the model adds more weight to it. Iāve been using this workaround for a few months now without too much strange responses.
Especially if prompts are becoming longer, the model tends to āforgetā things that are higher in the final prompt. Things in AI are very hacky, itās something I experience this in GPT models and langchain as well. Just today I got such a strong personality in langchain that the actions when asking the time in a random city, were āchecking my watchā, āchanging the timezone of my watchā and āask a strangerā
Iām assuming you rely on a formatted LLM output to decide the next action to take. So this works way better with temperatures close to zero. This should help grounding 3.5 and should greatly improve results.
Yeah I am working on splitting this into 2 prompts at the moment.
For triage
For answering
It is a rather big refactor of this code base but it will allow us to have 2 temps at play, and I think this grounds both Claude and GPT 3.5 from local testing.
We end up wasting one API call, but we save a tiny bit on the system prompt and may be able to shave off more.
Without a dedicated triage call I donāt think we have a chance with GPT 3.5
Maybe not within the scope, but it would be interesting to train a model on all the posts in my forum and use them to create an expert user AI bot that users could interact with, or that could answer questions from users on its own in threads, and link to/quote relevant posts from the past.
I donāt want perfect to be the enemy of good here.
I changed it so all the extra fancy stuff like search integration and image generation is only implemented on GPT 4 which is able to properly deal with the very complicated prompt.
I have some ideas on bringing these features to GPT 3.5 / Claude as well, but in the interim the basics are mighty useful on the simpler models.
Multiple people can interact with LLMs in a single session (something that is not possible in chat.openai.com)
Stuff streams like it does in the official UIs and can be cancelled.
We get access to our Markdown engine so you can get it to draw mermaid diagrams and other fancy things.
So this is very useful for general purpose tasks on the simpler models today.
afaik self hosting open ai costs more than 10 arms and 10 legs, so affording resources to submit a PR here should be easy⦠totally open to have a PR that adds a site setting for this.