I have been playing with the bot and it’s already great. Can I add a voice to enable semantic search? In my experiments so far this would make the bot much smarter, at least on our site. If I compare the results it finds and summarises or answers-using these are much worse than those it would use if it were doing semantic search.
Relatedly, is it possible to enable semantic search as the default when using
/? Again, I think most of our users would prefer the results. We have lots of knowledgebase type posts which don’t always use the keywords that people would actually say/search for but which are semantically related. This means a trad search tends to throw up lower-quality posts with people discussing issues informally, and are not the canonical answer to the question they have.
Can you share some example prompt/questions?
I too think the AI Bot is of great benefit for sites but my ideas of prompts/questions are not necessarily those of others and so looking for other prompts/questions for possible demonstration purposes.
Tbh even if you used the current semantic search as part of the bot workflow that would be cool. It works nicely now but just doesn’t have the right info context, even when the current semantic search would find them.
Would be great to have control over the prompts but I can see the ui is hard there too because it would be easy to break stuff with the wrong prompt. I think adding the concept of a persona for the bot which is entered as a system prompt might be a nice intro there.
Absolutely, this is something I really want, in fact I want forum admins to be allowed to craft custom personas with custom command sets
I can really see prompt tweaking being useful for us, although having done a bit of amateur “prompt engineering” for another project recently I think it will take a bit of handholding and lots of examples for people who are not familiar and the UI would probably benefit from a set of examples/default choices like “chatty/fun”, “neutral/accurate” through to “bookish/nerdy” to show how the persona wordings can can change the response.
I have also found that gauging the effect of prompt wording changes can be quite hard because of the inherent randomness of the models, and also because the effects may vary by the subject matter of the prompt. It might be nice to develop a standard test suite of user-inputs and use these to give a dry-run of how persona or instruction changes would alter the bot outputs. I guess this would be useful for your team too… although once the test set gets big you end up with the issue of how to evaluate it without taking lots of time.
Another dimension that I think users might often want to tweak is how strictly the llm sticks to the source material provided in the prompt. In my testing you have the be quite explicit (and repetitive) in instructing the model not import knowledge from outside the context and make it clear (more instructions) that you would rather have no answer than bad answers. You can also control the degree to which the models “shows its workings” and cites sources/gives examples, and I think that is often a good way of avoid hallucinations/bullshit responses when the context doesn’t include the actual answer or relevant material.
One final comment … I can see here that you guys have been worried about costs and being frugal with tokens which I guess makes sense for very large sites. However for smaller or higher-financial-value applications (e.g. customer support) I actually don’t think it would be a big deal, and this is only going to go down over time. The cost of extra queries to separate classifiers which sanity check the response, or implement user-defined “guard rails”, would definitely be worth it for us. For example, we have found prompts like “does this answer contain information not found in these sources” to be quite diagnostic, and definitely worth running before presenting information to the users. GPT 3.5 is definitely fine for this sort of task, even if the main job ran with GPT4.