Advice on a support bot for a technical support forum (Discourse AI vs Discourse Chatbot)

I’m trying to create an AI bot that can compliment our human support team for a technical support forum. The goals are to provide support during off-business hours (currently users must wait until we open) and faster support during business hours (humans are slower to respond). If the bot doesn’t resolve the issue then humans would take over and handle it during business hours.

A few design goals are:

  1. It would need to learn from our historical forum data in order to have the domain knowledge to provide support. The data in our forum is much more relevant than general internet knowledge.
  2. It should only attempt to answer on-topic questions. It should be able to help someone troubleshoot our products or use our service but if they ask how to bake a cake it should not attempt to help. It should stick to topics similar to those already in the forum.
  3. It should provide a way for the user to tell the bot to stop answering so they can wait for a human instead if they feel like the bot isn’t helping.

I see there is the AI Bot module in the official Discourse AI plugin.

And there is the original Discourse AI Chatbot.

Which plugin would be better for this kind of application? I’ve read about each separately and this topic which is similar, but I’d love more explicit advice if you have any to share.

Do either of these plugins automatically learn or fine-tune with historical forum data? We can fine-tune GPT-4 manually (I think) but I’d like to utilize the capabilities of the plugin when possible and not duplicate work that’s already been done by smarter people. :slight_smile:

Do either of these plugins let you restrict the topics the bot will engage in so it doesn’t engage in general, off-topic conversations?

Is the original Discourse Chatbot intended to be developed and maintained long term? Or is it planned to eventually be replaced by the AI Bot in the official plugin, or folded into it? @merefield

4 Likes

I think that you might find the “Agent Mode” of Discourse Chatbot 🤖 (Now smarter than ChatGPT!*) the best fit for your needs.

This plugin is 3rd party and maintained as an open source project by a single passionate developer. As such, it pushes the boundaries of what is possible and is quite a bit more feature-rich.

If you find it works well for you, I’d encourage you to seek out the developer and offer to contribute towards its maintenance.

In contrast, the other (official) plugin/module has been developed by the Discourse Team with an emphasis on stability and will be supported by their impressive Dev team.

(P.S. it isn’t the best etiquette to @mention people here - especially CEOs and busy Devs - unless there is serious need)

5 Likes

Yes, both use RAG to leverage forum data in their responses.

Where would those conversations take place?

  • Topics
  • PMs
  • Chat
  • External application
2 Likes

I’d be happy to pay for help on this. I’ll reach out.

Sorry, I didn’t mean to be rude. I was just mentioning the people who posted about those plugins in case they wanted to chime in.

2 Likes

Initially, probably just PMs. We don’t currently use chat but it does seem like it might work well for this since a bot can reply in real time (unlike our very busy humans) so that might be good as well. Then if it works well in the private conversations I’d want to enable it on public topics.

2 Likes

So I think you should give a try in our AIBot feature in the DiscourseAI plugin. Start with our “Forum Helper” persona, it’s almost perfect fit for your use case.

After that, you can go and New AI Persona Editor for Discourse to create a persona specialized on your needs. Give it both search and read capabilities and write a good system prompt and you are off to the races.

I gave your access to the feature here on Meta, refresh your browser to see and play with it.

6 Likes

I have version 0.0.1 running on a copy of our forum. It’s OK but it doesn’t seem that smart. It mostly just searches our forum and provides a list of topics that might be relevant. It doesn’t seem to interpret the information and provide support conversationally. And I don’t have the persona editor. The one on Meta seems to work better so I suspect I don’t have the latest version.

2 Likes

Actually it does do better when I ask it to be more specific and walk me through the steps.

2 Likes

I have no intention of dropping support for Chatbot anytime soon.

It is in any case open source, so support and contribution from others is welcome.

The bot is designed to leverage existing data on your forum to answer user questions.

If you have some special requirements that need implementation that go beyond the current offering and have budget to apply to that then you can always consider hiring me to help you.

4 Likes

Yep, I sent you a PM about that earlier. At this point the big question is which plugin to use and I don’t know enough about them (or bots in general) to make a confident decision. :slight_smile:

I need someone who is good at this to help us get started. I can pay for the help, and special features if we need any.

Sorry for @ ing you earlier, I didn’t know it was poor etiquette.

4 Likes

Good question, I don’t know how many users here have practical experience with either of the AI bots noted but can say that I have been using the Discourse AI bot on the OpenAI site even in the development stage. I regularly give feedback and suggestions, many are listed on this site. (list) Also do all of it for free.

Would it be possible for me to get an invite to the site for a better understanding of the site?

If you have questions feel free to ask here or send me a private message.

FYI

If you don’t know what RAG is this is a nice short intro.

3 Likes

May I ask how you decided which of the two AI bot plugins to use in the first place? (Discourse AI or Discourse Chatbot)

1 Like

I did not decide. The only choice was for the Discourse one.

Since the OpenAI Discourse site is a bit older, it still has the Lounge category. Sam from Discourse not Sam from OpenAI granted the TL3 users of the OpenAI site access to the Discourse AI bot. So that is how I gained access and it was chosen. However if you read the post here and between the lines you might see that Sam has been working with us to get feedback and to have users that would know what such a bot was capable of doing via topics in the Lounge.

As changes are made to the Discourse AI bot we see them soon after the code is pushed and installed and will try it out.

I can’t say anything about the Plugin bot and will not speculate.

HTH

1 Like

After taking a look at a few of the topics this is what I learned and recommended.

The site is for support of the Surety home security system.

The goal for the AI used with Discourse forum is (ref)

  • Compliment our human support team for a technical support forum.
  • It would need to learn from our historical forum data.
  • Have the domain knowledge to provide support.
  • It should only attempt to answer on-topic questions.
  • It should provide a way for the user to tell the bot to stop answering so they can wait for a human instead if they feel like the bot isn’t helping.

And here are some of the questions from Ryan

  • Do these automatically learn or fine-tune with historical forum data?
  • Not duplicate work that’s already been done by smarter people.
  • Restrict the topics the bot will engage in so it doesn’t engage in general, off-topic conversations?
  • Is the original Discourse Chatbot intended to be developed and maintained long term?
  • Or is it planned to eventually be replaced by the AI Bot in the official plugin, or folded into it?

Here is a posted question I gave to the Discourse AI bot currently enabled on the version of the site I tried and an answer from ChatGPT.

The question:

I have a IQP4 with a Powerg to hardwire. The wired sensors are working. I wanted to run a test on the bell/siren, but could not find the option on the IQP4. I can test the siren on the panel and it sounds along with the IQ Remote, but the hardwired bell does not. I have to trip the alarm to get the hardwired siren to sound.

Is there a way to test the bell on the powerg to hardwire from the IQP4?


Here is the reply using the Discourse AI bot with GPT-4

image

I found a relevant topic on our forum that may help you with your issue.

In the topic Test wired bell on PowerG to hardwire?, the user had a similar issue where they could not find the option to test the bell on the PowerG to hardwire from the IQP4.

Please have a look at this topic for possible solutions. If you still have questions or need further assistance, feel free to ask.


Here is the reply using the Discourse AI bot with GPT-3.5

I found a topic on the Surety Support Forum that discusses testing the wired bell on the PowerG to hardwire connection with the IQ Panel 4 (IQP4). You can find the topic here.

In the topic, the user mentions that they couldn’t find an option to test the bell/siren on the IQP4 itself. However, they were able to test the siren on the panel and it sounded along with the IQ Remote. The hardwired bell, on the other hand, did not sound unless the alarm was tripped.

You can visit the topic for more information and to see if there are any solutions or suggestions provided by other users.


Both the Discourse AI GPT-4 and GPT-3.5 bot found the same topic.



Here is the reply using ChatGPT

I don’t have specific information on the IQ Panel 4’s latest software version or features, as my training only includes knowledge up to January 2022. However, based on general principles of alarm system testing, here are some suggestions that might help you test your hardwired siren connected to the PowerG:

  1. System Test Mode:

    • Check if your alarm panel has a “System Test” or “Walk Test” mode. This mode is typically designed for testing sensors and alarms without triggering a real alarm event. If your panel has this feature, it should allow you to test the siren without setting off a full alarm.
  2. Output Control:

    • Look for an “Output Control” or “Auxiliary Output” option in your alarm panel settings. This feature may allow you to manually trigger outputs, such as the siren, for testing purposes.
  3. Zone Bypass:

    • Some systems allow you to bypass specific zones during testing. If your hardwired siren is associated with a particular zone, you might be able to bypass that zone for testing.
  4. User Manual:

    • Refer to the user manual or documentation that came with your IQ Panel 4. It should provide specific instructions on how to test and control different components of your alarm system.
  5. Technical Support:

    • If you’re unable to find a solution on your own, consider reaching out to the manufacturer’s technical support. They can provide guidance based on the specific features and capabilities of your system.
  6. Panel Software Update:

    • Check if there are any available software updates for your IQ Panel 4. Manufacturers often release updates to add new features or address issues, and an update might include a specific test mode for the siren.

Remember to exercise caution while testing to avoid accidentally triggering a real alarm event. If you’re unsure about any steps, it’s always a good idea to consult the user manual or contact technical support for assistance.

Clearly ChatGPT is giving a more detailed answer. I will go into possible reasons later in the topic.


Note: I do not have access to the Discourse AI plugin by Merefield (Card) (GitHub), it would be nice to see the response to the same question posted here for comparison. :slightly_smiling_face:




Here is my feedback.

The AI for these bots are based on LLMs, the knowledge is limited by the training set. The bots are also using RAG to augment their knowledge by accessing the topics in the forum.

The bots will typically use a system prompt, think persona, which is not a user prompt with the request to the LLM API.

I will let ChatGPT explain the difference.

ChatGPT

What is a system prompt? How is it different from a user prompt?

Completion

A system prompt and a user prompt are terms often used in the context of language models and conversational AI systems like GPT-3.5. Let me clarify the difference between them:

  1. User Prompt:

    • A user prompt is the input provided by a human user to initiate a conversation or request information from the AI system.
    • It’s the text or query that a person types or speaks to interact with the AI model.
    • For example, if you’re using a chatbot or a language model, your input asking a question or making a statement would be the user prompt.
  2. System Prompt:

    • A system prompt, on the other hand, is the initial input or instruction given to the AI model to set the context or guide its behavior for a conversation.
    • It’s used to provide the AI with a starting point or context for generating responses.
    • The system prompt helps shape the tone, style, or content of the AI’s responses, and it serves as a sort of instruction to guide the model’s behavior.
    • While user prompts are specific to the user’s queries or statements, system prompts are more about giving a general context or instruction for the entire conversation.

In the case of GPT-3.5 or similar language models, a conversation typically starts with a system prompt followed by user prompts in an ongoing exchange. The system prompt helps influence the model’s responses throughout the conversation, but it’s the combination of system and user prompts that determines the model’s output.


The reason the bots are not able to give a much more specific answer is because their knowledge base is lacking the user manuals, installation instructions, etc. for these products. Further bots are often not able to extract images and diagrams from such documentation such as wiring diagrams and images used in the installation manuals.

As a start, RAG should be used to expand the knowledge for the AI as much as possible even if it is only the text from the documentation. While the Discourse AI bot can use RAG for all of the topics in the forum I don’t know if they are allowing external information to be added. If not, one idea would be to import all of the documents as PDFs into a category, possibly hidden, and then the Discourse code could get at them easier as they are in the forum and then use Langchain to access the PDFs (ref)

With this the knowledge in there the system prompt can be tweaked to improve the replies but this will still be far from what you seek. You will still most likely get replies like the first with references to information but no step by step instructions when needed. Also there is no way to give feedback about the AI reply that the AI would know is giving feedback. In other words if a user replied to the bot that something is wrong, the bot would read the reply but not understand that it would need to update the knowledge based on that.

To improve the quality of the replies three thoughts come to mind.

  1. Make use of HyDE (Hypothetical Document Embeddings)

“Precise Zero-Shot Dense Retrieval without Relevance Labels” by Luyu Gao, Xueguang Ma, Jimmy Lin and Jamie Callan (pdf )

I actually learned of HyDE from Discourse (ref) and know they are using it.

  1. RLHF ( Reinforcement learning from human feedback). I don’t think either of the bots are capable of this and Discourse might be considering this; would like to hear their feedback if they care to share.

  2. Synthetic data with RLHF. This is really cutting edge and not something I would expect either bot to implement.

One of the most demonstrated and successful ways to train an LLM is to have them train themselves but you need some metric so they know if the result is better or worse. For your site the only way I currently understand to know if a solution is better or worse is for a human to rate the solution from the AI. To get more possible rounds of questions and solutions, the questions are generated by an AI with another AI generating the solution and then the solution is measured and used to improve the AI generating the solution.


To improve the ability of the bot to help users it should perform a series of questions to collect information to better understand the problem similar to the way a medical doctor will for a diagnosis and prognosis.

Another option is to look at similar technology that can be accessed with the AI bot such as Microsoft Copilot.

One item that you did not request or note that is of value is references back to the source. The reason this is important is to check if the AI completion is an hallucination or not. If a reference is included, is a real reference and not a hallucination then the reference can be checked to see if it is from an authoritative source such as a manual and then be known to not be a hallucination. Adding references is easy to do with RAG and should be expected.


Now to address your specific goals.

Compliment our human support team for a technical support forum.

That is more of a statement than a question, so will not answer it as a question.

It would need to learn from our historical forum data.

This was actually part of a larger statement that I broke up. The reason I broke it up is that while you are correct that the historical data in the forum is of value, the base knowledge such as that in the manuals is also needed. The current bots use RAG with the forum post but AFAIK do not know how to access external data.

It should only attempt to answer on-topic questions.

This can be accomplished by tweaking the system prompt. This would also need some of the technology in Discourse AI Post Classifier - Automation rule

It should provide a way for the user to tell the bot to stop answering so they can wait for a human instead if they feel like the bot isn’t helping.

This can be accomplished by tweaking the system prompt.

Do these automatically learn or fine-tune with historical forum data?

Good question. While I have brushed on how to start to solve this earlier, it really is a much harder question than it seems but the large LLM researchers are making much better progress with this and their might be some interesting announcements in the next few months that can make this much easier.

Not duplicate work that’s already been done by smarter people.

I will take that to mean that the LLM should not digress on what it knows. That depends on how the model is trained on new knowledge. It is possible to make the model worse and even harder to get a model to unlearn. AFAIK neither of the Discourse AI bots have the ability to learn, meaning change the weights in the LLM, but they do have the ability to use RAG with topic post. So bad topic post, garbage in, can create garbage out.

Restrict the topics the bot will engage in so it doesn’t engage in general, off-topic conversations?

This would be similar to Discourse AI Post Classifier - Automation rule

Is the original Discourse Chatbot intended to be developed and maintained long term?

Will leave that one for the Discourse staff.

Or is it planned to eventually be replaced by the AI Bot in the official plugin, or folded into it?

Will leave that one for the Discourse staff.

5 Likes

Thanks for your help and such a thorough post!

I wrote that before I knew about RAG. I just meant it needs to use our historical forum data, not necessarily be trained on it.

Yes, after feedback from you and others, a hidden category as a knowledge base seems like the way to go.

I wouldn’t want a user to be able to update the knowledge base anyway. Only staff should be able to do that. It would be fantastic if staff could review and rate the bot’s responses but it doesn’t seem like that’s possible with RAG.

That’s cool!

Good point. I wonder what it should provide as a reference if the source is in our hidden knowledge base though?

This question was poorly worded due to my naivety with how LLMs work. RAG should be sufficient for now.

What I meant was that I don’t want to develop custom code for features that merefield or the discourse dev team have already built in their plugins. I want to use what’s already available and only enhance when necessary.

Thanks again Eric!

1 Like

Good point. Glad you noted this, I should have stated such in my reply.

Seems like it would segue into the review queue nicely. That is Discourses call.

No.

RAG is one of the many technologies needed to get what you seek. In the case of rating the AI reply that is different from the purpose of RAG. The human feedback for a generated completion is part of the RLHF technology.

It is!

Most of this technology is so new and moving so fast one should be open to ideas from anyone and anywhere.

The same as if you were a human and noting it in a response to a real person. I would think the name of the manual, the date of publishing, the section and paragraph.

Thanks for sharing that. I see many post from many users and will tell you that it is a pleasure to help you.

That was funny but many will not even know why. The technology is moving so fast that I am sure many are creating code one month then realizing it has now been built-in to a technology and what they created is now redundant and may even be obsolete. If you take time to learn about Langchain, you may see users noting that they threw out just about every bit of code for their app and just started using Langchain.

Another source for learning such is the DeepLearning.AI short courses (presently free).

You are quite welcome.

Please ask more questions, that is why I am helping you. The more questions people ask about this technology the more we can help each other reach our goals.

3 Likes

Eric, I already answered this above.

My response wasn’t clear enough?

With all due respect, not sure what these questions have to do with “Discourse Staff”?

The repo is authored and maintained by myself as should be clear from the URL, the commit history and its plugin Topic?

If at any point Chabot was to move GitHub org it would likely be to Pavillion.

1 Like

When I read that I am thinking the bot you see when you first join Discourse. @discobot
If the statement used the word plugin then I would have thought of your plugin and not responded as I did.

1 Like

I’m not sure what you mean by “hidden” but I assume that both the Chatbot and the AI plugin will only use RAG on content in publicly accessible categories.

Exactly, so I take that instead of uploading into a category and then extracting them, nowadays the best approach for this would be to upload those PDF documents as files and use them in an assistant ? This could be combined with on-the-fly RAG from forum contents.

2 Likes

It might be beneficial to have a category that’s used for RAG but not visible to users because it’s not organized/presented in an end-user friendly way. For things like raw user manuals or maybe common scripts that the bot might follow.

2 Likes