By hidden I mean a category that is not visible to most users. It does not have to be hidden but having a category with just the PDFs to be used as the knowledge accessible by RAG could get cluttered up by other users.
I used PDFs as an example as many manuals probably already exist as PDFs and noted to hide them in a category as that is easier to comprehend. For other possibilities that would not need a hidden category, Langchain Retrieval is a good place to start.
I have no problem with that, it gets the idea of what needs to be done moving in the correct direction.
However the word assistant seems to be losing ground in certain context to the the new kid on the block GPTs. GPTs also seem to be replacing ChatGPT plugins but I haven’t used a ChatGPT plugin in months so really am not noticing, only seeing chatter about such in the forums.
I have to be honest. I did not know enough to respond so watched this video.
With regards to Assistant it seems that GPTs gives you
Web Browsing
DALL*E Image Generation
Code Interpreter
I would agree with most of what you noted.
Also having just learned so much more about GPTs watching the video I will say that it is impressive, still will not meet the needs of most people but is making progress. Also was interesting to see some of the JSON that is the same/similar to ChatGPT plugins.
With regards to this topic, I will note that many on the OpenAI forum do try to upload files for use with RAG and quickly find that there is a limit of 10 files. Also a link to the GPT can be shared but the other user also needs to be a ChatGPT Plus subscriber so this is most likely a dead end option for many with a commercial endeavor.
The Open AI Assistant API is a bit early in its life to adopt and is effectively in preview:
It is reported to be prohibitively expensive (it fills up the available context with every iteration regardless of whether it needs to and you are paying for that whether you need to or not)
It apparently suffers from an “infinite loop” issue where it occasionally finds itself iterating indefinitely. This is a surprising issue given the apparent expertise of the company because it’s very straightforward to design a mitigation for this.
I suspect it needs more time in the oven and needs to incorporate feedback from early adopters.
As a potential future simplification for downstream consumers of AI services, like Chatbot, I welcome it.
In any case you need an interface with Discourse.
Chatbot already provides the same power as the Assistant API without any of its issues and in addition provides the interface with Discourse. It is ultimately API iteration agnostic and will evolve with available services and it’s very possible it may offer an Assistant API option in the future but there is currently no benefit in doing that because it will only create issues (including the ones outlined above)
20, but yes, a very limited amount - which can be sort-of-circumvented by appending all files you need into a single large file (which tends to become pretty expensive quickly)
and the big surprise is that you don’t know how expensive until later…
Almost - the assistant API has the ability to include private, custom, knowledge that is external to Discourse.
See the example above - if you have PDF’s containing user manuals and they are uploaded to Discourse, they will not be processed and indexed. You would need to convert them to markdown. This is not only tedious but would contain additional automation and conversions.
Reality maybe?
Truth is that most domain knowledge is not readily available in markdown.
Again, take the example given above. They have nice PDF user manuals which are created by an entirely different division. Convert once and edit locally won’t work.
Do you foresee using manuals for use with RAG for a Discourse AI bot on the Discourse site?
a. What format are the current format of such manuals? Hard copy, PDF, web site, other?
b. How often will the knowledge base be updated with new or versions of the manuals?
c. Will the manuals be made public on the site?
What other information besides the Discourse forum could be of value to an AI for responding on the support site?
When creating a Discourse AI bot do you foresee other needs for the bot? I ask this because if you read through the history of Discourse creating the AI bot you will see it go through many changes, it was not the perfect bot with the first creations and still is seeing many changes.
Can you create some examples of how you see the bot or agents being used? For example show a few different cases of a user chatting with the bot.
Note I use the word agent with similarity to an Langchain agent and not to be confused with an OpenAI assistant.
Had an idea now that I have seen OpenAI GPT being created in a YouTube demo.
You could create an OpenAI GPT, which requires a ChatGPT Plus subscription and probably access after being on a waitlist, similar to what you seek to learn about the pros and cons of such technology and that can be leveraged for creating an Discourse AI bot. I am not suggesting you create the bot but would have a better understanding of what the AI can do and where it will fail miserably.
PDF or HTML (hosted online) but I’m fine with converting them to text/markdown and maintaining them as Discourse posts. We can write a script to make it easier.
Product manuals won’t be updated frequently but other content in the knowledge base will. Things like FAQs, frequent/trending issues, etc… would be updated more often. We post as much as possible publicly but some would make sense to keep private - not because it’s sensitive information but because it might look weird and confusing as a public post.
Yes, official product manuals are usually PDFs. For example…
For now, probably nothing. There’s too much risk letting it search the internet and getting results that aren’t true for us.
In the future, I could see value in giving the bot additional context about the customer’s own security system but that might never be possible due to privacy issues.
It seems to me that since it’s all based on RAG the most important thing is giving us fine-grain control over how the bot searches the forum and prompts the AI. From what I’ve read it seems like merefield’s Discourse Chatbot does that better today. I’m hoping to test that theory this week.
For us it’s always providing user/customer support by answering their questions or solving their problems. For example:
Helping them figure out what to buy to secure their specific home, sometimes taking into consideration home security equipment they already have.
Helping them install and use their system.
Helping them troubleshoot when they have problems.
Answering curiosity questions they have about how things work, what might be available in the future, etc…
I know this is just a simple link for you to share but for many others who are following in your footsteps of getting a support site to have a AI bot respond and allowing others to see the interchange of ideas in a public forum this is cutting edge. I know companies are doing this in private but you are doing it publicly.
I was also happy to see the diagrams as that reaffirmed what I was thinking was correct.
I would not do this.
The bot will not actually search the forum in a manner similar to how text is searched for in post. The keyword and related technology you need to learn for this is Vector Database and Nearest neighbor search.
I have used the free online version a few times and as I noted in my answer I have not seen anything as good and I have looked a lot over the years for ways to get text and other info from a PDF.
I want to add some more info on this as some may think that the reference (name of the manual, the date of publishing, the section and paragraph) is part of the information used with RAG and I did not envision it that way and did not give some of the needed details. I did envision using metadata; for me data and metadata have very different meanings and usage.
When I asked ChatGPT about this it gave a reply that was along the lines I envisioned but with a few more specifics like Elasticsearch.
Prompt
How can one use RAG (Retrieval Augmented Generation) with a PDF manual for a product and get the title, publication date, page and paragraph number from the manual as a reference with the noted information?
Completion
Using Retrieval Augmented Generation (RAG) with a PDF manual to extract specific information like the title, publication date, page, and paragraph number involves several steps. RAG, developed by Facebook AI, combines the power of retrieval (fetching relevant documents or data) with language generation, enabling the model to reference and incorporate external knowledge. Here’s a general approach to use RAG with a PDF manual:
Convert the PDF Manual to a Text Format:
First, you need to convert the PDF manual into a text format that can be processed. Tools like Adobe Acrobat, pdftotext (part of the Xpdf project), or Python libraries such as PyPDF2 or PDFMiner can be used for this purpose.
Preprocess the Text:
Once you have the text, you may need to preprocess it. This includes removing unnecessary formatting, handling special characters, and possibly segmenting the text into sections, pages, and paragraphs with identifiable markers.
Indexing the Manual:
For RAG to work effectively, the content of the manual needs to be indexed. This means creating a database or a structured format where each piece of information (like a paragraph or a section) is stored with its corresponding metadata (page number, paragraph number, etc.). Elasticsearch or a similar search engine can be used for this purpose.
Integrating RAG:
With your manual indexed, you can now integrate RAG. RAG typically combines a transformer-based language model (like BERT or RoBERTa) for retrieval and a sequence-to-sequence model (like BART or T5) for generation. You’ll need to set up RAG to query your indexed manual. This might involve customizing the retrieval component to work with your search engine.
Querying the System:
When you query the RAG system, the retrieval component first finds the most relevant sections of the manual based on your query. This information is then passed to the generation component, which formulates a response.
Extracting Reference Information:
To get the title, publication date, page, and paragraph number, you’ll need to ensure that these details are either included in the retrieved segments or that the system can reference back to the indexed data to fetch this metadata. This might require additional logic in your system.
Post-Processing for References:
After the response is generated, you may need a post-processing step to format the references correctly, ensuring that the title, publication date, page, and paragraph number are accurately included.
Fine-Tuning and Evaluation:
Depending on your specific requirements and the complexity of the manual, you might need to fine-tune the RAG model on similar types of documents or queries. Evaluate the system’s performance and adjust your approach as needed.
Integration with User Interface:
Finally, if this system is meant for end-users, integrate it with a user-friendly interface where users can input queries and receive responses with the referenced information.
Remember that the effectiveness of this approach heavily depends on the quality of the indexed data, the capabilities of the retrieval model, and how well the generation model has been trained or fine-tuned for your specific use case.
Note: While I do use ChatGPT to generate information I also do not post information from ChatGPT if I think it is wrong. In other words ChatGPT can be useful but you have to know if the information given is correct or not. While I have not implemented exactly what ChatGPT notes here, I have enough experience with this that I would use this or a deviation of it.
The Discourse Chatbot plugin has a step where you create the embeddings which uses the embeddings API from OpenAI. The Discourse AI plugin doesn’t seem to do this (it showed no embeddings activity on the OpenAI dashboard) but it still finds posts on our forum. If it doesn’t generate embeddings to search semantically, how does the Discourse AI plugin search the forum? Does it use regular text search? Does it compute it’s own embeddings?