This guide explains how to implement and use PDF processing capabilities within discourse-ai, including both basic text extraction and enhanced processing with LLM assistance.
Required user level: Administrator
Summary
The discourse-ai plugin supports PDF processing for RAG (Retrieval-Augmented Generation) in two distinct modes:
Basic text extraction
Enhanced processing with LLM analysis
Basic text extraction
This mode provides fundamental PDF processing capabilities:
This is really amazing news. Thanks team! Can’t wait for the enhanced processing to be finished. That’s gonna be critical for feeding LLMs research papers.
Also, is there any plan to allow doing RAG “chat-with-your-PDFs” by uploading PDFs in an AI bot PM or in a topic/post and mentioning the bot?
In my website (Arabic Forum) I did a test in Arabic by adding legislation in the first post “topic” and then I asked questions using AI, but the answers not accurate and I think this is because it is not Context Ragging
First of all, really thank you for your great work. I really like it.
After playing around with the settings and changing the AI Model to Gemini-Flash-2.0, it worked great for me. Here’s the situation I have:
We are an Auditors, Accountants, and Tax Consultants community, and we needed a tool to share related laws and trigger discussions about them. This discussion should be very useful for visitors, as we are professionals in our field. We are targeting the AI Model to check and analyze legislation and answer our questions. The great experiment led to the conclusion that we can really discuss the context added in the first post, and if the AI model is smart enough, it will answer our questions with very high-quality output.
Really thank you again and looking forward to the PDF support as it will make Discourse best forum Sofware
Does it have to be enabled via console? Don’t see any advanced mode options via the UI.
Furthermore, I am getting an error when trying to upload this pdf. It is 34 MB but I have my max attachment size set to 100 MB (in both admin settings and app.yml). What’s strange is that I have a compressed version which is 16 MB and it uploads just fine. But perhaps the larger PDF is simply too complex for now? There are lots of images, equations, etc.