Allow ChatBot to read PDFs so it can join in a group discussion

EricGT · August 29, 2023, 12:47pm

For those who have access to tools that allow one to chat with a PDF(s), it would be nice if the Discourse AI -AI Bot could also read PDFs and join in the discussion.

Right now the academics are eating this like candy but I don’t know of a way for a group of users to join as a group with the bot to talk about the paper(s). AFAIK one can only chat alone with the bot that read the paper. I am sure group chats with paper(s) exist but Discourse should have it too.

Think of it like a book club with a bot invited and the discussion being about one or more papers (pdfs).

If someone gets the bright idea that Discourse + AI model plugins (ref) = , hopefully this is the first place you read it.

As more and more different plugins and bots are created, one could eventually form a garage band, have a virtual programmer meetup , etc.

merefield · August 29, 2023, 2:47pm

As far as Discourse Chatbot 🤖 (Now smarter than ChatGPT!*) is concerned, PR welcome.

Anyone is free to contact me if they’d like to sponsor that work.

The framework I’ve created is easily extended and reading pdfs would be a great addition.

sam · August 31, 2023, 12:39am

Going to need dedicated personas for this kind of work, I do think it is doable, you chunk and embed and then can discuss with it. But I am not sure I would mix this with “Forum Helper” … maybe a “Document Explorer” persona.

Very interesting use case and given we have so much of the infra to upload documents, etc it is not too much of a stretch to build.

Falco · August 31, 2023, 1:46pm

Is this extracting text from the file and injecting it in the prompt? Sounds like an interesting feature if so.

EricGT · August 31, 2023, 2:05pm

First off, I have not created any of these so can only speculate.

Yes.

The few ChatGPT plugins that I have tried read the entire PDF, however many only read the text as trying to extract data from the math expressions and graphs is beyond their capability. This is do to the fact that a PDF is designed for layout and presentation and not context extraction or passing along knowledge as a data interchange format.

Not sure exactly what you mean by that but from I understand they embed the knowledge into a vector database and then use the prompt to pick out the relevant parts and compose a reply.

The analogy I use to explain to others for how to understand the concept is instead of focusing on the idea of a PDF instead think about the ideas the author(s) of the paper are trying to pass along in the paper and that you are conversing with them.

If you can run plugins with ChatGPT then at this site

https://pugin.ai/

search for PDF or paper and try some. The two main differences I find in them is that many will read a single PDF, (https://pugin.ai/p/chatwithpdf) while this one (https://pugin.ai/p/science) will pick the relevant papers out 250M scientific papers.

LangChain has this

and there are similar repos on GitHub (ref), YMMV.

Here is a specific use case for such technology for those that think such would be limited to just academics.

Leveraging LLMs with Vast Mechanic Datasets and Guides

merefield · August 31, 2023, 3:12pm

How strange to put a model number in a repo name! Why wouldn’t it work with 3.5?

EricGT · October 12, 2023, 10:21pm

FYI

Others are also jumping in on similar ideas.

Topic		Replies	Views
PDF support in Discourse AI Site Management how-to , ai	16	435	July 15, 2025
Advice on a support bot for a technical support forum (Discourse AI vs Discourse Chatbot) General ai , ai-bot	50	3564	September 19, 2024
Will RAG Support PDF Files in the Future? Feature completed , ai , ai-bot	23	362	May 25, 2025
[Ai Bot] Add user token tracking, custom AI personas, max context posts, document loading, custom API URLs, and localized chat titles Feature ai , ai-bot	2	464	March 22, 2024
Discourse AI Persona, upload support Announcements ai , ai-bot	20	1447	January 30, 2025

Allow ChatBot to read PDFs so it can join in a group discussion

Related topics