Will RAG Support PDF Files in the Future?

Confirmed, give me a few days here, I want to also try direct text extraction which is something we can enable by default.

Then “rich” LLM based extraction can be behind flags.

The trouble with many PDFs is that they are huge and can be very taxing on server resources. Additionally stuff like tesseract can be a bit tricky to install - it can improve the quality.

5 Likes