Current limitation:
• Forum semantic search only indexes post text, not PDF attachments
• PDF files aren’t searchable via /search AI toggle
• To get round this, I had to manually upload PDFs separately to persona RAG
Proposed solution:
• Extract text from PDF attachments during embedding generation
• Index PDF contents alongside post text
• Make PDF-attached topics discoverable via semantic search
Benefits:
• Users find technical documentation via forum search
• No need to duplicate content (forum post + RAG upload)
• Better SEO (PDFs attached to indexed topics)
• Simpler architecture (Search command just works)
If you were to implement this, i could potentially:
Remove forced tools (Search would naturally find PDF contents)
Eliminate RAG uploads entirely (everything in forum topics)
I think that a plugin might add the text to the cooked post in an optionally hidden details element. That would add it so it would be found, I think. If you’re self hosted I think it’d be a only a few hundred dollars to have developed. Or, if it sounds like they’re interested, submitted as a PR, for about twice as much (to include tests and such).
Fyi - I found that uploading PDF files to the persona stopped it from finding “normal” forum content in the ai-assisted search. I have therefore resorted to a combination of (a) converting the key ones to markdown (so I can post them directly as topics) or (b) picking the main keywords/ToC etc. out and posting them alongside the PDF files in the forums. I also had to switch from GPT 4.1 to Sonnet 4.5 and disabled HYDE to make it reliable.