I’m using Discourse AI on my site, which runs on a subdomain (community.website.com), and I’d like to better understand what kind of user information might be shared with the language model (LLM) during interactions. Specifically, I’m curious about:
What types of user data (e.g., personal information, IP addresses) could potentially be exposed to the LLM?
Are there any safeguards in place within Discourse AI to limit or anonymize what gets sent?
For some additional context, my setup uses Caddy as a reverse proxy and Sucuri for DNS and firewall. If anyone has insights on how this configuration might affect what is exposed—or just general knowledge about how Discourse AI handles user data—I’d really appreciate the input!
Looking forward to hearing from those who have a better understanding of this.
I believe you’ve been using my AI plugins at some point, Chatbot and AI Topic Summary, since you’ve posted in those Topics, so I will respond for those, but if you want more information, please post in those Topics.
Both of my plugins send usernames and raw Post content (ie the markdown). NB if someone mentions someone’s name in a Post, or an address that will get sent in the markdown, of course, but otherwise Users are only depicted by Usernames.
Other metadata is not sent, e.g. IPs, User Profiles, etc.
You can see the queries being sent in logs if you select the option for verbose logging and divert the logs to Warn (there’s another setting) so they are visible in /logs.
Thank you Robert. Yes, I do use those plugins which are excellent. Appreciate the feedback. After reading some of the LLM privacy policies transferring sensitive data for users would be concerning. Obviously whatever the context within the chat will be sent and the username by itself really not concerning. Some of the LLMS terms are quite invasive so that’s what spurred my inquiry. Thanks again