Privacy feature request: disabling or hashing logs (using cryptolog or similar)

Hi everyone,

I’ve been recently playing around with an installation of discourse and realized that as an admin, I can see the IP address of all users at all the time. That’s not exactly something I want to see or collect on the server that’s running this platform (read this post if you’re wondering why). I tried looking in the admin section and couldn’t find a setting to disable logging the IP address, and thought about asking that as a feature request. While logging the IP address might not be exactly useful for most forums, maybe we can take advantage of something like EFF’s cryptolog to hash the IP addresses and anonymize them instead of completely disabling them.

I was wondering if anyone has any experience or insight on this topic. And whether it could be considered to be put as an option in the admin section.

Thanks for the great work!

2 Likes

More ideas on this topic:
Maybe under a new section that could be called “data retention policies” we could make use of two settings.

- Flush all the logs
- Remove (or sanitize1) logs after [302] days

  1. Sanitize in terms of removing the PII from it. And keeping the rest for useful things like monitoring the number of HTTP errors, etc.

  2. If enabled, could have a default number but should configurable by user. Where 0 would basically be never.

1 Like

Related topic that I just found:

Totally get where you are coming from and understand that in some countries this is not something you want to collect.

The logs would be the easy part using some sort of plugin. The trickier change is amending Discourse core not to log last IP address for user in the database.

It is all doable, not built-in, but can be done with the right hooks.

2 Likes

The interesting part is that I actually love the counter-abuse measures built in discourse and don’t want to lose them but I believe that’s possible without knowing the exact IP address. Maybe we can redact the last two bytes of the IP, or maybe we can replace it with a unique hash.

It might sound too much to ask, but it would be fantastic if someone who’s already familiar with the code base can point out which parts of the program IP addresses are being logged.