Removing IP address records?

privacy

(James Figgle) #1

I want to respect the privacy of my users and not retain personally identifying information longer than needed.

How do I delete any record of IP addresses from tables more than a week old, or somehow fill the records with random nonsense? Just from databases; I am already destroying rails and nginx logs with GNU shred. This allows anti-abuse stuff to still function but keeps user privacy.


#2

Unfortunately I cannot answer this (who can?), but please note that I am also very interested in how to reduce IP address logging (ideally before it happens) or to limit the retention period. That’s both in terms of Discourse application generated logs (disabling per user logs should definitely be possible!) as well as all the other components the Discourse distribution comes with / depends on.

Instead of using shred (good suggestion!), I use the following patch to templates/web.template.yml to rewrite nginx access logs (note this does not work for error logs):

diff --git a/templates/web.template.yml b/templates/web.template.yml
index 316d835..9cd4106 100644
--- a/templates/web.template.yml
+++ b/templates/web.template.yml
@@ -152,6 +152,11 @@ run:
       from: /client_max_body_size.+$/
       to: client_max_body_size $upload_size ;
 
+  - replace:
+      filename: "/etc/nginx/conf.d/discourse.conf"
+      from: /log_format log_discourse \'\[\$time_local\] "\$http_host" \$remote_addr .+$/
+      to: "log_format log_discourse '[$time_local] \"$http_host\" 127.0.0.1 \"$request\" \"$http_user_agent\" \"$sent_http_x_discourse_route\" $status $bytes_sent \"$http_referer\" $upstream_response_time $request_time \"$sent_http_x_discourse_username\"\';"
+
   - exec:
       cmd: echo "done configuring web"
       hook: web_config