However, my forums do not have legal teams to pick through this new and poorly-defined law.
The fines are huge, and there are always axe-grinding members looking to cause trouble for a forum. For me to keep running forums, I need to know the software I’m using is compliant with the new law.
If I interpret the law correctly then we need to ensure the following:
If IPs have been stored for users without their consent, they absolutely need to be scrubbed from our database and no longer stored for anonymous visitors.
When a signed up (or signing up) user visits the forum they need to see a consent screen with an unticked box and an explanation of how the IP will be used
If consent is not given, they cannot be allowed to use the forum.
For the record, I absolutely deplore laws like this as do a poor job of protecting our rights yet they harm millions of businesses and scare the hell out of well-meaning and ethical operators.
I’m absolutely relying on the Discourse team here to take some action to protect its forum operators.
They can be stored, but no longer than necessary for a legitimate purpose.
For rate limiting, there is a legitimate interest and this period is pretty short.
For deduplicating link clicks, there is a legitimate interest but they need only to be stored in Redis for 24 hours. I don’t see any reason at all to keep them in the database.
I don’t see the purpose or a legitimate interest for keeping IP addresses in search logs or incoming links.
In contrast to the opening post I do think the topic_views and user_profile_views are problematic. After all, Redis is already deduplicating IP addresses so there is no need to store the IP address longer than topic view duration hours.
Recital 49 talks about usage of data for network and information security. Recital 47 mentions fraud prevention and direct marketing as a legitimate interest. Deduplicating link clicks and topic views could be considered fraud prevention.
There are no hard storage limits defined. The time you need to keep an IP address in order to deduplicate statistics depends on the granularity of the accumulated statistics.
Although I do absolutely welcome these PR’s I do want to emphasize that storing the IP addresses of visitors without an account (for a longer time than needed for deduplication) is a much more problematic issue since those people cannot easily be asked to give their consent.
Yeah, I was starting to work on that and it’s a bit tricky due to all the various ways that topic view data is used for logged-in users! And topic views are interesting in that only the first time a user or IP sees a topic is counted right now - it doesn’t reset daily like some of the other data.
@riking once we get ALL of these sorted we can start looking at “data hoarding” reduction.
So, for example we can roll up incoming links daily throwing away IPs and only including anon vs logged in counts per day (and follow a similar pattern for search)