Questions about user anonymization and GDPR

Hello,

On a public Discourse instance, a user recently requested deletion of their user account, to which a moderator replied by using Discourse’s anonymization feature.

However, I quickly noticed that the newly assigned user id was usable in the advanced search form to look for all past messages of that poster. It is quite easy for such posts to contain pieces of data that can help identify the user and link their online activities with the “anonymized” account to their physical existence (for example a surname, an age, the mention of the place they live in…).

This creates a problem because it does not fulfill those posters’ request for protecting their privacy rights. They may even think their request was fulfilled while information will actually be left around and very easy to find out by others. Also, perhaps they were a minor when using that account, which creates additional problems.

While I am not a lawyer, this is most certainly not compliant with the law of at least some European countries - including, I think, France which I am of citizen of (for French-speaking people reading this, here is a link to our official data privacy protection agency’s dedicated page: Le droit à l’effacement : supprimer vos données en ligne | CNIL) . It is generally understood that deleting user data really means deletion of any kind of data that might help identify the user (doesn’t need to be a complete official name or an unambiguous identifier : any mention of personal characteristics, however imprecise, is enough).

Givens that you probably don’t want admins and moderators to go around in all past posting history of a user requesting anonymization, and manually blank out any potentially personal information in those posts’ contents, it seems to me the only reasonable solution is to physically delete all posts by that person, and any quoting thereof in subsequent posts.

Also, if there is a concern that this might disrupt past discussions, perhaps you want to give users and moderators two options: shallow anonymization (current behavior) and full deletion (as proposed above).

PS: I am not giving any links to the original Discourse instance, because it would only risk publicizing someone who asked to be forgotten about. Of course, I can send them in private to Discourse maintainers.

3 Likes

There is already an option to wipe an account completely. You just need to delete all posts then delete the account.

Additionally, if you want the posts up but not be tied to a specific user, you can anonymize the account and then merge it with he system account or another account that can be a placeholder for all anonymized users

Note: that warning is just a setting that can be overridden

7 Likes

Discourse does sort of have an option that could be used.

Merge Accounts

Simply merge each new anonymized account with each newly anonymized.

Sure a full delete or a set of keywords might be handy to remove possible identifiers. The above though would make it more difficult to use message chains.

Another feature that maybe Anonymize could have a feature request to set Anonymized user profile Private. Then user profile not viewable.

3 Likes

I’m not a lawyer either, but basic understanding of GDPR compliance (not country specific) is that can be maintained with a policy of enforcing people don’t publish any kind of personally identifiable information in forum posts.

That does require active monitoring to maintain, but as long as that is done then the anonymization feature technically isn’t even required to fulfill GDPR. A user account could simply be permanently suspended as long as their user ID name and card don’t identify them, then there is no personal data maintained. (Edit: no personal data in public view, but database still has e-mail + I.P. address unless account is anonymized).

Can be a challenge if forum administrators don’t fulfill all of the requirements voluntarily then all that can be done is to report them for that and there may be a citation or call from a regulator to enforce the compliance.

Edit: Account does need to be anonymized to purge the I.P. address/e-mail from the database which is important for GDPR, but those aren’t ever published to the public view by the system unlike the username which may be legal identifiable name.

Also once something has been published in the public internet for over two months there may be public or non-public archives made of that so just deleting the original discourse posts won’t change that. Is important for people to be careful about not sharing personal information in forum posts if they want to be anonymous or be able to have account fully anonymized.

1 Like

Your sentence should continue: …that is collected by system.

Including France, because that is EU-level thing, not national.

IME, that doesn’t match the actual posting patterns in any real-world forum. Even in relatively impersonal settings (such as an open source community), some people still, occasionally, post personal information which might more or less easily help a reader identify them even after the username was anonymized.

IOW, I think the kind of forum that you’re describing probably doesn’t exist in practice.

Thanks for this. I’m not a moderator myself but I’m forwarding the information.

1 Like

These laws, as you point out, are open to interpretation.

Discourse provides multiple options which forum administrators are free to use at their discretion as they see fit to enforce their own interpretation of the law.

We have multiple customers with GDPR deletion request integrations pointing at their site; some of them delete the posts & accounts outright, some anonymise. Some do it by hand.

But what they do is their decision - we aren’t the data owner of the forums - they are. And if they find our handling of anonymisation is insufficient (they have) then we address it (we did).

If you feel that anonymisation was insufficient on the site you’re talking about, it’s best to address it with them.

10 Likes

Hm. It feels strange, that site maintainers should be responsible for personal traces, which people have interwoven into the web of conversations in a forum.

3 Likes

I don’t know what “IOW” means, but I have seen this practice maintained at a U.S. based forum for FTC compliance, also GDPR for European users. I am in the U.S. not the E.U.

Here at Meta it is a policy to not sign posts, as the user card acts as as signature where people can have a link to their own site and some personal contact information as long as that is restricted to user card doesn’t need to be posted in posts.

Is common for people to forget if that is a policy so does require active moderation to maintain, maybe is not a common practice but does exist.

1 Like

That’s useful, thank you!

Yeah, I know :slight_smile:

That’s what I did, and it was suggested in reply that I ask on meta.discourse.org, hence this topic :wink:

2 Likes

“In other words”

3 Likes

Thanks, then there was another acronym “IME,” that one is also a mystery.

I was talking specifically about direct personal contact information which is what the FTC regulates as well as the GDPR.

Statements like “I live in Sweden” don’t have to be censored for GDPR compliance.

That sounds to me like the admins on the forum in question are not aware of the available functionality. I can tell you with absolute conviction that we provide all the tools necessary to use Discourse in a way that is GDPR compliant. How they choose to use those tools is discretionary.

5 Likes

'IME" I believe means In My Example.

EU is interesting as the owness is on the site Admin to “Block” Eu ips if not compliant with gpdr. I could have in part something not quite right in my recall.

1 Like

Small adjustment: I would say it is “In My Experience.”

3 Likes

It is! Sorry for that.

1 Like

While “I live in Sweden” is not distinctive enough, “I am 14 years old and I live in this small village in Sweden” probably is.

To my understanding, the GDPR’s definition is much broader than “direct personal contact information”, and so is that of national laws predating GDPR, such as in France.

See What is considered personal data under the EU GDPR? - GDPR.eu

And in particular this example:

There are millions of Roberts in the world, but when you say the name “Robert,” generally you are trying to get the attention of the person you are facing. By adding another data point to the name (in this example, proximity), you have enough information to identify one specific individual. These data points are identifiers.

No it is not even remotely.

And GDPR is not for that, at all. An user may reveal personal stuff as much what he/she/it wants.

Thanks for clarifying hard to keep track of all these abbreviations. :beers::sunglasses::+1::sparkles:

@pitrou all good too many abbreviations to memorize. :joy:. I often have to google them when not familiar with one.

By default all the anonymization feature does is remove the account I.P. address, e-mail, and username from the database, which is no guarantee of protecting someone’s anonymity depending on what all they have published at a site that hasn’t been edited out.

For the situation you described it may be best to file a report with French authorities if the site administrators are not following through with their responsibility to the French GDPR and other relevant laws.