Deleting personal data

I am running into an issue that I can not seem to find a good solution to. I have a right to be forgotten request from an old member which I am attempting to honor. The problem is that their personally identifiable information is not only on their profile but within a number of topics. While I can delete the data, the revision history still contains the data.

A decent example about this is that we have an entire topic related specifically to that person which has numerous pieces of personal information about the individual that I need to be able to permanently and completely delete.

Is there a method I can use to permanently delete that data and have it no longer available even to admins? I would prefer not to have start manipulating the database directly if possible.

This is the best (only) way to ensure the data is 100% gone.

Do you have a list of topic or post IDs for which you need to scrub the revision history?

2 Likes

I can dig through and remove the data directly from the database without much trouble. I had just hoped that with GDPR and the like being the new reality, discourse would have some method built in to anonymize/eradicate posts and or topics to eliminate some of this data as needed since personal information does exist outside of user profiles.

We support hard nuking all the data the user posted, but don’t have a “hard nuke topic” or “hard nuke post”, technically you could change ownership of all the posts in the topic to said user and then nuke the user, but it feels somewhat hacky. That said, this is very rare so I guess it is a reasonable workaround?

1 Like

I will look into that for handling future users and see what the viability is long term. My biggest concern is that entire posts do not necessarily need to be destroyed, usually it would be enough to edit out the specific data if we could also destroy the revision history. I absolutely understand why this isn’t something that is generally allowed.

I guess a my bigger question is, with GDPR being put through its paces and other nations jumping on board. Is this a very rare request or will this be something more and more common?

While I know preserving history is extremely important, I have to image there is a happy medium somewhere to allow for adherence to privacy while not completely breaking accountability. Perhaps some option to eradicate a post history that retains the revisions but replaces the actual content of the revisions with some customizable text like “Removed in adherence to right to forget request”. This would at least retain some historical data that the post had been modified, how many times and when. That is better than nothing while allowing for personal data to be eradicated completely. This certainly isn’t idea and may not even be a good option. I am just throwing it out there as a potential way of finding a middle ground.

I would say destroying revisions is a reasonably rare request, hide revisions is usually what tends to be used per:

Technically this is doable from the console, for our hosted customers we would take care of it on demand, but the demand has been extremely low.

4 Likes