Completely remove user data from the system

Hey, so we might have a situation in which we want to completely erase every trace of a user from the database. I won’t go into details as it’s sensitive but also due to GDPR we might have to prove that we don’t hold any copy of the user information on our system.

There are some parts that I think I identified:

  • The current anonymisation process is fine for the user profile
  • Editing all its replies with a simple deleted by user request would be acceptable as well, provided that the edit history is removed.

So, what I really think I would need is a query (or a ruby function?) that replace all messages of a users with deleted by user request and wipe out all edit history for its messages.

Is there anyone with enough db/discourse code experience that can help with that?

Could you just delete them?

u=User.find_by_username('byebye')
posts=Post.where(user_id: u.id)

and then call PostDestroyer on all of the posts. (I don’t remember exactly how to do that offhand.)

If you really want to replace their posts with “deleted by…” then you’d do something like

posts.update_all(raw: "deleted by user request")

And then you’d need to wipe stuff from the PostRevision model, maybe

posts.each do |p|
  bad=PostRevision.where(post_id: p.id)
  bad.destroy_all
end

You’d want to do a couple of those by hand as a test, or on a staging site if you’re really careful.

2 Likes

I don’t know, that’s why I asked :slight_smile:

I’ll try on a stage environment in the next couple days. Thanks for now Jay, always a treasure! :heart:

I’d rather edit because I don’t want topics being destroyed just because someone is leaving while maybe others have contributes with interesting discussions on the topic.

Just one concern is when users have tons of replies. I am talking about tens of thousands.

2 Likes

If you have a user request for deletion of all their data, that won’t necessarily include all of their posts/replies as you probably know anonymization can be enough for GDPR given the posts don’t include highly personal information.

There have been a lot of topics about that here, the laws vary for every country.

2 Likes

Indeed after first user is anonymized simply merge future anonymized users into the first one.

That would in effect make all anonymized accounts read just as being a “deleted user,” as with other platforms like discord will just display that.

I wouldn’t recommend always doing that but is one option that can protect identity of post authors, since the randomized number won’t be for only one author. Downside of that is it can become more difficult to follow topic conversations if one doesn’t know if posts are from the same or different authors.

That is quite often impossible, especially with users with a lot of replies.

I noticed that the merge users functionalities fail quite often when I try to merge a users with thousands of messages even with just a new one (use case being: old user returning and not having the credential, verifying with me, all good, trying to merge the new user with a couple of replies with the original old user)

That sounds like need to file a bug report

1 Like