Providing data for GDPR

gdpr
privacy

(Richard Phillips) #11

Ah - I didn’t know that - thanks…


(Richard Phillips) #12

Got it, I think you are right (though for most users a pdf might be easier!) Just adding the personal information would probably make it legally compliant…


(Christoph) #13

Do we have clarity about what “the data” includes? I see two extreme possibilities:

  1. only the personal data (e.g. the name, IP address, etc
  2. all of the above plus all data somehow linked to these (e.g. how much time that person has spent reading each and every post on the forum).

I doubt that either of these extremes is the correct interpretation of “the data”, and I hope the “truth” doesn’t lie too close to #2.

BTW: To keep these discussions as focused as possible, I’d like to suggest that someone moves this latest topic digression about providing data into a new topic (and tags it #gdpr).


(Richard - DiscourseHosting.com) #14

It’s #2.

‘personal data’ means any information relating to an identified or identifiable natural person


(Richard Phillips) #15

Entirely agree that for the purpose of a data access request it is #2 - everything must be provided. All posts, the lot. Good thing is that existing functionality provides most of this, we just need the personal information adding.

For the purposes of the right to be forgotten… Would you agree that:

If a user anonymizes their account but leaves posts in place - the remaining posts are no longer personal data because they are no longer ‘linked’ to identifiers. If so, it would appear that the ‘anonymize’ function is sufficient to comply with the right to be forgotten - as long as it also removes IP addresses, email addresses etc in the background.

There remains a risk that posts contain personal identifiers. My thinking is to offer users (not via admin) both options - anonymize and accept that posts remain (so we have consent in the event some identifiers remain) or complete deletion.


(KajMagnus) #18

Isn’t that a risk also before the account got deleted? Personal data in comments & topics is maybe rather bad, most of the time? Because in general the staff doesn’t know if the one identified in the text, is okay with that. And if someone types his/her own name and details (which, intuitively, one should be allowed to do?) — then, in general, the staff still wouldn’t know if s/he is really the one s/he claims to be. Maybe s/he is an impostor, and the real person doesn’t want any of his/her name & info there.

Maybe personal data in comments & posts, should in general be deleted immediately (I mean, when the staff or core members sees it) and the author be sent a warning? Rather than waiting util the relevant account gets deleted. For example Reddit has a policy against posting any PII; one can get banned quickly by posting PII. (Public figures, like politicians & celebrities = exceptions)

If someone wants to tell the world who s/he is — then s/he can use his/her profile bio text, for that. And if later on s/he deletes the account, then the bio disappears, all is fine.

Maybe enabling-deletion-of-all-one’s-old-posts could be a forum wide option that could be turned on by admins? I’m thinking both alternatives make sense: some forums, with sensitive data (e.g. heath issues) might want to make their users feel extra safe & respected, by enabling the “delete all my old comments” button. Whilst the default, for “normal” forums, could be to disallow that (to avoid “destroying” old discussions).


(Blu McCormick) #19

This is a requirement in some form in Europe with real world consequences if you don’t comply. Having this might also be a selling point in this current anti-Facebook backlash. People seem to be turned off by the sharing of their personal info and knowing they have control over deleting personal information could be a plus.


(Christoph) #20

So on discourse, that would mean all the read times for all posts etc? If so, I suppose it would suffice to provide the post-ID, right? Or does the post itself become part of my personal data because I have read it for X seconds and discourse is saving that information in its database?


(Richard - DiscourseHosting.com) #21

No, reading a post will not make it part of your personal data.

Since a post ID is not visible to a user, the topic ID and post number would be better.


(Blu McCormick) #22

Here is a good starter article for people interested in this topic:

https://www.wileyrein.com/newsroom-newsletters-item-May_2017_PIF-The_GDPRs_Reach-Material_and_Territorial_Scope_Under_Articles_2_and_3.html


(Angus McLeod) #23

I don’t mean to offend, but this topic and its companion is a little misleading.

If you’re looking for reliable information on this subject you should restrict yourself to:

  1. Official sources, e.g. the European Commission’s Article 29 Working Party.

  2. Formal legal advice.

Don’t rely on 3rd party summaries (or even what folks are saying here, including me).

Regarding the substantive points, I would point out a few things

  1. Concerning the Article 29 Working Party’s Guidelines on the Right to Data Portability I note:

    • Availability of data via a JSON API is explicitly mentioned (multiple times) as a suitable data format. In fact one might even say it is encouraged vis-a-vis other methods.

    • There is no requirement to provide everything in a single package, or instantly. The data needs to be provided “within a reasonable time not exceeding one month”.

    • The thrust of the regulation is to avoid data “lock-in” and to promote interoperability.

    As far as I can tell, there is nothing that Discourse needs to add to its existing functionality to allow forums to which this directive applies to comply with it.

  2. Concerning the Right to Erasure (aka “Right to be forgotten”), I would reiterate that the applicable timeline (like with the Right to Data Portability) is one month. There is no need to provide a one-click “Forget me” button for users. It is quite possible to comply with requests to be forgotten within the existing functionality of Discourse.

    Moreover, It is not clear to me that it would be a good idea to allow a user to completely erase all data concerning them themselves as the Right to Erasure explicitly requires the data controller to consider exceptions and countervailing rights when complying with a request.

The bottom line here is that, as far as I can tell, Discourse does not contain any structural impediments to your compliance with the GDPR. Compliance with the GDPR is up to you, as it arises in specific cases and is largely a matter of organisational management, not one of technical functionality.

If you think the GDPR may apply to you, you should at a minimum review the help documents provided by the relevant Data Protection Authority in your jurisdiction (as they will be the ones actually enforcing the GDPR), and seek legal advice if you have specific concerns. If you’re not sure which DPA applies to you, you can review the European Commissions own documents I linked above, or just pick a DPA that uses a language you can understand.

None of the above constitutes legal advice, and I am not your lawyer.


(Sam Saffron) #25

This is one huge sticking point for me, if you signed up and accepted in the TOS that you are licensing your content to the forum operator, I am not sure if you have a leg to stand on when asking for erasure. Asking for anonymization, sure, but erasure is far more strong and disruptive.

For example with Stack Overflow you are licensing your content under 2018 Stack Exchange under cc by-sa 3.0 with attribution required. There are strong competing rights here between an existing granted license.


(KajMagnus) #26

Actually that seems incorrect to me (so good advice to not listen to anyone then :slight_smile: ) and, reading the docs, it seems to me that a delete-account (revoke consent) button is needed. From the docs:

However, when consent is obtained via electronic means through only one mouse-click, swipe, or
keystroke, data subjects must, in practice, be able to withdraw that consent equally as easily. Where
consent is obtained through use of a service-specific user interface (for example, via a website, an
app, a log-on account, the interface of an IoT device or by e-mail), there is no doubt a data subject
must be able to withdraw consent via the same electronic interface, as switching to another interface
for the sole reason of withdrawing consent would require undue effort. Furthermore, the data
subject should be able to withdraw his/her consent without detriment. This means, inter alia, that a
controller must make withdrawal of consent possible free of charge or without lowering service
levels.

From ARTICLE29 Newsroom - [adopted, but still to be finalized] Guidelines on Consent under Regulation 2016/679 (wp259) - European Commission, section 5.2 Withdrawal of consent, on page 21.

Then they go on describing an example, where consent is given via a one click web widget. And withdrawn, by making a phone call during business hours. And that’s not ok. To me it seems that having to switch to email and message the staff, not totally ok (although not quite as bad).


#27

That’s giving and revoking consent. It’s essentially saying that if you check a box to give consent to your data being stored then you need to be able to uncheck a box to revoke your consent not to delete your data. What happens then is unrelated and as @angus pointed out can occur over a one month period.


(KajMagnus) #28

@HAWK I didn’t write anything about the personal data having to get deleted immediately. I said apparently there does need to be a button (or checkbox), when someone else said no-button-or-checkbox-needed.

(I’m assuming people mean the same thing when they talk about a delete-account widget, forget-me widget, and a revoke-consent widget. I’m thinking it would delete the user’s personal data (but not the user’s CC-By licensed posts).)

In fact I think it can make sense to schedule the deletion a week later, in case the user changes his/her mind.


(Kane York) #29

This is important, and it would be reasonable to conclude that the rights of the other forum participants and readers to an accurate archive of conversations means that the Anonymize feature is plenty sufficient. Personal information in the posts themselves should be dealt with via case-by-case review and manual editing, either by the moderators or by the user.


(Matt Palmer) #30

That is a rather poor assumption to make.


#32

That implies that revoking consent means all data has to be deleted and I’m not sure that’s the case.

That said, what I think we should avoid doing is debating the semantics here. I think that until we see this in action we’re shooting into the dark.


(Richard - DiscourseHosting.com) #33

Please note the differences between GDPR Article 15.3 “Right of access” and GDPR article 20, “Right to data portability”

The JSON approach is an absolutely interesting approach (thank you for bringing it to our attention), but I have two concerns:

  • does it apply to article 15 as well?
  • one of the differences between article 20 and article 15 is that for article 20 a subset of the data will suffice, where article 15 requires all data to be made available. Right now there seem to be some fields (for instance my sign-up IP address, post reading times) that are not available to a user by means of a JSON API call.

For implementing The Right to Erasure it is indeed not required to provide a one-click button to users. The process (currently: “send a PM to admin”) should be clearly documented though. I also agree with you that deleting post content is not required, the countervailing interest of the other users to keep the discussion intact is larger.


However, I do see some impediments for GDPR compliance, and I think it’s a good idea to try to make a list. This is what I have right now:

  • IP addresses are stored in too many places without legitimate interest
  • GDPR does not require deletion of posts when a user is deleted, but Discourse does. You will have to anonymize instead, but that does not delete enough other data (see below)
  • Anonymizing a user leaves @ mentions
  • Anonymizing a user keeps (amongst others) IP addresses, reading times*, clicked links* in the database
  • If JSON is a usable method to implement article 15, it does not provide enough data

*) need to double check


#34

This is currently being addressed.