The data are available, if not easy to retrieve. The likelihood that some discourse customer will get such a request before you do and the ability to easily get the data is pretty great.
Worst case, you’ll have 30 days to solve the problem. At that point you can either do it yourself, or pay someone no more than a few thousand dollars to do it for you. You likely have many larger risks in your life.
Unfortunately for me the worst case scenario is here now - pressure to come up with a way that I can prove I can export GDPR compliant dataset to meet data access requests before the act comes into force! I can do this but I keep raising this as I am sure this is an issue for thousands of Discourse installations whether they realise it or not.
Including a list of IP adresses in a self-service data export seems dangerous and counterproductive to me. There should only be enough there for what the user needs if they are migrating to another service or backing up their content before deleting it.
I can see what GDPR is trying to do but I’m not sure they have considered the implication of what happens when a hacker steals your account and is able to easily dump a local copy of all the data linked to that account, including the sensitive stuff. Even after you regain control of your account they still have all your data.
Those kinds of requests NEED to go through an information officer that manually verifies the identity of the requester.
Imagine if PayPal just had a button on your account page saying “download all my stored data” and that archive included all your credit card information for instance!
The final rules are the product of extensive debates and are very well thought out. We’re well past the point where expressing opinion really matters, it’s a simple question of complying, or else.
I’m very familiar with @richp10’s dilemma, my own clients span the NHS, government and the UK education sector. As soon as GDPR was finalized it entered the standards and practices for many UK public sector organisations,which impact all new services.
This situation isn’t a new thing, a few years ago HEIs took a similar stance with identity and access management, mandating SAML2 and SIFA. Until compliance can be proven products such as Discourse can’t even be proposed at many levels.
Both true. However, in the case of “Right to Access,” the requested data must be provided within 30 days.
Therefore, a semi-manual process, run by an information control officer or other qualified person, and taking a few extra steps to verify the identity of the requester, would be compliant. It could also arguably be preferable from an account security viewpoint.
This is about a million miles from the reality of the public sector, it’s the key difference between what we as technical people know we can deliver on, and the risk assessment that the non-technical decision makers will run through when approving new products and services for inclusion in a live service.
It doesn’t mean that we can spend 29 days writing code and deliver on the 30th, a request will be received, validated, a change proposed and tested, the exported data scrutinised (because disclosing more than the request can be as big a risk as failure to disclose) all before release. No large organisation is going to let the 25th come and go without knowing exactly how they’re going to execute the above, the penalties are just too large.
In the kinds of organisation mentioned above a Right to Access Request will be handled by a records manager in exactly the same way a Subject Access Request is handled today. While there are outliers, they’re typically non-technical managers and need solutions that don’t involve rooting around in a database. Every new solution touted by their peers within the technical organisation have to meet key criteria for audit, access, and discovery of user data. It’s one of the big reasons that a lot of in-house projects get killed off in favor of COTS alternatives which guarantee compliance with one or more standards.
Unless we can demonstrate a toolset which makes product compliant with GDPR, many projects will be halted. This isn’t a hypothetical though, I’ve already seen several streams of work which have been running for years put under review. It’s just like every other compliance exercise, Discourse will be no exception there.
That’s kinda the crux of it here, right? You can reasonably easily pull all data about a user (on request) within minutes. Whether or not your internal procedures make this onerous isn’t about the software.
Again, that’s my point. Software isn’t compliant or non-compliant. It is up to individual administrators to ensure that they comply – so perhaps step one for anyone that is concerned about whether they are compliant would be to source/write this query.
Business or Enterprise customers could deal with these requests using the Data Explorer plugin so we wouldn’t need to be involved at all. Standard customers would need our assistance and we will talk to them on a case by case basis. There (obv) hasn’t yet been a requirement.