I am concerned about COPPA for under 13 users

Hi. I’m reviving this discussion to extend it beyond IP addresses, which is what everyone has talked about so far. But Discourse logs, and displays publicly, far more information, which is not nearly as justifiable as IP logging: date joined, date of last post, trust level, how much time reading, how many topics read, how many hearts given and received, etc. In addition to the GDPR, I’m concerned about the US COPPA, because some of my site’s users will be under 13, and as a privacy enthusiast all this information collected just rubs me the wrong way.

If we really have to dig through the code to find all this, I suppose one of those 12-year-olds would be happy to do it. But really, I can’t be the only person who thinks this way, and I’d really like one big check box under Basic Setup saying “Maximize privacy.”

1 Like

I can’t speak to GDPR/COPPA specifics (I’m not a lawyer), but I’m a bit confused here. Quite a bit of what you listed is already public. Last post date, likes given/received, and most of the other data on the user summary page (topics and posts created, topic replies, topic topics, top links…) as well. As for the rest, I’m failing to understand the concern - knowing how many topics I’ve read, for example, doesn’t seem like personally identifiable information, or a privacy concern.

If users are concerned, they can enable the Hide my public profile and presence features user preference to prevent non-staff users from seeing their profile page.


The problem with this is that “privacy” means something different to everyone. What in your mind is a privacy leak in others is perfectly fine. Also, whose privacy is the setting trying to protect? And from whom?

Can the site be public? That allows non-users of the site to see content. I guess the site becomes login required - that prevents anonymous users from accessing the site. Perhaps registration should be disabled - that prevents anonymous users from signing up. What about private email - can we trust users of the site not for forward emails they receive with content from the site?

I’m guessing none of the settings I just listed were what you were thinking for “maximize privacy”, but all change the privacy of user content. My point here is that a one size fits all setting for something as broad as “privacy” isn’t going to work.

6 Likes

Yes, exactly, that’s my point!

It would help a lot if that Hide my public profile and presence features user preference were split in two. Kids want to be able to show “I like Chinese food and anime” but not “I spent six hours here yesterday.” And then all I’d need is a way to turn on Hide my presence by default for new users.

But better than hide would be don't collect, so staff don’t have access either, and (not that I expect this to be an issue for our young users) neither do people with subpoena powers.

Hide my presence isn’t what you think it is. Presence is the indicator in the composer that someone else is typing, nothing else. “disabling presence” will not hide “I spent six hours here yesterday”.

don't collect would break Discourse. Don’t collect post date, no posting allowed. Don’t collect like count, no liking. Don’t track posts read, no indication to the user what they’ve read. Disabling all of this would turn Discourse from a forum software to a static website.

All this data has to be stored somewhere. Sure you could write a plugin to hide it from the web UI, but it still would need to be in the database, and thus still very much subpoena’able.

1 Like

I don’t understand. Why does one imply the other? Thanks.

Extreme over-simplification of the database to follow:

Going to use likes as the example as it’s simpler than posts.

When a user likes a post, that action causes multiple changes in the database.

  • The like_count for the post is incremented by 1.
  • The likes_given for the user is incremented by 1.
  • The fact that user foo liked post #123 is stored.

In order to indicate underneath the post the like count, we must increment the like count for the post. If likes_given isn’t incremented, that data isn’t as quickly obtained, but can still be determined via the user foo liked post data.

If we also stop tracking which user likes which post, we would only be able to display the count of likes below each post, not who liked it. We’d also no longer be able to limit users from liking the same post multiple times, or limit how many likes they give each day. A single user could sit and like a post infinite times.

Does that help?

1 Like

I see the argument about likes (although I don’t like likes as a forum feature), but how does not collecting who posted when prevent posts? I see that it prevents preventing multiple posts. So for spam-prevention maybe you want to keep the posting data for 24 hours and then delete it.

But, I get that I have very different ideas about these things from the Discourse team. You probably have a user community mostly of young adults. The needs of a forum for children are very different – and so are the preferences of someone old enough to remember when we had privacy. :slight_smile:

In and of itself it doesn’t, but it would make the forum anonymous, and completely change how it works. So I may have been a bit overzealous here - it may not completely prevent posting, but it would definitely break how one expects Discourse to work.

You’ve got that right :smiley:. We do not think about COPPA when building Discourse. Thinking back to helping set up an online service for someone under 13, I recall all sorts of extra hoops to jump through for registration, as well as full moderation of all content prior to posting. I don’t recall all posting being anonymous, though.

In any case, you could likely make Discourse work for your needs, but you’d need to do significant work via a plugin to make it happen. I don’t see this type of “don’t collect data” becoming a core feature.

I don’t feel “I don’t like likes” has much, if anything, to actually do with COPPA. I suggest consulting with a lawyer if you are building a site that caters to those 12 years or younger.

3 Likes

Yes, I was just going to say that to Joshua (time for me to consult the university’s lawyers) when you posted that.

I don’t want posts to be anonymous, by the way. So yes, by scraping the actual posts one could reconstruct a log. But that’s different from us keeping a log on purpose and displaying it with the user’s profile.

Long ago, when I was a callow youth and had never thought about privacy as an issue, we implemented a public display of users’ time of last logout. The idea was that our users woke up and slept in different phases, and you could use the last logout time to make inferences about whether your friend was awake now or not. To my astonishment, users complained; they were worried that people might instead make inferences about whether they were slacking. That was when I first learned that sometimes it’s better for sysadmins not to know some things.

Anyway, thanks for the discussion and the education.

2 Likes

The one thing you will achieve with any of the above is indicating which users are under 13, effectively marking them out.

Organisations I work with who have children or vulnerable adults posting usually enquire about additional protections, but quickly realise that unless these changes are made to all users they’re basically drawing a bullseye on the people they’re trying to protect.

IMHO you should consult with your lawyer.

My take of the COPPA Code of Federal Regulations
https://www.ecfr.gov/cgi-bin/text-idx?SID=4939e77c77a1a1a08c1cbf905fc4b409&node=16%3A1.0.1.3.36&rgn=div5
is that the primary intent is the scope

§312.1 Scope of regulations in this part.

This part implements the Children’s Online Privacy Protection Act of 1998, (15 U.S.C. 6501, et seq.,) which prohibits unfair or deceptive acts or practices in connection with the collection, use, and/or disclosure of personal information from and about children on the Internet.

I don’t see how any of the various forum related data could be considered within that scope. That is, I don’t see it as being either unfair or deceptive. Nor do I consider it to be PII.

But again, consult your lawyer.

Oh, yes, sorry I didn’t make myself clear. It’s because of young people that this issue is important, but the privacy protection should apply to all users.

I’m thinking due diligence would include getting parental consent. Unfortunately I can’t think of any easy way for Discourse to do this.

Are you suggesting that posts have neither owners nor time stamps? That doesn’t make sense. I must be missing something.

(I work with Brain, and am maintaining our discourse instance.)

I think in many ways, the COPPA-specific discussion is a bit of a distraction from the general concern, which is doing the most we can for user privacy. There is a difference in saying something is technically possible, but difficult to do, and something that is already done for you. (Especially since, given a large enough body of text, even an anonymous user can be de-anonymized with reasonable certainty.)

(We are using SSO for our forum, so those issues of user validation an age are handled elsewhere.)

But in general, I think the request is more basic — we’d simply like to have forum defaults for users which are pretty privacy conscious. I.e “By default user stats are not shown on profile pages”, similar to how you can hide suspension reasons or whitelist custom fields to show on user pages. A starting point of “whitelist forum data to show” would be really helpful.

If these are settings we could make plugins for, that’s probably a reasonable path too. :slight_smile:

2 Likes

There’s nothing stopping you from overriding these values directly in the database, for most everyone else though the things which have been flagged as privacy concerns above are actually central to promoting discussion.

Take the presence indicator on topics, I could be drafting a reply during a back-and-forth when I see that @codinghorror is also responding. It gives me the option to hold off on my reply until I’ve read his response. The indicator doesn’t facilitate any kind of abuse, merely gives me more information and context around which I can make decisions.

Rather than broadly flag the above as privacy concerns could you possibly elaborate for each how you believe privacy is breached by displaying the information and how it might be abused?

I think it should be reasonably easy to do what you’ve described here in a plugin.

1 Like

I’ve been thinking of what is easily possible to do now. (I have omitted some settings) For example:

  • disable enable_badges so that various forum activity can’t be deduced from the badge requirements
  • disable enable_personal_messages so that all interactions will be public
  • disable enable_user_directory i.e. no “users” page
  • disable allow_uploaded_avatars and allow_profile_backgrounds so the avatar and profile can’t have PII images
  • set approve_unless_trust_level and approve_new_topics_unless_trust_level to a level members will not have i.e. 4 so that all content can be vetted as being “safe” before it becomes public
  • use CSS to display: none all manner of things.
2 Likes

That is why the previously mentioned site setting to disable user profiles already exists.

I think it is a very error prone exercise to try to guess what settings exist, or what you want, before actively using the software at least a little.

That said, if you will have users under 13 you really, really need to talk to lawyers first.

1 Like