See which topics user has been reading

(James) #1

Is there a way to see which topics a user has been reading? It would also be nice to see if they’re clicking email links that are sent when new threads are answered, or weekly emails are sent out.

Some kind of comprehensive user log/history would be great.

That way I can learn what kind of interests my individual users have, etc.

(Jeff Atwood) #2

Do you work for the NSA by any chance? :wink:

Some third party mailers will add open trackers (and even link trackers, via URL rewriting) to emails through their service, Mandrill offers this for example.

You can see the count of the total # of topics a user has entered, and obviously this is tracked in the database, but we don’t show specific topics.

Longer term, we do eventually want to stop showing you “history” category topics if you never, ever visit the history category over a period of time. And conversely, if you always read the “politics” category, we want to show you a bit more of that. But there are no plans to make individual topic viewing histories directly visible to staff.

(Luke S) #3

In a case like this, a legitimate concern might be regulatory compliance.

  • In some jurisdictions, such as the USA, if it’s in the DB, it can be subpoenaed for use in law enforcement or before a court. So some tool for extracting this information may be necessary.
  • The only way around this is to not keep records at all, which is kind of impossible for discourse functionality.

I don’t know what the best solution may be, but something to keep in mind.

(F. Randall Farmer) #4

“Might be?” Could you supply specific jurisdictions and requirements? Otherwise this is a non-requirement because it is under-specified.

Seriously. When it comes to privacy and legal requirements, don’t make shit up.

(James) #5

I see absolutely no privacy issues with an ability to see what a user is doing on your forum. That’s just using analytics to track a user’s path through a website.

My forum is relatively new, so I’m studying user behavior in order to create topics that are relevant to my users. I don’t want to keep putting topics that no one cares about. Currently, none of my users are creating any topics themselves, so at this stage creating tailored content is super important.

Being a RoR newbie, what models/tables should I be looking into in order to learn what content specific users are viewing?

(Brentley Jones) #6

I’m in the same boat, trying to seed a new forum.

(F. Randall Farmer) #7

There’s a difference between wanting to know what users are reading in-aggregate from wanting to keep a detailed history of what each individual user is reading and in what order (how often), etc.

For example, I ready every thread once, then Mute most of them.

As a owner/moderator you can get a lot of information if you know which threads are most Favorited, Liked, Muted, Followed, Replied (thread length), etc. Whatever help the software can give in aggregate is great. What metrics are missing that you need to run a successful community?

(James) #8

Is that data available somewhere?

(Jeff Atwood) #9

Yes, this is a much better way to phrase the request.

(Luke S) #10

(A last! Got back to a fast connection!)
Perhaps I sounded a little uncertain.

This was intended to be understood that the USA is a specific jurisdiction. Check out:

for an overview of how this works in the US. Of course, there are wrinkles.

  • The person or organization on which the subpoena is served may object. (No guarantees on whether the objection will be accepted)
  • One tidbit especially relevant to someone running a Forum site:

Rule 45, section (d), (B) Form for Producing Electronically Stored Information Not Specified. If a subpoena does not specify a form for producing electronically stored information, the person responding must produce it in a form or forms in which it is ordinarily maintained or in a reasonably usable form or forms.

  • It could be argued that a platform, such as discourse, that has no way to retrieve a user’s activity records might fall under this one:

    Rule 45, section (d), (D) Inaccessible Electronically Stored Information. The person responding need not provide discovery of electronically stored information from sources that the person identifies as not reasonably accessible because of undue burden or cost. On motion to compel discovery or for a protective order, the person responding must show that the information is not reasonably accessible because of undue burden or cost. If that showing is made, the court may nonetheless order discovery from such sources if the requesting party shows good cause, considering the limitations of Rule 26(b)(2)(C). The court may specify conditions for the discovery.

The problem is, what happens when a court decides to enforce that last clause? Dump the entire DB? Then the privacy of all your other users is now compromised for no good reason.

  • Penalties for failure to comply depend on the exact local jurisdiction, but often include being held in contempt of court, with fines attached.

I don’t make stuff up. I did, however, assume that residents of the US would be aware that there is a legal mechanism for compelling a third party, such as a forum operator, to hand over data on one of the parties in a legal dispute. Some perspective: In my home area, an individual bringing suit in small claims court can have a subpoena issued for a witness without much in the way of review.

Case in point: (I actually saw this one!)

Party to Suit: Subpoenas witness, tries to bully him on the stand. Epic Fail.
Judge: asks witness a few questions to determine if he knows anything relevant to the case. He doesn’t.
Judge: "So, why are you here?"
Witness: “I have no Idea.”

I don’t know what the best solution for this use case is, but I share your concern that such a tool could be abused. Perhaps someone with a better working knowledge of law would have some insights?

(James) #11

I’m actually fine with that at this point.

I have no idea which topics my users (in aggregate) are “watching”, following or favorited. That would be pretty useful to know.

Is making this data available in the dashboard on the roadmap?

(F. Randall Farmer) #12

Umm. None of those things you cited compel storage, only discovery of data that exists.

For example, no US-based major online site that I am aware of stores detailed logs beyond a limited number of days (90 was the defacto standard for awhile, many are down to 30 days.)

If it isn’t stored, it can’t be discovered.

On the other hand, you have proven the unchallenged assertion: If you store it, they can take it.

For the record, As head of Yahoo’s social platform design team, I worked closely with their legal and site security teams for about 5 years and participated in discovery on several cases. I wrote the security and data storage requirements for several design documents for major social services, including their forum software.

As of 2008, there was no US legal requirement to store a detailed access history for users, and Yahoo! did just about everything they could to avoid it. If for no other reason, they had over a billion users at the time I was there. But the truth was, it was a privacy nightmare with all the jurisdictions they were present in.

(Luke S) #13

I think that we have had a bit of a misunderstanding… I didn’t intend to imply that Discourse should store more data… My first post was looking more at this line:

My point, which probably could have been expressed better, was that since Discourse keeps track of our place in topics, and read status, at least some of users’ activity data is stored to make the platform function, and this data would be discover-able. Could It be accessed if there was a proper court order?

Okay. I’ll keep that in mind next time before I open my mouth. :blush:

(F. Randall Farmer) #14

That explains it. I thought you’d been responding to the original request - especially:

I know I was.

This is a different question altogether. Anything that is stored can probably be compelled (per your citations.)

(Luke S) #15

I guess that my first read of the entire original post, (possibly sloppy on my part) was as a request for a tool to extract in an orderly fashion all user data that Discourse currently stores, not to log even more stuff.

(Dan Dascalescu) #16

I would also love to see readership statistics in Discourse beyond /about and /admin.

Does Discourse already have this data, since it can present to new users the most “interesting” threads after they sign up?