How do I extract raw data from my discourse community site?

I have the opportunity to work with a few social media business analysts to make recommendations on our community. However, this requires that they have access to more than the out-of-box excel data spreadsheets that I’m able to generate as the site’s admin.

The analysts are asking for aggregate, raw data of posts, handles, and all the data related to who posts something, where they came from, how long they stayed, who reacted to their post, etc. Is there a way in discourse to make this data available to analysts? I suspect it would be similar to the same datasets I would need if I were migrating the community to another platform.

Does Discourse provide this dataset, or is this something individuals need to do through the API. Unfortunately, I am not a programmer, so unless the API configuration were a simple GUI, I would be at a loss.

Hoping that someone can help set me in the direction of getting access to the raw data so that I can give it to analysts teams.

Thank you!

1 Like

You can use the Data Explorer plugin, create the queries to pull the necessary data and give access to them to your analysts. With the Data explorer you can pull data generated in your forum, and regulating access to this data is possible through groups.

Someone had a similar issue as the one you describe, check my reply here: Expose Data Explorer query as Report - #2 by cocococosti

4 Likes

Thank you for the response. I don’t think I can use this option. It seems that plugin requires a self-hosted standard installation . We only support the standard method of install here, so these instructions assume you have a standard install.

Our instance of discourse is hosted.

Is there another way that I could run queries on our community data without installing a plugin? Does Discourse provide this service?

Thank you!

Looking at your site code, I think you may have the data explorer already? If you’re hosted with Discourse on the Business or Enterprise plan the data explorer is included.

If it’s not already enabled, you should be able to do that from this setting:

https://community.graylog.org/admin/site_settings/category/plugins?filter=plugin%3Adiscourse-data-explorer

And once enabled you can find it here:

https://community.graylog.org/admin/plugins/explorer

Hopefully I’m right. :slightly_smiling_face::crossed_fingers:

3 Likes

JammyDodger:

Thank you so much! Your links were correct in helping locate the PlugIn.

Do you know why in some cases, I’m getting this error message:

The requested URL or resource could not be found.

It happens on most (but not all) queries:

  1. Who has been sending the most messages in the last week?
  2. Last 500 posts that were edited by TL0/TL1 users
  3. New Topics by Category
  4. Top 100 Active Topics
  5. Top 100 Likers
  6. User Participation Statistics
  7. Top 50 Largest Uploads
  8. Inactive Users with no posts
  9. Most Active Lurkers
  10. List of assigned topics by user
  11. Group Members Reply Count
  12. Total topics assigned per user
  13. Top tags per year

How would I fix this?

Thank you again for your help. Much appreciated!

2 Likes

That I’m not sure of. :slightly_smiling_face: But you can contact support from the email address on your dashboard, and they should be able to help. :+1:

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.