Run Data Explorer queries with the Discourse API

Virtually any action that can be performed through the Discourse user interface can also be triggered with the Discourse API. For a general overview of how to find the correct API request for an action, see Reverse engineer the Discourse API.

To run a Data Explorer query from the API, you need to make a POST request to /admin/plugins/explorer/queries/<query-id>/run. You can find the a Data Explorer query’s ID by visiting the query on your site and getting the value of the id parameter that you’ll see in your browser’s address bar.

This example query has an ID of 20 on my site. It returns a list of topics, ordered by views, for a given date:

--[params]
-- date :viewed_at

SELECT
topic_id,
COUNT(1) AS views_for_date
FROM topic_views
WHERE viewed_at = :viewed_at
GROUP BY topic_id
ORDER BY views_for_date DESC

This query can be run from a terminal with:

curl -X	POST "https://discourse.example.com/admin/plugins/explorer/queries/20/run" \
-H "Content-Type: multipart/form-data;" \
-H "Api-Key: <your_all_users_api_key>" \
-H "Api-Username: system" \
-F 'params={"viewed_at":"2019-06-10"}'

You will need to substitute your site’s base URL, query ID, and All Users API key into the example.

Handling queries that return over 1000 rows

:warning: The limit=ALL support for CSV exports has been removed, but should still work for JSON exports.

By default, the Data Explorer plugin returns a maximum of 1000 rows. You can bypass this limit by adding a limit=ALL parameter to the request. The query below will generate 10000 rows.

SELECT
*
FROM generate_series(1, 10000)

On my site, the query’s ID is 26. Here’s how to get all the query’s rows:

curl -X POST "https://discourse.example.com/admin/plugins/explorer/queries/26/run" \
-H "Content-Type: multipart/form-data;" \
-H "Api-Key: <your_all_users_api_key>" \
-H "Api-Username: system" \
-F "limit=ALL"

Paginating the results

For a query that returns a very large data set, you can paginate the results. Given this example query (it has an id of 27 on my site):

--[params]
-- integer :limit = 100
-- integer :page = 0

SELECT
*
FROM generate_series(1, 10000)
OFFSET :page * :limit
LIMIT :limit

you can return the first 100 rows with

curl -X POST "https://discourse.example.com/admin/plugins/explorer/queries/27/run" \
-H "Content-Type: multipart/form-data;" \
-H "Api-Key: <your_all_users_api_key>" \
-H "Api-Username: system" \
-F 'params={"page":"0"}'

You can keep incrementing the value of the page parameter until the result_count that is returned is 0.

Removing relations data from the results

When Data Explorer queries are run through the user interface, a relations object is added to the results. This data is used for rendering the user in UI results, but you are unlikely to need it when running queries via the API. To remove that data from the results, pass a download=true parameter with your request:

curl -X POST "https://discourse.example.com/admin/plugins/explorer/queries/27/run" \
-H "Content-Type: multipart/form-data;" \
-H "Api-Key: <your_all_users_api_key>" \
-H "Api-Username: system" \
-F 'params={"page":"0"}' \
-F "download=true"

API authentication

Details about generating an API key for the requests can be found here: Create and configure an API key. If the API key is only going to be used to run Data Explorer queries, select “Granular” from the Scope drop down, then select the “run queries” scope.

Common Questions

Is there any api endpoint I can use to get the list of reports and the ID numbers? I want to build a dropdown with the list in it?

Yes, you can make an authenticated GET request to /admin/plugins/explorer/queries.json to get a list of all queries on the site.

Is it possible to send parameters with the post request?

Yes, the first code example in the OP should give you what you are looking for. It passes a viewed_at parameter, but a similar approach would work with a category_id parameter.

Additional Resources

38 Likes

This comment seems to imply that you can do CSV export from the API. Is that possible? Just curious because I need the data as CSV. I can always get it as JSON and convert to CSV but if there is a built-in way to get CSV that would be a little easier.

Is it possible to do a query ‘like created or updated the last 50 seconds’?

:robot: AI says

-- [params]
-- int :seconds = 50

SELECT
    p.id AS post_id,
    p.created_at,
    p.updated_at,
    p.raw AS post_content,
    p.user_id,
    t.title AS topic_title,
    t.id AS topic_id
FROM posts p
INNER JOIN topics t ON t.id = p.topic_id
WHERE
    (EXTRACT(EPOCH FROM (NOW() - p.created_at)) <= :seconds
    OR EXTRACT(EPOCH FROM (NOW() - p.updated_at)) <= :seconds)
    AND p.deleted_at IS NULL
    AND t.deleted_at IS NULL
ORDER BY p.created_at DESC
LIMIT 50

Did a quick test, it seems to work :slight_smile:

I don’t see the “like :heart:” I added to a post using your query.

I think you need to use post_actions

--[params]
--string :timespan = 50 seconds

SELECT post_id,
       user_id, 
       created_at, 
       updated_at, 
       deleted_at
FROM  post_actions
WHERE post_action_type_id=2 AND updated_at > NOW() - INTERVAL :timespan

Version with more beautiful results

--[params]
--string :timespan = 50 seconds
--boolean :include_in_timespan_deleted = false

SELECT 
  post_id, 
  user_id, 
  CASE
    WHEN EXTRACT(EPOCH FROM (NOW() - created_at)) < 60 THEN CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - created_at))), ' seconds ago')
    WHEN EXTRACT(EPOCH FROM (NOW() - created_at)) < 3600 THEN CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - created_at)) / 60), ' minutes ago')
    WHEN EXTRACT(EPOCH FROM (NOW() - created_at)) < 86400 THEN CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - created_at)) / 3600), ' hours ago')
    ELSE CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - created_at)) / 86400), ' days ago')
  END AS relative_created_at,
  CASE
    WHEN EXTRACT(EPOCH FROM (NOW() - updated_at)) < 60 THEN CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - updated_at))), ' seconds ago')
    WHEN EXTRACT(EPOCH FROM (NOW() - updated_at)) < 3600 THEN CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - updated_at)) / 60), ' minutes ago')
    WHEN EXTRACT(EPOCH FROM (NOW() - updated_at)) < 86400 THEN CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - updated_at)) / 3600), ' hours ago')
    ELSE CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - updated_at)) / 86400), ' days ago')
  END AS relative_updated_at,
  CASE
    WHEN deleted_at IS NULL THEN 'no'
    WHEN EXTRACT(EPOCH FROM (NOW() - deleted_at)) < 60 THEN CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - deleted_at))), ' seconds ago')
    WHEN EXTRACT(EPOCH FROM (NOW() - deleted_at)) < 3600 THEN CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - deleted_at)) / 60), ' minutes ago')
    WHEN EXTRACT(EPOCH FROM (NOW() - deleted_at)) < 86400 THEN CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - deleted_at)) / 3600), ' hours ago')
    ELSE CONCAT(ROUND(EXTRACT(EPOCH FROM (NOW() - deleted_at)) / 86400), ' days ago')
  END AS relative_deleted_at
FROM 
  post_actions
WHERE 
  post_action_type_id = 2 
  AND updated_at > NOW() - INTERVAL :timespan
  AND (
    :include_in_timespan_deleted = false 
    OR (deleted_at IS NOT NULL AND deleted_at > NOW() - INTERVAL :timespan)
  )

2 Likes

Oooh I misread! I read the sentence as "Is it possible to do a query like ‘created or updated the last 50 seconds’?

(notice the single quote position)

1 Like

Thanks all,

I posted here because that is where ask.discourse pointed me to.
What I’m looking for is if the API is providing this capability.

1 Like

Yes, of course.

You create the query in data explorer, then you run the query through the API as described in this guide.

I just ran Moin’s query via the API and it properly returned the expected results.

3 Likes

I was wondering about this too. Is JSON the only way of exporting data via the API or is CSV export also supported for Data Explorer?